From patchwork Mon Nov 27 18:43:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 44832 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:bca6:b0:181:818d:5e7f with SMTP id fx38csp3615231pzb; Mon, 27 Nov 2023 10:44:13 -0800 (PST) X-Google-Smtp-Source: AGHT+IHA/svhmCuZbbW1obPG18OYlis6MF5/hY0Vipfli71oT6t9VBxBYwjdhjCpPMDNSBX/JBxv X-Received: by 2002:aa7:d658:0:b0:54b:1ca8:851c with SMTP id v24-20020aa7d658000000b0054b1ca8851cmr5577464edr.6.1701110653365; Mon, 27 Nov 2023 10:44:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701110653; cv=none; d=google.com; s=arc-20160816; b=ozJmKr+8RQkkf7VExhAv95gjNPRZVY7KxGIzQURp5E4cAu+8TnE2IsT0b499YEPwdc X9UgkNZusHZWw3Ic1zlrFg7rj2QdpoBGhw7gpOKGrKdaut9HDeQ9E3sC3WFeBNPVBvmg jlsJ9QhzqmJ8olreS+qpmbzeRyuwBHUE0QWopz+0HiMYbBvZ3cEmLjepJrCK1uNYvJzE Zo+W6//eS6x/UAGbXEijG3KKt2F1+FaHUtm7HN4P8DNbTYpaiC/Xxw9my/7B/s1KOc0B oXKbX6yPShY19n15NSTetqv4KSb5c3kv+KnSskImkicwLvwZ23M45Cnd69TvF6Ycx1LO M6Pg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=TbAfybeIuOEpocv7SXEf3Hfh55Im6t/cmmQXoSlarLY=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=KDVlpqWWn07C5Flv9EupEjV6VIzMMh4MT5PqeGFWRf7b1vk9qBWzEBOjODHlrIbk3s cAvqQWMvcDTDJ95yf6znyhd3egRDv+hv8yjbR20OsA7HCZ8UvJl7jtgp8UxJc2jgfZsZ 56oUZpx+ywKN+yLSLFp9AONtUPLXbbRkZI2PlDkyMHyrOvaboZ6Ms0Uv2bSwQypUDCXJ EWYaMukIn2kHMvMZmU5r5BG+eWiSrw0ZEkqaO78nM7vvRlPhKy6vmRTrH4kjXSQY4GqP C4LuPghLr1yjcnCs12TzfKqd47PcuxeTDUHp8eWcRqjDQfOhmMl9QRY4kRn2n43A6CWj DyWQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=bCZF4YUp; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id d18-20020a50f692000000b0054b7e310e19si951032edn.531.2023.11.27.10.44.12; Mon, 27 Nov 2023 10:44:13 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=bCZF4YUp; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 01C4B68CCA6; Mon, 27 Nov 2023 20:44:10 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pf1-f173.google.com (mail-pf1-f173.google.com [209.85.210.173]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 8386468CC46 for ; Mon, 27 Nov 2023 20:43:59 +0200 (EET) Received: by mail-pf1-f173.google.com with SMTP id d2e1a72fcca58-6cd89f2af9dso1499403b3a.1 for ; Mon, 27 Nov 2023 10:43:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701110637; x=1701715437; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=FTe1ul1GT2p08ddZvRB1XsnhiVj6bTq3++BHAVJhnS0=; b=bCZF4YUpMnX+JK4pw623YQjMfFCbg9VVvIQ13+xuaiSCu7n4yLaehFUHF493LMdRxV Aupb6XkeojrqRlPUhYxRaOx4CojJ4EHHzlhEpiRp3MGTqqAsEqffix8U8WymfgHTfLxF XH0/yU8+ETXgxSncp+2101PlTk9DHqYUCnGV2DoBK359Kbv9zTF7d6Krr6Vu7FPDMF2U M7UfaNbnpGkIszQMALKjFfsfUzRtULuRzp/itOXGQXlwJdutwBQhrfZIfvsxHHp+YqBF jHSRZ5JFDcjDmwnMl6tgDczNeTyuuBBIZC5cWSFJbibXQQ9VzwcqTcuvsJKcKsJYdnDi /bLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701110637; x=1701715437; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FTe1ul1GT2p08ddZvRB1XsnhiVj6bTq3++BHAVJhnS0=; b=Q8ZWU9UOd/xzwV5tOSiFUVvBkHpGDQ6u1jVaMi+820MyYfGs4lZAQBF6xcA3k3vbsT gzXfBYFV0zAN+vk6VLvdXZ+L9jOVT8AuRbXJLoecvDmykPrrJClbRCg8+Tj8GKnHDDYB gBiOoyznbrx92DtIuO8b9XPFRIumY1vawri1z/sSmWgbSn753YUz3IN8hC7XLupChcgK E1xxamZx7Ji65tCwjS555czhtWhy8UIfMS0WQktVxRDVoOj+SWpeEYBpwBYkAEbwKsg4 kEbI6xn20ydDl6B04J4YFpHv674X6HSzIIYkuLTfdFZ7iGfQ2xc+RF6UpBl/dqiF42Tg gM4Q== X-Gm-Message-State: AOJu0Yz4ZLfRZZYKBXw4gGFcqTSvFUoYlGaaBT0//N6NmFUbUcfgoN8Q ktGB4FChx87AFbtpoObxLFIFz1I2ugU= X-Received: by 2002:a05:6a00:2e05:b0:690:1857:3349 with SMTP id fc5-20020a056a002e0500b0069018573349mr15792935pfb.25.1701110636663; Mon, 27 Nov 2023 10:43:56 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id gx10-20020a056a001e0a00b006c107a9e8f0sm7516720pfb.128.2023.11.27.10.43.55 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Nov 2023 10:43:56 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Mon, 27 Nov 2023 15:43:54 -0300 Message-ID: <20231127184357.3361-1-jamrial@gmail.com> X-Mailer: git-send-email 2.42.1 In-Reply-To: <20231126012858.40388-1-jamrial@gmail.com> References: <20231126012858.40388-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 10/13] avcodec: add an Immersive Audio Model and Formats frame split bsf X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: PkMIDbXBrGZB Signed-off-by: James Almer --- libavcodec/Makefile | 1 + libavcodec/bitstream_filters.c | 1 + libavcodec/iamf_stream_split_bsf.c | 807 +++++++++++++++++++++++++++++ 3 files changed, 809 insertions(+) create mode 100644 libavcodec/iamf_stream_split_bsf.c diff --git a/libavcodec/Makefile b/libavcodec/Makefile index 748806e702..a2c345570e 100644 --- a/libavcodec/Makefile +++ b/libavcodec/Makefile @@ -1245,6 +1245,7 @@ OBJS-$(CONFIG_HAPQA_EXTRACT_BSF) += hapqa_extract_bsf.o hap.o OBJS-$(CONFIG_HEVC_METADATA_BSF) += h265_metadata_bsf.o h265_profile_level.o \ h2645data.o OBJS-$(CONFIG_HEVC_MP4TOANNEXB_BSF) += hevc_mp4toannexb_bsf.o +OBJS-$(CONFIG_IAMF_STREAM_SPLIT_BSF) += iamf_stream_split_bsf.o OBJS-$(CONFIG_IMX_DUMP_HEADER_BSF) += imx_dump_header_bsf.o OBJS-$(CONFIG_MEDIA100_TO_MJPEGB_BSF) += media100_to_mjpegb_bsf.o OBJS-$(CONFIG_MJPEG2JPEG_BSF) += mjpeg2jpeg_bsf.o diff --git a/libavcodec/bitstream_filters.c b/libavcodec/bitstream_filters.c index 1e9a676a3d..640b821413 100644 --- a/libavcodec/bitstream_filters.c +++ b/libavcodec/bitstream_filters.c @@ -42,6 +42,7 @@ extern const FFBitStreamFilter ff_h264_redundant_pps_bsf; extern const FFBitStreamFilter ff_hapqa_extract_bsf; extern const FFBitStreamFilter ff_hevc_metadata_bsf; extern const FFBitStreamFilter ff_hevc_mp4toannexb_bsf; +extern const FFBitStreamFilter ff_iamf_stream_split_bsf; extern const FFBitStreamFilter ff_imx_dump_header_bsf; extern const FFBitStreamFilter ff_media100_to_mjpegb_bsf; extern const FFBitStreamFilter ff_mjpeg2jpeg_bsf; diff --git a/libavcodec/iamf_stream_split_bsf.c b/libavcodec/iamf_stream_split_bsf.c new file mode 100644 index 0000000000..28c8101719 --- /dev/null +++ b/libavcodec/iamf_stream_split_bsf.c @@ -0,0 +1,807 @@ +/* + * Copyright (c) 2023 James Almer + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/dict.h" +#include "libavutil/opt.h" +#include "libavformat/iamf.h" +#include "bsf.h" +#include "bsf_internal.h" +#include "get_bits.h" + +typedef struct ParamDefinition { + AVIAMFParamDefinition *param; + size_t param_size; + int recon_gain_present_bitmask; +} ParamDefinition; + +typedef struct IAMFSplitContext { + AVClass *class; + AVPacket *buffer_pkt; + + ParamDefinition *param_definitions; + unsigned int nb_param_definitions; + + unsigned int *ids; + int nb_ids; + + // AVOptions + int first_index; + + // Packet side data + AVIAMFParamDefinition *mix; + size_t mix_size; + AVIAMFParamDefinition *demix; + size_t demix_size; + AVIAMFParamDefinition *recon; + size_t recon_size; +} IAMFSplitContext; + +static int param_parse(AVBSFContext *ctx, GetBitContext *gb, + unsigned int param_definition_type, + ParamDefinition **out) +{ + IAMFSplitContext *const c = ctx->priv_data; + ParamDefinition *param_definition = NULL; + AVIAMFParamDefinition *param; + unsigned int parameter_id, parameter_rate, param_definition_mode; + unsigned int duration = 0, constant_subblock_duration = 0, num_subblocks = 0; + size_t param_size; + + parameter_id = get_leb(gb); + + for (int i = 0; i < c->nb_param_definitions; i++) + if (c->param_definitions[i].param->parameter_id == parameter_id) { + param_definition = &c->param_definitions[i]; + break; + } + + parameter_rate = get_leb(gb); + param_definition_mode = get_bits(gb, 8) >> 7; + + if (param_definition_mode == 0) { + duration = get_leb(gb); + constant_subblock_duration = get_leb(gb); + if (constant_subblock_duration == 0) { + num_subblocks = get_leb(gb); + } else + num_subblocks = duration / constant_subblock_duration; + } + + param = av_iamf_param_definition_alloc(param_definition_type, NULL, num_subblocks, NULL, ¶m_size); + if (!param) + return AVERROR(ENOMEM); + + for (int i = 0; i < num_subblocks; i++) { + if (constant_subblock_duration == 0) + get_leb(gb); // subblock_duration + + switch (param_definition_type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: + break; + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: + skip_bits(gb, 8); // dmixp_mode + break; + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: + break; + default: + av_free(param); + return AVERROR_INVALIDDATA; + } + } + + param->parameter_id = parameter_id; + param->parameter_rate = parameter_rate; + param->param_definition_mode = param_definition_mode; + param->duration = duration; + param->constant_subblock_duration = constant_subblock_duration; + param->num_subblocks = num_subblocks; + + if (param_definition) { + if (param_definition->param_size != param_size || memcmp(param_definition->param, param, param_size)) { + av_log(ctx, AV_LOG_ERROR, "Incosistent parameters for parameter_id %u\n", parameter_id); + av_free(param); + return AVERROR_INVALIDDATA; + } + av_freep(¶m); + } else { + param_definition = av_dynarray2_add_nofree((void **)&c->param_definitions, &c->nb_param_definitions, + sizeof(*c->param_definitions), NULL); + if (!param_definition) { + av_free(param); + return AVERROR(ENOMEM); + } + param_definition->param = param; + param_definition->param_size = param_size; + } + if (out) + *out = param_definition; + + return 0; +} + +static int scalable_channel_layout_config(AVBSFContext *ctx, GetBitContext *gb, + ParamDefinition *recon_gain) +{ + int num_layers; + + num_layers = get_bits(gb, 3); + skip_bits(gb, 5); //reserved + + if (num_layers > 6) + return AVERROR_INVALIDDATA; + + for (int i = 0; i < num_layers; i++) { + int output_gain_is_present_flag, recon_gain_is_present; + + skip_bits(gb, 4); // loudspeaker_layout + output_gain_is_present_flag = get_bits1(gb); + recon_gain_is_present = get_bits1(gb); + if (recon_gain) + recon_gain->recon_gain_present_bitmask |= recon_gain_is_present << i; + skip_bits(gb, 2); // reserved + skip_bits(gb, 8); // substream_count + skip_bits(gb, 8); // coupled_substream_count + if (output_gain_is_present_flag) { + skip_bits(gb, 8); // output_gain_flags & reserved + skip_bits(gb, 16); // output_gain + } + } + + return 0; +} + +static int audio_element_obu(AVBSFContext *ctx, uint8_t *buf, unsigned size) +{ + IAMFSplitContext *const c = ctx->priv_data; + GetBitContext gb; + ParamDefinition *recon_gain = NULL; + unsigned audio_element_type; + unsigned num_substreams, num_parameters; + int ret; + + ret = init_get_bits8(&gb, buf, size); + if (ret < 0) + return ret; + + get_leb(&gb); // audio_element_id + audio_element_type = get_bits(&gb, 3); + skip_bits(&gb, 5); // reserved + + get_leb(&gb); // codec_config_id + num_substreams = get_leb(&gb); + for (unsigned i = 0; i < num_substreams; i++) { + unsigned *audio_substream_id = av_dynarray2_add_nofree((void **)&c->ids, &c->nb_ids, + sizeof(*c->ids), NULL); + if (!audio_substream_id) { + return AVERROR(ENOMEM); + } + + *audio_substream_id = get_leb(&gb); + } + + num_parameters = get_leb(&gb); + if (num_parameters && audio_element_type != 0) { + av_log(ctx, AV_LOG_ERROR, "Audio Element parameter count %u is invalid" + " for Scene representations\n", num_parameters); + return AVERROR_INVALIDDATA; + } + + for (int i = 0; i < num_parameters; i++) { + unsigned param_definition_type = get_leb(&gb); + + if (param_definition_type == AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN) + return AVERROR_INVALIDDATA; + else if (param_definition_type == AV_IAMF_PARAMETER_DEFINITION_DEMIXING) { + ret = param_parse(ctx, &gb, param_definition_type, NULL); + if (ret < 0) + return ret; + skip_bits(&gb, 8); // default_w + } else if (param_definition_type == AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN) { + ret = param_parse(ctx, &gb, param_definition_type, &recon_gain); + if (ret < 0) + return ret; + } else { + unsigned param_definition_size = get_leb(&gb); + skip_bits_long(&gb, param_definition_size * 8); + } + } + + if (audio_element_type == AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL) { + ret = scalable_channel_layout_config(ctx, &gb, recon_gain); + if (ret < 0) + return ret; + } + + return 0; +} + +static int label_string(GetBitContext *gb) +{ + int byte; + + do { + byte = get_bits(gb, 8); + } while (byte); + + return 0; +} + +static int mix_presentation_obu(AVBSFContext *ctx, uint8_t *buf, unsigned size) +{ + GetBitContext gb; + unsigned mix_presentation_id, count_label; + unsigned num_submixes, num_elements; + int ret; + + ret = init_get_bits8(&gb, buf, size); + if (ret < 0) + return ret; + + mix_presentation_id = get_leb(&gb); + count_label = get_leb(&gb); + + for (int i = 0; i < count_label; i++) { + ret = label_string(&gb); + if (ret < 0) + return ret; + } + + for (int i = 0; i < count_label; i++) { + ret = label_string(&gb); + if (ret < 0) + return ret; + } + + num_submixes = get_leb(&gb); + for (int i = 0; i < num_submixes; i++) { + unsigned num_layouts; + + num_elements = get_leb(&gb); + + for (int j = 0; j < num_elements; j++) { + unsigned rendering_config_extension_size; + + get_leb(&gb); // audio_element_id + for (int k = 0; k < count_label; k++) { + ret = label_string(&gb); + if (ret < 0) + return ret; + } + + skip_bits(&gb, 8); // headphones_rendering_mode & reserved + rendering_config_extension_size = get_leb(&gb); + skip_bits_long(&gb, rendering_config_extension_size * 8); + + ret = param_parse(ctx, &gb, AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN, NULL); + if (ret < 0) + return ret; + skip_bits(&gb, 16); // default_mix_gain + } + + ret = param_parse(ctx, &gb, AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN, NULL); + if (ret < 0) + return ret; + get_bits(&gb, 16); // default_mix_gain + + num_layouts = get_leb(&gb); + for (int j = 0; j < num_layouts; j++) { + int info_type, layout_type; + int byte = get_bits(&gb, 8); + + layout_type = byte >> 6; + if (layout_type < AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS && + layout_type > AV_IAMF_SUBMIX_LAYOUT_TYPE_BINAURAL) { + av_log(ctx, AV_LOG_ERROR, "Invalid Layout type %u in a submix from Mix Presentation %u\n", + layout_type, mix_presentation_id); + return AVERROR_INVALIDDATA; + } + + info_type = get_bits(&gb, 8); + get_bits(&gb, 16); // integrated_loudness + get_bits(&gb, 16); // digital_peak + + if (info_type & 1) + get_bits(&gb, 16); // true_peak + + if (info_type & 2) { + unsigned int num_anchored_loudness = get_bits(&gb, 8); + + for (int k = 0; k < num_anchored_loudness; k++) { + get_bits(&gb, 8); // anchor_element + get_bits(&gb, 16); // anchored_loudness + } + } + + if (info_type & 0xFC) { + unsigned int info_type_size = get_leb(&gb); + skip_bits_long(&gb, info_type_size * 8); + } + } + } + + return 0; +} + +static int find_idx_by_id(AVBSFContext *ctx, unsigned id) +{ + IAMFSplitContext *const c = ctx->priv_data; + + for (int i = 0; i < c->nb_ids; i++) { + unsigned audio_substream_id = c->ids[i]; + + if (audio_substream_id == id) + return i; + } + + av_log(ctx, AV_LOG_ERROR, "Invalid id %d\n", id); + return AVERROR_INVALIDDATA; +} + +static int audio_frame_obu(AVBSFContext *ctx, enum IAMF_OBU_Type type, int *idx, + uint8_t *buf, int *start_pos, unsigned *size, + int id_in_bitstream) +{ + GetBitContext gb; + unsigned audio_substream_id; + int ret; + + ret = init_get_bits8(&gb, buf + *start_pos, *size); + if (ret < 0) + return ret; + + if (id_in_bitstream) { + int pos; + audio_substream_id = get_leb(&gb); + pos = get_bits_count(&gb) / 8; + *start_pos += pos; + *size -= pos; + } else + audio_substream_id = type - IAMF_OBU_IA_AUDIO_FRAME_ID0; + + ret = find_idx_by_id(ctx, audio_substream_id); + if (ret < 0) + return ret; + + *idx = ret; + + return 0; +} + +static const ParamDefinition *get_param_definition(AVBSFContext *ctx, unsigned int parameter_id) +{ + const IAMFSplitContext *const c = ctx->priv_data; + const ParamDefinition *param_definition = NULL; + + for (int i = 0; i < c->nb_param_definitions; i++) + if (c->param_definitions[i].param->parameter_id == parameter_id) { + param_definition = &c->param_definitions[i]; + break; + } + + return param_definition; +} + +static int parameter_block_obu(AVBSFContext *ctx, uint8_t *buf, unsigned size) +{ + IAMFSplitContext *const c = ctx->priv_data; + GetBitContext gb; + const ParamDefinition *param_definition; + const AVIAMFParamDefinition *param; + AVIAMFParamDefinition *out_param = NULL; + unsigned int duration, constant_subblock_duration; + unsigned int num_subblocks; + unsigned int parameter_id; + size_t out_param_size; + int ret; + + ret = init_get_bits8(&gb, buf, size); + if (ret < 0) + return ret; + + parameter_id = get_leb(&gb); + + param_definition = get_param_definition(ctx, parameter_id); + if (!param_definition) { + ret = 0; + goto fail; + } + + param = param_definition->param; + if (param->param_definition_mode) { + duration = get_leb(&gb); + constant_subblock_duration = get_leb(&gb); + if (constant_subblock_duration == 0) + num_subblocks = get_leb(&gb); + else + num_subblocks = duration / constant_subblock_duration; + } else { + duration = param->duration; + constant_subblock_duration = param->constant_subblock_duration; + num_subblocks = param->num_subblocks; + if (!num_subblocks) + num_subblocks = duration / constant_subblock_duration; + } + + out_param = av_iamf_param_definition_alloc(param->param_definition_type, NULL, num_subblocks, + NULL, &out_param_size); + if (!out_param) { + ret = AVERROR(ENOMEM); + goto fail; + } + + out_param->parameter_id = param->parameter_id; + out_param->param_definition_type = param->param_definition_type; + out_param->parameter_rate = param->parameter_rate; + out_param->param_definition_mode = param->param_definition_mode; + out_param->duration = duration; + out_param->constant_subblock_duration = constant_subblock_duration; + out_param->num_subblocks = num_subblocks; + + for (int i = 0; i < num_subblocks; i++) { + void *subblock = av_iamf_param_definition_get_subblock(out_param, i); + unsigned int subblock_duration = constant_subblock_duration; + + if (param->param_definition_mode && !constant_subblock_duration) + subblock_duration = get_leb(&gb); + + switch (param->param_definition_type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: { + AVIAMFMixGainParameterData *mix = subblock; + + mix->animation_type = get_leb(&gb); + if (mix->animation_type > AV_IAMF_ANIMATION_TYPE_BEZIER) { + ret = 0; + av_free(out_param); + goto fail; + } + + mix->start_point_value = av_make_q(sign_extend(get_bits(&gb, 16), 16), 1 << 8); + if (mix->animation_type >= AV_IAMF_ANIMATION_TYPE_LINEAR) + mix->end_point_value = av_make_q(sign_extend(get_bits(&gb, 16), 16), 1 << 8); + if (mix->animation_type == AV_IAMF_ANIMATION_TYPE_BEZIER) { + mix->control_point_value = av_make_q(sign_extend(get_bits(&gb, 16), 16), 1 << 8); + mix->control_point_relative_time = get_bits(&gb, 8); + } + mix->subblock_duration = subblock_duration; + break; + } + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: { + AVIAMFDemixingInfoParameterData *demix = subblock; + + demix->dmixp_mode = get_bits(&gb, 3); + skip_bits(&gb, 5); // reserved + demix->subblock_duration = subblock_duration; + break; + } + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: { + AVIAMFReconGainParameterData *recon = subblock; + + for (int i = 0; i < 6; i++) { + if (param_definition->recon_gain_present_bitmask & (1 << i)) { + unsigned int recon_gain_flags = get_leb(&gb); + unsigned int bitcount = 7 + 5 * !!(recon_gain_flags & 0x80); + recon_gain_flags = (recon_gain_flags & 0x7F) | ((recon_gain_flags & 0xFF00) >> 1); + for (int j = 0; j < bitcount; j++) { + if (recon_gain_flags & (1 << j)) + recon->recon_gain[i][j] = get_bits(&gb, 8); + } + } + } + recon->subblock_duration = subblock_duration; + break; + } + default: + av_assert0(0); + } + } + + switch (param->param_definition_type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: + av_free(c->mix); + c->mix = out_param; + c->mix_size = out_param_size; + break; + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: + av_free(c->demix); + c->demix = out_param; + c->demix_size = out_param_size; + break; + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: + av_free(c->recon); + c->recon = out_param; + c->recon_size = out_param_size; + break; + default: + av_assert0(0); + } + + ret = 0; +fail: + if (ret < 0) + av_free(out_param); + + return ret; +} + +static int iamf_parse_obu_header(const uint8_t *buf, int buf_size, + unsigned *obu_size, int *start_pos, enum IAMF_OBU_Type *type, + unsigned *skip_samples, unsigned *discard_padding) +{ + GetBitContext gb; + int ret, extension_flag, trimming, start; + unsigned size; + + ret = init_get_bits8(&gb, buf, FFMIN(buf_size, MAX_IAMF_OBU_HEADER_SIZE)); + if (ret < 0) + return ret; + + *type = get_bits(&gb, 5); + /*redundant =*/ get_bits1(&gb); + trimming = get_bits1(&gb); + extension_flag = get_bits1(&gb); + + *obu_size = get_leb(&gb); + if (*obu_size > INT_MAX) + return AVERROR_INVALIDDATA; + + start = get_bits_count(&gb) / 8; + + if (trimming) { + *skip_samples = get_leb(&gb); // num_samples_to_trim_at_end + *discard_padding = get_leb(&gb); // num_samples_to_trim_at_start + } + + if (extension_flag) { + unsigned extension_bytes = get_leb(&gb); + if (extension_bytes > INT_MAX / 8) + return AVERROR_INVALIDDATA; + skip_bits_long(&gb, extension_bytes * 8); + } + + if (get_bits_left(&gb) < 0) + return AVERROR_INVALIDDATA; + + size = *obu_size + start; + if (size > INT_MAX) + return AVERROR_INVALIDDATA; + + *obu_size -= get_bits_count(&gb) / 8 - start; + *start_pos = size - *obu_size; + + return size; +} + +static int iamf_stream_split_filter(AVBSFContext *ctx, AVPacket *out) +{ + IAMFSplitContext *const c = ctx->priv_data; + int ret = 0; + + if (!c->buffer_pkt->data) { + ret = ff_bsf_get_packet_ref(ctx, c->buffer_pkt); + if (ret < 0) + return ret; + } + + while (1) { + enum IAMF_OBU_Type type; + unsigned skip_samples = 0, discard_padding = 0, obu_size; + int len, start_pos, idx; + + len = iamf_parse_obu_header(c->buffer_pkt->data, + c->buffer_pkt->size, + &obu_size, &start_pos, &type, + &skip_samples, &discard_padding); + if (len < 0) { + av_log(ctx, AV_LOG_ERROR, "Failed to read obu\n"); + ret = len; + goto fail; + } + + if (type >= IAMF_OBU_IA_AUDIO_FRAME && type <= IAMF_OBU_IA_AUDIO_FRAME_ID17) { + ret = audio_frame_obu(ctx, type, &idx, + c->buffer_pkt->data, &start_pos, + &obu_size, + type == IAMF_OBU_IA_AUDIO_FRAME); + if (ret < 0) + goto fail; + } else { + switch (type) { + case IAMF_OBU_IA_AUDIO_ELEMENT: + ret = audio_element_obu(ctx, c->buffer_pkt->data + start_pos, obu_size); + if (ret < 0) + goto fail; + break; + case IAMF_OBU_IA_MIX_PRESENTATION: + ret = mix_presentation_obu(ctx, c->buffer_pkt->data + start_pos, obu_size); + if (ret < 0) + goto fail; + break; + case IAMF_OBU_IA_PARAMETER_BLOCK: + ret = parameter_block_obu(ctx, c->buffer_pkt->data + start_pos, obu_size); + if (ret < 0) + goto fail; + break; + case IAMF_OBU_IA_SEQUENCE_HEADER: + for (int i = 0; c->param_definitions && i < c->nb_param_definitions; i++) + av_free(c->param_definitions[i].param); + av_freep(&c->param_definitions); + av_freep(&c->ids); + c->nb_param_definitions = 0; + c->nb_ids = 0; + // fall-through + case IAMF_OBU_IA_TEMPORAL_DELIMITER: + av_freep(&c->mix); + av_freep(&c->demix); + av_freep(&c->recon); + c->mix_size = 0; + c->demix_size = 0; + c->recon_size = 0; + break; + } + + c->buffer_pkt->data += len; + c->buffer_pkt->size -= len; + + if (!c->buffer_pkt->size) { + av_packet_unref(c->buffer_pkt); + ret = ff_bsf_get_packet_ref(ctx, c->buffer_pkt); + if (ret < 0) + return ret; + } else if (c->buffer_pkt->size < 0) { + ret = AVERROR_INVALIDDATA; + goto fail; + } + continue; + } + + if (c->buffer_pkt->size > INT_MAX - len) { + ret = AVERROR_INVALIDDATA; + goto fail; + } + + ret = av_packet_ref(out, c->buffer_pkt); + if (ret < 0) + goto fail; + + if (skip_samples || discard_padding) { + uint8_t *side_data = av_packet_new_side_data(out, AV_PKT_DATA_SKIP_SAMPLES, 10); + if (!side_data) + return AVERROR(ENOMEM); + AV_WL32(side_data, skip_samples); + AV_WL32(side_data + 4, discard_padding); + } + if (c->mix) { + uint8_t *side_data = av_packet_new_side_data(out, AV_PKT_DATA_IAMF_MIX_GAIN_PARAM, c->mix_size); + if (!side_data) + return AVERROR(ENOMEM); + memcpy(side_data, c->mix, c->mix_size); + } + if (c->demix) { + uint8_t *side_data = av_packet_new_side_data(out, AV_PKT_DATA_IAMF_DEMIXING_INFO_PARAM, c->demix_size); + if (!side_data) + return AVERROR(ENOMEM); + memcpy(side_data, c->demix, c->demix_size); + } + if (c->recon) { + uint8_t *side_data = av_packet_new_side_data(out, AV_PKT_DATA_IAMF_RECON_GAIN_INFO_PARAM, c->recon_size); + if (!side_data) + return AVERROR(ENOMEM); + memcpy(side_data, c->recon, c->recon_size); + } + + out->data += start_pos; + out->size = obu_size; + out->stream_index = idx + c->first_index; + + c->buffer_pkt->data += len; + c->buffer_pkt->size -= len; + + if (!c->buffer_pkt->size) + av_packet_unref(c->buffer_pkt); + else if (c->buffer_pkt->size < 0) { + ret = AVERROR_INVALIDDATA; + goto fail; + } + + return 0; + } + +fail: + if (ret < 0) { + av_packet_unref(out); + av_packet_unref(c->buffer_pkt); + } + + return ret; +} + +static int iamf_stream_split_init(AVBSFContext *ctx) +{ + IAMFSplitContext *const c = ctx->priv_data; + + c->buffer_pkt = av_packet_alloc(); + if (!c->buffer_pkt) + return AVERROR(ENOMEM); + + return 0; +} + +static void iamf_stream_split_flush(AVBSFContext *ctx) +{ + IAMFSplitContext *const c = ctx->priv_data; + + av_packet_unref(c->buffer_pkt); + + av_freep(&c->mix); + av_freep(&c->demix); + av_freep(&c->recon); + c->mix_size = 0; + c->demix_size = 0; + c->recon_size = 0; +} + +static void iamf_stream_split_close(AVBSFContext *ctx) +{ + IAMFSplitContext *const c = ctx->priv_data; + + iamf_stream_split_flush(ctx); + + for (int i = 0; c->param_definitions && i < c->nb_param_definitions; i++) + av_free(c->param_definitions[i].param); + av_freep(&c->param_definitions); + c->nb_param_definitions = 0; + + av_freep(&c->ids); + c->nb_ids = 0; +} + +#define OFFSET(x) offsetof(IAMFSplitContext, x) +#define FLAGS (AV_OPT_FLAG_AUDIO_PARAM|AV_OPT_FLAG_BSF_PARAM) +static const AVOption iamf_stream_split_options[] = { + { "first_index", "First index to set stream index in output packets", + OFFSET(first_index), AV_OPT_TYPE_INT, { 0 }, 0, INT_MAX, FLAGS }, + { NULL } +}; + +static const AVClass iamf_stream_split_class = { + .class_name = "iamf_stream_split_bsf", + .item_name = av_default_item_name, + .option = iamf_stream_split_options, + .version = LIBAVUTIL_VERSION_INT, +}; + +static const enum AVCodecID iamf_stream_split_codec_ids[] = { + AV_CODEC_ID_PCM_S16LE, AV_CODEC_ID_PCM_S16BE, + AV_CODEC_ID_PCM_S24LE, AV_CODEC_ID_PCM_S24BE, + AV_CODEC_ID_PCM_S32LE, AV_CODEC_ID_PCM_S32BE, + AV_CODEC_ID_OPUS, AV_CODEC_ID_AAC, + AV_CODEC_ID_FLAC, AV_CODEC_ID_NONE, +}; + +const FFBitStreamFilter ff_iamf_stream_split_bsf = { + .p.name = "iamf_stream_split", + .p.codec_ids = iamf_stream_split_codec_ids, + .p.priv_class = &iamf_stream_split_class, + .priv_data_size = sizeof(IAMFSplitContext), + .init = iamf_stream_split_init, + .flush = iamf_stream_split_flush, + .close = iamf_stream_split_close, + .filter = iamf_stream_split_filter, +};