From patchwork Wed Jan 31 17:26:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 45937 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:2c82:b0:199:de12:6fa6 with SMTP id g2csp3775pzj; Wed, 31 Jan 2024 09:26:58 -0800 (PST) X-Google-Smtp-Source: AGHT+IHvelQcvEbDUPkMHY3fL/dzOldKXMMrusFED9PdJZZnZiGIpcb2L9uETdtSLu9MqlxwMuDR X-Received: by 2002:a17:906:fcc4:b0:a35:42a3:2de9 with SMTP id qx4-20020a170906fcc400b00a3542a32de9mr1625788ejb.3.1706722018730; Wed, 31 Jan 2024 09:26:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1706722018; cv=none; d=google.com; s=arc-20160816; b=gB45F8c1/nQnlxhai2/nmf1qLTOHI7tp3PTDffwofc2wRkYuV7kKEx+v8uural/dyz I3/pxP/OZeEyYiRrBcPZ2HkBHgCQn5M4ea3CEIOj07aSroac4uqdoJ5c+/WBjRU4So+4 ktv6czury5+WvBfhQkqzCvH5322im8QLThu4WCjp1W5MGQjOjatuMqBZk6BAuaFloTJd 2dweE3XZJBmuQfhtWUB/RpoAu0o2AxSqgW4I9uedmHWBXUlUCak8MkTLYlrTWxeVSW12 jWuMaOVunMO+QZxtPizNMJlZV/QSlPR47eTNuS2j0rhg6oG0QFYcNiFdP8J5IVvQMUix HrnQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=oM+a5W8wdv5Ukvli1yQaU5z79kmdNYw2nMHxJj1k/aI=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=nYrHAr79YRgwE87K+rGPA4LcPbbw8rc9R82zeOZ53Tgc/cYWQC1vwBr5yVI4g7NrcE MAGvDXMQe6pMphu0Ve8n5PLg4HIW05lfpLtTVynCSkrDaTJEnhHQoWd4kshFLDAu3mwY t8k9r7enOGHVgaan6EANbVdaGBUBlJt0hXc0ULqA9w8cQQlVFrrjos4oFVbfNjNVMbgm g8vRZOJDOKa4BCr9i8UAomkd7sGEoOBc5Dudzx2QOpGaOrGO+28i+/o7xdTX1sa1h+XB NK2VZ/pYmFuT/uR69kq79nT0TFytt1iHdUZlXU0nAgHZJY+nNSrkzBevcRNGDuo51vMQ aswQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=RU77YDgN; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id y17-20020a17090629d100b00a35fd4bc26asi2415633eje.619.2024.01.31.09.26.58; Wed, 31 Jan 2024 09:26:58 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=RU77YDgN; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 11ADF68C58B; Wed, 31 Jan 2024 19:26:54 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1AD1A68CBAD for ; Wed, 31 Jan 2024 19:26:47 +0200 (EET) Received: by mail-pl1-f182.google.com with SMTP id d9443c01a7336-1d93ddd76adso4209885ad.2 for ; Wed, 31 Jan 2024 09:26:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706722004; x=1707326804; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:from:to:cc:subject:date:message-id:reply-to; bh=IU8YMzri50YXGQjmkJ4hGdsbEdcqlnQX2nwhnT5wRBc=; b=RU77YDgNRH4iyj2MggVOZJtPxc/+dyKAI3+pvoLRo/Je7UjtqVfcvdsGNFLATUNtLG 4gn6Bv7qIhEV9kO3JgcgFM2Z6xUY08FEswrW+JQxkytiFfUoZMVxdp2VQ97RaOIC7TLR DacSmaknlnblpEAi2k0o5QzVn7nQMu4h+xIIN0hdenXp25VblPzlPWH2+IL50aNQ3Wb6 iDYtJCXI/42flKCwM7QWT3yeZvvjgKdXqoVt59TwGwZQGx3u0w9bl0erHPQ6Augj9PEy oxf8T7azMfEmWvPjS0K6/dCSzTS4t2I/n8nGvFOcZhxr6AmTeOwuF3e3KCxPLf9H5222 GviA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706722004; x=1707326804; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=IU8YMzri50YXGQjmkJ4hGdsbEdcqlnQX2nwhnT5wRBc=; b=lUJSp4P/GkQxgGLEmrwqWkXfve0K5FUnZH9DKSxrYPcRCtiQXzQ85zLTfJgjYmGYQ4 8wOM5g3uL/9lNBKLghquW7t1vEw3kY2bTp8LHBljsHp74eRBv1KQ4Rd/iwXpfNmBO7Wo 3QB8nVk2tfPs3q1drkCg6+6hle78UROb2rSk/fbGSaBE+QPo7k0tA5NfvwilbxUmNYQH FE6919qsvqHSyrAMk/md/9e5HCtKcoQbzEnh08RvO1uLGB1cnqSblbstWyPEg4aYcSyf rC1NFteYqCUcwHJoCUyzvM+B9pz3+kaGlR1IqqXkRUkfPqDGjDji2aUTS0yAtpuVyMp9 Fojw== X-Gm-Message-State: AOJu0YydiLC6AmT7bDPWKZkf/byk32knhjaTTnNY1tXzsY1mH1EfAZek SZfD/6xppuraipUMEIiv6zQPtOabVChwa5xUZ8bo2YFaWZe6tFmV4RjfBZeX X-Received: by 2002:a17:902:f54f:b0:1d8:fb25:91a1 with SMTP id h15-20020a170902f54f00b001d8fb2591a1mr2376159plf.54.1706722003817; Wed, 31 Jan 2024 09:26:43 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id k11-20020a170902f28b00b001d8e4b85636sm5762235plc.138.2024.01.31.09.26.42 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 Jan 2024 09:26:43 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Wed, 31 Jan 2024 14:26:49 -0300 Message-ID: <20240131172654.15869-1-jamrial@gmail.com> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/6 v3] avcodec: add an Immersive Audio Model and Formats frame split bsf X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 1cPTQ5hGNfHt Signed-off-by: James Almer --- doc/bitstream_filters.texi | 13 + libavcodec/bitstream_filters.c | 1 + libavcodec/bsf/Makefile | 1 + libavcodec/bsf/iamf_frame_split_bsf.c | 825 ++++++++++++++++++++++++++ 4 files changed, 840 insertions(+) create mode 100644 libavcodec/bsf/iamf_frame_split_bsf.c diff --git a/doc/bitstream_filters.texi b/doc/bitstream_filters.texi index dc4f85bac0..7e0cfa3e26 100644 --- a/doc/bitstream_filters.texi +++ b/doc/bitstream_filters.texi @@ -465,6 +465,19 @@ Please note that this filter is auto-inserted for MPEG-TS (muxer @code{mpegts}) and raw HEVC/H.265 (muxer @code{h265} or @code{hevc}) output formats. +@section iamf_frame_split + +Split a packet containing one or more Audio Frame OBUs into several +packets each containing the respective extracted raw audio data from +every Audio Frame. +Stream index in output packets will be set based on the order the OBUs +are coded. + +@table @option +@item first_index +Lowest stream index value to set in output packets +@end table + @section imxdump Modifies the bitstream to fit in MOV and to be usable by the Final Cut diff --git a/libavcodec/bitstream_filters.c b/libavcodec/bitstream_filters.c index 1e9a676a3d..476331ec8a 100644 --- a/libavcodec/bitstream_filters.c +++ b/libavcodec/bitstream_filters.c @@ -42,6 +42,7 @@ extern const FFBitStreamFilter ff_h264_redundant_pps_bsf; extern const FFBitStreamFilter ff_hapqa_extract_bsf; extern const FFBitStreamFilter ff_hevc_metadata_bsf; extern const FFBitStreamFilter ff_hevc_mp4toannexb_bsf; +extern const FFBitStreamFilter ff_iamf_frame_split_bsf; extern const FFBitStreamFilter ff_imx_dump_header_bsf; extern const FFBitStreamFilter ff_media100_to_mjpegb_bsf; extern const FFBitStreamFilter ff_mjpeg2jpeg_bsf; diff --git a/libavcodec/bsf/Makefile b/libavcodec/bsf/Makefile index 7831b0f2aa..cb23428f4a 100644 --- a/libavcodec/bsf/Makefile +++ b/libavcodec/bsf/Makefile @@ -20,6 +20,7 @@ OBJS-$(CONFIG_H264_REDUNDANT_PPS_BSF) += bsf/h264_redundant_pps.o OBJS-$(CONFIG_HAPQA_EXTRACT_BSF) += bsf/hapqa_extract.o OBJS-$(CONFIG_HEVC_METADATA_BSF) += bsf/h265_metadata.o OBJS-$(CONFIG_HEVC_MP4TOANNEXB_BSF) += bsf/hevc_mp4toannexb.o +OBJS-$(CONFIG_IAMF_FRAME_SPLIT_BSF) += bsf/iamf_frame_split_bsf.o OBJS-$(CONFIG_IMX_DUMP_HEADER_BSF) += bsf/imx_dump_header.o OBJS-$(CONFIG_MEDIA100_TO_MJPEGB_BSF) += bsf/media100_to_mjpegb.o OBJS-$(CONFIG_MJPEG2JPEG_BSF) += bsf/mjpeg2jpeg.o diff --git a/libavcodec/bsf/iamf_frame_split_bsf.c b/libavcodec/bsf/iamf_frame_split_bsf.c new file mode 100644 index 0000000000..3e416e1ca1 --- /dev/null +++ b/libavcodec/bsf/iamf_frame_split_bsf.c @@ -0,0 +1,825 @@ +/* + * Copyright (c) 2023 James Almer + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include +#include + +#include "libavutil/dict.h" +#include "libavutil/opt.h" +#include "libavformat/iamf.h" +#include "bsf.h" +#include "bsf_internal.h" +#include "get_bits.h" +#include "leb.h" + +typedef struct ParamDefinition { + AVIAMFParamDefinition *param; + size_t param_size; + int mode; + int recon_gain_present_bitmask; +} ParamDefinition; + +typedef struct IAMFSplitContext { + AVClass *class; + AVPacket *buffer_pkt; + + ParamDefinition *param_definitions; + unsigned int nb_param_definitions; + + unsigned int *ids; + int nb_ids; + + // AVOptions + int first_index; + + // Packet side data + AVIAMFParamDefinition *mix; + size_t mix_size; + AVIAMFParamDefinition *demix; + size_t demix_size; + AVIAMFParamDefinition *recon; + size_t recon_size; +} IAMFSplitContext; + +static int param_parse(AVBSFContext *ctx, GetBitContext *gb, + unsigned int type, + ParamDefinition **out) +{ + IAMFSplitContext *const c = ctx->priv_data; + ParamDefinition *param_definition = NULL; + AVIAMFParamDefinition *param; + unsigned int parameter_id, parameter_rate, mode; + unsigned int duration = 0, constant_subblock_duration = 0, nb_subblocks = 0; + size_t param_size; + + parameter_id = get_leb(gb); + + for (int i = 0; i < c->nb_param_definitions; i++) + if (c->param_definitions[i].param->parameter_id == parameter_id) { + param_definition = &c->param_definitions[i]; + break; + } + + parameter_rate = get_leb(gb); + mode = get_bits(gb, 8) >> 7; + + if (mode == 0) { + duration = get_leb(gb); + constant_subblock_duration = get_leb(gb); + if (constant_subblock_duration == 0) { + nb_subblocks = get_leb(gb); + } else + nb_subblocks = duration / constant_subblock_duration; + } + + param = av_iamf_param_definition_alloc(type, nb_subblocks, ¶m_size); + if (!param) + return AVERROR(ENOMEM); + + for (int i = 0; i < nb_subblocks; i++) { + if (constant_subblock_duration == 0) + get_leb(gb); // subblock_duration + + switch (type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: + break; + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: + skip_bits(gb, 8); // dmixp_mode + break; + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: + break; + default: + av_free(param); + return AVERROR_INVALIDDATA; + } + } + + param->parameter_id = parameter_id; + param->parameter_rate = parameter_rate; + param->duration = duration; + param->constant_subblock_duration = constant_subblock_duration; + param->nb_subblocks = nb_subblocks; + + if (param_definition) { + if (param_definition->param_size != param_size || + memcmp(param_definition->param, param, param_size)) { + av_log(ctx, AV_LOG_ERROR, "Incosistent parameters for parameter_id %u\n", + parameter_id); + av_free(param); + return AVERROR_INVALIDDATA; + } + av_freep(¶m); + } else { + ParamDefinition *tmp = av_realloc_array(c->param_definitions, + c->nb_param_definitions + 1, + sizeof(*c->param_definitions)); + if (!tmp) { + av_free(param); + return AVERROR(ENOMEM); + } + c->param_definitions = tmp; + + param_definition = &c->param_definitions[c->nb_param_definitions++]; + param_definition->param = param; + param_definition->mode = !mode; + param_definition->param_size = param_size; + } + if (out) + *out = param_definition; + + return 0; +} + +static int scalable_channel_layout_config(AVBSFContext *ctx, GetBitContext *gb, + ParamDefinition *recon_gain) +{ + int nb_layers; + + nb_layers = get_bits(gb, 3); + skip_bits(gb, 5); //reserved + + if (nb_layers > 6) + return AVERROR_INVALIDDATA; + + for (int i = 0; i < nb_layers; i++) { + int output_gain_is_present_flag, recon_gain_is_present; + + skip_bits(gb, 4); // loudspeaker_layout + output_gain_is_present_flag = get_bits1(gb); + recon_gain_is_present = get_bits1(gb); + if (recon_gain) + recon_gain->recon_gain_present_bitmask |= recon_gain_is_present << i; + skip_bits(gb, 2); // reserved + skip_bits(gb, 8); // substream_count + skip_bits(gb, 8); // coupled_substream_count + if (output_gain_is_present_flag) { + skip_bits(gb, 8); // output_gain_flags & reserved + skip_bits(gb, 16); // output_gain + } + } + + return 0; +} + +static int audio_element_obu(AVBSFContext *ctx, uint8_t *buf, unsigned size) +{ + IAMFSplitContext *const c = ctx->priv_data; + GetBitContext gb; + ParamDefinition *recon_gain = NULL; + unsigned audio_element_type; + unsigned num_substreams, num_parameters; + int ret; + + ret = init_get_bits8(&gb, buf, size); + if (ret < 0) + return ret; + + get_leb(&gb); // audio_element_id + audio_element_type = get_bits(&gb, 3); + skip_bits(&gb, 5); // reserved + + get_leb(&gb); // codec_config_id + num_substreams = get_leb(&gb); + for (unsigned i = 0; i < num_substreams; i++) { + unsigned *audio_substream_id = av_dynarray2_add((void **)&c->ids, &c->nb_ids, + sizeof(*c->ids), NULL); + if (!audio_substream_id) + return AVERROR(ENOMEM); + + *audio_substream_id = get_leb(&gb); + } + + num_parameters = get_leb(&gb); + if (num_parameters && audio_element_type != 0) { + av_log(ctx, AV_LOG_ERROR, "Audio Element parameter count %u is invalid" + " for Scene representations\n", num_parameters); + return AVERROR_INVALIDDATA; + } + + for (int i = 0; i < num_parameters; i++) { + unsigned type = get_leb(&gb); + + if (type == AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN) + return AVERROR_INVALIDDATA; + else if (type == AV_IAMF_PARAMETER_DEFINITION_DEMIXING) { + ret = param_parse(ctx, &gb, type, NULL); + if (ret < 0) + return ret; + skip_bits(&gb, 8); // default_w + } else if (type == AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN) { + ret = param_parse(ctx, &gb, type, &recon_gain); + if (ret < 0) + return ret; + } else { + unsigned param_definition_size = get_leb(&gb); + skip_bits_long(&gb, param_definition_size * 8); + } + } + + if (audio_element_type == AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL) { + ret = scalable_channel_layout_config(ctx, &gb, recon_gain); + if (ret < 0) + return ret; + } + + return 0; +} + +static int label_string(GetBitContext *gb) +{ + int byte; + + do { + byte = get_bits(gb, 8); + } while (byte); + + return 0; +} + +static int mix_presentation_obu(AVBSFContext *ctx, uint8_t *buf, unsigned size) +{ + GetBitContext gb; + unsigned mix_presentation_id, count_label; + unsigned nb_submixes, nb_elements; + int ret; + + ret = init_get_bits8(&gb, buf, size); + if (ret < 0) + return ret; + + mix_presentation_id = get_leb(&gb); + count_label = get_leb(&gb); + + for (int i = 0; i < count_label; i++) { + ret = label_string(&gb); + if (ret < 0) + return ret; + } + + for (int i = 0; i < count_label; i++) { + ret = label_string(&gb); + if (ret < 0) + return ret; + } + + nb_submixes = get_leb(&gb); + for (int i = 0; i < nb_submixes; i++) { + unsigned nb_layouts; + + nb_elements = get_leb(&gb); + + for (int j = 0; j < nb_elements; j++) { + unsigned rendering_config_extension_size; + + get_leb(&gb); // audio_element_id + for (int k = 0; k < count_label; k++) { + ret = label_string(&gb); + if (ret < 0) + return ret; + } + + skip_bits(&gb, 8); // headphones_rendering_mode & reserved + rendering_config_extension_size = get_leb(&gb); + skip_bits_long(&gb, rendering_config_extension_size * 8); + + ret = param_parse(ctx, &gb, AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN, NULL); + if (ret < 0) + return ret; + skip_bits(&gb, 16); // default_mix_gain + } + + ret = param_parse(ctx, &gb, AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN, NULL); + if (ret < 0) + return ret; + get_bits(&gb, 16); // default_mix_gain + + nb_layouts = get_leb(&gb); + for (int j = 0; j < nb_layouts; j++) { + int info_type, layout_type; + int byte = get_bits(&gb, 8); + + layout_type = byte >> 6; + if (layout_type < AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS && + layout_type > AV_IAMF_SUBMIX_LAYOUT_TYPE_BINAURAL) { + av_log(ctx, AV_LOG_ERROR, "Invalid Layout type %u in a submix from " + "Mix Presentation %u\n", + layout_type, mix_presentation_id); + return AVERROR_INVALIDDATA; + } + + info_type = get_bits(&gb, 8); + get_bits(&gb, 16); // integrated_loudness + get_bits(&gb, 16); // digital_peak + + if (info_type & 1) + get_bits(&gb, 16); // true_peak + + if (info_type & 2) { + unsigned int num_anchored_loudness = get_bits(&gb, 8); + + for (int k = 0; k < num_anchored_loudness; k++) { + get_bits(&gb, 8); // anchor_element + get_bits(&gb, 16); // anchored_loudness + } + } + + if (info_type & 0xFC) { + unsigned int info_type_size = get_leb(&gb); + skip_bits_long(&gb, info_type_size * 8); + } + } + } + + return 0; +} + +static int find_idx_by_id(AVBSFContext *ctx, unsigned id) +{ + IAMFSplitContext *const c = ctx->priv_data; + + for (int i = 0; i < c->nb_ids; i++) + if (c->ids[i] == id) + return i; + + av_log(ctx, AV_LOG_ERROR, "Invalid id %d\n", id); + return AVERROR_INVALIDDATA; +} + +static int audio_frame_obu(AVBSFContext *ctx, enum IAMF_OBU_Type type, int *idx, + uint8_t *buf, int *start_pos, unsigned *size, + int id_in_bitstream) +{ + GetBitContext gb; + unsigned audio_substream_id; + int ret; + + ret = init_get_bits8(&gb, buf + *start_pos, *size); + if (ret < 0) + return ret; + + if (id_in_bitstream) { + int pos; + audio_substream_id = get_leb(&gb); + pos = get_bits_count(&gb) / 8; + *start_pos += pos; + *size -= pos; + } else + audio_substream_id = type - IAMF_OBU_IA_AUDIO_FRAME_ID0; + + ret = find_idx_by_id(ctx, audio_substream_id); + if (ret < 0) + return ret; + + *idx = ret; + + return 0; +} + +static const ParamDefinition *get_param_definition(AVBSFContext *ctx, + unsigned int parameter_id) +{ + const IAMFSplitContext *const c = ctx->priv_data; + const ParamDefinition *param_definition = NULL; + + for (int i = 0; i < c->nb_param_definitions; i++) + if (c->param_definitions[i].param->parameter_id == parameter_id) { + param_definition = &c->param_definitions[i]; + break; + } + + return param_definition; +} + +static int parameter_block_obu(AVBSFContext *ctx, uint8_t *buf, unsigned size) +{ + IAMFSplitContext *const c = ctx->priv_data; + GetBitContext gb; + const ParamDefinition *param_definition; + const AVIAMFParamDefinition *param; + AVIAMFParamDefinition *out_param = NULL; + unsigned int duration, constant_subblock_duration; + unsigned int nb_subblocks; + unsigned int parameter_id; + size_t out_param_size; + int ret; + + ret = init_get_bits8(&gb, buf, size); + if (ret < 0) + return ret; + + parameter_id = get_leb(&gb); + + param_definition = get_param_definition(ctx, parameter_id); + if (!param_definition) { + ret = 0; + goto fail; + } + + param = param_definition->param; + if (!param_definition->mode) { + duration = get_leb(&gb); + constant_subblock_duration = get_leb(&gb); + if (constant_subblock_duration == 0) + nb_subblocks = get_leb(&gb); + else + nb_subblocks = duration / constant_subblock_duration; + } else { + duration = param->duration; + constant_subblock_duration = param->constant_subblock_duration; + nb_subblocks = param->nb_subblocks; + if (!nb_subblocks) + nb_subblocks = duration / constant_subblock_duration; + } + + out_param = av_iamf_param_definition_alloc(param->type, nb_subblocks, + &out_param_size); + if (!out_param) { + ret = AVERROR(ENOMEM); + goto fail; + } + + out_param->parameter_id = param->parameter_id; + out_param->type = param->type; + out_param->parameter_rate = param->parameter_rate; + out_param->duration = duration; + out_param->constant_subblock_duration = constant_subblock_duration; + out_param->nb_subblocks = nb_subblocks; + + for (int i = 0; i < nb_subblocks; i++) { + void *subblock = av_iamf_param_definition_get_subblock(out_param, i); + unsigned int subblock_duration = constant_subblock_duration; + + if (!param_definition->mode && !constant_subblock_duration) + subblock_duration = get_leb(&gb); + + switch (param->type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: { + AVIAMFMixGain *mix = subblock; + + mix->animation_type = get_leb(&gb); + if (mix->animation_type > AV_IAMF_ANIMATION_TYPE_BEZIER) { + ret = 0; + av_free(out_param); + goto fail; + } + + mix->start_point_value = + av_make_q(sign_extend(get_bits(&gb, 16), 16), 1 << 8); + if (mix->animation_type >= AV_IAMF_ANIMATION_TYPE_LINEAR) + mix->end_point_value = + av_make_q(sign_extend(get_bits(&gb, 16), 16), 1 << 8); + if (mix->animation_type == AV_IAMF_ANIMATION_TYPE_BEZIER) { + mix->control_point_value = + av_make_q(sign_extend(get_bits(&gb, 16), 16), 1 << 8); + mix->control_point_relative_time = + av_make_q(get_bits(&gb, 8), 1 << 8); + } + mix->subblock_duration = subblock_duration; + break; + } + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: { + AVIAMFDemixingInfo *demix = subblock; + + demix->dmixp_mode = get_bits(&gb, 3); + skip_bits(&gb, 5); // reserved + demix->subblock_duration = subblock_duration; + break; + } + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: { + AVIAMFReconGain *recon = subblock; + + for (int i = 0; i < 6; i++) { + if (param_definition->recon_gain_present_bitmask & (1 << i)) { + unsigned int recon_gain_flags = get_leb(&gb); + unsigned int bitcount = 7 + 5 * !!(recon_gain_flags & 0x80); + recon_gain_flags = + (recon_gain_flags & 0x7F) | ((recon_gain_flags & 0xFF00) >> 1); + for (int j = 0; j < bitcount; j++) { + if (recon_gain_flags & (1 << j)) + recon->recon_gain[i][j] = get_bits(&gb, 8); + } + } + } + recon->subblock_duration = subblock_duration; + break; + } + default: + av_assert0(0); + } + } + + switch (param->type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: + av_free(c->mix); + c->mix = out_param; + c->mix_size = out_param_size; + break; + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: + av_free(c->demix); + c->demix = out_param; + c->demix_size = out_param_size; + break; + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: + av_free(c->recon); + c->recon = out_param; + c->recon_size = out_param_size; + break; + default: + av_assert0(0); + } + + ret = 0; +fail: + if (ret < 0) + av_free(out_param); + + return ret; +} + +static int iamf_parse_obu_header(const uint8_t *buf, int buf_size, + unsigned *obu_size, int *start_pos, + enum IAMF_OBU_Type *type, + unsigned *skip, unsigned *discard) +{ + GetBitContext gb; + int ret, extension_flag, trimming, start; + unsigned size; + + ret = init_get_bits8(&gb, buf, FFMIN(buf_size, MAX_IAMF_OBU_HEADER_SIZE)); + if (ret < 0) + return ret; + + *type = get_bits(&gb, 5); + /*redundant =*/ get_bits1(&gb); + trimming = get_bits1(&gb); + extension_flag = get_bits1(&gb); + + *obu_size = get_leb(&gb); + if (*obu_size > INT_MAX) + return AVERROR_INVALIDDATA; + + start = get_bits_count(&gb) / 8; + + if (trimming) { + *skip = get_leb(&gb); // num_samples_to_trim_at_end + *discard = get_leb(&gb); // num_samples_to_trim_at_start + } + + if (extension_flag) { + unsigned extension_bytes = get_leb(&gb); + if (extension_bytes > INT_MAX / 8) + return AVERROR_INVALIDDATA; + skip_bits_long(&gb, extension_bytes * 8); + } + + if (get_bits_left(&gb) < 0) + return AVERROR_INVALIDDATA; + + size = *obu_size + start; + if (size > INT_MAX) + return AVERROR_INVALIDDATA; + + *obu_size -= get_bits_count(&gb) / 8 - start; + *start_pos = size - *obu_size; + + return size; +} + +static int iamf_frame_split_filter(AVBSFContext *ctx, AVPacket *out) +{ + IAMFSplitContext *const c = ctx->priv_data; + int ret = 0; + + if (!c->buffer_pkt->data) { + ret = ff_bsf_get_packet_ref(ctx, c->buffer_pkt); + if (ret < 0) + return ret; + } + + while (1) { + enum IAMF_OBU_Type type; + unsigned skip_samples = 0, discard_padding = 0, obu_size; + int len, start_pos, idx; + + len = iamf_parse_obu_header(c->buffer_pkt->data, + c->buffer_pkt->size, + &obu_size, &start_pos, &type, + &skip_samples, &discard_padding); + if (len < 0) { + av_log(ctx, AV_LOG_ERROR, "Failed to read obu\n"); + ret = len; + goto fail; + } + + if (type >= IAMF_OBU_IA_AUDIO_FRAME && + type <= IAMF_OBU_IA_AUDIO_FRAME_ID17) { + ret = audio_frame_obu(ctx, type, &idx, + c->buffer_pkt->data, &start_pos, + &obu_size, + type == IAMF_OBU_IA_AUDIO_FRAME); + if (ret < 0) + goto fail; + } else { + switch (type) { + case IAMF_OBU_IA_AUDIO_ELEMENT: + ret = audio_element_obu(ctx, c->buffer_pkt->data + start_pos, + obu_size); + if (ret < 0) + goto fail; + break; + case IAMF_OBU_IA_MIX_PRESENTATION: + ret = mix_presentation_obu(ctx, c->buffer_pkt->data + start_pos, + obu_size); + if (ret < 0) + goto fail; + break; + case IAMF_OBU_IA_PARAMETER_BLOCK: + ret = parameter_block_obu(ctx, c->buffer_pkt->data + start_pos, + obu_size); + if (ret < 0) + goto fail; + break; + case IAMF_OBU_IA_SEQUENCE_HEADER: + for (int i = 0; c->param_definitions && i < c->nb_param_definitions; i++) + av_free(c->param_definitions[i].param); + av_freep(&c->param_definitions); + av_freep(&c->ids); + c->nb_param_definitions = 0; + c->nb_ids = 0; + // fall-through + case IAMF_OBU_IA_TEMPORAL_DELIMITER: + av_freep(&c->mix); + av_freep(&c->demix); + av_freep(&c->recon); + c->mix_size = 0; + c->demix_size = 0; + c->recon_size = 0; + break; + } + + c->buffer_pkt->data += len; + c->buffer_pkt->size -= len; + + if (!c->buffer_pkt->size) { + av_packet_unref(c->buffer_pkt); + ret = ff_bsf_get_packet_ref(ctx, c->buffer_pkt); + if (ret < 0) + return ret; + } else if (c->buffer_pkt->size < 0) { + ret = AVERROR_INVALIDDATA; + goto fail; + } + continue; + } + + if (c->buffer_pkt->size > INT_MAX - len) { + ret = AVERROR_INVALIDDATA; + goto fail; + } + + ret = av_packet_ref(out, c->buffer_pkt); + if (ret < 0) + goto fail; + + if (skip_samples || discard_padding) { + uint8_t *side_data = av_packet_new_side_data(out, + AV_PKT_DATA_SKIP_SAMPLES, 10); + if (!side_data) + return AVERROR(ENOMEM); + AV_WL32(side_data, skip_samples); + AV_WL32(side_data + 4, discard_padding); + } + if (c->mix) { + uint8_t *side_data = av_packet_new_side_data(out, + AV_PKT_DATA_IAMF_MIX_GAIN_PARAM, + c->mix_size); + if (!side_data) + return AVERROR(ENOMEM); + memcpy(side_data, c->mix, c->mix_size); + } + if (c->demix) { + uint8_t *side_data = av_packet_new_side_data(out, + AV_PKT_DATA_IAMF_DEMIXING_INFO_PARAM, + c->demix_size); + if (!side_data) + return AVERROR(ENOMEM); + memcpy(side_data, c->demix, c->demix_size); + } + if (c->recon) { + uint8_t *side_data = av_packet_new_side_data(out, + AV_PKT_DATA_IAMF_RECON_GAIN_INFO_PARAM, + c->recon_size); + if (!side_data) + return AVERROR(ENOMEM); + memcpy(side_data, c->recon, c->recon_size); + } + + out->data += start_pos; + out->size = obu_size; + out->stream_index = idx + c->first_index; + + c->buffer_pkt->data += len; + c->buffer_pkt->size -= len; + + if (!c->buffer_pkt->size) + av_packet_unref(c->buffer_pkt); + else if (c->buffer_pkt->size < 0) { + ret = AVERROR_INVALIDDATA; + goto fail; + } + + return 0; + } + +fail: + if (ret < 0) { + av_packet_unref(out); + av_packet_unref(c->buffer_pkt); + } + + return ret; +} + +static int iamf_frame_split_init(AVBSFContext *ctx) +{ + IAMFSplitContext *const c = ctx->priv_data; + + c->buffer_pkt = av_packet_alloc(); + if (!c->buffer_pkt) + return AVERROR(ENOMEM); + + return 0; +} + +static void iamf_frame_split_flush(AVBSFContext *ctx) +{ + IAMFSplitContext *const c = ctx->priv_data; + + if (c->buffer_pkt) + av_packet_unref(c->buffer_pkt); + + av_freep(&c->mix); + av_freep(&c->demix); + av_freep(&c->recon); + c->mix_size = 0; + c->demix_size = 0; + c->recon_size = 0; +} + +static void iamf_frame_split_close(AVBSFContext *ctx) +{ + IAMFSplitContext *const c = ctx->priv_data; + + iamf_frame_split_flush(ctx); + av_packet_free(&c->buffer_pkt); + + for (int i = 0; c->param_definitions && i < c->nb_param_definitions; i++) + av_free(c->param_definitions[i].param); + av_freep(&c->param_definitions); + c->nb_param_definitions = 0; + + av_freep(&c->ids); + c->nb_ids = 0; +} + +#define OFFSET(x) offsetof(IAMFSplitContext, x) +#define FLAGS (AV_OPT_FLAG_AUDIO_PARAM|AV_OPT_FLAG_BSF_PARAM) +static const AVOption iamf_frame_split_options[] = { + { "first_index", "First index to set stream index in output packets", + OFFSET(first_index), AV_OPT_TYPE_INT, { 0 }, 0, INT_MAX, FLAGS }, + { NULL } +}; + +static const AVClass iamf_frame_split_class = { + .class_name = "iamf_frame_split_bsf", + .item_name = av_default_item_name, + .option = iamf_frame_split_options, + .version = LIBAVUTIL_VERSION_INT, +}; + +const FFBitStreamFilter ff_iamf_frame_split_bsf = { + .p.name = "iamf_frame_split", + .p.priv_class = &iamf_frame_split_class, + .priv_data_size = sizeof(IAMFSplitContext), + .init = iamf_frame_split_init, + .flush = iamf_frame_split_flush, + .close = iamf_frame_split_close, + .filter = iamf_frame_split_filter, +}; From patchwork Wed Jan 31 17:26:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 45938 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:2c82:b0:199:de12:6fa6 with SMTP id g2csp3912pzj; Wed, 31 Jan 2024 09:27:11 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCVFMMLTSxVBp5L5ob6IgEbnN+saBpLa/q4h2RtftNfU+J3yj1DFgFFLwUHdCetHbYpbsxTvvmX3/AylD/waPvOHvaBiFWSVrOFyhA== X-Google-Smtp-Source: AGHT+IGW7URpWQkF3HE5THNTljpNiimHY5b93ABFx30PDQ/dCVHtJfQjSVgv+RcerXl43p5pBWqJ X-Received: by 2002:a17:906:3b85:b0:a31:234e:6a9c with SMTP id u5-20020a1709063b8500b00a31234e6a9cmr1552745ejf.5.1706722030760; Wed, 31 Jan 2024 09:27:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1706722030; cv=none; d=google.com; s=arc-20160816; b=uj+KtSSi+8S5ev7FZ9/gnHip4DoNN7SEDomXj7m5ZswbbrYwDbV50eDZZy7vT4Mn9B dxijKTVHeHpWcdJQ77DYPNPm0ktDQGa+OQVTCS9ScEIxrhrm7Fe9vRoBsR0gOpfEMXIi sqdiCG25Nqbwq3/7LFcsNknRT/jGxxBmouaLT//VvLZzWybjoFNeU/n0Y0Hmf0XsPlve 25DQbJ/C3ZZKsQXg+zsT0/ZQ3N87DCye2eddoPWcsheLHm0dphYv3v+A+n20fslAwAPw vnmcE0mdeTXVa0KXFENvc5eL+0VkLoiqVU9lshm0W/9kWYzvy+Z3KxXFlfy/zfORQENw 4bdA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=FLxR8zuyq4fk4q9jshWY7/ArTNfYP+WVRga5CUsaPDA=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=LNKe7g2SiVh68ucDr1PtL65iTSi+VGWq8hn0OG8FLhaX8kG1VbVD/PMyPm4zsODj/P qe3Bk4OklDNiSoyKLXr4jN5WSXWRKGR2buaophr2L7qsij3SklsUcRdpZ5ThgAfOeiHg Km+kywtgUPfDFRIgtRQ2pOPKMNa4W5JIVJZHol6Z6YQ3tukQ1GPv6W/YhYie808G0pwZ c25BhqtSkCRUiFCRO+KVefSWHCVvw+oz+bAF61wqBKvri/Y8DBPKKhCqmI9Lm++k5v7J 9NRAYLIv40yG5AUmJKSDc+lgXeBMjZLLp6KTmOvAV+yeKldk7fz7jrDNRkW+TuEBMTAp u1cw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=eYuCyRLo; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id w26-20020a17090652da00b00a2b22f642fasi5728523ejn.236.2024.01.31.09.27.10; Wed, 31 Jan 2024 09:27:10 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=eYuCyRLo; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5667168D118; Wed, 31 Jan 2024 19:26:55 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 65BDE68CAFA for ; Wed, 31 Jan 2024 19:26:48 +0200 (EET) Received: by mail-pl1-f172.google.com with SMTP id d9443c01a7336-1d911c9240dso15535405ad.2 for ; Wed, 31 Jan 2024 09:26:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706722006; x=1707326806; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=iVDXbV0NS0UVJL/Qg/i4CDJ4grAgWUzvGClGoMKd4y0=; b=eYuCyRLoR8qM9+18dzQ9N8cAXPnSNWU4v3SbbHIKSNk8obL7gt6wnQThjgxDiRl2Mq oGyC+eY596tp4P20GOJft0dSD54M+nF1SVwHLLLimmVkizAlproJXuaUu2oCaXDYF5Er AkkFdmR/s8bDcNYYLAi5bys8NLFwMbsL2yOKYiUv/pnswkGInlTdhVNiOu1mqs/FQtPj np57SG5Om2xdvJfz1Spb1yOO1lnGsBvACQKRY4+vr65d4hlK9+7VJ2KFVw83Z79qOrc7 pVkWlPnJsGTV54Ihy7sadxfb+M11RO9NTBFtV5LGFsfS3KD/kHohzEqu+DNZ+pYRy9ht +2Mg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706722006; x=1707326806; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iVDXbV0NS0UVJL/Qg/i4CDJ4grAgWUzvGClGoMKd4y0=; b=EB5GR/ht5xteEIH4R1f5yRNquOTjAGCqTXf0ikLAZ6bberCpsDDT+7+WnT8M2iiwzP rPHNpKcm9acaJUdz5fKYyxxis0ByMPUw67jj2MYMvLSNiWzuon98zOxunFNBfn2sFFlF cG55+L942ypQYrCW4gsqC3ztEmDUk35trI36wufLLf4UUuFSW9a653ALFgPQ8i3fgSgI l+ofCw/4+dgPRzYZMRG+3FnLPIHM9Fao0+IoG7LZLWnZN8uRzuRKtVsFDDPlKe+ndnf9 dWhD6+iopCJH90Jxg0XkLh02mbOJd7Bte0AmRGH940IlEf2NskKa0w3WvKbC5r0146IA hM7w== X-Gm-Message-State: AOJu0Yxj0wkgrH+A0hX7z0aISW/6SOsUK43qaZrYkaXs2GsD9DdYab06 cAkZZtaxKgaj55iC4r72/Qx6Qn2P09K5+TLOmyzYICGEgig4j5FpOq62a5Cs X-Received: by 2002:a17:902:c781:b0:1d7:690f:a961 with SMTP id w1-20020a170902c78100b001d7690fa961mr2164232pla.44.1706722005530; Wed, 31 Jan 2024 09:26:45 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id k11-20020a170902f28b00b001d8e4b85636sm5762235plc.138.2024.01.31.09.26.44 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 Jan 2024 09:26:44 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Wed, 31 Jan 2024 14:26:50 -0300 Message-ID: <20240131172654.15869-2-jamrial@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240131172654.15869-1-jamrial@gmail.com> References: <20240131172654.15869-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/6 v3] avformat/demux: support inserting bitstream filters in demuxing scenarios X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 1MLxkKXnsnuM Packets will be passed to the bsf immediately after being generated by a demuxer, and no further data will be read from the input until all packets have been returned by the bsf. Signed-off-by: James Almer --- libavformat/avformat.c | 47 ++++++++++++ libavformat/demux.c | 162 ++++++++++++++++++++++++++++++----------- libavformat/internal.h | 13 +++- libavformat/mux.c | 43 ----------- libavformat/mux.h | 11 --- libavformat/rawenc.c | 1 + 6 files changed, 181 insertions(+), 96 deletions(-) diff --git a/libavformat/avformat.c b/libavformat/avformat.c index 8e8c6fbe55..0e22d47c8b 100644 --- a/libavformat/avformat.c +++ b/libavformat/avformat.c @@ -931,3 +931,50 @@ FF_ENABLE_DEPRECATION_WARNINGS *pb = NULL; return ret; } + +int ff_stream_add_bitstream_filter(AVStream *st, const char *name, const char *args) +{ + int ret; + const AVBitStreamFilter *bsf; + FFStream *const sti = ffstream(st); + AVBSFContext *bsfc; + + av_assert0(!sti->bsfc); + + if (name) { + bsf = av_bsf_get_by_name(name); + if (!bsf) { + av_log(NULL, AV_LOG_ERROR, "Unknown bitstream filter '%s'\n", name); + return AVERROR_BSF_NOT_FOUND; + } + ret = av_bsf_alloc(bsf, &bsfc); + } else + ret = av_bsf_get_null_filter(&bsfc); + if (ret < 0) + return ret; + + bsfc->time_base_in = st->time_base; + if ((ret = avcodec_parameters_copy(bsfc->par_in, st->codecpar)) < 0) { + av_bsf_free(&bsfc); + return ret; + } + + if (args && bsfc->filter->priv_class) { + if ((ret = av_set_options_string(bsfc->priv_data, args, "=", ":")) < 0) { + av_bsf_free(&bsfc); + return ret; + } + } + + if ((ret = av_bsf_init(bsfc)) < 0) { + av_bsf_free(&bsfc); + return ret; + } + + sti->bsfc = bsfc; + + av_log(NULL, AV_LOG_VERBOSE, + "Automatically inserted bitstream filter '%s'; args='%s'\n", + name, args ? args : ""); + return 1; +} diff --git a/libavformat/demux.c b/libavformat/demux.c index 6f640b92b1..fb9bf9e4ac 100644 --- a/libavformat/demux.c +++ b/libavformat/demux.c @@ -540,6 +540,109 @@ static int update_wrap_reference(AVFormatContext *s, AVStream *st, int stream_in return 1; } +static void update_timestamps(AVFormatContext *s, AVStream *st, AVPacket *pkt) +{ + FFStream *const sti = ffstream(st); + + if (update_wrap_reference(s, st, pkt->stream_index, pkt) && sti->pts_wrap_behavior == AV_PTS_WRAP_SUB_OFFSET) { + // correct first time stamps to negative values + if (!is_relative(sti->first_dts)) + sti->first_dts = wrap_timestamp(st, sti->first_dts); + if (!is_relative(st->start_time)) + st->start_time = wrap_timestamp(st, st->start_time); + if (!is_relative(sti->cur_dts)) + sti->cur_dts = wrap_timestamp(st, sti->cur_dts); + } + + pkt->dts = wrap_timestamp(st, pkt->dts); + pkt->pts = wrap_timestamp(st, pkt->pts); + + force_codec_ids(s, st); + + /* TODO: audio: time filter; video: frame reordering (pts != dts) */ + if (s->use_wallclock_as_timestamps) + pkt->dts = pkt->pts = av_rescale_q(av_gettime(), AV_TIME_BASE_Q, st->time_base); +} + +static int filter_packet(AVFormatContext *s, AVStream *st, AVPacket *pkt) +{ + FFFormatContext *const si = ffformatcontext(s); + FFStream *const sti = ffstream(st); + const AVPacket *pkt1; + int err; + + if (!sti->bsfc) { + const PacketListEntry *pktl = si->raw_packet_buffer.head; + if (AVPACKET_IS_EMPTY(pkt)) + return 0; + + update_timestamps(s, st, pkt); + + if (!pktl && sti->request_probe <= 0) + return 0; + + err = avpriv_packet_list_put(&si->raw_packet_buffer, pkt, NULL, 0); + if (err < 0) { + av_packet_unref(pkt); + return err; + } + + pkt1 = &si->raw_packet_buffer.tail->pkt; + si->raw_packet_buffer_size += pkt1->size; + + if (sti->request_probe <= 0) + return 0; + + return probe_codec(s, s->streams[pkt1->stream_index], pkt1); + } + + err = av_bsf_send_packet(sti->bsfc, pkt); + if (err < 0) { + av_log(s, AV_LOG_ERROR, + "Failed to send packet to filter %s for stream %d\n", + sti->bsfc->filter->name, st->index); + return err; + } + + do { + AVStream *out_st; + FFStream *out_sti; + + err = av_bsf_receive_packet(sti->bsfc, pkt); + if (err < 0) { + if (err == AVERROR(EAGAIN) || err == AVERROR_EOF) + return 0; + av_log(s, AV_LOG_ERROR, "Error applying bitstream filters to an output " + "packet for stream #%d: %s\n", st->index, av_err2str(err)); + if (!(s->error_recognition & AV_EF_EXPLODE) && err != AVERROR(ENOMEM)) + continue; + return err; + } + out_st = s->streams[pkt->stream_index]; + out_sti = ffstream(out_st); + + update_timestamps(s, out_st, pkt); + + err = avpriv_packet_list_put(&si->raw_packet_buffer, pkt, NULL, 0); + if (err < 0) { + av_packet_unref(pkt); + return err; + } + + pkt1 = &si->raw_packet_buffer.tail->pkt; + si->raw_packet_buffer_size += pkt1->size; + + if (out_sti->request_probe <= 0) + continue; + + err = probe_codec(s, out_st, pkt1); + if (err < 0) + return err; + } while (1); + + return 0; +} + int ff_read_packet(AVFormatContext *s, AVPacket *pkt) { FFFormatContext *const si = ffformatcontext(s); @@ -557,9 +660,6 @@ FF_ENABLE_DEPRECATION_WARNINGS for (;;) { PacketListEntry *pktl = si->raw_packet_buffer.head; - AVStream *st; - FFStream *sti; - const AVPacket *pkt1; if (pktl) { AVStream *const st = s->streams[pktl->pkt.stream_index]; @@ -582,16 +682,27 @@ FF_ENABLE_DEPRECATION_WARNINGS We must re-call the demuxer to get the real packet. */ if (err == FFERROR_REDO) continue; - if (!pktl || err == AVERROR(EAGAIN)) + if (err == AVERROR(EAGAIN)) return err; for (unsigned i = 0; i < s->nb_streams; i++) { AVStream *const st = s->streams[i]; FFStream *const sti = ffstream(st); + int ret; + + // Drain buffered packets in the bsf context on eof + if (err == AVERROR_EOF) + if ((ret = filter_packet(s, st, pkt)) < 0) + return ret; + pktl = si->raw_packet_buffer.head; + if (!pktl) + continue; if (sti->probe_packets || sti->request_probe > 0) - if ((err = probe_codec(s, st, NULL)) < 0) - return err; + if ((ret = probe_codec(s, st, NULL)) < 0) + return ret; av_assert0(sti->request_probe <= 0); } + if (!pktl) + return err; continue; } @@ -616,42 +727,11 @@ FF_ENABLE_DEPRECATION_WARNINGS av_assert0(pkt->stream_index < (unsigned)s->nb_streams && "Invalid stream index.\n"); - st = s->streams[pkt->stream_index]; - sti = ffstream(st); - - if (update_wrap_reference(s, st, pkt->stream_index, pkt) && sti->pts_wrap_behavior == AV_PTS_WRAP_SUB_OFFSET) { - // correct first time stamps to negative values - if (!is_relative(sti->first_dts)) - sti->first_dts = wrap_timestamp(st, sti->first_dts); - if (!is_relative(st->start_time)) - st->start_time = wrap_timestamp(st, st->start_time); - if (!is_relative(sti->cur_dts)) - sti->cur_dts = wrap_timestamp(st, sti->cur_dts); - } - - pkt->dts = wrap_timestamp(st, pkt->dts); - pkt->pts = wrap_timestamp(st, pkt->pts); - - force_codec_ids(s, st); - - /* TODO: audio: time filter; video: frame reordering (pts != dts) */ - if (s->use_wallclock_as_timestamps) - pkt->dts = pkt->pts = av_rescale_q(av_gettime(), AV_TIME_BASE_Q, st->time_base); - - if (!pktl && sti->request_probe <= 0) - return 0; - - err = avpriv_packet_list_put(&si->raw_packet_buffer, - pkt, NULL, 0); - if (err < 0) { - av_packet_unref(pkt); - return err; - } - pkt1 = &si->raw_packet_buffer.tail->pkt; - si->raw_packet_buffer_size += pkt1->size; - - if ((err = probe_codec(s, st, pkt1)) < 0) + err = filter_packet(s, s->streams[pkt->stream_index], pkt); + if (err < 0) return err; + if (!AVPACKET_IS_EMPTY(pkt)) + return 0; } } diff --git a/libavformat/internal.h b/libavformat/internal.h index f93832b3c4..c2738a420f 100644 --- a/libavformat/internal.h +++ b/libavformat/internal.h @@ -212,7 +212,7 @@ typedef struct FFStream { /** * bitstream filter to run on stream * - encoding: Set by muxer using ff_stream_add_bitstream_filter - * - decoding: unused + * - decoding: Set by demuxer using ff_stream_add_bitstream_filter */ struct AVBSFContext *bsfc; @@ -752,4 +752,15 @@ int ff_match_url_ext(const char *url, const char *extensions); struct FFOutputFormat; void avpriv_register_devices(const struct FFOutputFormat * const o[], const AVInputFormat * const i[]); +/** + * Add a bitstream filter to a stream. + * + * @param st output stream to add a filter to + * @param name the name of the filter to add + * @param args filter-specific argument string + * @return >0 on success; + * AVERROR code on failure + */ +int ff_stream_add_bitstream_filter(AVStream *st, const char *name, const char *args); + #endif /* AVFORMAT_INTERNAL_H */ diff --git a/libavformat/mux.c b/libavformat/mux.c index de10d2c008..4bc8627617 100644 --- a/libavformat/mux.c +++ b/libavformat/mux.c @@ -1344,49 +1344,6 @@ int av_get_output_timestamp(struct AVFormatContext *s, int stream, return 0; } -int ff_stream_add_bitstream_filter(AVStream *st, const char *name, const char *args) -{ - int ret; - const AVBitStreamFilter *bsf; - FFStream *const sti = ffstream(st); - AVBSFContext *bsfc; - - av_assert0(!sti->bsfc); - - if (!(bsf = av_bsf_get_by_name(name))) { - av_log(NULL, AV_LOG_ERROR, "Unknown bitstream filter '%s'\n", name); - return AVERROR_BSF_NOT_FOUND; - } - - if ((ret = av_bsf_alloc(bsf, &bsfc)) < 0) - return ret; - - bsfc->time_base_in = st->time_base; - if ((ret = avcodec_parameters_copy(bsfc->par_in, st->codecpar)) < 0) { - av_bsf_free(&bsfc); - return ret; - } - - if (args && bsfc->filter->priv_class) { - if ((ret = av_set_options_string(bsfc->priv_data, args, "=", ":")) < 0) { - av_bsf_free(&bsfc); - return ret; - } - } - - if ((ret = av_bsf_init(bsfc)) < 0) { - av_bsf_free(&bsfc); - return ret; - } - - sti->bsfc = bsfc; - - av_log(NULL, AV_LOG_VERBOSE, - "Automatically inserted bitstream filter '%s'; args='%s'\n", - name, args ? args : ""); - return 1; -} - int ff_write_chained(AVFormatContext *dst, int dst_stream, AVPacket *pkt, AVFormatContext *src, int interleave) { diff --git a/libavformat/mux.h b/libavformat/mux.h index b9ec75641d..ab3e8edd60 100644 --- a/libavformat/mux.h +++ b/libavformat/mux.h @@ -171,17 +171,6 @@ const AVPacket *ff_interleaved_peek(AVFormatContext *s, int stream); int ff_get_muxer_ts_offset(AVFormatContext *s, int stream_index, int64_t *offset); -/** - * Add a bitstream filter to a stream. - * - * @param st output stream to add a filter to - * @param name the name of the filter to add - * @param args filter-specific argument string - * @return >0 on success; - * AVERROR code on failure - */ -int ff_stream_add_bitstream_filter(AVStream *st, const char *name, const char *args); - /** * Write a packet to another muxer than the one the user originally * intended. Useful when chaining muxers, where one muxer internally diff --git a/libavformat/rawenc.c b/libavformat/rawenc.c index f916db13a2..ec31d76d88 100644 --- a/libavformat/rawenc.c +++ b/libavformat/rawenc.c @@ -25,6 +25,7 @@ #include "libavutil/intreadwrite.h" #include "avformat.h" +#include "internal.h" #include "rawenc.h" #include "mux.h" From patchwork Wed Jan 31 17:26:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 45939 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:2c82:b0:199:de12:6fa6 with SMTP id g2csp3995pzj; Wed, 31 Jan 2024 09:27:20 -0800 (PST) X-Google-Smtp-Source: AGHT+IHvCv7hMBV4bQwzdQUwwAXJTdttOEdsg92j47YXYBgCMcpY594JK6ZhW3Y9dqB3UV6dFcmH X-Received: by 2002:a05:651c:2019:b0:2d0:54ef:1db3 with SMTP id s25-20020a05651c201900b002d054ef1db3mr1520867ljo.35.1706722039908; Wed, 31 Jan 2024 09:27:19 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1706722039; cv=none; d=google.com; s=arc-20160816; b=ntc2RvjMQ9M8dgxhI04tJ3fOvVmJTCHJwm8lbF4sCs+s44BTMAuxuqEwurSE0wDN6P FnEclnYxB0Id10uKhriwD4DlWIjNXSLcAD2qird0k58mNOg+I6JT4V41CnfVQfTNb312 fj2PnpiKMDhkSrMTVVAapeS4r/W3FjH0XM9GArk/2D0GY6MVxPwr75zH05Y4J49rWo5R AjtSZBO8ZITX8VnJF/Vm4/bVtCAZgyELcjJB4VLTlQ/+rdcGaRTE81jJ4TEenbSiatmw MDLF1+XbxwRh9F/+VT92YoJ+JOMxlMSvNip69xCzyeNllS9Zk2EV902nYwl2atL9UbhE kA+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=PYoQmG15DAvG5e4rF4qi9m/xKebh6Mv/4FWuYlqJGfc=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=cmyQGhsQTOFXqKnx/5VEwaJqY5QvEr241g+R54kasaYcyqYqVmu4DtIjp0n/8EK4+C 10GCWF7ebGi4fYT7OmymdX2LkzlOqvZ87ygekqdm5SNP4ilEG+pIcoF+UIzuGYcJQo0W m7pC6npBdj9EL2W6EVjczKhfZ+OOb5rgswTI+TGJO5BZs1DEOTWDZG8OHqhgB1vJS6ou dZtZbG6JpqHirSQyHwRBjDx4ZMmLIy/rlxAO+zDTWMUjONbzODqlxXJPRT6juDMq5sDB ueoFHP7ThJsyKSQwlvZ+l6RPgZitreEvt3FHQOqRPeJOe0y4uYg/OH/Dv5mxRb6sQliI 9kDg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=IW2OTfR3; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id p8-20020a50cd88000000b0055f5f0d8cd3si1241406edi.440.2024.01.31.09.27.19; Wed, 31 Jan 2024 09:27:19 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=IW2OTfR3; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6E19068D125; Wed, 31 Jan 2024 19:26:56 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 58CFC68D0A5 for ; Wed, 31 Jan 2024 19:26:49 +0200 (EET) Received: by mail-pl1-f176.google.com with SMTP id d9443c01a7336-1d8d747a3bcso10002835ad.0 for ; Wed, 31 Jan 2024 09:26:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706722007; x=1707326807; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=Kq8dmgwZoXKAk8nwllHsieLGe+lO5n1f/+76lhKq+t0=; b=IW2OTfR3o9s9EtOKDutU7E/OictxZDPW4TLUOefVOD1SkWWlClXoFkRNV3JGF51mts Uq97unuSpourZSdvqGceFh09c0YgkQgIeHJ6EEPoqIBDPqeCDPE9fhKX4AJeKHQ8QqPE J3Ozhb7XzFZl7U68Mjk7DNbxtOvWDcruRCYiUMIvPRD6Qlfxo7zwNA53ffZ3KIUjt4lx JgsIMlrOK3f0EUM2ttDgHLrfqjW9IaMfFD4ODPGKGYJG2Hy1QCLPMLlu+F4YeTrLVUBr Ky/46nV3hITa+Gh3EejRXzSpsS4Us2UNKcXi+F0yzLQG/JwtEgxeXToRMCMn5XIdJbro ehyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706722007; x=1707326807; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Kq8dmgwZoXKAk8nwllHsieLGe+lO5n1f/+76lhKq+t0=; b=HPSrEX5SJEsjopt97Td2LD1EWXOcRuTDRNE5awO1GHDUwzth3EncXeqD4qOP+XJYF3 KGgkafsE1VH6UbYgsxABpx2FV1ZcChQ8NfCjSWctpixn4/hKSzgjXqdu/Q+xP69NRNsN im5si8OXoAVbCMUV5F0iGx4Erfyauw2zF84ql4Q8cpnbQFcyy5KaKsk98kWg0h7j4Vcq d6b2m8MjPAr37Uq+Do/ZrCWu2who3yjFj4/fmrhIMNwU42cVwY4taGYL8giMkpdK4u0n 40zBwhTAaDriIUd1gjPdzXAj9ilCvE1VjUZOb0DD3GdyQosUIWgrCaj1qlYP5wvp5adP EkAw== X-Gm-Message-State: AOJu0YwUMRCY8DUQ4fWAtg+UINdkhzgUO8VfzgdUBM7Q2MNXYkNLXnsz p6w8sF3Kjsjwf9YkKadVEbBXDxom7TUONssnL9QvLzVfxqjrThenlsgAwvZd X-Received: by 2002:a17:903:4cf:b0:1d7:8570:e53 with SMTP id jm15-20020a17090304cf00b001d785700e53mr5203375plb.38.1706722007203; Wed, 31 Jan 2024 09:26:47 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id k11-20020a170902f28b00b001d8e4b85636sm5762235plc.138.2024.01.31.09.26.45 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 Jan 2024 09:26:46 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Wed, 31 Jan 2024 14:26:51 -0300 Message-ID: <20240131172654.15869-3-jamrial@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240131172654.15869-1-jamrial@gmail.com> References: <20240131172654.15869-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 3/6 v3] avformat/mov: make MOVStreamContext refcounted X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: L8jy6OIOr9AB Signed-off-by: James Almer --- libavformat/isom.h | 1 + libavformat/mov.c | 105 +++++++++++++++++++++++++-------------------- 2 files changed, 59 insertions(+), 47 deletions(-) diff --git a/libavformat/isom.h b/libavformat/isom.h index 2cf456fee1..a4925b3b08 100644 --- a/libavformat/isom.h +++ b/libavformat/isom.h @@ -165,6 +165,7 @@ typedef struct MOVIndexRange { typedef struct MOVStreamContext { AVIOContext *pb; + int refcount; int pb_is_copied; int ffindex; ///< AVStream index int next_chunk; diff --git a/libavformat/mov.c b/libavformat/mov.c index cf931d4594..555c72dc79 100644 --- a/libavformat/mov.c +++ b/libavformat/mov.c @@ -211,6 +211,7 @@ static int mov_read_covr(MOVContext *c, AVIOContext *pb, int type, int len) } st = c->fc->streams[c->fc->nb_streams - 1]; st->priv_data = sc; + sc->refcount = 1; if (st->attached_pic.size >= 8 && id != AV_CODEC_ID_BMP) { if (AV_RB64(st->attached_pic.data) == 0x89504e470d0a1a0a) { @@ -4654,6 +4655,7 @@ static int mov_read_trak(MOVContext *c, AVIOContext *pb, MOVAtom atom) st->codecpar->codec_type = AVMEDIA_TYPE_DATA; sc->ffindex = st->index; c->trak_index = st->index; + sc->refcount = 1; if ((ret = mov_read_default(c, pb, atom)) < 0) return ret; @@ -4941,6 +4943,7 @@ static int heif_add_stream(MOVContext *c, HEIFItem *item) sc = st->priv_data; sc->pb = c->fc->pb; sc->pb_is_copied = 1; + sc->refcount = 1; // Populate the necessary fields used by mov_build_index. sc->stsc_count = 1; @@ -8587,6 +8590,60 @@ static void mov_free_encryption_index(MOVEncryptionIndex **index) { av_freep(index); } +static void mov_free_stream_context(AVFormatContext *s, AVStream *st) +{ + MOVStreamContext *sc = st->priv_data; + + if (!sc || --sc->refcount) { + st->priv_data = NULL; + return; + } + + av_freep(&sc->ctts_data); + for (int i = 0; i < sc->drefs_count; i++) { + av_freep(&sc->drefs[i].path); + av_freep(&sc->drefs[i].dir); + } + av_freep(&sc->drefs); + + sc->drefs_count = 0; + + if (!sc->pb_is_copied) + ff_format_io_close(s, &sc->pb); + + sc->pb = NULL; + av_freep(&sc->chunk_offsets); + av_freep(&sc->stsc_data); + av_freep(&sc->sample_sizes); + av_freep(&sc->keyframes); + av_freep(&sc->stts_data); + av_freep(&sc->sdtp_data); + av_freep(&sc->stps_data); + av_freep(&sc->elst_data); + av_freep(&sc->rap_group); + av_freep(&sc->sync_group); + av_freep(&sc->sgpd_sync); + av_freep(&sc->sample_offsets); + av_freep(&sc->open_key_samples); + av_freep(&sc->display_matrix); + av_freep(&sc->index_ranges); + + if (sc->extradata) + for (int i = 0; i < sc->stsd_count; i++) + av_free(sc->extradata[i]); + av_freep(&sc->extradata); + av_freep(&sc->extradata_size); + + mov_free_encryption_index(&sc->cenc.encryption_index); + av_encryption_info_free(sc->cenc.default_encrypted_sample); + av_aes_ctr_free(sc->cenc.aes_ctr); + + av_freep(&sc->stereo3d); + av_freep(&sc->spherical); + av_freep(&sc->mastering); + av_freep(&sc->coll); +} + static int mov_read_close(AVFormatContext *s) { MOVContext *mov = s->priv_data; @@ -8594,54 +8651,8 @@ static int mov_read_close(AVFormatContext *s) for (i = 0; i < s->nb_streams; i++) { AVStream *st = s->streams[i]; - MOVStreamContext *sc = st->priv_data; - - if (!sc) - continue; - - av_freep(&sc->ctts_data); - for (j = 0; j < sc->drefs_count; j++) { - av_freep(&sc->drefs[j].path); - av_freep(&sc->drefs[j].dir); - } - av_freep(&sc->drefs); - sc->drefs_count = 0; - - if (!sc->pb_is_copied) - ff_format_io_close(s, &sc->pb); - - sc->pb = NULL; - av_freep(&sc->chunk_offsets); - av_freep(&sc->stsc_data); - av_freep(&sc->sample_sizes); - av_freep(&sc->keyframes); - av_freep(&sc->stts_data); - av_freep(&sc->sdtp_data); - av_freep(&sc->stps_data); - av_freep(&sc->elst_data); - av_freep(&sc->rap_group); - av_freep(&sc->sync_group); - av_freep(&sc->sgpd_sync); - av_freep(&sc->sample_offsets); - av_freep(&sc->open_key_samples); - av_freep(&sc->display_matrix); - av_freep(&sc->index_ranges); - - if (sc->extradata) - for (j = 0; j < sc->stsd_count; j++) - av_free(sc->extradata[j]); - av_freep(&sc->extradata); - av_freep(&sc->extradata_size); - - mov_free_encryption_index(&sc->cenc.encryption_index); - av_encryption_info_free(sc->cenc.default_encrypted_sample); - av_aes_ctr_free(sc->cenc.aes_ctr); - - av_freep(&sc->stereo3d); - av_freep(&sc->spherical); - av_freep(&sc->mastering); - av_freep(&sc->coll); + mov_free_stream_context(s, st); } av_freep(&mov->dv_demux); From patchwork Wed Jan 31 17:26:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 45940 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:2c82:b0:199:de12:6fa6 with SMTP id g2csp4106pzj; Wed, 31 Jan 2024 09:27:30 -0800 (PST) X-Google-Smtp-Source: AGHT+IEiO8HlODuEBATQKGMVf8oMkLdXY6MON01lIS7bwe7KHCPCj10tW4s+pnF8k4ivGJK6Dr+w X-Received: by 2002:a05:651c:1071:b0:2cc:cd2e:d9fc with SMTP id y17-20020a05651c107100b002cccd2ed9fcmr1357127ljm.36.1706722050550; Wed, 31 Jan 2024 09:27:30 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1706722050; cv=none; d=google.com; s=arc-20160816; b=mjMSKLthEQPh6vTZxi5cJTrCjPFplSoi9s5A25DvPEj2s3CAe4XrLHi+wTe55eW7Oo 3kf4+/liUa3Nu0ha7lDhTIshkbCVqugjqtgSvXualcQ0GmYWA+5A1GgVLczbjDLntcuC b1mHzJY2FtePAzRPttPhKNldcTrMfqfp37KhLRtfXMN7jBDTHWSpz41YA7iXDxUuIJxB ikRcmy8gw8qVh03F8ctvNKcmozIOgZ4blsW0Pky9gSTFpGe/ey4J3Wd1qGlmeQclm5As vV+mjGwFP4XF0dKOk+9FhYsJ3sUnirtZI62cyvbOXsqot8QL/DfJrHAkNkR7vXF1VYF7 IJbw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=hQL2Xk1D61U9ztoLQKGlVn8y/vjCHgJfzxI0YUQtfrY=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=JYAiBbMu+ctxetvq1JjjUJdNIFB18tbrPUjLzHeldVCvFVDQo+m6/WPdB37RBSNjvl ya2RTc/dMUrMCuQkxwgNntXDBhEef4JGthko7kdMPOadFr661s5nmXlU6jr72qU+AhbG iO933n3TrEFqCJpUsobMZsRols6tlvUZiYrOL9jzZofPpthkNHqqiaREfCh6RNj0RhA2 2xQ4vdGl7Ftdfv1PYGcdKwwBqKWT95YckxCFifQJ37/0LkvVxvMo9shOajSrJLGTYJ7t a4T93vPTtbdhIIwTuiyueR1FcH/3oVES6xd+KkE6hqHOs9ChkfEOhAleZ+OLBbA7IcQi 6pWA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=XE0JiNdn; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id l10-20020a056402124a00b0055f0078ad56si3236940edw.397.2024.01.31.09.27.30; Wed, 31 Jan 2024 09:27:30 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=XE0JiNdn; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 9336F68D130; Wed, 31 Jan 2024 19:26:57 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 7341868D126 for ; Wed, 31 Jan 2024 19:26:51 +0200 (EET) Received: by mail-pl1-f175.google.com with SMTP id d9443c01a7336-1d7232dcb3eso30531005ad.2 for ; Wed, 31 Jan 2024 09:26:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706722009; x=1707326809; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=0S6Dxp6dtM4HRx2J5Oaj7u8kJLpmwXWtyLBGa+O3x3I=; b=XE0JiNdn9IJ7nFFd8tm2kkLZp7tHygOcg3fFzwlFxU3SMMiGyj84QjsnPOOTlr2afG MZwcqjBEjEEm1MGRwBBtVOaN9RotZfqC8fmrn6ml139MACqXOQJOOZoTkaQbk72Dc8It BJD0ZKvH+CLaz5Z72UNDBxVXy5GrIsIEfSdrKXi/YfTJJNHMrTYct5473iL5rwdXj6Dy NcoxP4d7FWVWitYAeo5gAXCcZO4Xzx/JjqwEe0kD3nbhcGaqCw2TJo0/HWMArfclF+NN K295QUVzf/zvEVmik0R0FALa9hgBOZArZGVWLQkq5Ph2UpZPY6sezf3RiF1y5C+R5x9N 5kBA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706722009; x=1707326809; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0S6Dxp6dtM4HRx2J5Oaj7u8kJLpmwXWtyLBGa+O3x3I=; b=TX3DI1mApaarZ+kJAcnC2RmWr5i8dfTpkECpIZ241FdWZUtGzFqdAq1u2Mi40XXfJg 6c6x0XqxcpZG+AiqzI6jggG+tC1G45Mxtnwk55eZWq6tEyTaeeoyA3jcdNnioCI2QUH0 Oh9pKYZHpWaActqfsiBLUV8JDCGv463FUDz31ayug08Wh+jm1kkdnhv9FbTbHv8ULGFz /OhJLFWDzZOqgqptmQstRUqJWTJojtLLXe0mtRPRosyDbFJ6jXjm/Ubf6GMqP2khMErO Jd9eau0fWjTxfIa4yhePwUOGN2cSV/MIH0LdapvP9ixVnb5lS1Zdq+fGrji7AfxSzR+y UMZg== X-Gm-Message-State: AOJu0YwZBDLWqC8EqyhQXhTN+5qyNq/9qb4P1MzLIpqdUznzuKfsgfHm HPd91BDPEsp7IJ2RYnmgpvmUXH/RDBdaRY8PLFEy7LluhL9LFJBNB/dXjjTe X-Received: by 2002:a17:902:ec81:b0:1d8:eb2b:7b26 with SMTP id x1-20020a170902ec8100b001d8eb2b7b26mr2111256plg.49.1706722008540; Wed, 31 Jan 2024 09:26:48 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id k11-20020a170902f28b00b001d8e4b85636sm5762235plc.138.2024.01.31.09.26.47 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 Jan 2024 09:26:48 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Wed, 31 Jan 2024 14:26:52 -0300 Message-ID: <20240131172654.15869-4-jamrial@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240131172654.15869-1-jamrial@gmail.com> References: <20240131172654.15869-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 4/6 v3] avformat/mov: add support for Immersive Audio Model and Formats in ISOBMFF X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: P4VTESTOa1ZA Signed-off-by: James Almer --- configure | 2 +- libavformat/Makefile | 3 +- libavformat/isom.h | 6 + libavformat/mov.c | 283 ++++++++++++++++++++++++++++++++++++++++--- 4 files changed, 276 insertions(+), 18 deletions(-) diff --git a/configure b/configure index 68f675a4bc..42ba5ec502 100755 --- a/configure +++ b/configure @@ -3545,7 +3545,7 @@ matroska_demuxer_suggest="bzlib zlib" matroska_muxer_select="mpeg4audio riffenc aac_adtstoasc_bsf pgs_frame_merge_bsf vp9_superframe_bsf" mlp_demuxer_select="mlp_parser" mmf_muxer_select="riffenc" -mov_demuxer_select="iso_media riffdec" +mov_demuxer_select="iso_media riffdec iamf_frame_split_bsf" mov_demuxer_suggest="zlib" mov_muxer_select="iso_media riffenc rtpenc_chain vp9_superframe_bsf aac_adtstoasc_bsf ac3_parser" mp3_demuxer_select="mpegaudio_parser" diff --git a/libavformat/Makefile b/libavformat/Makefile index 05b9b8a115..4131279e69 100644 --- a/libavformat/Makefile +++ b/libavformat/Makefile @@ -364,7 +364,8 @@ OBJS-$(CONFIG_MMF_MUXER) += mmf.o rawenc.o OBJS-$(CONFIG_MODS_DEMUXER) += mods.o OBJS-$(CONFIG_MOFLEX_DEMUXER) += moflex.o OBJS-$(CONFIG_MOV_DEMUXER) += mov.o mov_chan.o mov_esds.o \ - qtpalette.o replaygain.o dovi_isom.o + qtpalette.o replaygain.o dovi_isom.o \ + iamf.o OBJS-$(CONFIG_MOV_MUXER) += movenc.o av1.o avc.o hevc.o vvc.o vpcc.o \ movenchint.o mov_chan.o rtp.o \ movenccenc.o movenc_ttml.o rawutils.o \ diff --git a/libavformat/isom.h b/libavformat/isom.h index a4925b3b08..9f202b3d73 100644 --- a/libavformat/isom.h +++ b/libavformat/isom.h @@ -33,6 +33,7 @@ #include "libavutil/stereo3d.h" #include "avio.h" +#include "iamf.h" #include "internal.h" #include "dv.h" @@ -167,6 +168,7 @@ typedef struct MOVStreamContext { AVIOContext *pb; int refcount; int pb_is_copied; + int id; ///< AVStream id int ffindex; ///< AVStream index int next_chunk; unsigned int chunk_count; @@ -261,6 +263,10 @@ typedef struct MOVStreamContext { AVEncryptionInfo *default_encrypted_sample; MOVEncryptionIndex *encryption_index; } cenc; + + IAMFContext *iamf; + uint8_t *iamf_descriptors; + int iamf_descriptors_size; } MOVStreamContext; typedef struct HEIFItem { diff --git a/libavformat/mov.c b/libavformat/mov.c index 555c72dc79..981a48e074 100644 --- a/libavformat/mov.c +++ b/libavformat/mov.c @@ -58,6 +58,7 @@ #include "internal.h" #include "avio_internal.h" #include "demux.h" +#include "iamf_parse.h" #include "dovi_isom.h" #include "riff.h" #include "isom.h" @@ -211,6 +212,7 @@ static int mov_read_covr(MOVContext *c, AVIOContext *pb, int type, int len) } st = c->fc->streams[c->fc->nb_streams - 1]; st->priv_data = sc; + sc->id = st->id; sc->refcount = 1; if (st->attached_pic.size >= 8 && id != AV_CODEC_ID_BMP) { @@ -835,6 +837,178 @@ static int mov_read_dac3(MOVContext *c, AVIOContext *pb, MOVAtom atom) return 0; } +static int mov_read_iacb(MOVContext *c, AVIOContext *pb, MOVAtom atom) +{ + AVStream *st; + MOVStreamContext *sc; + FFIOContext b; + AVIOContext *descriptor_pb; + AVDictionary *metadata; + IAMFContext *iamf; + char args[32]; + int64_t start_time, duration; + unsigned size; + int nb_frames, disposition; + int version, ret; + + if (atom.size < 5) + return AVERROR_INVALIDDATA; + + if (c->fc->nb_streams < 1) + return 0; + + version = avio_r8(pb); + if (version != 1) { + av_log(c->fc, AV_LOG_ERROR, "%s configurationVersion %d", + version < 1 ? "invalid" : "unsupported", version); + return AVERROR_INVALIDDATA; + } + + size = ffio_read_leb(pb); + if (!size) + return AVERROR_INVALIDDATA; + + st = c->fc->streams[c->fc->nb_streams - 1]; + sc = st->priv_data; + + iamf = sc->iamf = av_mallocz(sizeof(*iamf)); + if (!iamf) + return AVERROR(ENOMEM); + + sc->iamf_descriptors = av_malloc(size); + if (!sc->iamf_descriptors) + return AVERROR(ENOMEM); + + sc->iamf_descriptors_size = size; + ret = avio_read(pb, sc->iamf_descriptors, size); + if (ret != size) + return ret < 0 ? ret : AVERROR_INVALIDDATA; + + ffio_init_read_context(&b, sc->iamf_descriptors, size); + descriptor_pb = &b.pub; + + ret = ff_iamfdec_read_descriptors(iamf, descriptor_pb, size, c->fc); + if (ret < 0) + return ret; + + metadata = st->metadata; + st->metadata = NULL; + start_time = st->start_time; + nb_frames = st->nb_frames; + duration = st->duration; + disposition = st->disposition; + + for (int i = 0; i < iamf->nb_audio_elements; i++) { + IAMFAudioElement *audio_element = iamf->audio_elements[i]; + AVStreamGroup *stg = + avformat_stream_group_create(c->fc, AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT, NULL); + + if (!stg) { + ret = AVERROR(ENOMEM); + goto fail; + } + + stg->id = audio_element->audio_element_id; + stg->params.iamf_audio_element = audio_element->element; + audio_element->element = NULL; + + for (int j = 0; j < audio_element->nb_substreams; j++) { + IAMFSubStream *substream = &audio_element->substreams[j]; + AVStream *stream; + + if (!i && !j) + stream = st; + else + stream = avformat_new_stream(c->fc, NULL); + if (!stream) { + ret = AVERROR(ENOMEM); + goto fail; + } + + stream->start_time = start_time; + stream->nb_frames = nb_frames; + stream->duration = duration; + stream->disposition = disposition; + if (stream != st) { + stream->priv_data = sc; + sc->refcount++; + } + + ret = avcodec_parameters_copy(stream->codecpar, substream->codecpar); + if (ret < 0) + goto fail; + + stream->id = substream->audio_substream_id; + + avpriv_set_pts_info(st, 64, 1, sc->time_scale); + + ret = avformat_stream_group_add_stream(stg, stream); + if (ret < 0) + goto fail; + } + + ret = av_dict_copy(&stg->metadata, metadata, 0); + if (ret < 0) + goto fail; + } + + for (int i = 0; i < iamf->nb_mix_presentations; i++) { + IAMFMixPresentation *mix_presentation = iamf->mix_presentations[i]; + const AVIAMFMixPresentation *mix = mix_presentation->mix; + AVStreamGroup *stg = + avformat_stream_group_create(c->fc, AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION, NULL); + + if (!stg) { + ret = AVERROR(ENOMEM); + goto fail; + } + + stg->id = mix_presentation->mix_presentation_id; + stg->params.iamf_mix_presentation = mix_presentation->mix; + mix_presentation->mix = NULL; + + for (int j = 0; j < mix->nb_submixes; j++) { + const AVIAMFSubmix *submix = mix->submixes[j]; + + for (int k = 0; k < submix->nb_elements; k++) { + const AVIAMFSubmixElement *submix_element = submix->elements[k]; + const AVStreamGroup *audio_element = NULL; + + for (int l = 0; l < c->fc->nb_stream_groups; l++) + if (c->fc->stream_groups[l]->type == AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT && + c->fc->stream_groups[l]->id == submix_element->audio_element_id) { + audio_element = c->fc->stream_groups[l]; + break; + } + av_assert0(audio_element); + + for (int l = 0; l < audio_element->nb_streams; l++) { + ret = avformat_stream_group_add_stream(stg, audio_element->streams[l]); + if (ret < 0 && ret != AVERROR(EEXIST)) + goto fail; + } + } + } + + ret = av_dict_copy(&stg->metadata, metadata, 0); + if (ret < 0) + goto fail; + } + + snprintf(args, sizeof(args), "first_index=%d", st->index); + + ret = ff_stream_add_bitstream_filter(st, "iamf_frame_split", args); + if (ret == AVERROR_BSF_NOT_FOUND) { + av_log(c->fc, AV_LOG_ERROR, "iamf_frame_split bitstream filter " + "not found. This is a bug, please report it.\n"); + ret = AVERROR_BUG; + } +fail: + av_dict_free(&metadata); + + return ret; +} + static int mov_read_dec3(MOVContext *c, AVIOContext *pb, MOVAtom atom) { AVStream *st; @@ -1380,7 +1554,7 @@ static int64_t get_frag_time(AVFormatContext *s, AVStream *dst_st, // If the stream is referenced by any sidx, limit the search // to fragments that referenced this stream in the sidx if (sc->has_sidx) { - frag_stream_info = get_frag_stream_info(frag_index, index, dst_st->id); + frag_stream_info = get_frag_stream_info(frag_index, index, sc->id); if (frag_stream_info->sidx_pts != AV_NOPTS_VALUE) return frag_stream_info->sidx_pts; if (frag_stream_info->first_tfra_pts != AV_NOPTS_VALUE) @@ -1391,9 +1565,11 @@ static int64_t get_frag_time(AVFormatContext *s, AVStream *dst_st, for (i = 0; i < frag_index->item[index].nb_stream_info; i++) { AVStream *frag_stream = NULL; frag_stream_info = &frag_index->item[index].stream_info[i]; - for (j = 0; j < s->nb_streams; j++) - if (s->streams[j]->id == frag_stream_info->id) + for (j = 0; j < s->nb_streams; j++) { + MOVStreamContext *sc2 = s->streams[j]->priv_data; + if (sc2->id == frag_stream_info->id) frag_stream = s->streams[j]; + } if (!frag_stream) { av_log(s, AV_LOG_WARNING, "No stream matching sidx ID found.\n"); continue; @@ -1459,12 +1635,13 @@ static int update_frag_index(MOVContext *c, int64_t offset) for (i = 0; i < c->fc->nb_streams; i++) { // Avoid building frag index if streams lack track id. - if (c->fc->streams[i]->id < 0) { + MOVStreamContext *sc = c->fc->streams[i]->priv_data; + if (sc->id < 0) { av_free(frag_stream_info); return AVERROR_INVALIDDATA; } - frag_stream_info[i].id = c->fc->streams[i]->id; + frag_stream_info[i].id = sc->id; frag_stream_info[i].sidx_pts = AV_NOPTS_VALUE; frag_stream_info[i].tfdt_dts = AV_NOPTS_VALUE; frag_stream_info[i].next_trun_dts = AV_NOPTS_VALUE; @@ -3259,7 +3436,7 @@ static int mov_read_stts(MOVContext *c, AVIOContext *pb, MOVAtom atom) "All samples in data stream index:id [%d:%d] have zero " "duration, stream set to be discarded by default. Override " "using AVStream->discard or -discard for ffmpeg command.\n", - st->index, st->id); + st->index, sc->id); st->discard = AVDISCARD_ALL; } sc->track_end = duration; @@ -4639,6 +4816,50 @@ static void fix_timescale(MOVContext *c, MOVStreamContext *sc) } } +static int mov_update_iamf_streams(MOVContext *c, const AVStream *st) +{ + const MOVStreamContext *sc = st->priv_data; + + for (int i = 0; i < sc->iamf->nb_audio_elements; i++) { + const AVStreamGroup *stg = NULL; + + for (int j = 0; j < c->fc->nb_stream_groups; j++) + if (c->fc->stream_groups[j]->id == sc->iamf->audio_elements[i]->audio_element_id) + stg = c->fc->stream_groups[j]; + av_assert0(stg); + + for (int j = 0; j < stg->nb_streams; j++) { + const FFStream *sti = cffstream(st); + AVStream *out = stg->streams[j]; + FFStream *out_sti = ffstream(stg->streams[j]); + + out->codecpar->bit_rate = 0; + + if (out == st) + continue; + + out->time_base = st->time_base; + out->start_time = st->start_time; + out->duration = st->duration; + out->nb_frames = st->nb_frames; + out->disposition = st->disposition; + out->discard = st->discard; + + av_assert0(!out_sti->index_entries); + out_sti->index_entries = av_malloc(sti->index_entries_allocated_size); + if (!out_sti->index_entries) + return AVERROR(ENOMEM); + + out_sti->index_entries_allocated_size = sti->index_entries_allocated_size; + out_sti->nb_index_entries = sti->nb_index_entries; + out_sti->skip_samples = sti->skip_samples; + memcpy(out_sti->index_entries, sti->index_entries, sti->index_entries_allocated_size); + } + } + + return 0; +} + static int mov_read_trak(MOVContext *c, AVIOContext *pb, MOVAtom atom) { AVStream *st; @@ -4702,6 +4923,12 @@ static int mov_read_trak(MOVContext *c, AVIOContext *pb, MOVAtom atom) mov_build_index(c, st); + if (sc->iamf) { + ret = mov_update_iamf_streams(c, st); + if (ret < 0) + return ret; + } + if (sc->dref_id-1 < sc->drefs_count && sc->drefs[sc->dref_id-1].path) { MOVDref *dref = &sc->drefs[sc->dref_id - 1]; if (c->enable_drefs) { @@ -4934,6 +5161,7 @@ static int heif_add_stream(MOVContext *c, HEIFItem *item) st->priv_data = sc; st->codecpar->codec_type = AVMEDIA_TYPE_VIDEO; st->codecpar->codec_id = mov_codec_id(st, item->type); + sc->id = st->id; sc->ffindex = st->index; c->trak_index = st->index; st->avg_frame_rate.num = st->avg_frame_rate.den = 1; @@ -5032,6 +5260,7 @@ static int mov_read_tkhd(MOVContext *c, AVIOContext *pb, MOVAtom atom) avio_rb32(pb); /* modification time */ } st->id = (int)avio_rb32(pb); /* track id (NOT 0 !)*/ + sc->id = st->id; avio_rb32(pb); /* reserved */ /* highlevel (considering edits) duration in movie timebase */ @@ -5206,7 +5435,8 @@ static int mov_read_tfdt(MOVContext *c, AVIOContext *pb, MOVAtom atom) int64_t base_media_decode_time; for (i = 0; i < c->fc->nb_streams; i++) { - if (c->fc->streams[i]->id == frag->track_id) { + sc = c->fc->streams[i]->priv_data; + if (sc->id == frag->track_id) { st = c->fc->streams[i]; break; } @@ -5259,7 +5489,8 @@ static int mov_read_trun(MOVContext *c, AVIOContext *pb, MOVAtom atom) } for (i = 0; i < c->fc->nb_streams; i++) { - if (c->fc->streams[i]->id == frag->track_id) { + sc = c->fc->streams[i]->priv_data; + if (sc->id == frag->track_id) { st = c->fc->streams[i]; sti = ffstream(st); break; @@ -5562,7 +5793,8 @@ static int mov_read_sidx(MOVContext *c, AVIOContext *pb, MOVAtom atom) track_id = avio_rb32(pb); // Reference ID for (i = 0; i < c->fc->nb_streams; i++) { - if (c->fc->streams[i]->id == track_id) { + sc = c->fc->streams[i]->priv_data; + if (sc->id == track_id) { st = c->fc->streams[i]; break; } @@ -6454,7 +6686,8 @@ static int get_current_encryption_info(MOVContext *c, MOVEncryptionIndex **encry frag_stream_info = get_current_frag_stream_info(&c->frag_index); if (frag_stream_info) { for (i = 0; i < c->fc->nb_streams; i++) { - if (c->fc->streams[i]->id == frag_stream_info->id) { + *sc = c->fc->streams[i]->priv_data; + if ((*sc)->id == frag_stream_info->id) { st = c->fc->streams[i]; break; } @@ -7398,7 +7631,7 @@ static int cenc_filter(MOVContext *mov, AVStream* st, MOVStreamContext *sc, AVPa AVEncryptionInfo *encrypted_sample; int encrypted_index, ret; - frag_stream_info = get_frag_stream_info_from_pkt(&mov->frag_index, pkt, st->id); + frag_stream_info = get_frag_stream_info_from_pkt(&mov->frag_index, pkt, sc->id); encrypted_index = current_index; encryption_index = NULL; if (frag_stream_info) { @@ -8175,6 +8408,7 @@ static const MOVParseTableEntry mov_default_parse_table[] = { { MKTAG('i','s','p','e'), mov_read_ispe }, { MKTAG('i','p','r','p'), mov_read_iprp }, { MKTAG('i','i','n','f'), mov_read_iinf }, +{ MKTAG('i','a','c','b'), mov_read_iacb }, { 0, NULL } }; @@ -8406,11 +8640,13 @@ static void mov_read_chapters(AVFormatContext *s) AVStream *st = NULL; FFStream *sti = NULL; chapter_track = mov->chapter_tracks[j]; - for (i = 0; i < s->nb_streams; i++) - if (s->streams[i]->id == chapter_track) { + for (i = 0; i < s->nb_streams; i++) { + sc = mov->fc->streams[i]->priv_data; + if (sc->id == chapter_track) { st = s->streams[i]; break; } + } if (!st) { av_log(s, AV_LOG_ERROR, "Referenced QT chapter track not found\n"); continue; @@ -8642,6 +8878,11 @@ static void mov_free_stream_context(AVFormatContext *s, AVStream *st) av_freep(&sc->spherical); av_freep(&sc->mastering); av_freep(&sc->coll); + + ff_iamf_uninit_context(sc->iamf); + av_freep(&sc->iamf); + av_freep(&sc->iamf_descriptors); + sc->iamf_descriptors_size = 0; } static int mov_read_close(AVFormatContext *s) @@ -8896,9 +9137,11 @@ static int mov_read_header(AVFormatContext *s) AVDictionaryEntry *tcr; int tmcd_st_id = -1; - for (j = 0; j < s->nb_streams; j++) - if (s->streams[j]->id == sc->timecode_track) + for (j = 0; j < s->nb_streams; j++) { + MOVStreamContext *sc2 = s->streams[j]->priv_data; + if (sc2->id == sc->timecode_track) tmcd_st_id = j; + } if (tmcd_st_id < 0 || tmcd_st_id == i) continue; @@ -9211,7 +9454,15 @@ static int mov_read_packet(AVFormatContext *s, AVPacket *pkt) if (st->codecpar->codec_id == AV_CODEC_ID_EIA_608 && sample->size > 8) ret = get_eia608_packet(sc->pb, pkt, sample->size); - else + else if (sc->iamf_descriptors_size) { + ret = av_new_packet(pkt, sc->iamf_descriptors_size); + if (ret < 0) + return ret; + pkt->pos = avio_tell(sc->pb); + memcpy(pkt->data, sc->iamf_descriptors, sc->iamf_descriptors_size); + sc->iamf_descriptors_size = 0; + ret = av_append_packet(sc->pb, pkt, sample->size); + } else ret = av_get_packet(sc->pb, pkt, sample->size); if (ret < 0) { if (should_retry(sc->pb, ret)) { From patchwork Wed Jan 31 17:26:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 45941 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:2c82:b0:199:de12:6fa6 with SMTP id g2csp4217pzj; Wed, 31 Jan 2024 09:27:41 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCWw0XZLPa74MIef4uDj66PkNv/xtgDIdMK1XJS0dSVWmAuWNn2QGFobGplhG5J4W4X42EQKxXljFIkBjPAf5rPbwQpx8xo03S/8EQ== X-Google-Smtp-Source: AGHT+IGXIrpKejoCD2l+Vin50Wh1/120xuqUuMFNAN1HIuiVsSoTJP1WnmBLKWYq77yg2ijikX1n X-Received: by 2002:a05:651c:199e:b0:2d0:41df:70ab with SMTP id bx30-20020a05651c199e00b002d041df70abmr17707ljb.3.1706722061255; Wed, 31 Jan 2024 09:27:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1706722061; cv=none; d=google.com; s=arc-20160816; b=e9sllAkMIBzLVPbVzE4RTkih/1/HU65D3ZwhjgOHaWGtakxqgHedaTErC1ZhjxEtzt +5nfaPs3O7UdJn5rt9AsuZ/ZjWL/bZNC5Lw6254AW/+YXMo1PsLfBOEEL/ChpzVWgIjm aRmhY35bYgvZPk1IdpF8iugu3sxQa4AFyfHDBm6kIbTUs1sgUBE1y+oX0NucntzmmbBM uTrxcTbyrJvHnXLx6vRJ1sswtwHxBrWNebyhGytPwFo7jf3HA3jmZp4iOoEvcSo4mPG6 EOYYkzm2PtoaW5myEIeDprv2DbyU9EXqhSpCdNYiydKX+8v8+H2riVojXkatRvyt+4Yi WdFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=cvY9dRpQsFEO+HL5uSfW1sRANzPOKYv43DijUFsFdKM=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=vPhffga9X/iK8OcE3Smn0Wcj1nyD03z2ArdM4E3mpdxEseXcM5/nzEnrcjCQiA4i+j pz39TAip7h9FTsSMi2gh6riqyjqRDrvUYa1IkE2jjqj0Ab68iU8jqiWpB56WSx3DuBo8 zL5sy8AqmoKw8Un3WXVuPSKOW3scLlQutRY+kZ3bUqDcHBqB/4WC+3SM2B+sc1uGukIP EPYZzCbxsCXtDyXt9Nos9yMWyr9L3R2+a77vXBbr7LcitzkmRgPpio0WvdIgoU1eoqU6 /ph/mba9DSWw6zN23dACzC83P/MoDVutqMD7Ov2WNmErBZCKiP/R3qXOXCF1ENFVqD0W HYoA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=bETFuV6i; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id l3-20020a170906644300b00a34a45bdf0asi5890687ejn.716.2024.01.31.09.27.40; Wed, 31 Jan 2024 09:27:41 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=bETFuV6i; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A1AA668D140; Wed, 31 Jan 2024 19:26:59 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pl1-f173.google.com (mail-pl1-f173.google.com [209.85.214.173]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id D213068D139 for ; Wed, 31 Jan 2024 19:26:52 +0200 (EET) Received: by mail-pl1-f173.google.com with SMTP id d9443c01a7336-1d928a8dee8so7468235ad.1 for ; Wed, 31 Jan 2024 09:26:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706722010; x=1707326810; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=bKVtzvpeAppFHc4db5NY5nAKXfltec7MW9XOvGIkkU4=; b=bETFuV6icjVRBIHPGQcokpapX7GXccZr92yPV+erlSd+XjHXYd9ZzZWSLCiKn7EKWG 21Ifm/tUg0wVFzdwoMmitwKefx9tNsXlt8CEspTMGK1J8Wyhf8jP1H2FrrgLwWlX/SmA Tc82LNQuFzGDxqOpMcVAGbFVzVBrtZCya6T+AjEGLbnVRLjBwFUmZecEfG3LMh79pEG0 cCnElFaxML2QP77lcbQ85SV3qjprZoqQ/RdFnQyvumOKP+ruK5Og8+nptN2/QoklSo98 oDItZXeCYGjPmer4mcGKfsFP2LDLBoqWgTdN8hLr0QLeFB/GAE0TtO7pPEPDIZs4wNRy riYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706722010; x=1707326810; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bKVtzvpeAppFHc4db5NY5nAKXfltec7MW9XOvGIkkU4=; b=gTJ0Mp0FP/WJYbzHa1rQOYL1E+YaUr0k9Y+VVpuRv0f48oCB3c6PkgzQk+5G/PNdB2 DDyOFOk4wCXyKG8hM2aP0frDHFt3yAYrRcDOAvDXG8qCmRDqN7QwBHhX8xiTUbuZQnKx G9+HsonYBSu9/yQt7h7Bw1juByhoRvt5HB766kNc8FiicLz+V22N2tfKczmZPlplc4Mj ghme60CbORiVlxA2LtaBEfsvipsbKjnbZ3hbszdaLlPbO9aEobSuc6Qk/PhCmdL8yhLM sHNech7ZJv5+kJyZOVqgMBfXxMCRKFBSbdxJrqYBNajNHQnh4dMmiY1NOjo+FmjO0xvG BV8Q== X-Gm-Message-State: AOJu0YzbNSYWp/N9Z8BzKgochoou+uqzjY0NPJHrdG7Mnt9oY1kg7r+t tDqCIDhu1wVVNovdclltjzWSEK7tLrwihDvI+Vjng14jQWCJO2uiTqHtPSwo X-Received: by 2002:a17:902:c411:b0:1d8:e047:122a with SMTP id k17-20020a170902c41100b001d8e047122amr6886282plk.45.1706722010110; Wed, 31 Jan 2024 09:26:50 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id k11-20020a170902f28b00b001d8e4b85636sm5762235plc.138.2024.01.31.09.26.48 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 Jan 2024 09:26:49 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Wed, 31 Jan 2024 14:26:53 -0300 Message-ID: <20240131172654.15869-5-jamrial@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240131172654.15869-1-jamrial@gmail.com> References: <20240131172654.15869-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 5/6 v2] avcodec: add an Immersive Audio Model and Formats frame merge bsf X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: LDKOFh2Ad41E Signed-off-by: James Almer --- doc/bitstream_filters.texi | 14 ++ libavcodec/bitstream_filters.c | 1 + libavcodec/bsf/Makefile | 1 + libavcodec/bsf/iamf_frame_merge_bsf.c | 228 ++++++++++++++++++++++++++ libavcodec/leb.h | 22 +++ 5 files changed, 266 insertions(+) create mode 100644 libavcodec/bsf/iamf_frame_merge_bsf.c diff --git a/doc/bitstream_filters.texi b/doc/bitstream_filters.texi index 7e0cfa3e26..879182f00f 100644 --- a/doc/bitstream_filters.texi +++ b/doc/bitstream_filters.texi @@ -478,6 +478,20 @@ are coded. Lowest stream index value to set in output packets @end table +@section iamf_frame_merge + +Encapsulate audio data packets from different streams and merge them +into a single Audio Frame OBUs. + +@table @option +@item index_mapping +A :-separated list of stream_index=audio_substream_id entries to set +stream id in output Audio Frame OBUs + +@item out_index +Stream index to in output packets +@end table + @section imxdump Modifies the bitstream to fit in MOV and to be usable by the Final Cut diff --git a/libavcodec/bitstream_filters.c b/libavcodec/bitstream_filters.c index 476331ec8a..61c090a2f1 100644 --- a/libavcodec/bitstream_filters.c +++ b/libavcodec/bitstream_filters.c @@ -42,6 +42,7 @@ extern const FFBitStreamFilter ff_h264_redundant_pps_bsf; extern const FFBitStreamFilter ff_hapqa_extract_bsf; extern const FFBitStreamFilter ff_hevc_metadata_bsf; extern const FFBitStreamFilter ff_hevc_mp4toannexb_bsf; +extern const FFBitStreamFilter ff_iamf_frame_merge_bsf; extern const FFBitStreamFilter ff_iamf_frame_split_bsf; extern const FFBitStreamFilter ff_imx_dump_header_bsf; extern const FFBitStreamFilter ff_media100_to_mjpegb_bsf; diff --git a/libavcodec/bsf/Makefile b/libavcodec/bsf/Makefile index cb23428f4a..ff024d47f1 100644 --- a/libavcodec/bsf/Makefile +++ b/libavcodec/bsf/Makefile @@ -20,6 +20,7 @@ OBJS-$(CONFIG_H264_REDUNDANT_PPS_BSF) += bsf/h264_redundant_pps.o OBJS-$(CONFIG_HAPQA_EXTRACT_BSF) += bsf/hapqa_extract.o OBJS-$(CONFIG_HEVC_METADATA_BSF) += bsf/h265_metadata.o OBJS-$(CONFIG_HEVC_MP4TOANNEXB_BSF) += bsf/hevc_mp4toannexb.o +OBJS-$(CONFIG_IAMF_FRAME_MERGE_BSF) += bsf/iamf_frame_merge_bsf.o OBJS-$(CONFIG_IAMF_FRAME_SPLIT_BSF) += bsf/iamf_frame_split_bsf.o OBJS-$(CONFIG_IMX_DUMP_HEADER_BSF) += bsf/imx_dump_header.o OBJS-$(CONFIG_MEDIA100_TO_MJPEGB_BSF) += bsf/media100_to_mjpegb.o diff --git a/libavcodec/bsf/iamf_frame_merge_bsf.c b/libavcodec/bsf/iamf_frame_merge_bsf.c new file mode 100644 index 0000000000..98f37be653 --- /dev/null +++ b/libavcodec/bsf/iamf_frame_merge_bsf.c @@ -0,0 +1,228 @@ +/* + * Copyright (c) 2024 James Almer + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include +#include + +#include "libavutil/dict.h" +#include "libavutil/fifo.h" +#include "libavutil/opt.h" +#include "libavformat/iamf.h" +#include "bsf.h" +#include "bsf_internal.h" +#include "bytestream.h" +#include "get_bits.h" +#include "leb.h" +#include "put_bits.h" + +typedef struct IAMFMergeContext { + AVClass *class; + + AVFifo *fifo; + + // AVOptions + AVDictionary *index_mapping; + int stream_count; + int out_index; +} IAMFMergeContext; + +static int find_id_from_idx(AVBSFContext *ctx, int idx) +{ + IAMFMergeContext *const c = ctx->priv_data; + const AVDictionaryEntry *e = NULL; + + while (e = av_dict_iterate(c->index_mapping, e)) { + char *endptr = NULL; + int id, map_idx = strtol(e->key, &endptr, 0); + if (!endptr || *endptr) + return AVERROR_INVALIDDATA; + endptr = NULL; + id = strtol(e->value, &endptr, 0); + if (!endptr || *endptr) + return AVERROR_INVALIDDATA; + if (map_idx == idx) + return id; + } + + av_log(ctx, AV_LOG_ERROR, "Invalid stream idx %d\n", idx); + return AVERROR_INVALIDDATA; +} + +static int iamf_frame_merge_filter(AVBSFContext *ctx, AVPacket *out) +{ + IAMFMergeContext *const c = ctx->priv_data; + AVPacket *pkt; + int ret; + + while (av_fifo_can_write(c->fifo)) { + ret = ff_bsf_get_packet(ctx, &pkt); + if (ret < 0) + return ret; + av_fifo_write(c->fifo, &pkt, 1); + } + + pkt = NULL; + while (av_fifo_can_read(c->fifo)) { + PutBitContext pb; + PutByteContext p; + uint8_t *side_data, header[MAX_IAMF_OBU_HEADER_SIZE], obu[8]; + unsigned int obu_header; + unsigned int skip_samples = 0, discard_padding = 0; + size_t side_data_size; + int header_size, obu_size, old_out_size = out->size; + int id, type; + + av_packet_free(&pkt); + av_fifo_read(c->fifo, &pkt, 1); + id = find_id_from_idx(ctx, pkt->stream_index); + if (id < 0) + return AVERROR_INVALIDDATA; + + type = id <= 17 ? id + IAMF_OBU_IA_AUDIO_FRAME_ID0 : IAMF_OBU_IA_AUDIO_FRAME; + + side_data = av_packet_get_side_data(pkt, AV_PKT_DATA_SKIP_SAMPLES, + &side_data_size); + + if (side_data && side_data_size >= 10) { + skip_samples = AV_RL32(side_data); + discard_padding = AV_RL32(side_data + 4); + } + + init_put_bits(&pb, (uint8_t *)&obu_header, sizeof(obu_header)); + put_bits(&pb, 5, type); + put_bits(&pb, 1, 0); // obu_redundant_copy + put_bits(&pb, 1, skip_samples || discard_padding); + put_bits(&pb, 1, 0); // obu_extension_flag + flush_put_bits(&pb); + + init_put_bits(&pb, header, sizeof(header)); + if (skip_samples || discard_padding) { + put_leb(&pb, discard_padding); + put_leb(&pb, skip_samples); + } + if (id > 17) + put_leb(&pb, id); + flush_put_bits(&pb); + + header_size = put_bytes_count(&pb, 1); + + init_put_bits(&pb, obu, sizeof(obu)); + put_leb(&pb, header_size + pkt->size); + flush_put_bits(&pb); + + obu_size = put_bytes_count(&pb, 1); + + ret = av_grow_packet(out, 1 + obu_size + header_size + pkt->size); + if (ret < 0) + goto fail; + + bytestream2_init_writer(&p, out->data + old_out_size, 1 + obu_size + header_size + pkt->size); + bytestream2_put_byteu(&p, obu_header); + bytestream2_put_bufferu(&p, obu, obu_size); + bytestream2_put_bufferu(&p, header, header_size); + bytestream2_put_bufferu(&p, pkt->data, pkt->size); + } + + ret = av_packet_copy_props(out, pkt); + if (ret < 0) + goto fail; + out->stream_index = c->out_index; + + ret = 0; +fail: + av_packet_free(&pkt); + if (ret < 0) + av_packet_free(&out); + return ret; +} + +static int iamf_frame_merge_init(AVBSFContext *ctx) +{ + IAMFMergeContext *const c = ctx->priv_data; + + if (!c->index_mapping) { + av_log(ctx, AV_LOG_ERROR, "Empty index map\n"); + return AVERROR(EINVAL); + } + + c->fifo = av_fifo_alloc2(av_dict_count(c->index_mapping), sizeof(AVPacket*), 0); + if (!c->fifo) + return AVERROR(ENOMEM); + + return 0; +} + +static void iamf_frame_merge_flush(AVBSFContext *ctx) +{ + IAMFMergeContext *const c = ctx->priv_data; + + while (av_fifo_can_read(c->fifo)) { + AVPacket *pkt; + av_fifo_read(c->fifo, &pkt, 1); + av_packet_free(&pkt); + } + av_fifo_reset2(c->fifo); +} + +static void iamf_frame_merge_close(AVBSFContext *ctx) +{ + IAMFMergeContext *const c = ctx->priv_data; + + if (c->fifo) + iamf_frame_merge_flush(ctx); + av_fifo_freep2(&c->fifo); +} + +#define OFFSET(x) offsetof(IAMFMergeContext, x) +#define FLAGS (AV_OPT_FLAG_AUDIO_PARAM|AV_OPT_FLAG_BSF_PARAM) +static const AVOption iamf_frame_merge_options[] = { + { "index_mapping", "a :-separated list of stream_index=audio_substream_id entries " + "to set stream id in output Audio Frame OBUs", + OFFSET(index_mapping), AV_OPT_TYPE_DICT, { .str = NULL }, 0, 0, FLAGS }, + { "out_index", "Stream index to set in output packets", + OFFSET(out_index), AV_OPT_TYPE_INT, { 0 }, 0, INT_MAX, FLAGS }, + { NULL } +}; + +static const AVClass iamf_frame_merge_class = { + .class_name = "iamf_frame_merge_bsf", + .item_name = av_default_item_name, + .option = iamf_frame_merge_options, + .version = LIBAVUTIL_VERSION_INT, +}; + +static const enum AVCodecID iamf_frame_merge_codec_ids[] = { + AV_CODEC_ID_PCM_S16LE, AV_CODEC_ID_PCM_S16BE, + AV_CODEC_ID_PCM_S24LE, AV_CODEC_ID_PCM_S24BE, + AV_CODEC_ID_PCM_S32LE, AV_CODEC_ID_PCM_S32BE, + AV_CODEC_ID_OPUS, AV_CODEC_ID_AAC, + AV_CODEC_ID_FLAC, AV_CODEC_ID_NONE, +}; + +const FFBitStreamFilter ff_iamf_frame_merge_bsf = { + .p.name = "iamf_frame_merge", + .p.codec_ids = iamf_frame_merge_codec_ids, + .p.priv_class = &iamf_frame_merge_class, + .priv_data_size = sizeof(IAMFMergeContext), + .init = iamf_frame_merge_init, + .flush = iamf_frame_merge_flush, + .close = iamf_frame_merge_close, + .filter = iamf_frame_merge_filter, +}; diff --git a/libavcodec/leb.h b/libavcodec/leb.h index 5159c434b1..3f00b2988d 100644 --- a/libavcodec/leb.h +++ b/libavcodec/leb.h @@ -25,6 +25,7 @@ #define AVCODEC_LEB_H #include "get_bits.h" +#include "put_bits.h" /** * Read a unsigned integer coded as a variable number of up to eight @@ -67,4 +68,25 @@ static inline int64_t get_leb128(GetBitContext *gb) { return ret; } +/** + * Write a unsigned integer coded as a variable number of up to eight + * little-endian bytes, where the MSB in a byte signals another byte + * is coded. + */ +static inline void put_leb(PutBitContext *s, unsigned value) +{ + int len; + uint8_t byte; + + len = (av_log2(value) + 7) / 7; + + for (int i = 0; i < len; i++) { + byte = value >> (7 * i) & 0x7f; + if (i < len - 1) + byte |= 0x80; + + put_bits_no_assert(s, 8, byte); + } +} + #endif /* AVCODEC_LEB_H */ From patchwork Wed Jan 31 17:26:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 45942 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:2c82:b0:199:de12:6fa6 with SMTP id g2csp4331pzj; Wed, 31 Jan 2024 09:27:52 -0800 (PST) X-Google-Smtp-Source: AGHT+IFwJ3JC9quU8K21XFPE9Vom3OyknTyjOwx5LA4YU/CraIDbTPsLjn8FSRvdElxNjwObrwtz X-Received: by 2002:a17:906:7cf:b0:a36:5dd3:6126 with SMTP id m15-20020a17090607cf00b00a365dd36126mr1756503ejc.41.1706722072361; Wed, 31 Jan 2024 09:27:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1706722072; cv=none; d=google.com; s=arc-20160816; b=cyMNzjZBSVB/zqr2KcbzbQp6a3bLaZahLfow/ejeVxu2kvCi80tF1L8aNqYqHKqbgp +2Zpj7Ux7ITQLOAQnNWo5IPuaSZX3wYbui9ctrsnARF2GpvL8lGSOKaBGIqRbl4IEVxz 9gytuwk8wEnXY9WA5qVEPmEVca0GKafcqpyrzSqTw1xZNTk/ceQMDjOz3hhS7F1a3zwD XS/tNBZ9oMjPKH00s24BW0iJqCENPrffnw3JbiDt9km4eOS2umrym7cyroaYVLFfbpzN NpvJDfLjxQRSOJsNEaJBK49hlVf4E/KbwkXo8F6qqxTgGlH+U0KoaGQEKGeXY+plV1El ORtQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=9qNyn6BHv5eoy/5d1MvthBX1h+xrs75KsWIJoG2mI0Y=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=fRHtl+V1CoDjatRs70vDOm8ojTlWmEkl+7cQjf7xBJ/sN1phgTV30SPTp4y05b8Tn4 odltvDP8SeU3E/XWiknxgHG6vVZS3J9mPX2M+2Y7naeScSgxsbXKMbHAfJc/JDW+mNx4 mJs3k4dBIByw3UPGMOztD/5IznRB2Oai8DzMA6wPGh2uFlN2Op0Ym4/dmS5qPNOGP6kQ +uI3aAEYTC5lpYptOcJnGRPqvInQM/3GcJ/j3D8kOn43yj1wepG1BKGj20oenw+bRW+U FAmkbwQUTaW257thIkuEnXPZ+NWhN5K/2Z5v7MaaHvqcPJmTTZw59jrDoA91ElfSvQyE RVLg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=cUtLMiON; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id z12-20020a170906240c00b00a35ddf50728si2708577eja.816.2024.01.31.09.27.51; Wed, 31 Jan 2024 09:27:52 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=cUtLMiON; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C285B68D150; Wed, 31 Jan 2024 19:27:02 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 6C2F768D140 for ; Wed, 31 Jan 2024 19:26:54 +0200 (EET) Received: by mail-pl1-f172.google.com with SMTP id d9443c01a7336-1d73066880eso43325965ad.3 for ; Wed, 31 Jan 2024 09:26:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706722012; x=1707326812; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=rSk3Zz89PvimzTNtNYuPLhV+wnR6vjl8OAzrA+F/mLM=; b=cUtLMiONxdv4qJJCL08MWP1abkumdgwyvGnAmMke30wDch4WxlJXk4zjTATsOjxZYg alDg9VFvFqIjNiN5BStzbHpyQJW9JMfQgU3mOCOY6iLTBIb77SmyThtG0q/G4BPuVjyr 0PonEH7ujkWg7RRlmd3gYOvX1bqvy3VMzXXkLg3kliNorhL5pTayGXHAvIQi5XYQoT55 +HIm9vUz8P+75LMTVCwOURXPU+52pF9FVooeqVk9RehuKK+TIN88EXzvHchgGAnhijoS sNcSvz8Ahk1qdATiJbTXxwbbCVcCDiRvFkfglM1qhPZwODClNplV4atMcsHB7K56le+j Rg2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706722012; x=1707326812; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rSk3Zz89PvimzTNtNYuPLhV+wnR6vjl8OAzrA+F/mLM=; b=XaTwcTQg1c3yv9Hu3L9YnDQ5CkgroAap5NkLf2Wb6a8uDa7GTJZ/VY/ydgxzgmi7c3 DcmR+4p+8Zno3Bw4/59HF8Jb8ZofEkU6p7lGvJegYdFASyToT0+XwijLzW+Yj5T05Yi2 LGANgu05OPti5N+y6Sqo1TY6xxZEkvqBp5jMROHabX200vTpHlep8zLVCmhi3X9j6pF3 A4ZEZegOyg1tUGtSjPv8r3qjR7LFFSfbvEcDYpsNs3Kg0XdfbFdfu51AUSmo6Yi/eBGw doqY9gYEdSq1ra4RhzgjkInXaULmCc21YAVwRkbi06fBw+6WY9y0iK4ig3ruxNLf4VeA uHGg== X-Gm-Message-State: AOJu0YwWivLHaZClNsMZyq8pkHVw6hxiy+ZK2lxVB4anYv7zHHsOje+s TD5+9uzKQ39ktfE8y+i4X0zkNo8yWRmKTdIm1VfYZEh/jjrhknlZS9Tse7VQ X-Received: by 2002:a17:903:1d2:b0:1d9:d:573e with SMTP id e18-20020a17090301d200b001d9000d573emr3118927plh.7.1706722011667; Wed, 31 Jan 2024 09:26:51 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id k11-20020a170902f28b00b001d8e4b85636sm5762235plc.138.2024.01.31.09.26.50 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 Jan 2024 09:26:51 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Wed, 31 Jan 2024 14:26:54 -0300 Message-ID: <20240131172654.15869-6-jamrial@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240131172654.15869-1-jamrial@gmail.com> References: <20240131172654.15869-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 6/6 v2] avformat/movenc: add support for Immersive Audio Model and Formats in ISOBMFF X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 33qIPBoNSFsq Signed-off-by: James Almer --- configure | 2 +- libavformat/movenc.c | 323 ++++++++++++++++++++++++++++++++++--------- libavformat/movenc.h | 7 + 3 files changed, 269 insertions(+), 63 deletions(-) diff --git a/configure b/configure index 42ba5ec502..6cdd101487 100755 --- a/configure +++ b/configure @@ -3547,7 +3547,7 @@ mlp_demuxer_select="mlp_parser" mmf_muxer_select="riffenc" mov_demuxer_select="iso_media riffdec iamf_frame_split_bsf" mov_demuxer_suggest="zlib" -mov_muxer_select="iso_media riffenc rtpenc_chain vp9_superframe_bsf aac_adtstoasc_bsf ac3_parser" +mov_muxer_select="iso_media riffenc rtpenc_chain vp9_superframe_bsf aac_adtstoasc_bsf iamf_frame_merge_bsf ac3_parser" mp3_demuxer_select="mpegaudio_parser" mp3_muxer_select="mpegaudioheader" mp4_muxer_select="mov_muxer" diff --git a/libavformat/movenc.c b/libavformat/movenc.c index b724bd5ebc..dfa8b6b04e 100644 --- a/libavformat/movenc.c +++ b/libavformat/movenc.c @@ -32,6 +32,7 @@ #include "dovi_isom.h" #include "riff.h" #include "avio.h" +#include "iamf_writer.h" #include "isom.h" #include "av1.h" #include "avc.h" @@ -47,6 +48,7 @@ #include "libavcodec/raw.h" #include "internal.h" #include "libavutil/avstring.h" +#include "libavutil/bprint.h" #include "libavutil/channel_layout.h" #include "libavutil/csp.h" #include "libavutil/intfloat.h" @@ -316,6 +318,32 @@ static int mov_write_sdtp_tag(AVIOContext *pb, MOVTrack *track) return update_size(pb, pos); } +static int mov_write_iacb_tag(AVFormatContext *s, AVIOContext *pb, MOVTrack *track) +{ + AVIOContext *dyn_bc; + int64_t pos = avio_tell(pb); + uint8_t *dyn_buf = NULL; + int dyn_size; + int ret = avio_open_dyn_buf(&dyn_bc); + if (ret < 0) + return ret; + + avio_wb32(pb, 0); + ffio_wfourcc(pb, "iacb"); + avio_w8(pb, 1); // configurationVersion + + ret = ff_iamf_write_descriptors(track->iamf, dyn_bc, s); + if (ret < 0) + return ret; + + dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf); + ffio_write_leb(pb, dyn_size); + avio_write(pb, dyn_buf, dyn_size); + av_free(dyn_buf); + + return update_size(pb, pos); +} + static int mov_write_amr_tag(AVIOContext *pb, MOVTrack *track) { avio_wb32(pb, 0x11); /* size */ @@ -1358,6 +1386,8 @@ static int mov_write_audio_tag(AVFormatContext *s, AVIOContext *pb, MOVMuxContex ret = mov_write_wave_tag(s, pb, track); else if (track->tag == MKTAG('m','p','4','a')) ret = mov_write_esds_tag(pb, track); + else if (track->tag == MKTAG('i','a','m','f')) + ret = mov_write_iacb_tag(mov->fc, pb, track); else if (track->par->codec_id == AV_CODEC_ID_AMR_NB) ret = mov_write_amr_tag(pb, track); else if (track->par->codec_id == AV_CODEC_ID_AC3) @@ -2501,7 +2531,7 @@ static int mov_write_video_tag(AVFormatContext *s, AVIOContext *pb, MOVMuxContex if (track->mode == MODE_AVIF) { mov_write_ccst_tag(pb); - if (s->nb_streams > 0 && track == &mov->tracks[1]) + if (mov->nb_streams > 0 && track == &mov->tracks[1]) mov_write_aux_tag(pb, "auxi"); } @@ -3096,9 +3126,9 @@ static int mov_write_iloc_tag(AVIOContext *pb, MOVMuxContext *mov, AVFormatConte avio_wb32(pb, 0); /* Version & flags */ avio_w8(pb, (4 << 4) + 4); /* offset_size(4) and length_size(4) */ avio_w8(pb, 0); /* base_offset_size(4) and reserved(4) */ - avio_wb16(pb, s->nb_streams); /* item_count */ + avio_wb16(pb, mov->nb_streams); /* item_count */ - for (int i = 0; i < s->nb_streams; i++) { + for (int i = 0; i < mov->nb_streams; i++) { avio_wb16(pb, i + 1); /* item_id */ avio_wb16(pb, 0); /* data_reference_index */ avio_wb16(pb, 1); /* extent_count */ @@ -3117,9 +3147,9 @@ static int mov_write_iinf_tag(AVIOContext *pb, MOVMuxContext *mov, AVFormatConte avio_wb32(pb, 0); /* size */ ffio_wfourcc(pb, "iinf"); avio_wb32(pb, 0); /* Version & flags */ - avio_wb16(pb, s->nb_streams); /* entry_count */ + avio_wb16(pb, mov->nb_streams); /* entry_count */ - for (int i = 0; i < s->nb_streams; i++) { + for (int i = 0; i < mov->nb_streams; i++) { int64_t infe_pos = avio_tell(pb); avio_wb32(pb, 0); /* size */ ffio_wfourcc(pb, "infe"); @@ -3188,7 +3218,7 @@ static int mov_write_ipco_tag(AVIOContext *pb, MOVMuxContext *mov, AVFormatConte int64_t pos = avio_tell(pb); avio_wb32(pb, 0); /* size */ ffio_wfourcc(pb, "ipco"); - for (int i = 0; i < s->nb_streams; i++) { + for (int i = 0; i < mov->nb_streams; i++) { mov_write_ispe_tag(pb, mov, s, i); mov_write_pixi_tag(pb, mov, s, i); mov_write_av1c_tag(pb, &mov->tracks[i]); @@ -3206,9 +3236,9 @@ static int mov_write_ipma_tag(AVIOContext *pb, MOVMuxContext *mov, AVFormatConte avio_wb32(pb, 0); /* size */ ffio_wfourcc(pb, "ipma"); avio_wb32(pb, 0); /* Version & flags */ - avio_wb32(pb, s->nb_streams); /* entry_count */ + avio_wb32(pb, mov->nb_streams); /* entry_count */ - for (int i = 0, index = 1; i < s->nb_streams; i++) { + for (int i = 0, index = 1; i < mov->nb_streams; i++) { avio_wb16(pb, i + 1); /* item_ID */ avio_w8(pb, 4); /* association_count */ @@ -4185,7 +4215,7 @@ static int mov_write_covr(AVIOContext *pb, AVFormatContext *s) int64_t pos = 0; int i; - for (i = 0; i < s->nb_streams; i++) { + for (i = 0; i < mov->nb_streams; i++) { MOVTrack *trk = &mov->tracks[i]; if (!is_cover_image(trk->st) || trk->cover_image->size <= 0) @@ -4332,7 +4362,7 @@ static int mov_write_meta_tag(AVIOContext *pb, MOVMuxContext *mov, mov_write_pitm_tag(pb, 1); mov_write_iloc_tag(pb, mov, s); mov_write_iinf_tag(pb, mov, s); - if (s->nb_streams > 1) + if (mov->nb_streams > 1) mov_write_iref_tag(pb, mov, s); mov_write_iprp_tag(pb, mov, s); } else { @@ -4583,16 +4613,17 @@ static int mov_setup_track_ids(MOVMuxContext *mov, AVFormatContext *s) if (mov->use_stream_ids_as_track_ids) { int next_generated_track_id = 0; - for (i = 0; i < s->nb_streams; i++) { - if (s->streams[i]->id > next_generated_track_id) - next_generated_track_id = s->streams[i]->id; + for (i = 0; i < mov->nb_streams; i++) { + AVStream *st = mov->tracks[i].st; + if (st->id > next_generated_track_id) + next_generated_track_id = st->id; } for (i = 0; i < mov->nb_tracks; i++) { if (mov->tracks[i].entry <= 0 && !(mov->flags & FF_MOV_FLAG_FRAGMENT)) continue; - mov->tracks[i].track_id = i >= s->nb_streams ? ++next_generated_track_id : s->streams[i]->id; + mov->tracks[i].track_id = i >= mov->nb_streams ? ++next_generated_track_id : mov->tracks[i].st->id; } } else { for (i = 0; i < mov->nb_tracks; i++) { @@ -4629,7 +4660,7 @@ static int mov_write_moov_tag(AVIOContext *pb, MOVMuxContext *mov, } if (mov->chapter_track) - for (i = 0; i < s->nb_streams; i++) { + for (i = 0; i < mov->nb_streams; i++) { mov->tracks[i].tref_tag = MKTAG('c','h','a','p'); mov->tracks[i].tref_id = mov->tracks[mov->chapter_track].track_id; } @@ -4669,7 +4700,7 @@ static int mov_write_moov_tag(AVIOContext *pb, MOVMuxContext *mov, for (i = 0; i < mov->nb_tracks; i++) { if (mov->tracks[i].entry > 0 || mov->flags & FF_MOV_FLAG_FRAGMENT || mov->mode == MODE_AVIF) { - int ret = mov_write_trak_tag(s, pb, mov, &(mov->tracks[i]), i < s->nb_streams ? s->streams[i] : NULL); + int ret = mov_write_trak_tag(s, pb, mov, &(mov->tracks[i]), i < mov->nb_streams ? mov->tracks[i].st : NULL); if (ret < 0) return ret; } @@ -5463,8 +5494,8 @@ static int mov_write_ftyp_tag(AVIOContext *pb, AVFormatContext *s) int has_h264 = 0, has_av1 = 0, has_video = 0, has_dolby = 0; int i; - for (i = 0; i < s->nb_streams; i++) { - AVStream *st = s->streams[i]; + for (i = 0; i < mov->nb_streams; i++) { + AVStream *st = mov->tracks[i].st; if (is_cover_image(st)) continue; if (st->codecpar->codec_type == AVMEDIA_TYPE_VIDEO) @@ -5639,8 +5670,8 @@ static int mov_write_identification(AVIOContext *pb, AVFormatContext *s) mov_write_ftyp_tag(pb,s); if (mov->mode == MODE_PSP) { int video_streams_nb = 0, audio_streams_nb = 0, other_streams_nb = 0; - for (i = 0; i < s->nb_streams; i++) { - AVStream *st = s->streams[i]; + for (i = 0; i < mov->nb_streams; i++) { + AVStream *st = mov->tracks[i].st; if (is_cover_image(st)) continue; if (st->codecpar->codec_type == AVMEDIA_TYPE_VIDEO) @@ -5827,7 +5858,7 @@ static int mov_write_squashed_packets(AVFormatContext *s) { MOVMuxContext *mov = s->priv_data; - for (int i = 0; i < s->nb_streams; i++) { + for (int i = 0; i < mov->nb_streams; i++) { MOVTrack *track = &mov->tracks[i]; int ret = AVERROR_BUG; @@ -5868,7 +5899,7 @@ static int mov_flush_fragment(AVFormatContext *s, int force) // of fragments was triggered automatically by an AVPacket, we // already have reliable info for the end of that track, but other // tracks may need to be filled in. - for (i = 0; i < s->nb_streams; i++) { + for (i = 0; i < mov->nb_streams; i++) { MOVTrack *track = &mov->tracks[i]; if (!track->end_reliable) { const AVPacket *pkt = ff_interleaved_peek(s, i); @@ -6069,10 +6100,8 @@ static int mov_auto_flush_fragment(AVFormatContext *s, int force) return ret; } -static int check_pkt(AVFormatContext *s, AVPacket *pkt) +static int check_pkt(AVFormatContext *s, MOVTrack *trk, AVPacket *pkt) { - MOVMuxContext *mov = s->priv_data; - MOVTrack *trk = &mov->tracks[pkt->stream_index]; int64_t ref; uint64_t duration; @@ -6110,15 +6139,21 @@ int ff_mov_write_packet(AVFormatContext *s, AVPacket *pkt) { MOVMuxContext *mov = s->priv_data; AVIOContext *pb = s->pb; - MOVTrack *trk = &mov->tracks[pkt->stream_index]; - AVCodecParameters *par = trk->par; + MOVTrack *trk; + AVCodecParameters *par; AVProducerReferenceTime *prft; unsigned int samples_in_chunk = 0; int size = pkt->size, ret = 0, offset = 0; size_t prft_size; uint8_t *reformatted_data = NULL; - ret = check_pkt(s, pkt); + if (pkt->stream_index < s->nb_streams) + trk = s->streams[pkt->stream_index]->priv_data; + else // Timecode or chapter + trk = &mov->tracks[pkt->stream_index]; + par = trk->par; + + ret = check_pkt(s, trk, pkt); if (ret < 0) return ret; @@ -6208,7 +6243,7 @@ int ff_mov_write_packet(AVFormatContext *s, AVPacket *pkt) if (par->codec_id == AV_CODEC_ID_AAC && pkt->size > 2 && (AV_RB16(pkt->data) & 0xfff0) == 0xfff0) { - if (!s->streams[pkt->stream_index]->nb_frames) { + if (!trk->st->nb_frames) { av_log(s, AV_LOG_ERROR, "Malformed AAC bitstream detected: " "use the audio bitstream filter 'aac_adtstoasc' to fix it " "('-bsf:a aac_adtstoasc' option with ffmpeg)\n"); @@ -6470,18 +6505,18 @@ err: static int mov_write_single_packet(AVFormatContext *s, AVPacket *pkt) { MOVMuxContext *mov = s->priv_data; - MOVTrack *trk = &mov->tracks[pkt->stream_index]; + MOVTrack *trk = s->streams[pkt->stream_index]->priv_data; AVCodecParameters *par = trk->par; int64_t frag_duration = 0; int size = pkt->size; - int ret = check_pkt(s, pkt); + int ret = check_pkt(s, trk, pkt); if (ret < 0) return ret; if (mov->flags & FF_MOV_FLAG_FRAG_DISCONT) { int i; - for (i = 0; i < s->nb_streams; i++) + for (i = 0; i < mov->nb_streams; i++) mov->tracks[i].frag_discont = 1; mov->flags &= ~FF_MOV_FLAG_FRAG_DISCONT; } @@ -6523,7 +6558,7 @@ static int mov_write_single_packet(AVFormatContext *s, AVPacket *pkt) return 0; /* Discard 0 sized packets */ } - if (trk->entry && pkt->stream_index < s->nb_streams) + if (trk->entry && pkt->stream_index < mov->nb_streams) frag_duration = av_rescale_q(pkt->dts - trk->cluster[0].dts, s->streams[pkt->stream_index]->time_base, AV_TIME_BASE_Q); @@ -6578,17 +6613,45 @@ static int mov_write_subtitle_end_packet(AVFormatContext *s, return ret; } +static int mov_filter_packet(AVFormatContext *s, MOVTrack *track, AVPacket *pkt) +{ + int ret; + + if (!track->bsf) + return 0; + + ret = av_bsf_send_packet(track->bsf, pkt); + if (ret < 0) { + av_log(s, AV_LOG_ERROR, + "Failed to send packet to filter %s for stream %d: %s\n", + track->bsf->filter->name, pkt->stream_index, av_err2str(ret)); + return ret; + } + + return av_bsf_receive_packet(track->bsf, pkt); +} + static int mov_write_packet(AVFormatContext *s, AVPacket *pkt) { MOVMuxContext *mov = s->priv_data; MOVTrack *trk; + int ret; if (!pkt) { mov_flush_fragment(s, 1); return 1; } - trk = &mov->tracks[pkt->stream_index]; + trk = s->streams[pkt->stream_index]->priv_data; + + ret = mov_filter_packet(s, trk, pkt); + if (ret < 0) { + if (ret == AVERROR(EAGAIN)) + return 0; + av_log(s, AV_LOG_ERROR, "Error applying bitstream filters to an output " + "packet for stream #%d: %s\n", trk->st->index, av_err2str(ret)); + return ret; + } if (is_cover_image(trk->st)) { int ret; @@ -6789,12 +6852,12 @@ static int mov_create_chapter_track(AVFormatContext *s, int tracknum) } -static int mov_check_timecode_track(AVFormatContext *s, AVTimecode *tc, int src_index, const char *tcstr) +static int mov_check_timecode_track(AVFormatContext *s, AVTimecode *tc, AVStream *src_st, const char *tcstr) { int ret; /* compute the frame number */ - ret = av_timecode_init_from_string(tc, s->streams[src_index]->avg_frame_rate, tcstr, s); + ret = av_timecode_init_from_string(tc, src_st->avg_frame_rate, tcstr, s); return ret; } @@ -6802,7 +6865,7 @@ static int mov_create_timecode_track(AVFormatContext *s, int index, int src_inde { MOVMuxContext *mov = s->priv_data; MOVTrack *track = &mov->tracks[index]; - AVStream *src_st = s->streams[src_index]; + AVStream *src_st = mov->tracks[src_index].st; uint8_t data[4]; AVPacket *pkt = mov->pkt; AVRational rate = src_st->avg_frame_rate; @@ -6862,8 +6925,8 @@ static void enable_tracks(AVFormatContext *s) first[i] = -1; } - for (i = 0; i < s->nb_streams; i++) { - AVStream *st = s->streams[i]; + for (i = 0; i < mov->nb_streams; i++) { + AVStream *st = mov->tracks[i].st; if (st->codecpar->codec_type <= AVMEDIA_TYPE_UNKNOWN || st->codecpar->codec_type >= AVMEDIA_TYPE_NB || @@ -6897,6 +6960,9 @@ static void mov_free(AVFormatContext *s) MOVMuxContext *mov = s->priv_data; int i; + for (i = 0; i < s->nb_streams; i++) + s->streams[i]->priv_data = NULL; + if (!mov->tracks) return; @@ -6927,6 +6993,7 @@ static void mov_free(AVFormatContext *s) ffio_free_dyn_buf(&track->mdat_buf); avpriv_packet_list_free(&track->squashed_packet_queue); + av_bsf_free(&track->bsf); } av_freep(&mov->tracks); @@ -6999,6 +7066,92 @@ static int mov_create_dvd_sub_decoder_specific_info(MOVTrack *track, return 0; } +static int mov_init_iamf_track(AVFormatContext *s) +{ + MOVMuxContext *mov = s->priv_data; + MOVTrack *track = &mov->tracks[0]; // IAMF if present is always the first track + const AVBitStreamFilter *filter; + AVBPrint bprint; + AVStream *first_st = NULL; + char *args; + int nb_audio_elements = 0, nb_mix_presentations = 0; + int ret; + + for (int i = 0; i < s->nb_stream_groups; i++) { + const AVStreamGroup *stg = s->stream_groups[i]; + + if (stg->type == AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT) + nb_audio_elements++; + if (stg->type == AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION) + nb_mix_presentations++; + } + + if (!nb_audio_elements && !nb_mix_presentations) + return 0; + + if ((nb_audio_elements < 1 && nb_audio_elements > 2) || nb_mix_presentations < 1) { + av_log(s, AV_LOG_ERROR, "There must be >= 1 and <= 2 IAMF_AUDIO_ELEMENT and at least " + "one IAMF_MIX_PRESENTATION stream groups to write a IMAF track\n"); + return AVERROR(EINVAL); + } + + track->iamf = av_mallocz(sizeof(*track->iamf)); + if (!track->iamf) + return AVERROR(ENOMEM); + + av_bprint_init(&bprint, 0, AV_BPRINT_SIZE_UNLIMITED); + + for (int i = 0; i < s->nb_stream_groups; i++) { + const AVStreamGroup *stg = s->stream_groups[i]; + + switch(stg->type) { + case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT: + if (!first_st) + first_st = stg->streams[0]; + + for (int j = 0; j < stg->nb_streams; j++) { + av_bprintf(&bprint, "%d=%d%s", s->streams[j]->index, s->streams[j]->id, + j < (stg->nb_streams - 1) ? ":" : ""); + s->streams[j]->priv_data = track; + } + + ret = ff_iamf_add_audio_element(track->iamf, stg, s); + break; + case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION: + ret = ff_iamf_add_mix_presentation(track->iamf, stg, s); + break; + default: + av_assert0(0); + } + if (ret < 0) + return ret; + } + + av_bprint_finalize(&bprint, &args); + + filter = av_bsf_get_by_name("iamf_frame_merge"); + if (!filter) { + av_log(s, AV_LOG_ERROR, "iamf_frame_merge bitstream filter " + "not found. This is a bug, please report it.\n"); + return AVERROR_BUG; + } + + ret = av_bsf_alloc(filter, &track->bsf); + if (ret < 0) + return ret; + + ret = avcodec_parameters_copy(track->bsf->par_in, first_st->codecpar); + if (ret < 0) + return ret; + + av_opt_set(track->bsf->priv_data, "index_mapping", args, 0); + av_opt_set_int(track->bsf->priv_data, "out_index", first_st->index, 0); + + track->tag = MKTAG('i','a','m','f'); + + return av_bsf_init(track->bsf); +} + static int mov_init(AVFormatContext *s) { MOVMuxContext *mov = s->priv_data; @@ -7136,7 +7289,37 @@ static int mov_init(AVFormatContext *s) s->streams[0]->disposition |= AV_DISPOSITION_DEFAULT; } - mov->nb_tracks = s->nb_streams; + for (i = 0; i < s->nb_stream_groups; i++) { + AVStreamGroup *stg = s->stream_groups[i]; + + if (stg->type != AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT) + continue; + + for (int j = 0; j < stg->nb_streams; j++) { + AVStream *st = stg->streams[j]; + + if (st->priv_data) { + av_log(s, AV_LOG_ERROR, "Stream %d is present in more than one Stream Group of type " + "IAMF Audio Element\n", j); + return AVERROR(EINVAL); + } + st->priv_data = st; + } + + if (!mov->nb_tracks) // We support one track for the entire IAMF structure + mov->nb_tracks++; + } + + for (i = 0; i < s->nb_streams; i++) { + AVStream *st = s->streams[i]; + if (st->priv_data) + continue; + st->priv_data = st; + mov->nb_tracks++; + } + + mov->nb_streams = mov->nb_tracks; + if (mov->mode & (MODE_MP4|MODE_MOV|MODE_IPOD) && s->nb_chapters) mov->chapter_track = mov->nb_tracks++; @@ -7162,7 +7345,7 @@ static int mov_init(AVFormatContext *s) if (st->codecpar->codec_type == AVMEDIA_TYPE_VIDEO && (t || (t=av_dict_get(st->metadata, "timecode", NULL, 0)))) { AVTimecode tc; - ret = mov_check_timecode_track(s, &tc, i, t->value); + ret = mov_check_timecode_track(s, &tc, st, t->value); if (ret >= 0) mov->nb_meta_tmcd++; } @@ -7211,18 +7394,33 @@ static int mov_init(AVFormatContext *s) } } + ret = mov_init_iamf_track(s); + if (ret < 0) + return ret; + + for (int j = 0, i = 0; j < s->nb_streams; j++) { + AVStream *st = s->streams[j]; + + if (st != st->priv_data) + continue; + st->priv_data = &mov->tracks[i++]; + } + for (i = 0; i < s->nb_streams; i++) { AVStream *st= s->streams[i]; - MOVTrack *track= &mov->tracks[i]; + MOVTrack *track = st->priv_data; AVDictionaryEntry *lang = av_dict_get(st->metadata, "language", NULL,0); - track->st = st; - track->par = st->codecpar; + if (!track->st) { + track->st = st; + track->par = st->codecpar; + } track->language = ff_mov_iso639_to_lang(lang?lang->value:"und", mov->mode!=MODE_MOV); if (track->language < 0) track->language = 32767; // Unspecified Macintosh language code track->mode = mov->mode; - track->tag = mov_find_codec_tag(s, track); + if (!track->tag) + track->tag = mov_find_codec_tag(s, track); if (!track->tag) { av_log(s, AV_LOG_ERROR, "Could not find tag for codec %s in stream #%d, " "codec not currently supported in container\n", @@ -7414,25 +7612,26 @@ static int mov_write_header(AVFormatContext *s) { AVIOContext *pb = s->pb; MOVMuxContext *mov = s->priv_data; - int i, ret, hint_track = 0, tmcd_track = 0, nb_tracks = s->nb_streams; + int i, ret, hint_track = 0, tmcd_track = 0, nb_tracks = mov->nb_streams; if (mov->mode & (MODE_MP4|MODE_MOV|MODE_IPOD) && s->nb_chapters) nb_tracks++; if (mov->flags & FF_MOV_FLAG_RTP_HINT) { hint_track = nb_tracks; - for (i = 0; i < s->nb_streams; i++) - if (rtp_hinting_needed(s->streams[i])) + for (i = 0; i < mov->nb_streams; i++) { + if (rtp_hinting_needed(mov->tracks[i].st)) nb_tracks++; + } } if (mov->nb_meta_tmcd) tmcd_track = nb_tracks; - for (i = 0; i < s->nb_streams; i++) { + for (i = 0; i < mov->nb_streams; i++) { int j; - AVStream *st= s->streams[i]; - MOVTrack *track= &mov->tracks[i]; + MOVTrack *track = &mov->tracks[i]; + AVStream *st = track->st; /* copy extradata if it exists */ if (st->codecpar->extradata_size) { @@ -7454,8 +7653,8 @@ static int mov_write_header(AVFormatContext *s) &(AVChannelLayout)AV_CHANNEL_LAYOUT_MONO)) continue; - for (j = 0; j < s->nb_streams; j++) { - AVStream *stj= s->streams[j]; + for (j = 0; j < mov->nb_streams; j++) { + AVStream *stj= mov->tracks[j].st; MOVTrack *trackj= &mov->tracks[j]; if (j == i) continue; @@ -7518,8 +7717,8 @@ static int mov_write_header(AVFormatContext *s) return ret; if (mov->flags & FF_MOV_FLAG_RTP_HINT) { - for (i = 0; i < s->nb_streams; i++) { - if (rtp_hinting_needed(s->streams[i])) { + for (i = 0; i < mov->nb_streams; i++) { + if (rtp_hinting_needed(mov->tracks[i].st)) { if ((ret = ff_mov_init_hinting(s, hint_track, i)) < 0) return ret; hint_track++; @@ -7531,8 +7730,8 @@ static int mov_write_header(AVFormatContext *s) const AVDictionaryEntry *t, *global_tcr = av_dict_get(s->metadata, "timecode", NULL, 0); /* Initialize the tmcd tracks */ - for (i = 0; i < s->nb_streams; i++) { - AVStream *st = s->streams[i]; + for (i = 0; i < mov->nb_streams; i++) { + AVStream *st = mov->tracks[i].st; t = global_tcr; if (st->codecpar->codec_type == AVMEDIA_TYPE_VIDEO) { @@ -7541,7 +7740,7 @@ static int mov_write_header(AVFormatContext *s) t = av_dict_get(st->metadata, "timecode", NULL, 0); if (!t) continue; - if (mov_check_timecode_track(s, &tc, i, t->value) < 0) + if (mov_check_timecode_track(s, &tc, st, t->value) < 0) continue; if ((ret = mov_create_timecode_track(s, tmcd_track, i, tc)) < 0) return ret; @@ -7662,7 +7861,7 @@ static int mov_write_trailer(AVFormatContext *s) int64_t moov_pos; if (mov->need_rewrite_extradata) { - for (i = 0; i < s->nb_streams; i++) { + for (i = 0; i < mov->nb_streams; i++) { MOVTrack *track = &mov->tracks[i]; AVCodecParameters *par = track->par; @@ -7802,7 +8001,7 @@ static int avif_write_trailer(AVFormatContext *s) if (mov->moov_written) return 0; mov->is_animated_avif = s->streams[0]->nb_frames > 1; - if (mov->is_animated_avif && s->nb_streams > 1) { + if (mov->is_animated_avif && mov->nb_streams > 1) { // For animated avif with alpha channel, we need to write a tref tag // with type "auxl". mov->tracks[1].tref_tag = MKTAG('a', 'u', 'x', 'l'); @@ -7812,7 +8011,7 @@ static int avif_write_trailer(AVFormatContext *s) mov_write_meta_tag(pb, mov, s); moov_size = get_moov_size(s); - for (i = 0; i < s->nb_streams; i++) + for (i = 0; i < mov->nb_tracks; i++) mov->tracks[i].data_offset = avio_tell(pb) + moov_size + 8; if (mov->is_animated_avif) { @@ -7834,7 +8033,7 @@ static int avif_write_trailer(AVFormatContext *s) // write extent offsets. pos_backup = avio_tell(pb); - for (i = 0; i < s->nb_streams; i++) { + for (i = 0; i < mov->nb_streams; i++) { if (extent_offsets[i] != (uint32_t)extent_offsets[i]) { av_log(s, AV_LOG_ERROR, "extent offset does not fit in 32 bits\n"); return AVERROR_INVALIDDATA; diff --git a/libavformat/movenc.h b/libavformat/movenc.h index 60363198c9..fee3e759e0 100644 --- a/libavformat/movenc.h +++ b/libavformat/movenc.h @@ -25,7 +25,9 @@ #define AVFORMAT_MOVENC_H #include "avformat.h" +#include "iamf.h" #include "movenccenc.h" +#include "libavcodec/bsf.h" #include "libavcodec/packet_internal.h" #define MOV_FRAG_INFO_ALLOC_INCREMENT 64 @@ -170,6 +172,10 @@ typedef struct MOVTrack { unsigned int squash_fragment_samples_to_one; //< flag to note formats where all samples for a fragment are to be squashed PacketList squashed_packet_queue; + + AVBSFContext *bsf; + + IAMFContext *iamf; } MOVTrack; typedef enum { @@ -188,6 +194,7 @@ typedef struct MOVMuxContext { const AVClass *av_class; int mode; int64_t time; + int nb_streams; int nb_tracks; int nb_meta_tmcd; ///< number of new created tmcd track based on metadata (aka not data copy) int chapter_track; ///< qt chapter track number