From patchwork Sun Jan 30 22:00:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pierre-Anthony Lemieux X-Patchwork-Id: 33950 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6602:2c4e:0:0:0:0 with SMTP id x14csp2245165iov; Sun, 30 Jan 2022 14:01:46 -0800 (PST) X-Google-Smtp-Source: ABdhPJws0KoOAvxFTNNETUtp/x5MVzUZB1Yoewr5oerJEY0Cqa+/yC2W6YEH5qbN0LQ6hN9vKAGU X-Received: by 2002:a17:907:9488:: with SMTP id dm8mr14753513ejc.73.1643580106258; Sun, 30 Jan 2022 14:01:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643580106; cv=none; d=google.com; s=arc-20160816; b=s0fgU9qm2wRgXj8ESdv6G2K2ElMo6JiJu3U+S2rh0ic3EJiHejxsEsttrUnggCmUTd uMnL3+7cb3HQ7m5Cn4L3OyLLDBFJSP4Har6eZXs66S/G+vqNI1AHX9DreEND2zyzbf6y 8AjOZjecZwO2Tc0XxW64VPVuhysOqW7H7qBXaF2bgcpI10DzhjrmuLULA0WAXn1YV6kk rxMHhNNY1WDBQjx4dm2O+QwwXwRyEfwk6xAa6KDW1A+hcDnySPtJIOlTK+G4ACeBf5Xh DIoKNgyy2CJVdNlySEiN1ReeeUwX4rJeLh7ezgFAIw9vDBDawbpPxX43MNF3j5ehk979 B1pA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:delivered-to; bh=nnNjxUWclkSYOianW7LucRQpKTtoMkq0YDoBE86UXto=; b=JLSyUAaQBhzqtH0zvdajTcY2c4DHMMafM/nT9cf1CRh0rOjybkiG2DAkqNxim6gRgk hqDMt20oyqbZUJPa1PmHzBKriJ60C+0VKYsFx7mvEyGZv+IP/GzxFkFxbraMaJbmU+XK YWCNtdhWg6imbOA5k0wOjkgB7WrfGFE6eBP1wwQG6SnitGNhYUCgZ1yNItpktHke+7+k DBygRyN9xOhSC7pPL9SLQPYmvH6vImsVGF9Bb+VljQ8GmM9A1uqYJA8li3ylTimdh6+G F9TsKWuRd6ilL/qGynmyPWj1tnv2d6I/MmjD8uLTPSsUiDmvCo345tHogewWSyVIjDoO dDaw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 28si6244429ejm.165.2022.01.30.14.01.45; Sun, 30 Jan 2022 14:01:46 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6847968B17A; Mon, 31 Jan 2022 00:01:24 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 2172F68A4CD for ; Mon, 31 Jan 2022 00:01:18 +0200 (EET) Received: by mail-pl1-f169.google.com with SMTP id b15so10930290plg.3 for ; Sun, 30 Jan 2022 14:01:18 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=d/yLmq7UkGsuQQiAAwlMua7poF4xahg+PkzSscS2zGg=; b=osW9TWXggS4G0Kejg4D9NECFk15lPaKnx8+ivPX9iHAZv+IUoEPXpKoLgXLByBHo0c 6oILOtTGZ09i5jHeeh76HKJOx266WI1CBwh6wdu4xdFVtCfIauxlzr2jXjT0Z0cbHEKz rEIYPm1TAMOKOgdZPVk86SACtf3EnqJTl83DsHTtvECsaT6kFA9iRXLuuUFkJ8r74aMp qecdltPpmHNQokX4EkDUuOrqNHeyVglvNYcmjuJKlWQXAU4IIXhoA6+zbR3fRVU5M9ZK oPgrcmC4oNGv1UCFUG23jvN0N+JboD+8i3y3QZ0ah5UECs8xFaJhe/sWq6Xo2w9nhzk8 abag== X-Gm-Message-State: AOAM532pvPpLSyTtxtz7c4SFdvAmahQvr2s0JX+3KQxFyE99b2sUeKim QzvtfH5hTBHJBvteYhCkWz79vqZwDcc= X-Received: by 2002:a17:90b:1bc3:: with SMTP id oa3mr21148535pjb.172.1643580075610; Sun, 30 Jan 2022 14:01:15 -0800 (PST) Received: from localhost (76-14-89-2.sf-cable.astound.net. [76.14.89.2]) by smtp.gmail.com with ESMTPSA id g12sm15538658pfm.119.2022.01.30.14.01.14 (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Sun, 30 Jan 2022 14:01:15 -0800 (PST) Received: by localhost (sSMTP sendmail emulation); Sun, 30 Jan 2022 14:01:04 -0800 From: pal@sandflow.com To: ffmpeg-devel@ffmpeg.org Date: Sun, 30 Jan 2022 14:00:54 -0800 Message-Id: <20220130220055.2595-3-pal@sandflow.com> X-Mailer: git-send-email 2.35.0.windows.1 In-Reply-To: <20220130220055.2595-1-pal@sandflow.com> References: <20220130220055.2595-1-pal@sandflow.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v1 3/4] avformat/imf: fix packet pts, dts and muxing X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Pierre-Anthony Lemieux Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: cakloOadeuTB From: Pierre-Anthony Lemieux The IMF demuxer does not set the DTS and PTS of packets accurately in all scenarios. Moreover, audio packets are not trimmed when they exceed the duration of the underlying resource. Closes https://trac.ffmpeg.org/ticket/9611 --- libavformat/imfdec.c | 225 +++++++++++++++++++++++++------------------ 1 file changed, 132 insertions(+), 93 deletions(-) diff --git a/libavformat/imfdec.c b/libavformat/imfdec.c index 6b50b582f6..05dcb6ff31 100644 --- a/libavformat/imfdec.c +++ b/libavformat/imfdec.c @@ -65,6 +65,7 @@ #include "avio_internal.h" #include "imf.h" #include "internal.h" +#include "libavcodec/packet.h" #include "libavutil/avstring.h" #include "libavutil/bprint.h" #include "libavutil/opt.h" @@ -97,6 +98,9 @@ typedef struct IMFVirtualTrackResourcePlaybackCtx { IMFAssetLocator *locator; FFIMFTrackFileResource *resource; AVFormatContext *ctx; + AVRational start_time; + AVRational end_time; + AVRational ts_offset; } IMFVirtualTrackResourcePlaybackCtx; typedef struct IMFVirtualTrackPlaybackCtx { @@ -108,7 +112,6 @@ typedef struct IMFVirtualTrackPlaybackCtx { IMFVirtualTrackResourcePlaybackCtx *resources; /**< Buffer holding the resources */ int32_t current_resource_index; /**< Index of the current resource in resources, or < 0 if a current resource has yet to be selected */ - int64_t last_pts; /**< Last timestamp */ } IMFVirtualTrackPlaybackCtx; typedef struct IMFContext { @@ -342,6 +345,7 @@ static int open_track_resource_context(AVFormatContext *s, int ret = 0; int64_t entry_point; AVDictionary *opts = NULL; + AVStream *st; if (track_resource->ctx) { av_log(s, @@ -383,23 +387,28 @@ static int open_track_resource_context(AVFormatContext *s, } av_dict_free(&opts); - /* Compare the source timebase to the resource edit rate, - * considering the first stream of the source file - */ - if (av_cmp_q(track_resource->ctx->streams[0]->time_base, - av_inv_q(track_resource->resource->base.edit_rate))) + /* make sure there is only one stream in the file */ + + if (track_resource->ctx->nb_streams != 1) { + ret = AVERROR_INVALIDDATA; + goto cleanup; + } + + st = track_resource->ctx->streams[0]; + + /* Warn if the resource time base does not match the file time base */ + if (av_cmp_q(st->time_base, av_inv_q(track_resource->resource->base.edit_rate))) av_log(s, AV_LOG_WARNING, - "Incoherent source stream timebase %d/%d regarding resource edit rate: %d/%d", - track_resource->ctx->streams[0]->time_base.num, - track_resource->ctx->streams[0]->time_base.den, + "Incoherent source stream timebase " AVRATIONAL_FORMAT + "regarding resource edit rate: " AVRATIONAL_FORMAT, + st->time_base.num, + st->time_base.den, track_resource->resource->base.edit_rate.den, track_resource->resource->base.edit_rate.num); - entry_point = (int64_t)track_resource->resource->base.entry_point - * track_resource->resource->base.edit_rate.den - * AV_TIME_BASE - / track_resource->resource->base.edit_rate.num; + entry_point = av_rescale_q(track_resource->resource->base.entry_point, st->time_base, + av_inv_q(track_resource->resource->base.edit_rate)); if (entry_point) { av_log(s, @@ -407,7 +416,7 @@ static int open_track_resource_context(AVFormatContext *s, "Seek at resource %s entry point: %" PRIu32 "\n", track_resource->locator->absolute_uri, track_resource->resource->base.entry_point); - ret = avformat_seek_file(track_resource->ctx, -1, entry_point, entry_point, entry_point, 0); + ret = avformat_seek_file(track_resource->ctx, 0, entry_point, entry_point, entry_point, 0); if (ret < 0) { av_log(s, AV_LOG_ERROR, @@ -470,11 +479,16 @@ static int open_track_file_resource(AVFormatContext *s, vt_ctx.locator = asset_locator; vt_ctx.resource = track_file_resource; vt_ctx.ctx = NULL; - track->resources[track->resource_count++] = vt_ctx; - track->duration = av_add_q(track->duration, + vt_ctx.start_time = track->duration; + vt_ctx.ts_offset = av_sub_q(vt_ctx.start_time, + av_div_q(av_make_q((int)track_file_resource->base.entry_point, 1), + track_file_resource->base.edit_rate)); + vt_ctx.end_time = av_add_q(track->duration, av_make_q((int)track_file_resource->base.duration * track_file_resource->base.edit_rate.den, track_file_resource->base.edit_rate.num)); + track->resources[track->resource_count++] = vt_ctx; + track->duration = vt_ctx.end_time; } return 0; @@ -701,11 +715,14 @@ static IMFVirtualTrackPlaybackCtx *get_next_track_with_minimum_timestamp(AVForma return track; } -static IMFVirtualTrackResourcePlaybackCtx *get_resource_context_for_timestamp(AVFormatContext *s, - IMFVirtualTrackPlaybackCtx *track) +static int get_resource_context_for_timestamp(AVFormatContext *s, IMFVirtualTrackPlaybackCtx *track, IMFVirtualTrackResourcePlaybackCtx **resource) { - AVRational edit_unit_duration = av_inv_q(track->resources[0].resource->base.edit_rate); - AVRational cumulated_duration = av_make_q(0, edit_unit_duration.den); + *resource = NULL; + + if (av_cmp_q(track->current_timestamp, track->duration) >= 0) { + av_log(s, AV_LOG_DEBUG, "Reached the end of the virtual track\n"); + return AVERROR_EOF; + } av_log(s, AV_LOG_DEBUG, @@ -714,119 +731,141 @@ static IMFVirtualTrackResourcePlaybackCtx *get_resource_context_for_timestamp(AV av_q2d(track->current_timestamp), av_q2d(track->duration)); for (uint32_t i = 0; i < track->resource_count; ++i) { - cumulated_duration = av_add_q(cumulated_duration, - av_make_q((int)track->resources[i].resource->base.duration - * edit_unit_duration.num, - edit_unit_duration.den)); - if (av_cmp_q(av_add_q(track->current_timestamp, edit_unit_duration), cumulated_duration) <= 0) { + if (av_cmp_q(track->resources[i].end_time, track->current_timestamp) > 0) { av_log(s, AV_LOG_DEBUG, - "Found resource %d in track %d to read for timestamp %lf " - "(on cumulated=%lf): entry=%" PRIu32 + "Found resource %d in track %d to read at timestamp %lf: " + "entry=%" PRIu32 ", duration=%" PRIu32 - ", editrate=" AVRATIONAL_FORMAT - " | edit_unit_duration=%lf\n", + ", editrate=" AVRATIONAL_FORMAT, i, track->index, av_q2d(track->current_timestamp), - av_q2d(cumulated_duration), track->resources[i].resource->base.entry_point, track->resources[i].resource->base.duration, - AVRATIONAL_ARG(track->resources[i].resource->base.edit_rate), - av_q2d(edit_unit_duration)); + AVRATIONAL_ARG(track->resources[i].resource->base.edit_rate)); if (track->current_resource_index != i) { + int ret; + av_log(s, AV_LOG_DEBUG, "Switch resource on track %d: re-open context\n", track->index); - if (open_track_resource_context(s, &(track->resources[i])) != 0) - return NULL; + + ret = open_track_resource_context(s, &(track->resources[i])); + if (ret != 0) + return ret; if (track->current_resource_index > 0) avformat_close_input(&track->resources[track->current_resource_index].ctx); track->current_resource_index = i; } - return &(track->resources[track->current_resource_index]); + *resource = &(track->resources[track->current_resource_index]); + return 0; } } - return NULL; + + av_log(s, AV_LOG_ERROR, "Could not find IMF track resource to read\n"); + return AVERROR_STREAM_NOT_FOUND; +} + +static int imf_time_to_ts(int64_t *ts, AVRational t, AVRational time_base) +{ + int dst_num; + int dst_den; + AVRational r; + + r = av_div_q(t, time_base); + + if ((av_reduce(&dst_num, &dst_den, r.num, r.den, INT64_MAX) != 1)) + return 0; + + if (dst_den != 1) + return 0; + + *ts = dst_num; + + return 1; } static int imf_read_packet(AVFormatContext *s, AVPacket *pkt) { - IMFContext *c = s->priv_data; - IMFVirtualTrackResourcePlaybackCtx *resource_to_read = NULL; - AVRational edit_unit_duration; + IMFVirtualTrackResourcePlaybackCtx *resource = NULL; int ret = 0; IMFVirtualTrackPlaybackCtx *track; - FFStream *track_stream; + int64_t delta_ts; + AVStream *st; + AVRational next_timestamp; track = get_next_track_with_minimum_timestamp(s); - if (av_cmp_q(track->current_timestamp, track->duration) == 0) - return AVERROR_EOF; + ret = get_resource_context_for_timestamp(s, track, &resource); + if (ret) + return ret; - resource_to_read = get_resource_context_for_timestamp(s, track); + ret = av_read_frame(resource->ctx, pkt); + if (ret) { + av_log(s, AV_LOG_ERROR, "Failed to read frame\n"); + return ret; + } - if (!resource_to_read) { - edit_unit_duration - = av_inv_q(track->resources[track->current_resource_index].resource->base.edit_rate); + av_log(s, AV_LOG_DEBUG, "Got packet: pts=%" PRId64 ", dts=%" PRId64 + ", duration=%" PRId64 ", stream_index=%d, pos=%" PRId64 + ", time_base=" AVRATIONAL_FORMAT "\n", pkt->pts, pkt->dts, pkt->duration, + pkt->stream_index, pkt->pos, pkt->time_base.num, pkt->time_base.den); - if (av_cmp_q(av_add_q(track->current_timestamp, edit_unit_duration), track->duration) > 0) - return AVERROR_EOF; + /* IMF resources contain only one stream */ - av_log(s, AV_LOG_ERROR, "Could not find IMF track resource to read\n"); - return AVERROR_STREAM_NOT_FOUND; - } + if (pkt->stream_index != 0) + return AVERROR_INVALIDDATA; + st = resource->ctx->streams[0]; - while (!ff_check_interrupt(c->interrupt_callback) && !ret) { - ret = av_read_frame(resource_to_read->ctx, pkt); - av_log(s, - AV_LOG_DEBUG, - "Got packet: pts=%" PRId64 - ", dts=%" PRId64 - ", duration=%" PRId64 - ", stream_index=%d, pos=%" PRId64 - "\n", - pkt->pts, - pkt->dts, - pkt->duration, - pkt->stream_index, - pkt->pos); - - track_stream = ffstream(s->streams[track->index]); - if (ret >= 0) { - /* Update packet info from track */ - if (pkt->dts < track_stream->cur_dts && track->last_pts > 0) - pkt->dts = track_stream->cur_dts; - - pkt->pts = track->last_pts; - pkt->dts = pkt->dts - - (int64_t)track->resources[track->current_resource_index].resource->base.entry_point; - pkt->stream_index = track->index; - - /* Update track cursors */ - track->current_timestamp - = av_add_q(track->current_timestamp, - av_make_q((int)pkt->duration - * resource_to_read->ctx->streams[0]->time_base.num, - resource_to_read->ctx->streams[0]->time_base.den)); - track->last_pts += pkt->duration; + pkt->stream_index = track->index; - return 0; - } else if (ret != AVERROR_EOF) { - av_log(s, - AV_LOG_ERROR, - "Could not get packet from track %d: %s\n", - track->index, - av_err2str(ret)); - return ret; + /* adjust the packet PTS and DTS based on the temporal position of the resource within the timeline */ + + if ((imf_time_to_ts(&delta_ts, resource->ts_offset, st->time_base) == 0)) + av_log(s, AV_LOG_WARNING, "Incoherent time stamp " AVRATIONAL_FORMAT " for time base " AVRATIONAL_FORMAT, + resource->ts_offset.num, resource->ts_offset.den, pkt->time_base.num, + pkt->time_base.den); + if (pkt->pts != AV_NOPTS_VALUE) + pkt->pts += delta_ts; + if (pkt->dts != AV_NOPTS_VALUE) + pkt->dts += delta_ts; + + /* advance the track timestamp by the packet duration */ + + next_timestamp = av_add_q(track->current_timestamp, + av_mul_q(av_make_q((int)pkt->duration, 1), st->time_base)); + + /* if necessary, clamp the next timestamp to the end of the current resource */ + + if (av_cmp_q(next_timestamp, resource->end_time) > 0) { + + next_timestamp = resource->end_time; + + /* shrink the packet duration */ + + if ((imf_time_to_ts(&pkt->duration, av_sub_q(resource->end_time, track->current_timestamp), st->time_base) == 0)) + av_log(s, AV_LOG_WARNING, "Incoherent time base during packet duration calculation"); + + /* shrink the packet size itself for audio samples */ + /* only AV_CODEC_ID_PCM_S24LE is supported in IMF */ + + if (st->codecpar->codec_id == AV_CODEC_ID_PCM_S24LE) { + int bytes_per_sample = av_get_exact_bits_per_sample(st->codecpar->codec_id) >> 3; + int64_t nbsamples = av_rescale_q(pkt->duration, st->time_base, av_make_q(1, st->codecpar->sample_rate)); + av_shrink_packet(pkt, nbsamples * st->codecpar->channels * bytes_per_sample); + } else { + av_log(s, AV_LOG_WARNING, "Cannot shrink packets for non-PCM essence"); } } - return AVERROR_EOF; + track->current_timestamp = next_timestamp; + + return 0; } static int imf_close(AVFormatContext *s)