From patchwork Wed Feb 2 00:13:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pierre-Anthony Lemieux X-Patchwork-Id: 34062 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a10:6681:0:0:0:0 with SMTP id b1csp1023850pxu; Tue, 1 Feb 2022 16:13:59 -0800 (PST) X-Google-Smtp-Source: ABdhPJyvPSEfdoXK4tuq1JBFWCgDCdQhairAdwSoLuojyuwBUHvkfrI0ElATfUFpNXuFrUjSKEiT X-Received: by 2002:a17:907:96aa:: with SMTP id hd42mr23547178ejc.74.1643760839045; Tue, 01 Feb 2022 16:13:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643760839; cv=none; d=google.com; s=arc-20160816; b=U9dJVBaqMl82ydxLcMMcwfQkot6+kb86xAOv4HlqgH0hI9U+jJtDlCF4IBdiykPfRP cF5jjzniR7f+LEghVTqeBIjtzm6iFo1APPuevvMkafLZi6f9XNz+UkQPXEXcr54bMUJl lBKRyAdpNg3kHCVR+u7ITQb9MoAHrEEnHUo8ot/Sf0OvNb4Z75zvYbBZ/Sls8414fg4H jq9aThFf6PBsmZw9TGYTJm8gajzgYnUkp8ZG9QQgprIZCL6pEloRxitGttZTJasKHMvP KCeRUuxOaEKOKviUoOfYQ9a6p8NcwVvYcVZr0OJem8F/wMjXtWjZISq+wInlnat4Ewi0 2f8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:delivered-to; bh=YxQ3HJ6qz1F9ZcM0gJpW2bBXD4jGdS1YXRUXinawEg8=; b=dwvhOXu5m7fOxWQdnsK7hDEic6hWcf7hl6s+c5bo5m+zEpKbF3eF6i7TyNBXQjL8tf sZCUQp3XW8jKiOpVLifKVg+tapJmnyuDWbZbWouZeLeTdtm8ZgxI4HSMCKio7Ht3b6+T Tp2JpBsb5KTzlxv/LC/mxHa1kPKleWSc1oQXyEst9Ld6k3OOboegytjy1eHHz+SmhnsS HgN7xUvUo7nS2Tvp76bQc2y2O4EykYhL5beXYYheLTZmZarKiMBDzdarNCetD0baFWOl yUjSLneyUW6zGfKykCT6T1PufCUkReD7UHTW7aideDNOgeRmvOzpWXUUBTDcruRB0534 de7w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id j8si13803374edw.248.2022.02.01.16.13.58; Tue, 01 Feb 2022 16:13:59 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C081068B2AC; Wed, 2 Feb 2022 02:13:38 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pj1-f51.google.com (mail-pj1-f51.google.com [209.85.216.51]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 4351968B265 for ; Wed, 2 Feb 2022 02:13:32 +0200 (EET) Received: by mail-pj1-f51.google.com with SMTP id b1-20020a17090a990100b001b14bd47532so4326078pjp.0 for ; Tue, 01 Feb 2022 16:13:32 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=W2pmOBjz9oskvTC83uAmLy0eP5dv0Xnrp5olgi0TFHo=; b=hS6ejatj59FCqlUugOtV5VoN5V/tOB1mKJNqLrCHjSZTv5b0nGaSb+DDLJn+ycjOci ord8ZF8n8R63bPbw9X45nivNixOb4JKdIB/SOrniuBxcB18p5MCP9DuxFMQFQLlpeMdc dvB4TIfWJlrNJeuIJ2RPkwMYFqt8vm1jtTVpSua+LYe8p0/PGRUKkl7xC/3/GpMEp1Ua 24Z68l0ygCm8GT/OzxDC59Kc+foRtm25KqZ2uC9Xnt6799qH+Xm8iOOeibXfQuTcSWv3 D8wBLYQN15xExzY+7/tgwzmYSOellSg8+CmEcc9YSQWhMl6FV1inud5MlqRxxoBNi35d Ik/A== X-Gm-Message-State: AOAM533Pl2fVugMwvk9q1wTtpZ44uFiG7NonH1SiWAjUfFeFgI8ld5U9 cBVnoYtFLpXuRxgpIV6NMNaZavAFwTw= X-Received: by 2002:a17:902:f205:: with SMTP id m5mr28198482plc.71.1643760810178; Tue, 01 Feb 2022 16:13:30 -0800 (PST) Received: from localhost (76-14-89-2.sf-cable.astound.net. [76.14.89.2]) by smtp.gmail.com with ESMTPSA id a4sm3760545pjs.24.2022.02.01.16.13.28 (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Tue, 01 Feb 2022 16:13:29 -0800 (PST) Received: by localhost (sSMTP sendmail emulation); Tue, 01 Feb 2022 16:13:17 -0800 From: pal@sandflow.com To: ffmpeg-devel@ffmpeg.org Date: Tue, 1 Feb 2022 16:13:01 -0800 Message-Id: <20220202001302.4430-3-pal@sandflow.com> X-Mailer: git-send-email 2.35.0.windows.1 In-Reply-To: <20220202001302.4430-1-pal@sandflow.com> References: <20220202001302.4430-1-pal@sandflow.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2 3/4] avformat/imf: fix packet pts, dts and muxing X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Pierre-Anthony Lemieux Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: xUjfkacNcp4T From: Pierre-Anthony Lemieux The IMF demuxer does not set the DTS and PTS of packets accurately in all scenarios. Moreover, audio packets are not trimmed when they exceed the duration of the underlying resource. Addresses https://trac.ffmpeg.org/ticket/9611 --- libavformat/imfdec.c | 263 ++++++++++++++++++++++++++++--------------- 1 file changed, 171 insertions(+), 92 deletions(-) diff --git a/libavformat/imfdec.c b/libavformat/imfdec.c index 658ddc40f2..5be4411cb1 100644 --- a/libavformat/imfdec.c +++ b/libavformat/imfdec.c @@ -65,8 +65,10 @@ #include "avio_internal.h" #include "imf.h" #include "internal.h" +#include "libavcodec/packet.h" #include "libavutil/avstring.h" #include "libavutil/bprint.h" +#include "libavutil/intreadwrite.h" #include "libavutil/opt.h" #include "mxf.h" #include "url.h" @@ -97,6 +99,9 @@ typedef struct IMFVirtualTrackResourcePlaybackCtx { IMFAssetLocator *locator; FFIMFTrackFileResource *resource; AVFormatContext *ctx; + AVRational start_time; + AVRational end_time; + AVRational ts_offset; } IMFVirtualTrackResourcePlaybackCtx; typedef struct IMFVirtualTrackPlaybackCtx { @@ -108,7 +113,6 @@ typedef struct IMFVirtualTrackPlaybackCtx { IMFVirtualTrackResourcePlaybackCtx *resources; /**< Buffer holding the resources */ int32_t current_resource_index; /**< Index of the current resource in resources, or < 0 if a current resource has yet to be selected */ - int64_t last_pts; /**< Last timestamp */ } IMFVirtualTrackPlaybackCtx; typedef struct IMFContext { @@ -342,6 +346,7 @@ static int open_track_resource_context(AVFormatContext *s, int ret = 0; int64_t entry_point; AVDictionary *opts = NULL; + AVStream *st; if (track_resource->ctx) { av_log(s, @@ -383,23 +388,28 @@ static int open_track_resource_context(AVFormatContext *s, } av_dict_free(&opts); - /* Compare the source timebase to the resource edit rate, - * considering the first stream of the source file - */ - if (av_cmp_q(track_resource->ctx->streams[0]->time_base, - av_inv_q(track_resource->resource->base.edit_rate))) + /* make sure there is only one stream in the file */ + + if (track_resource->ctx->nb_streams != 1) { + ret = AVERROR_INVALIDDATA; + goto cleanup; + } + + st = track_resource->ctx->streams[0]; + + /* Warn if the resource time base does not match the file time base */ + if (av_cmp_q(st->time_base, av_inv_q(track_resource->resource->base.edit_rate))) av_log(s, AV_LOG_WARNING, - "Incoherent source stream timebase %d/%d regarding resource edit rate: %d/%d", - track_resource->ctx->streams[0]->time_base.num, - track_resource->ctx->streams[0]->time_base.den, + "Incoherent source stream timebase " AVRATIONAL_FORMAT + "regarding resource edit rate: " AVRATIONAL_FORMAT, + st->time_base.num, + st->time_base.den, track_resource->resource->base.edit_rate.den, track_resource->resource->base.edit_rate.num); - entry_point = (int64_t)track_resource->resource->base.entry_point - * track_resource->resource->base.edit_rate.den - * AV_TIME_BASE - / track_resource->resource->base.edit_rate.num; + entry_point = av_rescale_q(track_resource->resource->base.entry_point, st->time_base, + av_inv_q(track_resource->resource->base.edit_rate)); if (entry_point) { av_log(s, @@ -407,7 +417,7 @@ static int open_track_resource_context(AVFormatContext *s, "Seek at resource %s entry point: %" PRIu32 "\n", track_resource->locator->absolute_uri, track_resource->resource->base.entry_point); - ret = avformat_seek_file(track_resource->ctx, -1, entry_point, entry_point, entry_point, 0); + ret = avformat_seek_file(track_resource->ctx, 0, entry_point, entry_point, entry_point, 0); if (ret < 0) { av_log(s, AV_LOG_ERROR, @@ -470,11 +480,16 @@ static int open_track_file_resource(AVFormatContext *s, vt_ctx.locator = asset_locator; vt_ctx.resource = track_file_resource; vt_ctx.ctx = NULL; - track->resources[track->resource_count++] = vt_ctx; - track->duration = av_add_q(track->duration, + vt_ctx.start_time = track->duration; + vt_ctx.ts_offset = av_sub_q(vt_ctx.start_time, + av_div_q(av_make_q((int)track_file_resource->base.entry_point, 1), + track_file_resource->base.edit_rate)); + vt_ctx.end_time = av_add_q(track->duration, av_make_q((int)track_file_resource->base.duration * track_file_resource->base.edit_rate.den, track_file_resource->base.edit_rate.num)); + track->resources[track->resource_count++] = vt_ctx; + track->duration = vt_ctx.end_time; } return 0; @@ -701,11 +716,14 @@ static IMFVirtualTrackPlaybackCtx *get_next_track_with_minimum_timestamp(AVForma return track; } -static IMFVirtualTrackResourcePlaybackCtx *get_resource_context_for_timestamp(AVFormatContext *s, - IMFVirtualTrackPlaybackCtx *track) +static int get_resource_context_for_timestamp(AVFormatContext *s, IMFVirtualTrackPlaybackCtx *track, IMFVirtualTrackResourcePlaybackCtx **resource) { - AVRational edit_unit_duration = av_inv_q(track->resources[0].resource->base.edit_rate); - AVRational cumulated_duration = av_make_q(0, edit_unit_duration.den); + *resource = NULL; + + if (av_cmp_q(track->current_timestamp, track->duration) >= 0) { + av_log(s, AV_LOG_DEBUG, "Reached the end of the virtual track\n"); + return AVERROR_EOF; + } av_log(s, AV_LOG_DEBUG, @@ -714,119 +732,180 @@ static IMFVirtualTrackResourcePlaybackCtx *get_resource_context_for_timestamp(AV av_q2d(track->current_timestamp), av_q2d(track->duration)); for (uint32_t i = 0; i < track->resource_count; ++i) { - cumulated_duration = av_add_q(cumulated_duration, - av_make_q((int)track->resources[i].resource->base.duration - * edit_unit_duration.num, - edit_unit_duration.den)); - if (av_cmp_q(av_add_q(track->current_timestamp, edit_unit_duration), cumulated_duration) <= 0) { + if (av_cmp_q(track->resources[i].end_time, track->current_timestamp) > 0) { av_log(s, AV_LOG_DEBUG, - "Found resource %d in track %d to read for timestamp %lf " - "(on cumulated=%lf): entry=%" PRIu32 + "Found resource %d in track %d to read at timestamp %lf: " + "entry=%" PRIu32 ", duration=%" PRIu32 - ", editrate=" AVRATIONAL_FORMAT - " | edit_unit_duration=%lf\n", + ", editrate=" AVRATIONAL_FORMAT, i, track->index, av_q2d(track->current_timestamp), - av_q2d(cumulated_duration), track->resources[i].resource->base.entry_point, track->resources[i].resource->base.duration, - AVRATIONAL_ARG(track->resources[i].resource->base.edit_rate), - av_q2d(edit_unit_duration)); + AVRATIONAL_ARG(track->resources[i].resource->base.edit_rate)); if (track->current_resource_index != i) { + int ret; + av_log(s, AV_LOG_DEBUG, "Switch resource on track %d: re-open context\n", track->index); - if (open_track_resource_context(s, &(track->resources[i])) != 0) - return NULL; + + ret = open_track_resource_context(s, &(track->resources[i])); + if (ret != 0) + return ret; if (track->current_resource_index > 0) avformat_close_input(&track->resources[track->current_resource_index].ctx); track->current_resource_index = i; } - return &(track->resources[track->current_resource_index]); + *resource = &(track->resources[track->current_resource_index]); + return 0; } } - return NULL; + + av_log(s, AV_LOG_ERROR, "Could not find IMF track resource to read\n"); + return AVERROR_STREAM_NOT_FOUND; +} + +static int imf_time_to_ts(int64_t *ts, AVRational t, AVRational time_base) +{ + int dst_num; + int dst_den; + AVRational r; + + r = av_div_q(t, time_base); + + if ((av_reduce(&dst_num, &dst_den, r.num, r.den, INT64_MAX) != 1)) + return 1; + + if (dst_den != 1) + return 1; + + *ts = dst_num; + + return 0; } static int imf_read_packet(AVFormatContext *s, AVPacket *pkt) { - IMFContext *c = s->priv_data; - IMFVirtualTrackResourcePlaybackCtx *resource_to_read = NULL; - AVRational edit_unit_duration; + IMFVirtualTrackResourcePlaybackCtx *resource = NULL; int ret = 0; IMFVirtualTrackPlaybackCtx *track; - FFStream *track_stream; + int64_t delta_ts; + AVStream *st; + AVRational next_timestamp; track = get_next_track_with_minimum_timestamp(s); - if (av_cmp_q(track->current_timestamp, track->duration) == 0) - return AVERROR_EOF; + ret = get_resource_context_for_timestamp(s, track, &resource); + if (ret) + return ret; - resource_to_read = get_resource_context_for_timestamp(s, track); + ret = av_read_frame(resource->ctx, pkt); + if (ret) { + av_log(s, AV_LOG_ERROR, "Failed to read frame\n"); + return ret; + } - if (!resource_to_read) { - edit_unit_duration - = av_inv_q(track->resources[track->current_resource_index].resource->base.edit_rate); + av_log(s, AV_LOG_DEBUG, "Got packet: pts=%" PRId64 ", dts=%" PRId64 + ", duration=%" PRId64 ", stream_index=%d, pos=%" PRId64 + ", time_base=" AVRATIONAL_FORMAT "\n", pkt->pts, pkt->dts, pkt->duration, + pkt->stream_index, pkt->pos, pkt->time_base.num, pkt->time_base.den); - if (av_cmp_q(av_add_q(track->current_timestamp, edit_unit_duration), track->duration) > 0) - return AVERROR_EOF; + /* IMF resources contain only one stream */ - av_log(s, AV_LOG_ERROR, "Could not find IMF track resource to read\n"); - return AVERROR_STREAM_NOT_FOUND; + if (pkt->stream_index != 0) + return AVERROR_INVALIDDATA; + st = resource->ctx->streams[0]; + + pkt->stream_index = track->index; + + /* adjust the packet PTS and DTS based on the temporal position of the resource within the timeline */ + + ret = imf_time_to_ts(&delta_ts, resource->ts_offset, st->time_base); + + if (!ret) { + if (pkt->pts != AV_NOPTS_VALUE) + pkt->pts += delta_ts; + if (pkt->dts != AV_NOPTS_VALUE) + pkt->dts += delta_ts; + } else { + av_log(s, AV_LOG_WARNING, "Incoherent time stamp " AVRATIONAL_FORMAT " for time base " AVRATIONAL_FORMAT, + resource->ts_offset.num, resource->ts_offset.den, pkt->time_base.num, + pkt->time_base.den); } - while (!ff_check_interrupt(c->interrupt_callback) && !ret) { - ret = av_read_frame(resource_to_read->ctx, pkt); - av_log(s, - AV_LOG_DEBUG, - "Got packet: pts=%" PRId64 - ", dts=%" PRId64 - ", duration=%" PRId64 - ", stream_index=%d, pos=%" PRId64 - "\n", - pkt->pts, - pkt->dts, - pkt->duration, - pkt->stream_index, - pkt->pos); - - track_stream = ffstream(s->streams[track->index]); - if (ret >= 0) { - /* Update packet info from track */ - if (pkt->dts < track_stream->cur_dts && track->last_pts > 0) - pkt->dts = track_stream->cur_dts; - - pkt->pts = track->last_pts; - pkt->dts = pkt->dts - - (int64_t)track->resources[track->current_resource_index].resource->base.entry_point; - pkt->stream_index = track->index; - - /* Update track cursors */ - track->current_timestamp - = av_add_q(track->current_timestamp, - av_make_q((int)pkt->duration - * resource_to_read->ctx->streams[0]->time_base.num, - resource_to_read->ctx->streams[0]->time_base.den)); - track->last_pts += pkt->duration; + /* advance the track timestamp by the packet duration */ - return 0; - } else if (ret != AVERROR_EOF) { - av_log(s, - AV_LOG_ERROR, - "Could not get packet from track %d: %s\n", - track->index, - av_err2str(ret)); - return ret; + next_timestamp = av_add_q(track->current_timestamp, + av_mul_q(av_make_q((int)pkt->duration, 1), st->time_base)); + + /* if necessary, clamp the next timestamp to the end of the current resource */ + + if (av_cmp_q(next_timestamp, resource->end_time) > 0) { + + int64_t new_pkt_dur; + + /* shrink the packet duration */ + + ret = imf_time_to_ts(&new_pkt_dur, + av_sub_q(resource->end_time, track->current_timestamp), + st->time_base); + + if (!ret) + pkt->duration = new_pkt_dur; + else + av_log(s, AV_LOG_WARNING, "Incoherent time base in packet duration calculation"); + + /* shrink the packet itself for audio essence */ + + if (st->codecpar->codec_type == AVMEDIA_TYPE_AUDIO) { + + if (st->codecpar->codec_id == AV_CODEC_ID_PCM_S24LE) { + /* AV_CODEC_ID_PCM_S24LE is the only PCM format supported in IMF */ + /* in this case, explicitly shrink the packet */ + + int bytes_per_sample = av_get_exact_bits_per_sample(st->codecpar->codec_id) >> 3; + int64_t nbsamples = av_rescale_q(pkt->duration, + st->time_base, + av_make_q(1, st->codecpar->sample_rate)); + av_shrink_packet(pkt, nbsamples * st->codecpar->channels * bytes_per_sample); + + } else { + /* in all other cases, use side data to skip samples */ + int64_t skip_samples; + + ret = imf_time_to_ts(&skip_samples, + av_sub_q(next_timestamp, resource->end_time), + av_make_q(1, st->codecpar->sample_rate)); + + if (ret || skip_samples < 0 || skip_samples > UINT32_MAX) { + av_log(s, AV_LOG_WARNING, "Cannot skip audio samples"); + } else { + uint8_t *side_data = av_packet_new_side_data(pkt, AV_PKT_DATA_SKIP_SAMPLES, 10); + if (!side_data) + return AVERROR(ENOMEM); + + AV_WL32(side_data + 4, skip_samples); /* skip from end of this packet */ + side_data[6] = 1; /* reason for end is convergence */ + } + } + + next_timestamp = resource->end_time; + + } else { + av_log(s, AV_LOG_WARNING, "Non-audio packet duration reduced"); } } - return AVERROR_EOF; + track->current_timestamp = next_timestamp; + + return 0; } static int imf_close(AVFormatContext *s)