diff mbox series

[FFmpeg-devel] avformat/movenc: Fix tfdt out of sync

Message ID 20210718102232.1382376-1-sehuww@mail.scut.edu.cn
State New
Headers show
Series [FFmpeg-devel] avformat/movenc: Fix tfdt out of sync
Related show

Checks

Context Check Description
andriy/x86_make success Make finished
andriy/x86_make_fate success Make fate finished
andriy/PPC64_make success Make finished
andriy/PPC64_make_fate success Make fate finished

Commit Message

胡玮文 July 18, 2021, 10:22 a.m. UTC
Fix an edge case when auto flushing an empty fragment, and the previous fragment
has inaccurate duration, the track->frag_start would be out of sync from input
dts, and all subsequent tfdt box will have out of sync base_media_decode_time.
This error can accumulate to quite large over long-running streaming.

This can be easily reproduced by remux a variable frame rate source with dash
muxer and set streaming to 1.

When flushing a fragment, we may guess the duration if we have not got the next
package. Then when we get the first packet of the next fragment, we intend to
set the trk->cluster[0].dts to match the previously guessed duration.  However,
at this time trk->track_duration may already be updated from
mov_write_single_packet() when auto flushing the empty fragment (e.g. if
FF_MOV_FLAG_FRAG_EVERY_FRAME is set), but no moof was written to reflect this
change. So the trk->cluster[0].dts do NOT actually match the previously written
duration. Then, when flushing the next non-empty fragment, we will get a wrong
"duration", and track->frag_start will be out of sync. And it cannot get back on
track again.

This error is expected to accumulate over time. trk->cluster[0].dts will only
get too large. it will not get too small because we have check_pkt(), so
track->frag_start will fall more and more behind from input dts.

To fix this, we can directly remove the track->frag_start. It can always be
replaced by "track->cluster[0].dts - track->start_dts".

Signed-off-by: Hu Weiwen <sehuww@mail.scut.edu.cn>
---
 libavformat/movenc.c | 24 ++++--------------------
 libavformat/movenc.h |  1 -
 2 files changed, 4 insertions(+), 21 deletions(-)

Comments

Martin Storsjö July 19, 2021, 8:33 p.m. UTC | #1
Hi,

Thanks for the patch! I'll try to look into it in a while (I'm a bit 
swamped and short on time at the moment), hopefully within a couple days 
or so.

On Sun, 18 Jul 2021, Hu Weiwen wrote:

> Fix an edge case when auto flushing an empty fragment, and the previous fragment
> has inaccurate duration, the track->frag_start would be out of sync from input
> dts, and all subsequent tfdt box will have out of sync base_media_decode_time.
> This error can accumulate to quite large over long-running streaming.
>
> This can be easily reproduced by remux a variable frame rate source with dash
> muxer and set streaming to 1.

Can you provide a specific sample and a command line, and point out what 
values in the output file that are mismatched - so I can zoom in on the 
issue quicker once I have time to sit down and look at it?

// Martin
胡玮文 July 20, 2021, 5:18 a.m. UTC | #2
Hi Martin,

OK, here I provide some codes to demo this bug. Many of these are copied from libavformat/tests/movenc.c

#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libavutil/opt.h>
#include <libavutil/intreadwrite.h>

static const uint8_t h264_extradata[] = {
    0x01, 0x4d, 0x40, 0x1e, 0xff, 0xe1, 0x00, 0x02, 0x67, 0x4d, 0x01, 0x00, 0x02, 0x68, 0xef
};

AVPacket *pkt;
AVFormatContext *ctx;

void mux_packet(int64_t ts) {
    uint8_t pktdata[8] = { 0 };

    av_packet_unref(pkt);
    pkt->stream_index = 0;
    pkt->pts = pkt->dts = ts;
    AV_WB32(pktdata + 4, pkt->pts);
    pkt->data = pktdata;
    pkt->size = 8;

    av_write_frame(ctx, pkt);
}

int main() {
    avformat_alloc_output_context2(&ctx, NULL, NULL, "bug.mp4");
    // Setting this flag is important
    av_opt_set(ctx, "movflags", "+frag_every_frame", AV_OPT_SEARCH_CHILDREN);
    AVStream *st = avformat_new_stream(ctx, NULL);
    st->codecpar->codec_type = AVMEDIA_TYPE_VIDEO;
    st->codecpar->codec_id = AV_CODEC_ID_H264;
    st->codecpar->width = 640;
    st->codecpar->height = 480;
    st->time_base.num = 1;
    st->time_base.den = 30;
    st->codecpar->extradata_size = sizeof(h264_extradata);
    st->codecpar->extradata = av_mallocz(st->codecpar->extradata_size + AV_INPUT_BUFFER_PADDING_SIZE);
    memcpy(st->codecpar->extradata, h264_extradata, sizeof(h264_extradata));
    avio_open(&ctx->pb, "bug.mp4", AVIO_FLAG_WRITE);
    avformat_write_header(ctx, NULL);

    pkt = av_packet_alloc();
    mux_packet(0);
    mux_packet(1);
    mux_packet(2);
    av_write_frame(ctx, NULL); // Manually flush a frag
    mux_packet(10);            // Automatically flush an empty frag, because frag_every_frame is set
    mux_packet(11);
    mux_packet(12);

    av_write_trailer(ctx);
    avformat_free_context(ctx);
    av_packet_free(&pkt);
}

After compile and run this code, run this command to inspect the dts (which comes from the out of sync tfdt)

ffprobe -show_packets bug.mp4 | grep dts=

The output is:

dts=0
dts=1
dts=2
dts=2
dts=3
dts=4

With this patch applied, the output is:

dts=0
dts=1
dts=2
dts=10
dts=11
dts=12

Regards,
Hu Weiwen

On Mon, Jul 19, 2021 at 11:33:59PM +0300, Martin Storsjö wrote:
> Hi,
> 
> Thanks for the patch! I'll try to look into it in a while (I'm a bit swamped
> and short on time at the moment), hopefully within a couple days or so.
> 
> On Sun, 18 Jul 2021, Hu Weiwen wrote:
> 
> > Fix an edge case when auto flushing an empty fragment, and the previous fragment
> > has inaccurate duration, the track->frag_start would be out of sync from input
> > dts, and all subsequent tfdt box will have out of sync base_media_decode_time.
> > This error can accumulate to quite large over long-running streaming.
> > 
> > This can be easily reproduced by remux a variable frame rate source with dash
> > muxer and set streaming to 1.
> 
> Can you provide a specific sample and a command line, and point out what
> values in the output file that are mismatched - so I can zoom in on the
> issue quicker once I have time to sit down and look at it?
> 
> // Martin
diff mbox series

Patch

diff --git a/libavformat/movenc.c b/libavformat/movenc.c
index c85efe87489..48736acab65 100644
--- a/libavformat/movenc.c
+++ b/libavformat/movenc.c
@@ -4482,8 +4482,7 @@  static int mov_write_tfxd_tag(AVIOContext *pb, MOVTrack *track)
     avio_write(pb, uuid, sizeof(uuid));
     avio_w8(pb, 1);
     avio_wb24(pb, 0);
-    avio_wb64(pb, track->start_dts + track->frag_start +
-                  track->cluster[0].cts);
+    avio_wb64(pb, track->cluster[0].dts + track->cluster[0].cts);
     avio_wb64(pb, track->end_pts -
                   (track->cluster[0].dts + track->cluster[0].cts));
 
@@ -4562,8 +4561,7 @@  static int mov_add_tfra_entries(AVIOContext *pb, MOVMuxContext *mov, int tracks,
         info->size     = size;
         // Try to recreate the original pts for the first packet
         // from the fields we have stored
-        info->time     = track->start_dts + track->frag_start +
-                         track->cluster[0].cts;
+        info->time     = track->cluster[0].dts + track->cluster[0].cts;
         info->duration = track->end_pts -
                          (track->cluster[0].dts + track->cluster[0].cts);
         // If the pts is less than zero, we will have trimmed
@@ -4601,7 +4599,7 @@  static int mov_write_tfdt_tag(AVIOContext *pb, MOVTrack *track)
     ffio_wfourcc(pb, "tfdt");
     avio_w8(pb, 1); /* version */
     avio_wb24(pb, 0);
-    avio_wb64(pb, track->frag_start);
+    avio_wb64(pb, track->cluster[0].dts - track->start_dts);
     return update_size(pb, pos);
 }
 
@@ -4678,8 +4676,7 @@  static int mov_write_sidx_tag(AVIOContext *pb,
 
     if (track->entry) {
         entries = 1;
-        presentation_time = track->start_dts + track->frag_start +
-                            track->cluster[0].cts;
+        presentation_time = track->cluster[0].dts + track->cluster[0].cts;
         duration = track->end_pts -
                    (track->cluster[0].dts + track->cluster[0].cts);
         starts_with_SAP = track->cluster[0].flags & MOV_SYNC_SAMPLE;
@@ -5358,10 +5355,6 @@  static int mov_flush_fragment(AVFormatContext *s, int force)
         mov->moov_written = 1;
         mov->mdat_size = 0;
         for (i = 0; i < mov->nb_streams; i++) {
-            if (mov->tracks[i].entry)
-                mov->tracks[i].frag_start += mov->tracks[i].start_dts +
-                                             mov->tracks[i].track_duration -
-                                             mov->tracks[i].cluster[0].dts;
             mov->tracks[i].entry = 0;
             mov->tracks[i].end_reliable = 0;
         }
@@ -5415,11 +5408,7 @@  static int mov_flush_fragment(AVFormatContext *s, int force)
         MOVTrack *track = &mov->tracks[i];
         int buf_size, write_moof = 1, moof_tracks = -1;
         uint8_t *buf;
-        int64_t duration = 0;
 
-        if (track->entry)
-            duration = track->start_dts + track->track_duration -
-                       track->cluster[0].dts;
         if (mov->flags & FF_MOV_FLAG_SEPARATE_MOOF) {
             if (!track->mdat_buf)
                 continue;
@@ -5439,8 +5428,6 @@  static int mov_flush_fragment(AVFormatContext *s, int force)
             ffio_wfourcc(s->pb, "mdat");
         }
 
-        if (track->entry)
-            track->frag_start += duration;
         track->entry = 0;
         track->entries_flushed = 0;
         track->end_reliable = 0;
@@ -5759,7 +5746,6 @@  int ff_mov_write_packet(AVFormatContext *s, AVPacket *pkt)
             /* New fragment, but discontinuous from previous fragments.
              * Pretend the duration sum of the earlier fragments is
              * pkt->dts - trk->start_dts. */
-            trk->frag_start = pkt->dts - trk->start_dts;
             trk->end_pts = AV_NOPTS_VALUE;
             trk->frag_discont = 0;
         }
@@ -5781,12 +5767,10 @@  int ff_mov_write_packet(AVFormatContext *s, AVPacket *pkt)
                 /* Pretend the whole stream started at pts=0, with earlier fragments
                  * already written. If the stream started at pts=0, the duration sum
                  * of earlier fragments would have been pkt->pts. */
-                trk->frag_start = pkt->pts;
                 trk->start_dts  = pkt->dts - pkt->pts;
             } else {
                 /* Pretend the whole stream started at dts=0, with earlier fragments
                  * already written, with a duration summing up to pkt->dts. */
-                trk->frag_start = pkt->dts;
                 trk->start_dts  = 0;
             }
             trk->frag_discont = 0;
diff --git a/libavformat/movenc.h b/libavformat/movenc.h
index af1ea0bce64..daeaad1cc68 100644
--- a/libavformat/movenc.h
+++ b/libavformat/movenc.h
@@ -138,7 +138,6 @@  typedef struct MOVTrack {
 
     AVIOContext *mdat_buf;
     int64_t     data_offset;
-    int64_t     frag_start;
     int         frag_discont;
     int         entries_flushed;