Message ID | 20210718102232.1382376-1-sehuww@mail.scut.edu.cn |
---|---|
State | New |
Headers | show |
Series | [FFmpeg-devel] avformat/movenc: Fix tfdt out of sync | expand |
Context | Check | Description |
---|---|---|
andriy/x86_make | success | Make finished |
andriy/x86_make_fate | success | Make fate finished |
andriy/PPC64_make | success | Make finished |
andriy/PPC64_make_fate | success | Make fate finished |
Hi, Thanks for the patch! I'll try to look into it in a while (I'm a bit swamped and short on time at the moment), hopefully within a couple days or so. On Sun, 18 Jul 2021, Hu Weiwen wrote: > Fix an edge case when auto flushing an empty fragment, and the previous fragment > has inaccurate duration, the track->frag_start would be out of sync from input > dts, and all subsequent tfdt box will have out of sync base_media_decode_time. > This error can accumulate to quite large over long-running streaming. > > This can be easily reproduced by remux a variable frame rate source with dash > muxer and set streaming to 1. Can you provide a specific sample and a command line, and point out what values in the output file that are mismatched - so I can zoom in on the issue quicker once I have time to sit down and look at it? // Martin
Hi Martin, OK, here I provide some codes to demo this bug. Many of these are copied from libavformat/tests/movenc.c #include <libavcodec/avcodec.h> #include <libavformat/avformat.h> #include <libavutil/opt.h> #include <libavutil/intreadwrite.h> static const uint8_t h264_extradata[] = { 0x01, 0x4d, 0x40, 0x1e, 0xff, 0xe1, 0x00, 0x02, 0x67, 0x4d, 0x01, 0x00, 0x02, 0x68, 0xef }; AVPacket *pkt; AVFormatContext *ctx; void mux_packet(int64_t ts) { uint8_t pktdata[8] = { 0 }; av_packet_unref(pkt); pkt->stream_index = 0; pkt->pts = pkt->dts = ts; AV_WB32(pktdata + 4, pkt->pts); pkt->data = pktdata; pkt->size = 8; av_write_frame(ctx, pkt); } int main() { avformat_alloc_output_context2(&ctx, NULL, NULL, "bug.mp4"); // Setting this flag is important av_opt_set(ctx, "movflags", "+frag_every_frame", AV_OPT_SEARCH_CHILDREN); AVStream *st = avformat_new_stream(ctx, NULL); st->codecpar->codec_type = AVMEDIA_TYPE_VIDEO; st->codecpar->codec_id = AV_CODEC_ID_H264; st->codecpar->width = 640; st->codecpar->height = 480; st->time_base.num = 1; st->time_base.den = 30; st->codecpar->extradata_size = sizeof(h264_extradata); st->codecpar->extradata = av_mallocz(st->codecpar->extradata_size + AV_INPUT_BUFFER_PADDING_SIZE); memcpy(st->codecpar->extradata, h264_extradata, sizeof(h264_extradata)); avio_open(&ctx->pb, "bug.mp4", AVIO_FLAG_WRITE); avformat_write_header(ctx, NULL); pkt = av_packet_alloc(); mux_packet(0); mux_packet(1); mux_packet(2); av_write_frame(ctx, NULL); // Manually flush a frag mux_packet(10); // Automatically flush an empty frag, because frag_every_frame is set mux_packet(11); mux_packet(12); av_write_trailer(ctx); avformat_free_context(ctx); av_packet_free(&pkt); } After compile and run this code, run this command to inspect the dts (which comes from the out of sync tfdt) ffprobe -show_packets bug.mp4 | grep dts= The output is: dts=0 dts=1 dts=2 dts=2 dts=3 dts=4 With this patch applied, the output is: dts=0 dts=1 dts=2 dts=10 dts=11 dts=12 Regards, Hu Weiwen On Mon, Jul 19, 2021 at 11:33:59PM +0300, Martin Storsjö wrote: > Hi, > > Thanks for the patch! I'll try to look into it in a while (I'm a bit swamped > and short on time at the moment), hopefully within a couple days or so. > > On Sun, 18 Jul 2021, Hu Weiwen wrote: > > > Fix an edge case when auto flushing an empty fragment, and the previous fragment > > has inaccurate duration, the track->frag_start would be out of sync from input > > dts, and all subsequent tfdt box will have out of sync base_media_decode_time. > > This error can accumulate to quite large over long-running streaming. > > > > This can be easily reproduced by remux a variable frame rate source with dash > > muxer and set streaming to 1. > > Can you provide a specific sample and a command line, and point out what > values in the output file that are mismatched - so I can zoom in on the > issue quicker once I have time to sit down and look at it? > > // Martin
Hi, On Tue, 20 Jul 2021, 胡玮文 wrote: > After compile and run this code, run this command to inspect the dts (which comes from the out of sync tfdt) > > ffprobe -show_packets bug.mp4 | grep dts= > > The output is: > > dts=0 > dts=1 > dts=2 > dts=2 > dts=3 > dts=4 > > With this patch applied, the output is: > > dts=0 > dts=1 > dts=2 > dts=10 > dts=11 > dts=12 Thanks for the repro case, and sorry for the delay in looking at it. I do see the issue, but I disagree with your suggested solution. While your patch does create the correct, intended value in tfdt, you will instead create a file where the dts calculated from adding up previous sample durations differ from what's written in tfdt. So depending on whether the demuxer just accumulates durations or reads tfdt, it will produce a different result. I guess it can be argued that if a demuxer reads a fragmented file, then tfdt should be more authoritative than duration and any other reader is buggy, but nevertheless, the code as is is designed to make sure that tfdt is consistent with the sum of durations. It seems it's possible to fix the same issue differently though, by not adjusting track_duration and end_pts when autoflushing, if there's no samples in the track that are going to be flushed. That way, we retain the existing intended logic of the muxer, while avoiding diverging. The result of your repro example, with my movenc modification, produces this dts sequence: dts=0 dts=1 dts=2 dts=2 dts=11 dts=12 This is, of course less nice than what we had before, but after flushing the fragment containing tfdt=2, duration=0, the only consistent choice we have is to start the next fragment at tfdt/dts=2. However, I'm open to add an option to ignore the end of the previous fragment and make the new fragment start at the exact desired timestamp. // Martin
diff --git a/libavformat/movenc.c b/libavformat/movenc.c index c85efe87489..48736acab65 100644 --- a/libavformat/movenc.c +++ b/libavformat/movenc.c @@ -4482,8 +4482,7 @@ static int mov_write_tfxd_tag(AVIOContext *pb, MOVTrack *track) avio_write(pb, uuid, sizeof(uuid)); avio_w8(pb, 1); avio_wb24(pb, 0); - avio_wb64(pb, track->start_dts + track->frag_start + - track->cluster[0].cts); + avio_wb64(pb, track->cluster[0].dts + track->cluster[0].cts); avio_wb64(pb, track->end_pts - (track->cluster[0].dts + track->cluster[0].cts)); @@ -4562,8 +4561,7 @@ static int mov_add_tfra_entries(AVIOContext *pb, MOVMuxContext *mov, int tracks, info->size = size; // Try to recreate the original pts for the first packet // from the fields we have stored - info->time = track->start_dts + track->frag_start + - track->cluster[0].cts; + info->time = track->cluster[0].dts + track->cluster[0].cts; info->duration = track->end_pts - (track->cluster[0].dts + track->cluster[0].cts); // If the pts is less than zero, we will have trimmed @@ -4601,7 +4599,7 @@ static int mov_write_tfdt_tag(AVIOContext *pb, MOVTrack *track) ffio_wfourcc(pb, "tfdt"); avio_w8(pb, 1); /* version */ avio_wb24(pb, 0); - avio_wb64(pb, track->frag_start); + avio_wb64(pb, track->cluster[0].dts - track->start_dts); return update_size(pb, pos); } @@ -4678,8 +4676,7 @@ static int mov_write_sidx_tag(AVIOContext *pb, if (track->entry) { entries = 1; - presentation_time = track->start_dts + track->frag_start + - track->cluster[0].cts; + presentation_time = track->cluster[0].dts + track->cluster[0].cts; duration = track->end_pts - (track->cluster[0].dts + track->cluster[0].cts); starts_with_SAP = track->cluster[0].flags & MOV_SYNC_SAMPLE; @@ -5358,10 +5355,6 @@ static int mov_flush_fragment(AVFormatContext *s, int force) mov->moov_written = 1; mov->mdat_size = 0; for (i = 0; i < mov->nb_streams; i++) { - if (mov->tracks[i].entry) - mov->tracks[i].frag_start += mov->tracks[i].start_dts + - mov->tracks[i].track_duration - - mov->tracks[i].cluster[0].dts; mov->tracks[i].entry = 0; mov->tracks[i].end_reliable = 0; } @@ -5415,11 +5408,7 @@ static int mov_flush_fragment(AVFormatContext *s, int force) MOVTrack *track = &mov->tracks[i]; int buf_size, write_moof = 1, moof_tracks = -1; uint8_t *buf; - int64_t duration = 0; - if (track->entry) - duration = track->start_dts + track->track_duration - - track->cluster[0].dts; if (mov->flags & FF_MOV_FLAG_SEPARATE_MOOF) { if (!track->mdat_buf) continue; @@ -5439,8 +5428,6 @@ static int mov_flush_fragment(AVFormatContext *s, int force) ffio_wfourcc(s->pb, "mdat"); } - if (track->entry) - track->frag_start += duration; track->entry = 0; track->entries_flushed = 0; track->end_reliable = 0; @@ -5759,7 +5746,6 @@ int ff_mov_write_packet(AVFormatContext *s, AVPacket *pkt) /* New fragment, but discontinuous from previous fragments. * Pretend the duration sum of the earlier fragments is * pkt->dts - trk->start_dts. */ - trk->frag_start = pkt->dts - trk->start_dts; trk->end_pts = AV_NOPTS_VALUE; trk->frag_discont = 0; } @@ -5781,12 +5767,10 @@ int ff_mov_write_packet(AVFormatContext *s, AVPacket *pkt) /* Pretend the whole stream started at pts=0, with earlier fragments * already written. If the stream started at pts=0, the duration sum * of earlier fragments would have been pkt->pts. */ - trk->frag_start = pkt->pts; trk->start_dts = pkt->dts - pkt->pts; } else { /* Pretend the whole stream started at dts=0, with earlier fragments * already written, with a duration summing up to pkt->dts. */ - trk->frag_start = pkt->dts; trk->start_dts = 0; } trk->frag_discont = 0; diff --git a/libavformat/movenc.h b/libavformat/movenc.h index af1ea0bce64..daeaad1cc68 100644 --- a/libavformat/movenc.h +++ b/libavformat/movenc.h @@ -138,7 +138,6 @@ typedef struct MOVTrack { AVIOContext *mdat_buf; int64_t data_offset; - int64_t frag_start; int frag_discont; int entries_flushed;
Fix an edge case when auto flushing an empty fragment, and the previous fragment has inaccurate duration, the track->frag_start would be out of sync from input dts, and all subsequent tfdt box will have out of sync base_media_decode_time. This error can accumulate to quite large over long-running streaming. This can be easily reproduced by remux a variable frame rate source with dash muxer and set streaming to 1. When flushing a fragment, we may guess the duration if we have not got the next package. Then when we get the first packet of the next fragment, we intend to set the trk->cluster[0].dts to match the previously guessed duration. However, at this time trk->track_duration may already be updated from mov_write_single_packet() when auto flushing the empty fragment (e.g. if FF_MOV_FLAG_FRAG_EVERY_FRAME is set), but no moof was written to reflect this change. So the trk->cluster[0].dts do NOT actually match the previously written duration. Then, when flushing the next non-empty fragment, we will get a wrong "duration", and track->frag_start will be out of sync. And it cannot get back on track again. This error is expected to accumulate over time. trk->cluster[0].dts will only get too large. it will not get too small because we have check_pkt(), so track->frag_start will fall more and more behind from input dts. To fix this, we can directly remove the track->frag_start. It can always be replaced by "track->cluster[0].dts - track->start_dts". Signed-off-by: Hu Weiwen <sehuww@mail.scut.edu.cn> --- libavformat/movenc.c | 24 ++++-------------------- libavformat/movenc.h | 1 - 2 files changed, 4 insertions(+), 21 deletions(-)