Message ID | 20210521002954.1684-1-michael.dirks@xaymar.com |
---|---|
State | New |
Headers | show |
Series | [FFmpeg-devel] avformat/matroskaenc: Allow changing the time stamp precision via option | expand |
Context | Check | Description |
---|---|---|
andriy/x86_make | success | Make finished |
andriy/x86_make_fate | success | Make fate finished |
andriy/PPC64_make | success | Make finished |
andriy/PPC64_make_fate | success | Make fate finished |
On 5/20/2021 9:29 PM, michael.dirks@xaymar.com wrote: > From: Michael Fabian 'Xaymar' Dirks <michael.dirks@xaymar.com> > > Adds "timestamp_precision" to the available options for Matroska muxing. > The option enables users and developers to change the precision of the > time stamps in the Matroska container up to 1 nanosecond, which can aid > with the proper detection of constant and variable rate content. > > Work-around fix for: 259, 6406, 7927, 8909 and 9124. > > Signed-off-by: Michael Fabian 'Xaymar' Dirks <michael.dirks@xaymar.com> > --- > doc/muxers.texi | 8 ++++++++ > libavformat/matroskaenc.c | 33 ++++++++++++++++++++++++++------- > 2 files changed, 34 insertions(+), 7 deletions(-) Will apply. Thanks, and sorry for all the bikeshedding.
On 2021-05-21 02:45, James Almer wrote: > On 5/20/2021 9:29 PM, michael.dirks@xaymar.com wrote: >> From: Michael Fabian 'Xaymar' Dirks <michael.dirks@xaymar.com> >> >> Adds "timestamp_precision" to the available options for Matroska muxing. >> The option enables users and developers to change the precision of the >> time stamps in the Matroska container up to 1 nanosecond, which can aid >> with the proper detection of constant and variable rate content. >> >> Work-around fix for: 259, 6406, 7927, 8909 and 9124. >> >> Signed-off-by: Michael Fabian 'Xaymar' Dirks <michael.dirks@xaymar.com> >> --- >> doc/muxers.texi | 8 ++++++++ >> libavformat/matroskaenc.c | 33 ++++++++++++++++++++++++++------- >> 2 files changed, 34 insertions(+), 7 deletions(-) > > Will apply. Thanks, and sorry for all the bikeshedding. It's fine, it's better to be correct the first time than to discover a problem much later that could have been solved by being correct in the first place.
michael.dirks@xaymar.com: > From: Michael Fabian 'Xaymar' Dirks <michael.dirks@xaymar.com> > > Adds "timestamp_precision" to the available options for Matroska muxing. > The option enables users and developers to change the precision of the > time stamps in the Matroska container up to 1 nanosecond, which can aid > with the proper detection of constant and variable rate content. > > Work-around fix for: 259, 6406, 7927, 8909 and 9124. > > Signed-off-by: Michael Fabian 'Xaymar' Dirks <michael.dirks@xaymar.com> > --- > doc/muxers.texi | 8 ++++++++ > libavformat/matroskaenc.c | 33 ++++++++++++++++++++++++++------- > 2 files changed, 34 insertions(+), 7 deletions(-) > > diff --git a/doc/muxers.texi b/doc/muxers.texi > index e1c6ad0829..8655be94ff 100644 > --- a/doc/muxers.texi > +++ b/doc/muxers.texi > @@ -1583,6 +1583,14 @@ bitmap is stored bottom-up. Note that this option does not flip the bitmap > which has to be done manually beforehand, e.g. by using the vflip filter. > Default is @var{false} and indicates bitmap is stored top down. > > +@item timestamp_precision > +Sets the timestamp precision up to 1 nanosecond for Matroska/WebM, which can > +improve detection of constant rate content in demuxers. Note that some poorly > +implemented demuxers may require a timestamp precision of 1 millisecond, so > +increasing it past that point may result in playback issues. Higher precision > +also reduces the maximum possible timestamp significantly. This should mention that a too high precision will increase container overhead. > +Default is @var{1/1000} (1 millisecond). > + > @end table > > @anchor{md5} > diff --git a/libavformat/matroskaenc.c b/libavformat/matroskaenc.c > index 186a25d920..1b911a648c 100644 > --- a/libavformat/matroskaenc.c > +++ b/libavformat/matroskaenc.c > @@ -158,6 +158,8 @@ typedef struct MatroskaMuxContext { > int default_mode; > > uint32_t segment_uid[4]; > + > + AVRational time_base; > } MatroskaMuxContext; > > /** 2 bytes * 7 for EBML IDs, 7 1-byte EBML lengths, 6 1-byte uint, > @@ -1814,6 +1816,7 @@ static int mkv_write_header(AVFormatContext *s) > const AVDictionaryEntry *tag; > int ret, i, version = 2; > int64_t creation_time; > + int64_t time_base = 1; > > if (mkv->mode != MODE_WEBM || > av_dict_get(s->metadata, "stereo_mode", NULL, 0) || > @@ -1850,7 +1853,10 @@ static int mkv_write_header(AVFormatContext *s) > return ret; > pb = mkv->info.bc; > > - put_ebml_uint(pb, MATROSKA_ID_TIMECODESCALE, 1000000); > + time_base = av_rescale_q(time_base, mkv->time_base, (AVRational){1, 1000000000}); > + av_log(s, AV_LOG_DEBUG, "TimestampScale is: %" PRId64 " ns\n", time_base); > + put_ebml_uint(pb, MATROSKA_ID_TIMECODESCALE, time_base); There is a serious problem here: mkv->time_base is the time base that the muxer uses; yet if mkv->time_base is not a multiple of 1/1000000000, then av_rescale_q will have to round a bit and then the demuxer will read a different time base, leading to a drift of all timestamps. This is not acceptable. > + > if ((tag = av_dict_get(s->metadata, "title", NULL, 0))) > put_ebml_string(pb, MATROSKA_ID_TITLE, tag->value); > if (!(s->flags & AVFMT_FLAG_BITEXACT)) { > @@ -1883,11 +1889,11 @@ static int mkv_write_header(AVFormatContext *s) > int64_t metadata_duration = get_metadata_duration(s); > > if (s->duration > 0) { > - int64_t scaledDuration = av_rescale(s->duration, 1000, AV_TIME_BASE); > + int64_t scaledDuration = av_rescale_q(s->duration, AV_TIME_BASE_Q, mkv->time_base); > put_ebml_float(pb, MATROSKA_ID_DURATION, scaledDuration); > av_log(s, AV_LOG_DEBUG, "Write early duration from recording time = %" PRIu64 "\n", scaledDuration); > } else if (metadata_duration > 0) { > - int64_t scaledDuration = av_rescale(metadata_duration, 1000, AV_TIME_BASE); > + int64_t scaledDuration = av_rescale_q(metadata_duration, AV_TIME_BASE_Q, mkv->time_base); > put_ebml_float(pb, MATROSKA_ID_DURATION, scaledDuration); > av_log(s, AV_LOG_DEBUG, "Write early duration from metadata = %" PRIu64 "\n", scaledDuration); > } else if (s->pb->seekable & AVIO_SEEKABLE_NORMAL) { > @@ -1948,12 +1954,12 @@ static int mkv_write_header(AVFormatContext *s) > // after 4k and on a keyframe > if (IS_SEEKABLE(pb, mkv)) { > if (mkv->cluster_time_limit < 0) > - mkv->cluster_time_limit = 5000; > + mkv->cluster_time_limit = av_rescale_q(5000, (AVRational){1, 1000}, mkv->time_base); > if (mkv->cluster_size_limit < 0) > mkv->cluster_size_limit = 5 * 1024 * 1024; > } else { > if (mkv->cluster_time_limit < 0) > - mkv->cluster_time_limit = 1000; > + mkv->cluster_time_limit = av_rescale_q(1000, (AVRational){1, 1000}, mkv->time_base); There are three places where you rescale this; it would be better if you did it unconditionally in one place (namely after this block here). > if (mkv->cluster_size_limit < 0) > mkv->cluster_size_limit = 32 * 1024; > } > @@ -2713,6 +2719,14 @@ static int mkv_init(struct AVFormatContext *s) > } else > mkv->mode = MODE_MATROSKAv2; > > + // WebM requires a timestamp precision of 1ms. > + if (mkv->mode == MODE_WEBM) { > + if (av_cmp_q(mkv->time_base, (AVRational){1, 1000}) != 0) { > + av_log(s, AV_LOG_ERROR, "WebM requires 1ms timestamp precision\n"); > + return AVERROR(EINVAL); > + } (This could be put into the same block as the one just above that sets the mode.) > + } > + > mkv->cur_audio_pkt = av_packet_alloc(); > if (!mkv->cur_audio_pkt) > return AVERROR(ENOMEM); > @@ -2738,8 +2752,8 @@ static int mkv_init(struct AVFormatContext *s) > track->uid = mkv_get_uid(mkv->tracks, i, &c); > } > > - // ms precision is the de-facto standard timescale for mkv files > - avpriv_set_pts_info(st, 64, 1, 1000); > + // Use user-defined timescale. > + avpriv_set_pts_info(st, 64, mkv->time_base.num, mkv->time_base.den); > > if (st->codecpar->codec_type == AVMEDIA_TYPE_ATTACHMENT) { > if (mkv->mode == MODE_WEBM) { > @@ -2759,6 +2773,10 @@ static int mkv_init(struct AVFormatContext *s) > track->track_num_size = ebml_num_size(track->track_num); > } > > + // Scale the configured cluster_time_limit. > + if (mkv->cluster_time_limit >= 0) > + mkv->cluster_time_limit = av_rescale_q(mkv->cluster_time_limit, (AVRational){1, 1000}, mkv->time_base); > + > if (mkv->is_dash && nb_tracks != 1) > return AVERROR(EINVAL); > > @@ -2826,6 +2844,7 @@ static const AVOption options[] = { > { "infer", "For each track type, mark the first track of disposition default as default; if none exists, mark the first track as default.", 0, AV_OPT_TYPE_CONST, { .i64 = DEFAULT_MODE_INFER }, 0, 0, FLAGS, "default_mode" }, > { "infer_no_subs", "For each track type, mark the first track of disposition default as default; for audio and video: if none exists, mark the first track as default.", 0, AV_OPT_TYPE_CONST, { .i64 = DEFAULT_MODE_INFER_NO_SUBS }, 0, 0, FLAGS, "default_mode" }, > { "passthrough", "Use the disposition flag as-is", 0, AV_OPT_TYPE_CONST, { .i64 = DEFAULT_MODE_PASSTHROUGH }, 0, 0, FLAGS, "default_mode" }, > + { "timestamp_precision", "Timestamp precision to use for the entire container", OFFSET(time_base), AV_OPT_TYPE_RATIONAL, { .dbl = 0.001 }, 0.000000001, INT_MAX, FLAGS}, Using a default value that is a valid value makes it impossible to distinguish whether the user set a value at all; but I'd like to preserve that possibility. > { NULL }, > }; > >
On 2021-05-24 22:15, Andreas Rheinhardt wrote: > michael.dirks@xaymar.com: >> From: Michael Fabian 'Xaymar' Dirks <michael.dirks@xaymar.com> >> >> Adds "timestamp_precision" to the available options for Matroska muxing. >> The option enables users and developers to change the precision of the >> time stamps in the Matroska container up to 1 nanosecond, which can aid >> with the proper detection of constant and variable rate content. >> >> Work-around fix for: 259, 6406, 7927, 8909 and 9124. >> >> Signed-off-by: Michael Fabian 'Xaymar' Dirks <michael.dirks@xaymar.com> >> --- >> doc/muxers.texi | 8 ++++++++ >> libavformat/matroskaenc.c | 33 ++++++++++++++++++++++++++------- >> 2 files changed, 34 insertions(+), 7 deletions(-) >> >> diff --git a/doc/muxers.texi b/doc/muxers.texi >> index e1c6ad0829..8655be94ff 100644 >> --- a/doc/muxers.texi >> +++ b/doc/muxers.texi >> @@ -1583,6 +1583,14 @@ bitmap is stored bottom-up. Note that this option does not flip the bitmap >> which has to be done manually beforehand, e.g. by using the vflip filter. >> Default is @var{false} and indicates bitmap is stored top down. >> >> +@item timestamp_precision >> +Sets the timestamp precision up to 1 nanosecond for Matroska/WebM, which can >> +improve detection of constant rate content in demuxers. Note that some poorly >> +implemented demuxers may require a timestamp precision of 1 millisecond, so >> +increasing it past that point may result in playback issues. Higher precision >> +also reduces the maximum possible timestamp significantly. > This should mention that a too high precision will increase container > overhead. Good point. > >> +Default is @var{1/1000} (1 millisecond). >> + >> @end table >> >> @anchor{md5} >> diff --git a/libavformat/matroskaenc.c b/libavformat/matroskaenc.c >> index 186a25d920..1b911a648c 100644 >> --- a/libavformat/matroskaenc.c >> +++ b/libavformat/matroskaenc.c >> @@ -158,6 +158,8 @@ typedef struct MatroskaMuxContext { >> int default_mode; >> >> uint32_t segment_uid[4]; >> + >> + AVRational time_base; >> } MatroskaMuxContext; >> >> /** 2 bytes * 7 for EBML IDs, 7 1-byte EBML lengths, 6 1-byte uint, >> @@ -1814,6 +1816,7 @@ static int mkv_write_header(AVFormatContext *s) >> const AVDictionaryEntry *tag; >> int ret, i, version = 2; >> int64_t creation_time; >> + int64_t time_base = 1; >> >> if (mkv->mode != MODE_WEBM || >> av_dict_get(s->metadata, "stereo_mode", NULL, 0) || >> @@ -1850,7 +1853,10 @@ static int mkv_write_header(AVFormatContext *s) >> return ret; >> pb = mkv->info.bc; >> >> - put_ebml_uint(pb, MATROSKA_ID_TIMECODESCALE, 1000000); >> + time_base = av_rescale_q(time_base, mkv->time_base, (AVRational){1, 1000000000}); >> + av_log(s, AV_LOG_DEBUG, "TimestampScale is: %" PRId64 " ns\n", time_base); >> + put_ebml_uint(pb, MATROSKA_ID_TIMECODESCALE, time_base); > There is a serious problem here: mkv->time_base is the time base that > the muxer uses; yet if mkv->time_base is not a multiple of 1/1000000000, > then av_rescale_q will have to round a bit and then the demuxer will > read a different time base, leading to a drift of all timestamps. This > is not acceptable. This issue is already present in the current version of FFmpeg. Unfortunately even if you tell me this, this is not something I can change: Matroska uses nanosecond precision, and nobody has agreed on what the way forward for timestamps is. You will have to bring up such nanosecond-precision problems with the Matroska specification maintainers itself, which are currently seeking IETF standardization: https://github.com/ietf-wg-cellar/matroska-specification/issues/422 > >> + >> if ((tag = av_dict_get(s->metadata, "title", NULL, 0))) >> put_ebml_string(pb, MATROSKA_ID_TITLE, tag->value); >> if (!(s->flags & AVFMT_FLAG_BITEXACT)) { >> @@ -1883,11 +1889,11 @@ static int mkv_write_header(AVFormatContext *s) >> int64_t metadata_duration = get_metadata_duration(s); >> >> if (s->duration > 0) { >> - int64_t scaledDuration = av_rescale(s->duration, 1000, AV_TIME_BASE); >> + int64_t scaledDuration = av_rescale_q(s->duration, AV_TIME_BASE_Q, mkv->time_base); >> put_ebml_float(pb, MATROSKA_ID_DURATION, scaledDuration); >> av_log(s, AV_LOG_DEBUG, "Write early duration from recording time = %" PRIu64 "\n", scaledDuration); >> } else if (metadata_duration > 0) { >> - int64_t scaledDuration = av_rescale(metadata_duration, 1000, AV_TIME_BASE); >> + int64_t scaledDuration = av_rescale_q(metadata_duration, AV_TIME_BASE_Q, mkv->time_base); >> put_ebml_float(pb, MATROSKA_ID_DURATION, scaledDuration); >> av_log(s, AV_LOG_DEBUG, "Write early duration from metadata = %" PRIu64 "\n", scaledDuration); >> } else if (s->pb->seekable & AVIO_SEEKABLE_NORMAL) { >> @@ -1948,12 +1954,12 @@ static int mkv_write_header(AVFormatContext *s) >> // after 4k and on a keyframe >> if (IS_SEEKABLE(pb, mkv)) { >> if (mkv->cluster_time_limit < 0) >> - mkv->cluster_time_limit = 5000; >> + mkv->cluster_time_limit = av_rescale_q(5000, (AVRational){1, 1000}, mkv->time_base); >> if (mkv->cluster_size_limit < 0) >> mkv->cluster_size_limit = 5 * 1024 * 1024; >> } else { >> if (mkv->cluster_time_limit < 0) >> - mkv->cluster_time_limit = 1000; >> + mkv->cluster_time_limit = av_rescale_q(1000, (AVRational){1, 1000}, mkv->time_base); > There are three places where you rescale this; it would be better if you > did it unconditionally in one place (namely after this block here). > I did not want to cut into existing code too much, merely adjust it to the change without changing the main parts. >> if (mkv->cluster_size_limit < 0) >> mkv->cluster_size_limit = 32 * 1024; >> } >> @@ -2713,6 +2719,14 @@ static int mkv_init(struct AVFormatContext *s) >> } else >> mkv->mode = MODE_MATROSKAv2; >> >> + // WebM requires a timestamp precision of 1ms. >> + if (mkv->mode == MODE_WEBM) { >> + if (av_cmp_q(mkv->time_base, (AVRational){1, 1000}) != 0) { >> + av_log(s, AV_LOG_ERROR, "WebM requires 1ms timestamp precision\n"); >> + return AVERROR(EINVAL); >> + } > (This could be put into the same block as the one just above that sets > the mode.) Fair point. >> + } >> + >> mkv->cur_audio_pkt = av_packet_alloc(); >> if (!mkv->cur_audio_pkt) >> return AVERROR(ENOMEM); >> @@ -2738,8 +2752,8 @@ static int mkv_init(struct AVFormatContext *s) >> track->uid = mkv_get_uid(mkv->tracks, i, &c); >> } >> >> - // ms precision is the de-facto standard timescale for mkv files >> - avpriv_set_pts_info(st, 64, 1, 1000); >> + // Use user-defined timescale. >> + avpriv_set_pts_info(st, 64, mkv->time_base.num, mkv->time_base.den); >> >> if (st->codecpar->codec_type == AVMEDIA_TYPE_ATTACHMENT) { >> if (mkv->mode == MODE_WEBM) { >> @@ -2759,6 +2773,10 @@ static int mkv_init(struct AVFormatContext *s) >> track->track_num_size = ebml_num_size(track->track_num); >> } >> >> + // Scale the configured cluster_time_limit. >> + if (mkv->cluster_time_limit >= 0) >> + mkv->cluster_time_limit = av_rescale_q(mkv->cluster_time_limit, (AVRational){1, 1000}, mkv->time_base); >> + >> if (mkv->is_dash && nb_tracks != 1) >> return AVERROR(EINVAL); >> >> @@ -2826,6 +2844,7 @@ static const AVOption options[] = { >> { "infer", "For each track type, mark the first track of disposition default as default; if none exists, mark the first track as default.", 0, AV_OPT_TYPE_CONST, { .i64 = DEFAULT_MODE_INFER }, 0, 0, FLAGS, "default_mode" }, >> { "infer_no_subs", "For each track type, mark the first track of disposition default as default; for audio and video: if none exists, mark the first track as default.", 0, AV_OPT_TYPE_CONST, { .i64 = DEFAULT_MODE_INFER_NO_SUBS }, 0, 0, FLAGS, "default_mode" }, >> { "passthrough", "Use the disposition flag as-is", 0, AV_OPT_TYPE_CONST, { .i64 = DEFAULT_MODE_PASSTHROUGH }, 0, 0, FLAGS, "default_mode" }, >> + { "timestamp_precision", "Timestamp precision to use for the entire container", OFFSET(time_base), AV_OPT_TYPE_RATIONAL, { .dbl = 0.001 }, 0.000000001, INT_MAX, FLAGS}, > Using a default value that is a valid value makes it impossible to > distinguish whether the user set a value at all; but I'd like to > preserve that possibility. Sorry, could you explain why you want to do this further? I see no code path that would even need to check if the user/caller has set precision, if it's left at default we get the default of 1/1000 - just as previous FFmpeg versions. > >> { NULL }, >> }; >> >> > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Michael Fabian 'Xaymar' Dirks: > On 2021-05-24 22:15, Andreas Rheinhardt wrote: >> michael.dirks@xaymar.com: >>> From: Michael Fabian 'Xaymar' Dirks <michael.dirks@xaymar.com> >>> >>> Adds "timestamp_precision" to the available options for Matroska muxing. >>> The option enables users and developers to change the precision of the >>> time stamps in the Matroska container up to 1 nanosecond, which can aid >>> with the proper detection of constant and variable rate content. >>> >>> Work-around fix for: 259, 6406, 7927, 8909 and 9124. >>> >>> Signed-off-by: Michael Fabian 'Xaymar' Dirks <michael.dirks@xaymar.com> >>> --- >>> doc/muxers.texi | 8 ++++++++ >>> libavformat/matroskaenc.c | 33 ++++++++++++++++++++++++++------- >>> 2 files changed, 34 insertions(+), 7 deletions(-) >>> >>> diff --git a/doc/muxers.texi b/doc/muxers.texi >>> index e1c6ad0829..8655be94ff 100644 >>> --- a/doc/muxers.texi >>> +++ b/doc/muxers.texi >>> @@ -1583,6 +1583,14 @@ bitmap is stored bottom-up. Note that this >>> option does not flip the bitmap >>> which has to be done manually beforehand, e.g. by using the vflip >>> filter. >>> Default is @var{false} and indicates bitmap is stored top down. >>> +@item timestamp_precision >>> +Sets the timestamp precision up to 1 nanosecond for Matroska/WebM, >>> which can >>> +improve detection of constant rate content in demuxers. Note that >>> some poorly >>> +implemented demuxers may require a timestamp precision of 1 >>> millisecond, so >>> +increasing it past that point may result in playback issues. Higher >>> precision >>> +also reduces the maximum possible timestamp significantly. >> This should mention that a too high precision will increase container >> overhead. > Good point. >> >>> +Default is @var{1/1000} (1 millisecond). >>> + >>> @end table >>> @anchor{md5} >>> diff --git a/libavformat/matroskaenc.c b/libavformat/matroskaenc.c >>> index 186a25d920..1b911a648c 100644 >>> --- a/libavformat/matroskaenc.c >>> +++ b/libavformat/matroskaenc.c >>> @@ -158,6 +158,8 @@ typedef struct MatroskaMuxContext { >>> int default_mode; >>> uint32_t segment_uid[4]; >>> + >>> + AVRational time_base; >>> } MatroskaMuxContext; >>> /** 2 bytes * 7 for EBML IDs, 7 1-byte EBML lengths, 6 1-byte uint, >>> @@ -1814,6 +1816,7 @@ static int mkv_write_header(AVFormatContext *s) >>> const AVDictionaryEntry *tag; >>> int ret, i, version = 2; >>> int64_t creation_time; >>> + int64_t time_base = 1; >>> if (mkv->mode != MODE_WEBM || >>> av_dict_get(s->metadata, "stereo_mode", NULL, 0) || >>> @@ -1850,7 +1853,10 @@ static int mkv_write_header(AVFormatContext *s) >>> return ret; >>> pb = mkv->info.bc; >>> - put_ebml_uint(pb, MATROSKA_ID_TIMECODESCALE, 1000000); >>> + time_base = av_rescale_q(time_base, mkv->time_base, >>> (AVRational){1, 1000000000}); >>> + av_log(s, AV_LOG_DEBUG, "TimestampScale is: %" PRId64 " ns\n", >>> time_base); >>> + put_ebml_uint(pb, MATROSKA_ID_TIMECODESCALE, time_base); >> There is a serious problem here: mkv->time_base is the time base that >> the muxer uses; yet if mkv->time_base is not a multiple of 1/1000000000, >> then av_rescale_q will have to round a bit and then the demuxer will >> read a different time base, leading to a drift of all timestamps. This >> is not acceptable. > This issue is already present in the current version of FFmpeg. Matroska's timestamp imprecision currently lead to jitter, but not to drift (this of course presumes that one actually uses the timestamps as-is (instead of summing the durations)). But your patch as-is can lead to drift (when using a wrong timebase), because it adds a whole new type of error: One in which the demuxer and the muxer disagree about the timebase that is in use. Consider a user using 1/750000000 as timebase. 48kHz audio (with a timebase of 1/48000) can be converted accurately to it, so that the muxer receives precise timestamps. Yet the muxer actually writes that the timebase is 1/1000000000 and that is what a demuxer sees. This means that all timestamps (except those for which Matroska uses "absolute timestamps" (Matroska-speak for a time/duration that is always in 1/1000000000 like default duration)) are skewed as-if multiplied by 3/4. > Unfortunately even if you tell me this, this is not something I can > change: Matroska uses nanosecond precision, and nobody has agreed on > what the way forward for timestamps is. You will have to bring up such > nanosecond-precision problems with the Matroska specification > maintainers itself, which are currently seeking IETF standardization: > https://github.com/ietf-wg-cellar/matroska-specification/issues/422 >> Already did so (I am mkver). >>> + >>> if ((tag = av_dict_get(s->metadata, "title", NULL, 0))) >>> put_ebml_string(pb, MATROSKA_ID_TITLE, tag->value); >>> if (!(s->flags & AVFMT_FLAG_BITEXACT)) { >>> @@ -1883,11 +1889,11 @@ static int mkv_write_header(AVFormatContext *s) >>> int64_t metadata_duration = get_metadata_duration(s); >>> if (s->duration > 0) { >>> - int64_t scaledDuration = av_rescale(s->duration, 1000, >>> AV_TIME_BASE); >>> + int64_t scaledDuration = av_rescale_q(s->duration, >>> AV_TIME_BASE_Q, mkv->time_base); >>> put_ebml_float(pb, MATROSKA_ID_DURATION, scaledDuration); >>> av_log(s, AV_LOG_DEBUG, "Write early duration from >>> recording time = %" PRIu64 "\n", scaledDuration); >>> } else if (metadata_duration > 0) { >>> - int64_t scaledDuration = av_rescale(metadata_duration, >>> 1000, AV_TIME_BASE); >>> + int64_t scaledDuration = av_rescale_q(metadata_duration, >>> AV_TIME_BASE_Q, mkv->time_base); >>> put_ebml_float(pb, MATROSKA_ID_DURATION, scaledDuration); >>> av_log(s, AV_LOG_DEBUG, "Write early duration from >>> metadata = %" PRIu64 "\n", scaledDuration); >>> } else if (s->pb->seekable & AVIO_SEEKABLE_NORMAL) { >>> @@ -1948,12 +1954,12 @@ static int mkv_write_header(AVFormatContext *s) >>> // after 4k and on a keyframe >>> if (IS_SEEKABLE(pb, mkv)) { >>> if (mkv->cluster_time_limit < 0) >>> - mkv->cluster_time_limit = 5000; >>> + mkv->cluster_time_limit = av_rescale_q(5000, >>> (AVRational){1, 1000}, mkv->time_base); >>> if (mkv->cluster_size_limit < 0) >>> mkv->cluster_size_limit = 5 * 1024 * 1024; >>> } else { >>> if (mkv->cluster_time_limit < 0) >>> - mkv->cluster_time_limit = 1000; >>> + mkv->cluster_time_limit = av_rescale_q(1000, >>> (AVRational){1, 1000}, mkv->time_base); >> There are three places where you rescale this; it would be better if you >> did it unconditionally in one place (namely after this block here). >> > I did not want to cut into existing code too much, merely adjust it to > the change without changing the main parts. >>> if (mkv->cluster_size_limit < 0) >>> mkv->cluster_size_limit = 32 * 1024; >>> } >>> @@ -2713,6 +2719,14 @@ static int mkv_init(struct AVFormatContext *s) >>> } else >>> mkv->mode = MODE_MATROSKAv2; >>> + // WebM requires a timestamp precision of 1ms. >>> + if (mkv->mode == MODE_WEBM) { >>> + if (av_cmp_q(mkv->time_base, (AVRational){1, 1000}) != 0) { >>> + av_log(s, AV_LOG_ERROR, "WebM requires 1ms timestamp >>> precision\n"); >>> + return AVERROR(EINVAL); >>> + } >> (This could be put into the same block as the one just above that sets >> the mode.) > Fair point. >>> + } >>> + >>> mkv->cur_audio_pkt = av_packet_alloc(); >>> if (!mkv->cur_audio_pkt) >>> return AVERROR(ENOMEM); >>> @@ -2738,8 +2752,8 @@ static int mkv_init(struct AVFormatContext *s) >>> track->uid = mkv_get_uid(mkv->tracks, i, &c); >>> } >>> - // ms precision is the de-facto standard timescale for mkv >>> files >>> - avpriv_set_pts_info(st, 64, 1, 1000); >>> + // Use user-defined timescale. >>> + avpriv_set_pts_info(st, 64, mkv->time_base.num, >>> mkv->time_base.den); >>> if (st->codecpar->codec_type == AVMEDIA_TYPE_ATTACHMENT) { >>> if (mkv->mode == MODE_WEBM) { >>> @@ -2759,6 +2773,10 @@ static int mkv_init(struct AVFormatContext *s) >>> track->track_num_size = ebml_num_size(track->track_num); >>> } >>> + // Scale the configured cluster_time_limit. >>> + if (mkv->cluster_time_limit >= 0) >>> + mkv->cluster_time_limit = >>> av_rescale_q(mkv->cluster_time_limit, (AVRational){1, 1000}, >>> mkv->time_base); >>> + >>> if (mkv->is_dash && nb_tracks != 1) >>> return AVERROR(EINVAL); >>> @@ -2826,6 +2844,7 @@ static const AVOption options[] = { >>> { "infer", "For each track type, mark the first track of >>> disposition default as default; if none exists, mark the first track >>> as default.", 0, AV_OPT_TYPE_CONST, { .i64 = DEFAULT_MODE_INFER }, 0, >>> 0, FLAGS, "default_mode" }, >>> { "infer_no_subs", "For each track type, mark the first track >>> of disposition default as default; for audio and video: if none >>> exists, mark the first track as default.", 0, AV_OPT_TYPE_CONST, { >>> .i64 = DEFAULT_MODE_INFER_NO_SUBS }, 0, 0, FLAGS, "default_mode" }, >>> { "passthrough", "Use the disposition flag as-is", 0, >>> AV_OPT_TYPE_CONST, { .i64 = DEFAULT_MODE_PASSTHROUGH }, 0, 0, FLAGS, >>> "default_mode" }, >>> + { "timestamp_precision", "Timestamp precision to use for the >>> entire container", OFFSET(time_base), AV_OPT_TYPE_RATIONAL, { .dbl = >>> 0.001 }, 0.000000001, INT_MAX, FLAGS}, >> Using a default value that is a valid value makes it impossible to >> distinguish whether the user set a value at all; but I'd like to >> preserve that possibility. > Sorry, could you explain why you want to do this further? I see no code > path that would even need to check if the user/caller has set precision, > if it's left at default we get the default of 1/1000 - just as previous > FFmpeg versions. >> The default value of 1/1000 is good for most use-cases, but there might be scenarios where it is not (e.g. TrueHD packets have a duration < 1ms, so different packets will end up with the same timestamp when using 1/1000). If the user has set a timestamp precision, then that should be honoured; but if not, then we might use a different value than 1/1000 (in the future; I know that currently no code path that makes use of this exists). - Andreas
On Fri, Jun 4, 2021 at 10:18 PM Andreas Rheinhardt < andreas.rheinhardt@outlook.com> wrote: > Michael Fabian 'Xaymar' Dirks: > > On 2021-05-24 22:15, Andreas Rheinhardt wrote: > >> michael.dirks@xaymar.com: > >>> From: Michael Fabian 'Xaymar' Dirks <michael.dirks@xaymar.com> > >>> > >>> Adds "timestamp_precision" to the available options for Matroska > muxing. > >>> The option enables users and developers to change the precision of the > >>> time stamps in the Matroska container up to 1 nanosecond, which can aid > >>> with the proper detection of constant and variable rate content. > >>> > >>> Work-around fix for: 259, 6406, 7927, 8909 and 9124. > >>> > >>> Signed-off-by: Michael Fabian 'Xaymar' Dirks <michael.dirks@xaymar.com > > > >>> --- > >>> doc/muxers.texi | 8 ++++++++ > >>> libavformat/matroskaenc.c | 33 ++++++++++++++++++++++++++------- > >>> 2 files changed, 34 insertions(+), 7 deletions(-) > >>> > >>> diff --git a/doc/muxers.texi b/doc/muxers.texi > >>> index e1c6ad0829..8655be94ff 100644 > >>> --- a/doc/muxers.texi > >>> +++ b/doc/muxers.texi > >>> @@ -1583,6 +1583,14 @@ bitmap is stored bottom-up. Note that this > >>> option does not flip the bitmap > >>> which has to be done manually beforehand, e.g. by using the vflip > >>> filter. > >>> Default is @var{false} and indicates bitmap is stored top down. > >>> +@item timestamp_precision > >>> +Sets the timestamp precision up to 1 nanosecond for Matroska/WebM, > >>> which can > >>> +improve detection of constant rate content in demuxers. Note that > >>> some poorly > >>> +implemented demuxers may require a timestamp precision of 1 > >>> millisecond, so > >>> +increasing it past that point may result in playback issues. Higher > >>> precision > >>> +also reduces the maximum possible timestamp significantly. > >> This should mention that a too high precision will increase container > >> overhead. > > Good point. > >> > >>> +Default is @var{1/1000} (1 millisecond). > >>> + > >>> @end table > >>> @anchor{md5} > >>> diff --git a/libavformat/matroskaenc.c b/libavformat/matroskaenc.c > >>> index 186a25d920..1b911a648c 100644 > >>> --- a/libavformat/matroskaenc.c > >>> +++ b/libavformat/matroskaenc.c > >>> @@ -158,6 +158,8 @@ typedef struct MatroskaMuxContext { > >>> int default_mode; > >>> uint32_t segment_uid[4]; > >>> + > >>> + AVRational time_base; > >>> } MatroskaMuxContext; > >>> /** 2 bytes * 7 for EBML IDs, 7 1-byte EBML lengths, 6 1-byte uint, > >>> @@ -1814,6 +1816,7 @@ static int mkv_write_header(AVFormatContext *s) > >>> const AVDictionaryEntry *tag; > >>> int ret, i, version = 2; > >>> int64_t creation_time; > >>> + int64_t time_base = 1; > >>> if (mkv->mode != MODE_WEBM || > >>> av_dict_get(s->metadata, "stereo_mode", NULL, 0) || > >>> @@ -1850,7 +1853,10 @@ static int mkv_write_header(AVFormatContext *s) > >>> return ret; > >>> pb = mkv->info.bc; > >>> - put_ebml_uint(pb, MATROSKA_ID_TIMECODESCALE, 1000000); > >>> + time_base = av_rescale_q(time_base, mkv->time_base, > >>> (AVRational){1, 1000000000}); > >>> + av_log(s, AV_LOG_DEBUG, "TimestampScale is: %" PRId64 " ns\n", > >>> time_base); > >>> + put_ebml_uint(pb, MATROSKA_ID_TIMECODESCALE, time_base); > >> There is a serious problem here: mkv->time_base is the time base that > >> the muxer uses; yet if mkv->time_base is not a multiple of 1/1000000000, > >> then av_rescale_q will have to round a bit and then the demuxer will > >> read a different time base, leading to a drift of all timestamps. This > >> is not acceptable. > > This issue is already present in the current version of FFmpeg. > > Matroska's timestamp imprecision currently lead to jitter, but not to > drift (this of course presumes that one actually uses the timestamps > as-is (instead of summing the durations)). But your patch as-is can lead > to drift (when using a wrong timebase), because it adds a whole new type > of error: One in which the demuxer and the muxer disagree about the > timebase that is in use. Consider a user using 1/750000000 as timebase. > 48kHz audio (with a timebase of 1/48000) can be converted accurately to > it, so that the muxer receives precise timestamps. Yet the muxer > actually writes that the timebase is 1/1000000000 and that is what a > demuxer sees. This means that all timestamps (except those for which > Matroska uses "absolute timestamps" (Matroska-speak for a time/duration > that is always in 1/1000000000 like default duration)) are skewed as-if > multiplied by 3/4. > > > Unfortunately even if you tell me this, this is not something I can > > change: Matroska uses nanosecond precision, and nobody has agreed on > > what the way forward for timestamps is. You will have to bring up such > > nanosecond-precision problems with the Matroska specification > > maintainers itself, which are currently seeking IETF standardization: > > https://github.com/ietf-wg-cellar/matroska-specification/issues/422 > >> > > Already did so (I am mkver). > > >>> + > >>> if ((tag = av_dict_get(s->metadata, "title", NULL, 0))) > >>> put_ebml_string(pb, MATROSKA_ID_TITLE, tag->value); > >>> if (!(s->flags & AVFMT_FLAG_BITEXACT)) { > >>> @@ -1883,11 +1889,11 @@ static int mkv_write_header(AVFormatContext *s) > >>> int64_t metadata_duration = get_metadata_duration(s); > >>> if (s->duration > 0) { > >>> - int64_t scaledDuration = av_rescale(s->duration, 1000, > >>> AV_TIME_BASE); > >>> + int64_t scaledDuration = av_rescale_q(s->duration, > >>> AV_TIME_BASE_Q, mkv->time_base); > >>> put_ebml_float(pb, MATROSKA_ID_DURATION, scaledDuration); > >>> av_log(s, AV_LOG_DEBUG, "Write early duration from > >>> recording time = %" PRIu64 "\n", scaledDuration); > >>> } else if (metadata_duration > 0) { > >>> - int64_t scaledDuration = av_rescale(metadata_duration, > >>> 1000, AV_TIME_BASE); > >>> + int64_t scaledDuration = av_rescale_q(metadata_duration, > >>> AV_TIME_BASE_Q, mkv->time_base); > >>> put_ebml_float(pb, MATROSKA_ID_DURATION, scaledDuration); > >>> av_log(s, AV_LOG_DEBUG, "Write early duration from > >>> metadata = %" PRIu64 "\n", scaledDuration); > >>> } else if (s->pb->seekable & AVIO_SEEKABLE_NORMAL) { > >>> @@ -1948,12 +1954,12 @@ static int mkv_write_header(AVFormatContext *s) > >>> // after 4k and on a keyframe > >>> if (IS_SEEKABLE(pb, mkv)) { > >>> if (mkv->cluster_time_limit < 0) > >>> - mkv->cluster_time_limit = 5000; > >>> + mkv->cluster_time_limit = av_rescale_q(5000, > >>> (AVRational){1, 1000}, mkv->time_base); > >>> if (mkv->cluster_size_limit < 0) > >>> mkv->cluster_size_limit = 5 * 1024 * 1024; > >>> } else { > >>> if (mkv->cluster_time_limit < 0) > >>> - mkv->cluster_time_limit = 1000; > >>> + mkv->cluster_time_limit = av_rescale_q(1000, > >>> (AVRational){1, 1000}, mkv->time_base); > >> There are three places where you rescale this; it would be better if you > >> did it unconditionally in one place (namely after this block here). > >> > > I did not want to cut into existing code too much, merely adjust it to > > the change without changing the main parts. > >>> if (mkv->cluster_size_limit < 0) > >>> mkv->cluster_size_limit = 32 * 1024; > >>> } > >>> @@ -2713,6 +2719,14 @@ static int mkv_init(struct AVFormatContext *s) > >>> } else > >>> mkv->mode = MODE_MATROSKAv2; > >>> + // WebM requires a timestamp precision of 1ms. > >>> + if (mkv->mode == MODE_WEBM) { > >>> + if (av_cmp_q(mkv->time_base, (AVRational){1, 1000}) != 0) { > >>> + av_log(s, AV_LOG_ERROR, "WebM requires 1ms timestamp > >>> precision\n"); > >>> + return AVERROR(EINVAL); > >>> + } > >> (This could be put into the same block as the one just above that sets > >> the mode.) > > Fair point. > >>> + } > >>> + > >>> mkv->cur_audio_pkt = av_packet_alloc(); > >>> if (!mkv->cur_audio_pkt) > >>> return AVERROR(ENOMEM); > >>> @@ -2738,8 +2752,8 @@ static int mkv_init(struct AVFormatContext *s) > >>> track->uid = mkv_get_uid(mkv->tracks, i, &c); > >>> } > >>> - // ms precision is the de-facto standard timescale for mkv > >>> files > >>> - avpriv_set_pts_info(st, 64, 1, 1000); > >>> + // Use user-defined timescale. > >>> + avpriv_set_pts_info(st, 64, mkv->time_base.num, > >>> mkv->time_base.den); > >>> if (st->codecpar->codec_type == AVMEDIA_TYPE_ATTACHMENT) { > >>> if (mkv->mode == MODE_WEBM) { > >>> @@ -2759,6 +2773,10 @@ static int mkv_init(struct AVFormatContext *s) > >>> track->track_num_size = ebml_num_size(track->track_num); > >>> } > >>> + // Scale the configured cluster_time_limit. > >>> + if (mkv->cluster_time_limit >= 0) > >>> + mkv->cluster_time_limit = > >>> av_rescale_q(mkv->cluster_time_limit, (AVRational){1, 1000}, > >>> mkv->time_base); > >>> + > >>> if (mkv->is_dash && nb_tracks != 1) > >>> return AVERROR(EINVAL); > >>> @@ -2826,6 +2844,7 @@ static const AVOption options[] = { > >>> { "infer", "For each track type, mark the first track of > >>> disposition default as default; if none exists, mark the first track > >>> as default.", 0, AV_OPT_TYPE_CONST, { .i64 = DEFAULT_MODE_INFER }, 0, > >>> 0, FLAGS, "default_mode" }, > >>> { "infer_no_subs", "For each track type, mark the first track > >>> of disposition default as default; for audio and video: if none > >>> exists, mark the first track as default.", 0, AV_OPT_TYPE_CONST, { > >>> .i64 = DEFAULT_MODE_INFER_NO_SUBS }, 0, 0, FLAGS, "default_mode" }, > >>> { "passthrough", "Use the disposition flag as-is", 0, > >>> AV_OPT_TYPE_CONST, { .i64 = DEFAULT_MODE_PASSTHROUGH }, 0, 0, FLAGS, > >>> "default_mode" }, > >>> + { "timestamp_precision", "Timestamp precision to use for the > >>> entire container", OFFSET(time_base), AV_OPT_TYPE_RATIONAL, { .dbl = > >>> 0.001 }, 0.000000001, INT_MAX, FLAGS}, > >> Using a default value that is a valid value makes it impossible to > >> distinguish whether the user set a value at all; but I'd like to > >> preserve that possibility. > > Sorry, could you explain why you want to do this further? I see no code > > path that would even need to check if the user/caller has set precision, > > if it's left at default we get the default of 1/1000 - just as previous > > FFmpeg versions. > >> > > The default value of 1/1000 is good for most use-cases, but there might > be scenarios where it is not (e.g. TrueHD packets have a duration < 1ms, > so different packets will end up with the same timestamp when using > 1/1000). If the user has set a timestamp precision, then that should be > honoured; but if not, then we might use a different value than 1/1000 > (in the future; I know that currently no code path that makes use of > this exists). > > Is there any problem to submit this flag while keeping the default value to millisecond. So users of matroska muxer who want to increase the precision can still do it. And the discussion of which default value should be could be done in another patch? > - Andreas > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". >
Thierry Foucu: > On Fri, Jun 4, 2021 at 10:18 PM Andreas Rheinhardt < > andreas.rheinhardt@outlook.com> wrote: > >> Michael Fabian 'Xaymar' Dirks: >>> On 2021-05-24 22:15, Andreas Rheinhardt wrote: >>>> michael.dirks@xaymar.com: >>>>> From: Michael Fabian 'Xaymar' Dirks <michael.dirks@xaymar.com> >>>>> >>>>> Adds "timestamp_precision" to the available options for Matroska >> muxing. >>>>> The option enables users and developers to change the precision of the >>>>> time stamps in the Matroska container up to 1 nanosecond, which can aid >>>>> with the proper detection of constant and variable rate content. >>>>> >>>>> Work-around fix for: 259, 6406, 7927, 8909 and 9124. >>>>> >>>>> Signed-off-by: Michael Fabian 'Xaymar' Dirks <michael.dirks@xaymar.com >>> >>>>> --- >>>>> doc/muxers.texi | 8 ++++++++ >>>>> libavformat/matroskaenc.c | 33 ++++++++++++++++++++++++++------- >>>>> 2 files changed, 34 insertions(+), 7 deletions(-) >>>>> >>>>> diff --git a/doc/muxers.texi b/doc/muxers.texi >>>>> index e1c6ad0829..8655be94ff 100644 >>>>> --- a/doc/muxers.texi >>>>> +++ b/doc/muxers.texi >>>>> @@ -1583,6 +1583,14 @@ bitmap is stored bottom-up. Note that this >>>>> option does not flip the bitmap >>>>> which has to be done manually beforehand, e.g. by using the vflip >>>>> filter. >>>>> Default is @var{false} and indicates bitmap is stored top down. >>>>> +@item timestamp_precision >>>>> +Sets the timestamp precision up to 1 nanosecond for Matroska/WebM, >>>>> which can >>>>> +improve detection of constant rate content in demuxers. Note that >>>>> some poorly >>>>> +implemented demuxers may require a timestamp precision of 1 >>>>> millisecond, so >>>>> +increasing it past that point may result in playback issues. Higher >>>>> precision >>>>> +also reduces the maximum possible timestamp significantly. >>>> This should mention that a too high precision will increase container >>>> overhead. >>> Good point. >>>> >>>>> +Default is @var{1/1000} (1 millisecond). >>>>> + >>>>> @end table >>>>> @anchor{md5} >>>>> diff --git a/libavformat/matroskaenc.c b/libavformat/matroskaenc.c >>>>> index 186a25d920..1b911a648c 100644 >>>>> --- a/libavformat/matroskaenc.c >>>>> +++ b/libavformat/matroskaenc.c >>>>> @@ -158,6 +158,8 @@ typedef struct MatroskaMuxContext { >>>>> int default_mode; >>>>> uint32_t segment_uid[4]; >>>>> + >>>>> + AVRational time_base; >>>>> } MatroskaMuxContext; >>>>> /** 2 bytes * 7 for EBML IDs, 7 1-byte EBML lengths, 6 1-byte uint, >>>>> @@ -1814,6 +1816,7 @@ static int mkv_write_header(AVFormatContext *s) >>>>> const AVDictionaryEntry *tag; >>>>> int ret, i, version = 2; >>>>> int64_t creation_time; >>>>> + int64_t time_base = 1; >>>>> if (mkv->mode != MODE_WEBM || >>>>> av_dict_get(s->metadata, "stereo_mode", NULL, 0) || >>>>> @@ -1850,7 +1853,10 @@ static int mkv_write_header(AVFormatContext *s) >>>>> return ret; >>>>> pb = mkv->info.bc; >>>>> - put_ebml_uint(pb, MATROSKA_ID_TIMECODESCALE, 1000000); >>>>> + time_base = av_rescale_q(time_base, mkv->time_base, >>>>> (AVRational){1, 1000000000}); >>>>> + av_log(s, AV_LOG_DEBUG, "TimestampScale is: %" PRId64 " ns\n", >>>>> time_base); >>>>> + put_ebml_uint(pb, MATROSKA_ID_TIMECODESCALE, time_base); >>>> There is a serious problem here: mkv->time_base is the time base that >>>> the muxer uses; yet if mkv->time_base is not a multiple of 1/1000000000, >>>> then av_rescale_q will have to round a bit and then the demuxer will >>>> read a different time base, leading to a drift of all timestamps. This >>>> is not acceptable. >>> This issue is already present in the current version of FFmpeg. >> >> Matroska's timestamp imprecision currently lead to jitter, but not to >> drift (this of course presumes that one actually uses the timestamps >> as-is (instead of summing the durations)). But your patch as-is can lead >> to drift (when using a wrong timebase), because it adds a whole new type >> of error: One in which the demuxer and the muxer disagree about the >> timebase that is in use. Consider a user using 1/750000000 as timebase. >> 48kHz audio (with a timebase of 1/48000) can be converted accurately to >> it, so that the muxer receives precise timestamps. Yet the muxer >> actually writes that the timebase is 1/1000000000 and that is what a >> demuxer sees. This means that all timestamps (except those for which >> Matroska uses "absolute timestamps" (Matroska-speak for a time/duration >> that is always in 1/1000000000 like default duration)) are skewed as-if >> multiplied by 3/4. >> >>> Unfortunately even if you tell me this, this is not something I can >>> change: Matroska uses nanosecond precision, and nobody has agreed on >>> what the way forward for timestamps is. You will have to bring up such >>> nanosecond-precision problems with the Matroska specification >>> maintainers itself, which are currently seeking IETF standardization: >>> https://github.com/ietf-wg-cellar/matroska-specification/issues/422 >>>> >> >> Already did so (I am mkver). >> >>>>> + >>>>> if ((tag = av_dict_get(s->metadata, "title", NULL, 0))) >>>>> put_ebml_string(pb, MATROSKA_ID_TITLE, tag->value); >>>>> if (!(s->flags & AVFMT_FLAG_BITEXACT)) { >>>>> @@ -1883,11 +1889,11 @@ static int mkv_write_header(AVFormatContext *s) >>>>> int64_t metadata_duration = get_metadata_duration(s); >>>>> if (s->duration > 0) { >>>>> - int64_t scaledDuration = av_rescale(s->duration, 1000, >>>>> AV_TIME_BASE); >>>>> + int64_t scaledDuration = av_rescale_q(s->duration, >>>>> AV_TIME_BASE_Q, mkv->time_base); >>>>> put_ebml_float(pb, MATROSKA_ID_DURATION, scaledDuration); >>>>> av_log(s, AV_LOG_DEBUG, "Write early duration from >>>>> recording time = %" PRIu64 "\n", scaledDuration); >>>>> } else if (metadata_duration > 0) { >>>>> - int64_t scaledDuration = av_rescale(metadata_duration, >>>>> 1000, AV_TIME_BASE); >>>>> + int64_t scaledDuration = av_rescale_q(metadata_duration, >>>>> AV_TIME_BASE_Q, mkv->time_base); >>>>> put_ebml_float(pb, MATROSKA_ID_DURATION, scaledDuration); >>>>> av_log(s, AV_LOG_DEBUG, "Write early duration from >>>>> metadata = %" PRIu64 "\n", scaledDuration); >>>>> } else if (s->pb->seekable & AVIO_SEEKABLE_NORMAL) { >>>>> @@ -1948,12 +1954,12 @@ static int mkv_write_header(AVFormatContext *s) >>>>> // after 4k and on a keyframe >>>>> if (IS_SEEKABLE(pb, mkv)) { >>>>> if (mkv->cluster_time_limit < 0) >>>>> - mkv->cluster_time_limit = 5000; >>>>> + mkv->cluster_time_limit = av_rescale_q(5000, >>>>> (AVRational){1, 1000}, mkv->time_base); >>>>> if (mkv->cluster_size_limit < 0) >>>>> mkv->cluster_size_limit = 5 * 1024 * 1024; >>>>> } else { >>>>> if (mkv->cluster_time_limit < 0) >>>>> - mkv->cluster_time_limit = 1000; >>>>> + mkv->cluster_time_limit = av_rescale_q(1000, >>>>> (AVRational){1, 1000}, mkv->time_base); >>>> There are three places where you rescale this; it would be better if you >>>> did it unconditionally in one place (namely after this block here). >>>> >>> I did not want to cut into existing code too much, merely adjust it to >>> the change without changing the main parts. >>>>> if (mkv->cluster_size_limit < 0) >>>>> mkv->cluster_size_limit = 32 * 1024; >>>>> } >>>>> @@ -2713,6 +2719,14 @@ static int mkv_init(struct AVFormatContext *s) >>>>> } else >>>>> mkv->mode = MODE_MATROSKAv2; >>>>> + // WebM requires a timestamp precision of 1ms. >>>>> + if (mkv->mode == MODE_WEBM) { >>>>> + if (av_cmp_q(mkv->time_base, (AVRational){1, 1000}) != 0) { >>>>> + av_log(s, AV_LOG_ERROR, "WebM requires 1ms timestamp >>>>> precision\n"); >>>>> + return AVERROR(EINVAL); >>>>> + } >>>> (This could be put into the same block as the one just above that sets >>>> the mode.) >>> Fair point. >>>>> + } >>>>> + >>>>> mkv->cur_audio_pkt = av_packet_alloc(); >>>>> if (!mkv->cur_audio_pkt) >>>>> return AVERROR(ENOMEM); >>>>> @@ -2738,8 +2752,8 @@ static int mkv_init(struct AVFormatContext *s) >>>>> track->uid = mkv_get_uid(mkv->tracks, i, &c); >>>>> } >>>>> - // ms precision is the de-facto standard timescale for mkv >>>>> files >>>>> - avpriv_set_pts_info(st, 64, 1, 1000); >>>>> + // Use user-defined timescale. >>>>> + avpriv_set_pts_info(st, 64, mkv->time_base.num, >>>>> mkv->time_base.den); >>>>> if (st->codecpar->codec_type == AVMEDIA_TYPE_ATTACHMENT) { >>>>> if (mkv->mode == MODE_WEBM) { >>>>> @@ -2759,6 +2773,10 @@ static int mkv_init(struct AVFormatContext *s) >>>>> track->track_num_size = ebml_num_size(track->track_num); >>>>> } >>>>> + // Scale the configured cluster_time_limit. >>>>> + if (mkv->cluster_time_limit >= 0) >>>>> + mkv->cluster_time_limit = >>>>> av_rescale_q(mkv->cluster_time_limit, (AVRational){1, 1000}, >>>>> mkv->time_base); >>>>> + >>>>> if (mkv->is_dash && nb_tracks != 1) >>>>> return AVERROR(EINVAL); >>>>> @@ -2826,6 +2844,7 @@ static const AVOption options[] = { >>>>> { "infer", "For each track type, mark the first track of >>>>> disposition default as default; if none exists, mark the first track >>>>> as default.", 0, AV_OPT_TYPE_CONST, { .i64 = DEFAULT_MODE_INFER }, 0, >>>>> 0, FLAGS, "default_mode" }, >>>>> { "infer_no_subs", "For each track type, mark the first track >>>>> of disposition default as default; for audio and video: if none >>>>> exists, mark the first track as default.", 0, AV_OPT_TYPE_CONST, { >>>>> .i64 = DEFAULT_MODE_INFER_NO_SUBS }, 0, 0, FLAGS, "default_mode" }, >>>>> { "passthrough", "Use the disposition flag as-is", 0, >>>>> AV_OPT_TYPE_CONST, { .i64 = DEFAULT_MODE_PASSTHROUGH }, 0, 0, FLAGS, >>>>> "default_mode" }, >>>>> + { "timestamp_precision", "Timestamp precision to use for the >>>>> entire container", OFFSET(time_base), AV_OPT_TYPE_RATIONAL, { .dbl = >>>>> 0.001 }, 0.000000001, INT_MAX, FLAGS}, >>>> Using a default value that is a valid value makes it impossible to >>>> distinguish whether the user set a value at all; but I'd like to >>>> preserve that possibility. >>> Sorry, could you explain why you want to do this further? I see no code >>> path that would even need to check if the user/caller has set precision, >>> if it's left at default we get the default of 1/1000 - just as previous >>> FFmpeg versions. >>>> >> >> The default value of 1/1000 is good for most use-cases, but there might >> be scenarios where it is not (e.g. TrueHD packets have a duration < 1ms, >> so different packets will end up with the same timestamp when using >> 1/1000). If the user has set a timestamp precision, then that should be >> honoured; but if not, then we might use a different value than 1/1000 >> (in the future; I know that currently no code path that makes use of >> this exists). >> >> > Is there any problem to submit this flag while keeping the default value to > millisecond. So users of matroska muxer who want to increase the precision > can still do it. > And the discussion of which default value should be could be done in > another patch? > There is no such problem in general; yet there is a problem with this patch, namely that it allows to choose timebases which are not a multiple of the basic Matroska timebase of 1/1000000000. See above. - Andreas
On Mon, Jun 14, 2021 at 12:09 PM Andreas Rheinhardt < andreas.rheinhardt@outlook.com> wrote: > Thierry Foucu: > > On Fri, Jun 4, 2021 at 10:18 PM Andreas Rheinhardt < > > andreas.rheinhardt@outlook.com> wrote: > > > >> Michael Fabian 'Xaymar' Dirks: > >>> On 2021-05-24 22:15, Andreas Rheinhardt wrote: > >>>> michael.dirks@xaymar.com: > >>>>> From: Michael Fabian 'Xaymar' Dirks <michael.dirks@xaymar.com> > >>>>> > >>>>> Adds "timestamp_precision" to the available options for Matroska > >> muxing. > >>>>> The option enables users and developers to change the precision of > the > >>>>> time stamps in the Matroska container up to 1 nanosecond, which can > aid > >>>>> with the proper detection of constant and variable rate content. > >>>>> > >>>>> Work-around fix for: 259, 6406, 7927, 8909 and 9124. > >>>>> > >>>>> Signed-off-by: Michael Fabian 'Xaymar' Dirks < > michael.dirks@xaymar.com > >>> > >>>>> --- > >>>>> doc/muxers.texi | 8 ++++++++ > >>>>> libavformat/matroskaenc.c | 33 ++++++++++++++++++++++++++------- > >>>>> 2 files changed, 34 insertions(+), 7 deletions(-) > >>>>> > >>>>> diff --git a/doc/muxers.texi b/doc/muxers.texi > >>>>> index e1c6ad0829..8655be94ff 100644 > >>>>> --- a/doc/muxers.texi > >>>>> +++ b/doc/muxers.texi > >>>>> @@ -1583,6 +1583,14 @@ bitmap is stored bottom-up. Note that this > >>>>> option does not flip the bitmap > >>>>> which has to be done manually beforehand, e.g. by using the vflip > >>>>> filter. > >>>>> Default is @var{false} and indicates bitmap is stored top down. > >>>>> +@item timestamp_precision > >>>>> +Sets the timestamp precision up to 1 nanosecond for Matroska/WebM, > >>>>> which can > >>>>> +improve detection of constant rate content in demuxers. Note that > >>>>> some poorly > >>>>> +implemented demuxers may require a timestamp precision of 1 > >>>>> millisecond, so > >>>>> +increasing it past that point may result in playback issues. Higher > >>>>> precision > >>>>> +also reduces the maximum possible timestamp significantly. > >>>> This should mention that a too high precision will increase container > >>>> overhead. > >>> Good point. > >>>> > >>>>> +Default is @var{1/1000} (1 millisecond). > >>>>> + > >>>>> @end table > >>>>> @anchor{md5} > >>>>> diff --git a/libavformat/matroskaenc.c b/libavformat/matroskaenc.c > >>>>> index 186a25d920..1b911a648c 100644 > >>>>> --- a/libavformat/matroskaenc.c > >>>>> +++ b/libavformat/matroskaenc.c > >>>>> @@ -158,6 +158,8 @@ typedef struct MatroskaMuxContext { > >>>>> int default_mode; > >>>>> uint32_t segment_uid[4]; > >>>>> + > >>>>> + AVRational time_base; > >>>>> } MatroskaMuxContext; > >>>>> /** 2 bytes * 7 for EBML IDs, 7 1-byte EBML lengths, 6 1-byte > uint, > >>>>> @@ -1814,6 +1816,7 @@ static int mkv_write_header(AVFormatContext *s) > >>>>> const AVDictionaryEntry *tag; > >>>>> int ret, i, version = 2; > >>>>> int64_t creation_time; > >>>>> + int64_t time_base = 1; > >>>>> if (mkv->mode != MODE_WEBM || > >>>>> av_dict_get(s->metadata, "stereo_mode", NULL, 0) || > >>>>> @@ -1850,7 +1853,10 @@ static int mkv_write_header(AVFormatContext > *s) > >>>>> return ret; > >>>>> pb = mkv->info.bc; > >>>>> - put_ebml_uint(pb, MATROSKA_ID_TIMECODESCALE, 1000000); > >>>>> + time_base = av_rescale_q(time_base, mkv->time_base, > >>>>> (AVRational){1, 1000000000}); > >>>>> + av_log(s, AV_LOG_DEBUG, "TimestampScale is: %" PRId64 " ns\n", > >>>>> time_base); > >>>>> + put_ebml_uint(pb, MATROSKA_ID_TIMECODESCALE, time_base); > >>>> There is a serious problem here: mkv->time_base is the time base that > >>>> the muxer uses; yet if mkv->time_base is not a multiple of > 1/1000000000, > >>>> then av_rescale_q will have to round a bit and then the demuxer will > >>>> read a different time base, leading to a drift of all timestamps. This > >>>> is not acceptable. > >>> This issue is already present in the current version of FFmpeg. > >> > >> Matroska's timestamp imprecision currently lead to jitter, but not to > >> drift (this of course presumes that one actually uses the timestamps > >> as-is (instead of summing the durations)). But your patch as-is can lead > >> to drift (when using a wrong timebase), because it adds a whole new type > >> of error: One in which the demuxer and the muxer disagree about the > >> timebase that is in use. Consider a user using 1/750000000 as timebase. > >> 48kHz audio (with a timebase of 1/48000) can be converted accurately to > >> it, so that the muxer receives precise timestamps. Yet the muxer > >> actually writes that the timebase is 1/1000000000 and that is what a > >> demuxer sees. This means that all timestamps (except those for which > >> Matroska uses "absolute timestamps" (Matroska-speak for a time/duration > >> that is always in 1/1000000000 like default duration)) are skewed as-if > >> multiplied by 3/4. > >> > >>> Unfortunately even if you tell me this, this is not something I can > >>> change: Matroska uses nanosecond precision, and nobody has agreed on > >>> what the way forward for timestamps is. You will have to bring up such > >>> nanosecond-precision problems with the Matroska specification > >>> maintainers itself, which are currently seeking IETF standardization: > >>> https://github.com/ietf-wg-cellar/matroska-specification/issues/422 > >>>> > >> > >> Already did so (I am mkver). > >> > >>>>> + > >>>>> if ((tag = av_dict_get(s->metadata, "title", NULL, 0))) > >>>>> put_ebml_string(pb, MATROSKA_ID_TITLE, tag->value); > >>>>> if (!(s->flags & AVFMT_FLAG_BITEXACT)) { > >>>>> @@ -1883,11 +1889,11 @@ static int mkv_write_header(AVFormatContext > *s) > >>>>> int64_t metadata_duration = get_metadata_duration(s); > >>>>> if (s->duration > 0) { > >>>>> - int64_t scaledDuration = av_rescale(s->duration, 1000, > >>>>> AV_TIME_BASE); > >>>>> + int64_t scaledDuration = av_rescale_q(s->duration, > >>>>> AV_TIME_BASE_Q, mkv->time_base); > >>>>> put_ebml_float(pb, MATROSKA_ID_DURATION, > scaledDuration); > >>>>> av_log(s, AV_LOG_DEBUG, "Write early duration from > >>>>> recording time = %" PRIu64 "\n", scaledDuration); > >>>>> } else if (metadata_duration > 0) { > >>>>> - int64_t scaledDuration = av_rescale(metadata_duration, > >>>>> 1000, AV_TIME_BASE); > >>>>> + int64_t scaledDuration = av_rescale_q(metadata_duration, > >>>>> AV_TIME_BASE_Q, mkv->time_base); > >>>>> put_ebml_float(pb, MATROSKA_ID_DURATION, > scaledDuration); > >>>>> av_log(s, AV_LOG_DEBUG, "Write early duration from > >>>>> metadata = %" PRIu64 "\n", scaledDuration); > >>>>> } else if (s->pb->seekable & AVIO_SEEKABLE_NORMAL) { > >>>>> @@ -1948,12 +1954,12 @@ static int mkv_write_header(AVFormatContext > *s) > >>>>> // after 4k and on a keyframe > >>>>> if (IS_SEEKABLE(pb, mkv)) { > >>>>> if (mkv->cluster_time_limit < 0) > >>>>> - mkv->cluster_time_limit = 5000; > >>>>> + mkv->cluster_time_limit = av_rescale_q(5000, > >>>>> (AVRational){1, 1000}, mkv->time_base); > >>>>> if (mkv->cluster_size_limit < 0) > >>>>> mkv->cluster_size_limit = 5 * 1024 * 1024; > >>>>> } else { > >>>>> if (mkv->cluster_time_limit < 0) > >>>>> - mkv->cluster_time_limit = 1000; > >>>>> + mkv->cluster_time_limit = av_rescale_q(1000, > >>>>> (AVRational){1, 1000}, mkv->time_base); > >>>> There are three places where you rescale this; it would be better if > you > >>>> did it unconditionally in one place (namely after this block here). > >>>> > >>> I did not want to cut into existing code too much, merely adjust it to > >>> the change without changing the main parts. > >>>>> if (mkv->cluster_size_limit < 0) > >>>>> mkv->cluster_size_limit = 32 * 1024; > >>>>> } > >>>>> @@ -2713,6 +2719,14 @@ static int mkv_init(struct AVFormatContext *s) > >>>>> } else > >>>>> mkv->mode = MODE_MATROSKAv2; > >>>>> + // WebM requires a timestamp precision of 1ms. > >>>>> + if (mkv->mode == MODE_WEBM) { > >>>>> + if (av_cmp_q(mkv->time_base, (AVRational){1, 1000}) != 0) { > >>>>> + av_log(s, AV_LOG_ERROR, "WebM requires 1ms timestamp > >>>>> precision\n"); > >>>>> + return AVERROR(EINVAL); > >>>>> + } > >>>> (This could be put into the same block as the one just above that sets > >>>> the mode.) > >>> Fair point. > >>>>> + } > >>>>> + > >>>>> mkv->cur_audio_pkt = av_packet_alloc(); > >>>>> if (!mkv->cur_audio_pkt) > >>>>> return AVERROR(ENOMEM); > >>>>> @@ -2738,8 +2752,8 @@ static int mkv_init(struct AVFormatContext *s) > >>>>> track->uid = mkv_get_uid(mkv->tracks, i, &c); > >>>>> } > >>>>> - // ms precision is the de-facto standard timescale for mkv > >>>>> files > >>>>> - avpriv_set_pts_info(st, 64, 1, 1000); > >>>>> + // Use user-defined timescale. > >>>>> + avpriv_set_pts_info(st, 64, mkv->time_base.num, > >>>>> mkv->time_base.den); > >>>>> if (st->codecpar->codec_type == AVMEDIA_TYPE_ATTACHMENT) > { > >>>>> if (mkv->mode == MODE_WEBM) { > >>>>> @@ -2759,6 +2773,10 @@ static int mkv_init(struct AVFormatContext *s) > >>>>> track->track_num_size = ebml_num_size(track->track_num); > >>>>> } > >>>>> + // Scale the configured cluster_time_limit. > >>>>> + if (mkv->cluster_time_limit >= 0) > >>>>> + mkv->cluster_time_limit = > >>>>> av_rescale_q(mkv->cluster_time_limit, (AVRational){1, 1000}, > >>>>> mkv->time_base); > >>>>> + > >>>>> if (mkv->is_dash && nb_tracks != 1) > >>>>> return AVERROR(EINVAL); > >>>>> @@ -2826,6 +2844,7 @@ static const AVOption options[] = { > >>>>> { "infer", "For each track type, mark the first track of > >>>>> disposition default as default; if none exists, mark the first track > >>>>> as default.", 0, AV_OPT_TYPE_CONST, { .i64 = DEFAULT_MODE_INFER }, 0, > >>>>> 0, FLAGS, "default_mode" }, > >>>>> { "infer_no_subs", "For each track type, mark the first track > >>>>> of disposition default as default; for audio and video: if none > >>>>> exists, mark the first track as default.", 0, AV_OPT_TYPE_CONST, { > >>>>> .i64 = DEFAULT_MODE_INFER_NO_SUBS }, 0, 0, FLAGS, "default_mode" }, > >>>>> { "passthrough", "Use the disposition flag as-is", 0, > >>>>> AV_OPT_TYPE_CONST, { .i64 = DEFAULT_MODE_PASSTHROUGH }, 0, 0, FLAGS, > >>>>> "default_mode" }, > >>>>> + { "timestamp_precision", "Timestamp precision to use for the > >>>>> entire container", OFFSET(time_base), AV_OPT_TYPE_RATIONAL, { .dbl = > >>>>> 0.001 }, 0.000000001, INT_MAX, FLAGS}, > >>>> Using a default value that is a valid value makes it impossible to > >>>> distinguish whether the user set a value at all; but I'd like to > >>>> preserve that possibility. > >>> Sorry, could you explain why you want to do this further? I see no code > >>> path that would even need to check if the user/caller has set > precision, > >>> if it's left at default we get the default of 1/1000 - just as previous > >>> FFmpeg versions. > >>>> > >> > >> The default value of 1/1000 is good for most use-cases, but there might > >> be scenarios where it is not (e.g. TrueHD packets have a duration < 1ms, > >> so different packets will end up with the same timestamp when using > >> 1/1000). If the user has set a timestamp precision, then that should be > >> honoured; but if not, then we might use a different value than 1/1000 > >> (in the future; I know that currently no code path that makes use of > >> this exists). > >> > >> > > Is there any problem to submit this flag while keeping the default value > to > > millisecond. So users of matroska muxer who want to increase the > precision > > can still do it. > > And the discussion of which default value should be could be done in > > another patch? > > > There is no such problem in general; yet there is a problem with this > patch, namely that it allows to choose timebases which are not a > multiple of the basic Matroska timebase of 1/1000000000. See above. > Thanks. So, if the option is taking a int64 to represent the timescale, instead of a floating point (so no need to convert from it), just having the option setting will be ok with the community? > > - Andreas > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". >
Thierry Foucu: > On Mon, Jun 14, 2021 at 12:09 PM Andreas Rheinhardt < > andreas.rheinhardt@outlook.com> wrote: > >> Thierry Foucu: >>> On Fri, Jun 4, 2021 at 10:18 PM Andreas Rheinhardt < >>> andreas.rheinhardt@outlook.com> wrote: >>> >>>> Michael Fabian 'Xaymar' Dirks: >>>>> On 2021-05-24 22:15, Andreas Rheinhardt wrote: >>>>>> michael.dirks@xaymar.com: >>>>>>> From: Michael Fabian 'Xaymar' Dirks <michael.dirks@xaymar.com> >>>>>>> >>>>>>> Adds "timestamp_precision" to the available options for Matroska >>>> muxing. >>>>>>> The option enables users and developers to change the precision of >> the >>>>>>> time stamps in the Matroska container up to 1 nanosecond, which can >> aid >>>>>>> with the proper detection of constant and variable rate content. >>>>>>> >>>>>>> Work-around fix for: 259, 6406, 7927, 8909 and 9124. >>>>>>> >>>>>>> Signed-off-by: Michael Fabian 'Xaymar' Dirks < >> michael.dirks@xaymar.com >>>>> >>>>>>> --- >>>>>>> doc/muxers.texi | 8 ++++++++ >>>>>>> libavformat/matroskaenc.c | 33 ++++++++++++++++++++++++++------- >>>>>>> 2 files changed, 34 insertions(+), 7 deletions(-) >>>>>>> >>>>>>> diff --git a/doc/muxers.texi b/doc/muxers.texi >>>>>>> index e1c6ad0829..8655be94ff 100644 >>>>>>> --- a/doc/muxers.texi >>>>>>> +++ b/doc/muxers.texi >>>>>>> @@ -1583,6 +1583,14 @@ bitmap is stored bottom-up. Note that this >>>>>>> option does not flip the bitmap >>>>>>> which has to be done manually beforehand, e.g. by using the vflip >>>>>>> filter. >>>>>>> Default is @var{false} and indicates bitmap is stored top down. >>>>>>> +@item timestamp_precision >>>>>>> +Sets the timestamp precision up to 1 nanosecond for Matroska/WebM, >>>>>>> which can >>>>>>> +improve detection of constant rate content in demuxers. Note that >>>>>>> some poorly >>>>>>> +implemented demuxers may require a timestamp precision of 1 >>>>>>> millisecond, so >>>>>>> +increasing it past that point may result in playback issues. Higher >>>>>>> precision >>>>>>> +also reduces the maximum possible timestamp significantly. >>>>>> This should mention that a too high precision will increase container >>>>>> overhead. >>>>> Good point. >>>>>> >>>>>>> +Default is @var{1/1000} (1 millisecond). >>>>>>> + >>>>>>> @end table >>>>>>> @anchor{md5} >>>>>>> diff --git a/libavformat/matroskaenc.c b/libavformat/matroskaenc.c >>>>>>> index 186a25d920..1b911a648c 100644 >>>>>>> --- a/libavformat/matroskaenc.c >>>>>>> +++ b/libavformat/matroskaenc.c >>>>>>> @@ -158,6 +158,8 @@ typedef struct MatroskaMuxContext { >>>>>>> int default_mode; >>>>>>> uint32_t segment_uid[4]; >>>>>>> + >>>>>>> + AVRational time_base; >>>>>>> } MatroskaMuxContext; >>>>>>> /** 2 bytes * 7 for EBML IDs, 7 1-byte EBML lengths, 6 1-byte >> uint, >>>>>>> @@ -1814,6 +1816,7 @@ static int mkv_write_header(AVFormatContext *s) >>>>>>> const AVDictionaryEntry *tag; >>>>>>> int ret, i, version = 2; >>>>>>> int64_t creation_time; >>>>>>> + int64_t time_base = 1; >>>>>>> if (mkv->mode != MODE_WEBM || >>>>>>> av_dict_get(s->metadata, "stereo_mode", NULL, 0) || >>>>>>> @@ -1850,7 +1853,10 @@ static int mkv_write_header(AVFormatContext >> *s) >>>>>>> return ret; >>>>>>> pb = mkv->info.bc; >>>>>>> - put_ebml_uint(pb, MATROSKA_ID_TIMECODESCALE, 1000000); >>>>>>> + time_base = av_rescale_q(time_base, mkv->time_base, >>>>>>> (AVRational){1, 1000000000}); >>>>>>> + av_log(s, AV_LOG_DEBUG, "TimestampScale is: %" PRId64 " ns\n", >>>>>>> time_base); >>>>>>> + put_ebml_uint(pb, MATROSKA_ID_TIMECODESCALE, time_base); >>>>>> There is a serious problem here: mkv->time_base is the time base that >>>>>> the muxer uses; yet if mkv->time_base is not a multiple of >> 1/1000000000, >>>>>> then av_rescale_q will have to round a bit and then the demuxer will >>>>>> read a different time base, leading to a drift of all timestamps. This >>>>>> is not acceptable. >>>>> This issue is already present in the current version of FFmpeg. >>>> >>>> Matroska's timestamp imprecision currently lead to jitter, but not to >>>> drift (this of course presumes that one actually uses the timestamps >>>> as-is (instead of summing the durations)). But your patch as-is can lead >>>> to drift (when using a wrong timebase), because it adds a whole new type >>>> of error: One in which the demuxer and the muxer disagree about the >>>> timebase that is in use. Consider a user using 1/750000000 as timebase. >>>> 48kHz audio (with a timebase of 1/48000) can be converted accurately to >>>> it, so that the muxer receives precise timestamps. Yet the muxer >>>> actually writes that the timebase is 1/1000000000 and that is what a >>>> demuxer sees. This means that all timestamps (except those for which >>>> Matroska uses "absolute timestamps" (Matroska-speak for a time/duration >>>> that is always in 1/1000000000 like default duration)) are skewed as-if >>>> multiplied by 3/4. >>>> >>>>> Unfortunately even if you tell me this, this is not something I can >>>>> change: Matroska uses nanosecond precision, and nobody has agreed on >>>>> what the way forward for timestamps is. You will have to bring up such >>>>> nanosecond-precision problems with the Matroska specification >>>>> maintainers itself, which are currently seeking IETF standardization: >>>>> https://github.com/ietf-wg-cellar/matroska-specification/issues/422 >>>>>> >>>> >>>> Already did so (I am mkver). >>>> >>>>>>> + >>>>>>> if ((tag = av_dict_get(s->metadata, "title", NULL, 0))) >>>>>>> put_ebml_string(pb, MATROSKA_ID_TITLE, tag->value); >>>>>>> if (!(s->flags & AVFMT_FLAG_BITEXACT)) { >>>>>>> @@ -1883,11 +1889,11 @@ static int mkv_write_header(AVFormatContext >> *s) >>>>>>> int64_t metadata_duration = get_metadata_duration(s); >>>>>>> if (s->duration > 0) { >>>>>>> - int64_t scaledDuration = av_rescale(s->duration, 1000, >>>>>>> AV_TIME_BASE); >>>>>>> + int64_t scaledDuration = av_rescale_q(s->duration, >>>>>>> AV_TIME_BASE_Q, mkv->time_base); >>>>>>> put_ebml_float(pb, MATROSKA_ID_DURATION, >> scaledDuration); >>>>>>> av_log(s, AV_LOG_DEBUG, "Write early duration from >>>>>>> recording time = %" PRIu64 "\n", scaledDuration); >>>>>>> } else if (metadata_duration > 0) { >>>>>>> - int64_t scaledDuration = av_rescale(metadata_duration, >>>>>>> 1000, AV_TIME_BASE); >>>>>>> + int64_t scaledDuration = av_rescale_q(metadata_duration, >>>>>>> AV_TIME_BASE_Q, mkv->time_base); >>>>>>> put_ebml_float(pb, MATROSKA_ID_DURATION, >> scaledDuration); >>>>>>> av_log(s, AV_LOG_DEBUG, "Write early duration from >>>>>>> metadata = %" PRIu64 "\n", scaledDuration); >>>>>>> } else if (s->pb->seekable & AVIO_SEEKABLE_NORMAL) { >>>>>>> @@ -1948,12 +1954,12 @@ static int mkv_write_header(AVFormatContext >> *s) >>>>>>> // after 4k and on a keyframe >>>>>>> if (IS_SEEKABLE(pb, mkv)) { >>>>>>> if (mkv->cluster_time_limit < 0) >>>>>>> - mkv->cluster_time_limit = 5000; >>>>>>> + mkv->cluster_time_limit = av_rescale_q(5000, >>>>>>> (AVRational){1, 1000}, mkv->time_base); >>>>>>> if (mkv->cluster_size_limit < 0) >>>>>>> mkv->cluster_size_limit = 5 * 1024 * 1024; >>>>>>> } else { >>>>>>> if (mkv->cluster_time_limit < 0) >>>>>>> - mkv->cluster_time_limit = 1000; >>>>>>> + mkv->cluster_time_limit = av_rescale_q(1000, >>>>>>> (AVRational){1, 1000}, mkv->time_base); >>>>>> There are three places where you rescale this; it would be better if >> you >>>>>> did it unconditionally in one place (namely after this block here). >>>>>> >>>>> I did not want to cut into existing code too much, merely adjust it to >>>>> the change without changing the main parts. >>>>>>> if (mkv->cluster_size_limit < 0) >>>>>>> mkv->cluster_size_limit = 32 * 1024; >>>>>>> } >>>>>>> @@ -2713,6 +2719,14 @@ static int mkv_init(struct AVFormatContext *s) >>>>>>> } else >>>>>>> mkv->mode = MODE_MATROSKAv2; >>>>>>> + // WebM requires a timestamp precision of 1ms. >>>>>>> + if (mkv->mode == MODE_WEBM) { >>>>>>> + if (av_cmp_q(mkv->time_base, (AVRational){1, 1000}) != 0) { >>>>>>> + av_log(s, AV_LOG_ERROR, "WebM requires 1ms timestamp >>>>>>> precision\n"); >>>>>>> + return AVERROR(EINVAL); >>>>>>> + } >>>>>> (This could be put into the same block as the one just above that sets >>>>>> the mode.) >>>>> Fair point. >>>>>>> + } >>>>>>> + >>>>>>> mkv->cur_audio_pkt = av_packet_alloc(); >>>>>>> if (!mkv->cur_audio_pkt) >>>>>>> return AVERROR(ENOMEM); >>>>>>> @@ -2738,8 +2752,8 @@ static int mkv_init(struct AVFormatContext *s) >>>>>>> track->uid = mkv_get_uid(mkv->tracks, i, &c); >>>>>>> } >>>>>>> - // ms precision is the de-facto standard timescale for mkv >>>>>>> files >>>>>>> - avpriv_set_pts_info(st, 64, 1, 1000); >>>>>>> + // Use user-defined timescale. >>>>>>> + avpriv_set_pts_info(st, 64, mkv->time_base.num, >>>>>>> mkv->time_base.den); >>>>>>> if (st->codecpar->codec_type == AVMEDIA_TYPE_ATTACHMENT) >> { >>>>>>> if (mkv->mode == MODE_WEBM) { >>>>>>> @@ -2759,6 +2773,10 @@ static int mkv_init(struct AVFormatContext *s) >>>>>>> track->track_num_size = ebml_num_size(track->track_num); >>>>>>> } >>>>>>> + // Scale the configured cluster_time_limit. >>>>>>> + if (mkv->cluster_time_limit >= 0) >>>>>>> + mkv->cluster_time_limit = >>>>>>> av_rescale_q(mkv->cluster_time_limit, (AVRational){1, 1000}, >>>>>>> mkv->time_base); >>>>>>> + >>>>>>> if (mkv->is_dash && nb_tracks != 1) >>>>>>> return AVERROR(EINVAL); >>>>>>> @@ -2826,6 +2844,7 @@ static const AVOption options[] = { >>>>>>> { "infer", "For each track type, mark the first track of >>>>>>> disposition default as default; if none exists, mark the first track >>>>>>> as default.", 0, AV_OPT_TYPE_CONST, { .i64 = DEFAULT_MODE_INFER }, 0, >>>>>>> 0, FLAGS, "default_mode" }, >>>>>>> { "infer_no_subs", "For each track type, mark the first track >>>>>>> of disposition default as default; for audio and video: if none >>>>>>> exists, mark the first track as default.", 0, AV_OPT_TYPE_CONST, { >>>>>>> .i64 = DEFAULT_MODE_INFER_NO_SUBS }, 0, 0, FLAGS, "default_mode" }, >>>>>>> { "passthrough", "Use the disposition flag as-is", 0, >>>>>>> AV_OPT_TYPE_CONST, { .i64 = DEFAULT_MODE_PASSTHROUGH }, 0, 0, FLAGS, >>>>>>> "default_mode" }, >>>>>>> + { "timestamp_precision", "Timestamp precision to use for the >>>>>>> entire container", OFFSET(time_base), AV_OPT_TYPE_RATIONAL, { .dbl = >>>>>>> 0.001 }, 0.000000001, INT_MAX, FLAGS}, >>>>>> Using a default value that is a valid value makes it impossible to >>>>>> distinguish whether the user set a value at all; but I'd like to >>>>>> preserve that possibility. >>>>> Sorry, could you explain why you want to do this further? I see no code >>>>> path that would even need to check if the user/caller has set >> precision, >>>>> if it's left at default we get the default of 1/1000 - just as previous >>>>> FFmpeg versions. >>>>>> >>>> >>>> The default value of 1/1000 is good for most use-cases, but there might >>>> be scenarios where it is not (e.g. TrueHD packets have a duration < 1ms, >>>> so different packets will end up with the same timestamp when using >>>> 1/1000). If the user has set a timestamp precision, then that should be >>>> honoured; but if not, then we might use a different value than 1/1000 >>>> (in the future; I know that currently no code path that makes use of >>>> this exists). >>>> >>>> >>> Is there any problem to submit this flag while keeping the default value >> to >>> millisecond. So users of matroska muxer who want to increase the >> precision >>> can still do it. >>> And the discussion of which default value should be could be done in >>> another patch? >>> >> There is no such problem in general; yet there is a problem with this >> patch, namely that it allows to choose timebases which are not a >> multiple of the basic Matroska timebase of 1/1000000000. See above. >> > > Thanks. So, if the option is taking a int64 to represent the timescale, > instead of a floating point (so no need to convert from it), just having > the option setting will be ok with the community? > Do you intend to represent the timescale as in int64_t / 1000000000 (with the numerator given by an option)? While this would indeed lead to the least code, it has a drawback: Should Matroska evolve to drop the hardcoded nanosecond precision, it would not be able to support this. - Andreas
diff --git a/doc/muxers.texi b/doc/muxers.texi index e1c6ad0829..8655be94ff 100644 --- a/doc/muxers.texi +++ b/doc/muxers.texi @@ -1583,6 +1583,14 @@ bitmap is stored bottom-up. Note that this option does not flip the bitmap which has to be done manually beforehand, e.g. by using the vflip filter. Default is @var{false} and indicates bitmap is stored top down. +@item timestamp_precision +Sets the timestamp precision up to 1 nanosecond for Matroska/WebM, which can +improve detection of constant rate content in demuxers. Note that some poorly +implemented demuxers may require a timestamp precision of 1 millisecond, so +increasing it past that point may result in playback issues. Higher precision +also reduces the maximum possible timestamp significantly. +Default is @var{1/1000} (1 millisecond). + @end table @anchor{md5} diff --git a/libavformat/matroskaenc.c b/libavformat/matroskaenc.c index 186a25d920..1b911a648c 100644 --- a/libavformat/matroskaenc.c +++ b/libavformat/matroskaenc.c @@ -158,6 +158,8 @@ typedef struct MatroskaMuxContext { int default_mode; uint32_t segment_uid[4]; + + AVRational time_base; } MatroskaMuxContext; /** 2 bytes * 7 for EBML IDs, 7 1-byte EBML lengths, 6 1-byte uint, @@ -1814,6 +1816,7 @@ static int mkv_write_header(AVFormatContext *s) const AVDictionaryEntry *tag; int ret, i, version = 2; int64_t creation_time; + int64_t time_base = 1; if (mkv->mode != MODE_WEBM || av_dict_get(s->metadata, "stereo_mode", NULL, 0) || @@ -1850,7 +1853,10 @@ static int mkv_write_header(AVFormatContext *s) return ret; pb = mkv->info.bc; - put_ebml_uint(pb, MATROSKA_ID_TIMECODESCALE, 1000000); + time_base = av_rescale_q(time_base, mkv->time_base, (AVRational){1, 1000000000}); + av_log(s, AV_LOG_DEBUG, "TimestampScale is: %" PRId64 " ns\n", time_base); + put_ebml_uint(pb, MATROSKA_ID_TIMECODESCALE, time_base); + if ((tag = av_dict_get(s->metadata, "title", NULL, 0))) put_ebml_string(pb, MATROSKA_ID_TITLE, tag->value); if (!(s->flags & AVFMT_FLAG_BITEXACT)) { @@ -1883,11 +1889,11 @@ static int mkv_write_header(AVFormatContext *s) int64_t metadata_duration = get_metadata_duration(s); if (s->duration > 0) { - int64_t scaledDuration = av_rescale(s->duration, 1000, AV_TIME_BASE); + int64_t scaledDuration = av_rescale_q(s->duration, AV_TIME_BASE_Q, mkv->time_base); put_ebml_float(pb, MATROSKA_ID_DURATION, scaledDuration); av_log(s, AV_LOG_DEBUG, "Write early duration from recording time = %" PRIu64 "\n", scaledDuration); } else if (metadata_duration > 0) { - int64_t scaledDuration = av_rescale(metadata_duration, 1000, AV_TIME_BASE); + int64_t scaledDuration = av_rescale_q(metadata_duration, AV_TIME_BASE_Q, mkv->time_base); put_ebml_float(pb, MATROSKA_ID_DURATION, scaledDuration); av_log(s, AV_LOG_DEBUG, "Write early duration from metadata = %" PRIu64 "\n", scaledDuration); } else if (s->pb->seekable & AVIO_SEEKABLE_NORMAL) { @@ -1948,12 +1954,12 @@ static int mkv_write_header(AVFormatContext *s) // after 4k and on a keyframe if (IS_SEEKABLE(pb, mkv)) { if (mkv->cluster_time_limit < 0) - mkv->cluster_time_limit = 5000; + mkv->cluster_time_limit = av_rescale_q(5000, (AVRational){1, 1000}, mkv->time_base); if (mkv->cluster_size_limit < 0) mkv->cluster_size_limit = 5 * 1024 * 1024; } else { if (mkv->cluster_time_limit < 0) - mkv->cluster_time_limit = 1000; + mkv->cluster_time_limit = av_rescale_q(1000, (AVRational){1, 1000}, mkv->time_base); if (mkv->cluster_size_limit < 0) mkv->cluster_size_limit = 32 * 1024; } @@ -2713,6 +2719,14 @@ static int mkv_init(struct AVFormatContext *s) } else mkv->mode = MODE_MATROSKAv2; + // WebM requires a timestamp precision of 1ms. + if (mkv->mode == MODE_WEBM) { + if (av_cmp_q(mkv->time_base, (AVRational){1, 1000}) != 0) { + av_log(s, AV_LOG_ERROR, "WebM requires 1ms timestamp precision\n"); + return AVERROR(EINVAL); + } + } + mkv->cur_audio_pkt = av_packet_alloc(); if (!mkv->cur_audio_pkt) return AVERROR(ENOMEM); @@ -2738,8 +2752,8 @@ static int mkv_init(struct AVFormatContext *s) track->uid = mkv_get_uid(mkv->tracks, i, &c); } - // ms precision is the de-facto standard timescale for mkv files - avpriv_set_pts_info(st, 64, 1, 1000); + // Use user-defined timescale. + avpriv_set_pts_info(st, 64, mkv->time_base.num, mkv->time_base.den); if (st->codecpar->codec_type == AVMEDIA_TYPE_ATTACHMENT) { if (mkv->mode == MODE_WEBM) { @@ -2759,6 +2773,10 @@ static int mkv_init(struct AVFormatContext *s) track->track_num_size = ebml_num_size(track->track_num); } + // Scale the configured cluster_time_limit. + if (mkv->cluster_time_limit >= 0) + mkv->cluster_time_limit = av_rescale_q(mkv->cluster_time_limit, (AVRational){1, 1000}, mkv->time_base); + if (mkv->is_dash && nb_tracks != 1) return AVERROR(EINVAL); @@ -2826,6 +2844,7 @@ static const AVOption options[] = { { "infer", "For each track type, mark the first track of disposition default as default; if none exists, mark the first track as default.", 0, AV_OPT_TYPE_CONST, { .i64 = DEFAULT_MODE_INFER }, 0, 0, FLAGS, "default_mode" }, { "infer_no_subs", "For each track type, mark the first track of disposition default as default; for audio and video: if none exists, mark the first track as default.", 0, AV_OPT_TYPE_CONST, { .i64 = DEFAULT_MODE_INFER_NO_SUBS }, 0, 0, FLAGS, "default_mode" }, { "passthrough", "Use the disposition flag as-is", 0, AV_OPT_TYPE_CONST, { .i64 = DEFAULT_MODE_PASSTHROUGH }, 0, 0, FLAGS, "default_mode" }, + { "timestamp_precision", "Timestamp precision to use for the entire container", OFFSET(time_base), AV_OPT_TYPE_RATIONAL, { .dbl = 0.001 }, 0.000000001, INT_MAX, FLAGS}, { NULL }, };