diff mbox

[FFmpeg-devel] Match source video timestamp

Message ID AM4PR0401MB17310E228366A1A1CCFE9B7EF5320@AM4PR0401MB1731.eurprd04.prod.outlook.com
State Superseded
Headers show

Commit Message

Eran Kornblau March 28, 2017, 12:12 p.m. UTC
Hi all,

I'm trying to transcode some video file (MP4/h264 baseline vfr/AAC) with ffmpeg, and I would like to have the
frame timestamps in the transcoded file perfectly match the source video. This is in order to have them both
play together adaptively without issues (from my experience, DASH is very sensitive to timestamp continuity,
a difference as small as a couple of frames between renditions causes playback to fail)

After trying all sorts of parameters (-vsync 0/2, -copytb, with/out -r, -video_track_timescale) without success,
I found that the reason for the mismatched timestamps is that ffmpeg uses 1/frame_rate as the time base
of the encoder. With cfr this is probably ok, but with vfr this quantizes the timestamps and they come out
different than the source.

Just for testing, I made this change:
root@pa-front-vod-stg2 FFmpeg# git diff

And I am able to get a perfect match (=identical stts atom) between the source & transcoded videos.
This is the command line I'm using (the -r is probably meaningless...):
ffmpeg -threads 1 -i input.mp4 -c:v libx264  -subq 2 -qcomp 0.6 -qmin 10 -qmax 50 -qdiff 4 -coder 0 -x264opts stitchable -vprofile baseline -force_key_frames source -pix_fmt yuv420p -b:v 400k -s 640x480 -r 25.174 -g 86400 -aspect 640:480 -c:a copy -map_chapters -1 -map_metadata -1  -f mp4 -flags +loop+mv4 -cmp 256 -partitions +parti4x4+partp8x8+partb8x8 -trellis 1 -refs 1 -me_range 16 -keyint_min 20 -sc_threshold 0 -i_qfactor 0.71 -bt 100k -maxrate 400k -bufsize 1200k -rc_eq 'blurCplx^(1-qComp)' -level 30  -vsync 2 -threads 4  -y output.mp4

I'm thinking about adding variables for this -video_timescale / -audio_timescale (following the convention of
movenc's video_track_timescale) that will get these values:

1.       0 (default) - the existing behavior - 1/frame rate for video, 1/sampling rate for audio

2.       -1 - match the input stream (as in the patch above)

3.       >0 - fixed value (e.g. when passing 1000 it will use 1/1000)

Does this make sense? Any other ideas for solving this?

Thank you!

diff mbox


diff --git a/ffmpeg.c b/ffmpeg.c
index 3b91710..9cba0d5 100644
--- a/ffmpeg.c
+++ b/ffmpeg.c
@@ -3351,7 +3351,7 @@  static int transcode_init(void)
                 enc_ctx->time_base      = (AVRational){ 1, enc_ctx->sample_rate };
             case AVMEDIA_TYPE_VIDEO:
-                enc_ctx->time_base = av_inv_q(ost->frame_rate);
+                enc_ctx->time_base = ist->st->time_base; //av_inv_q(ost->frame_rate);
                 if (!(enc_ctx->time_base.num && enc_ctx->time_base.den))
                     enc_ctx->time_base = ost->filter->filter->inputs[0]->time_base;
                 if (   av_q2d(enc_ctx->time_base) < 0.001 && video_sync_method != VSYNC_PASSTHROUGH