From patchwork Wed Nov  6 02:36:11 2019
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: manuelyuan <manuelyuan@163.com>
X-Patchwork-Id: 16118
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
X-Original-To: patchwork@ffaux-bg.ffmpeg.org
Delivered-To: patchwork@ffaux-bg.ffmpeg.org
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100])
	by ffaux.localdomain (Postfix) with ESMTP id A0E76448A4F
	for <patchwork@ffaux-bg.ffmpeg.org>;
	Wed,  6 Nov 2019 04:36:22 +0200 (EET)
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 7A82A68B058;
	Wed,  6 Nov 2019 04:36:22 +0200 (EET)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from m13-60.163.com (m13-60.163.com [220.181.13.60])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4749868B04E
	for <ffmpeg-devel@ffmpeg.org>; Wed,  6 Nov 2019 04:36:15 +0200 (EET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=163.com;
	s=s110527; h=Date:From:Subject:MIME-Version:Message-ID; bh=ljOf1
	dpXpU23oYge0Xxk3HZsn0b3nnXhm05h4YCzny8=; b=XjM+G8KIdyQPt7fbov+gH
	fLxTjmuTrBt3AK+tFUsT8wPaZpDmjAS1gggHLKWdy3Kw2PXEMaGVWRSxyzE6yhMJ
	yQAhe14xaAVe8t7lsuFPBdJ4vbnx9uhHLlDM9RHJQnfe5F5/LuWKEgXAHM0npNN3
	ovJ0LKpMfUEsCMaFcaQSrA=
Received: from manuelyuan$163.com ( [14.17.22.33] ) by ajax-webmail-wmsvr60
	(Coremail) ; Wed, 6 Nov 2019 10:36:11 +0800 (CST)
X-Originating-IP: [14.17.22.33]
Date: Wed, 6 Nov 2019 10:36:11 +0800 (CST)
From: manuelyuan  <manuelyuan@163.com>
To: ffmpeg-devel@ffmpeg.org
X-Priority: 3
X-Mailer: Coremail Webmail Server Version XT5.0.10 build 20190724(ac680a23)
	Copyright (c) 2002-2019 www.mailtech.cn 163com
MIME-Version: 1.0
Message-ID: <6543c657.3c3b.16e3e91c7b6.Coremail.manuelyuan@163.com>
X-Coremail-Locale: zh_CN
X-CM-TRANSID: PMGowADnHYmbMcJdwjsRAQ--.12714W
X-CM-SenderInfo: 5pdq3vpo1xt0i6rwjhhfrp/1tbiSg9lslPAImd3cwABsB
X-Coremail-Antispam: 1U5529EdanIXcx71UUUUU7vcSsGvfC2KfnxnUU==
Subject: [FFmpeg-devel] [PATCH] lavf/movenc: Replace dts by pts to calculate
	duration
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
	<mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
	<mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches
	<ffmpeg-devel@ffmpeg.org>
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>

From: Mengyang Yuan <manuelyuan@163.com>

In this case, the input video is of dynamic frame rate and we don't want to
duplicate or drop frames, but the output video duration calculated by DTS is
incorrect, I solved it by using PTS.
There are many UGC videos with dynamic frame rates, which are represented by
PTS jumps. After transcoding with ffmpeg -vsync 0 or -vsync 2, the output
video duration becomes longer.By reading the code of x264/encoder/encoder.c,
I found that in order to predict the B frame, x264 needs to ensure that there
are enough reference frames when DTS = 0, so the DTS of these reference frames
will subtract the cache time. However, the cache time includes the part of PTS
jumps, which results in the abnormal small DTS.

Signed-off-by: Mengyang Yuan <manuelyuan@163.com>
---
 libavformat/movenc.c | 23 ++++++++++++++---------
 libavformat/movenc.h |  2 ++
 2 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/libavformat/movenc.c b/libavformat/movenc.c
index 715bec1c2f..206aa48d8c 100644
--- a/libavformat/movenc.c
+++ b/libavformat/movenc.c
@@ -1015,7 +1015,7 @@ static int get_cluster_duration(MOVTrack *track, int cluster_idx)
         return 0;
 
     if (cluster_idx + 1 == track->entry)
-        next_dts = track->track_duration + track->start_dts;
+        next_dts = track->end_dts;
     else
         next_dts = track->cluster[cluster_idx + 1].dts;
 
@@ -5149,8 +5149,7 @@ static int mov_flush_fragment(AVFormatContext *s, int force)
         mov->mdat_size = 0;
         for (i = 0; i < mov->nb_streams; i++) {
             if (mov->tracks[i].entry)
-                mov->tracks[i].frag_start += mov->tracks[i].start_dts +
-                                             mov->tracks[i].track_duration -
+                mov->tracks[i].frag_start += mov->tracks[i].end_dts -
                                              mov->tracks[i].cluster[0].dts;
             mov->tracks[i].entry = 0;
             mov->tracks[i].end_reliable = 0;
@@ -5208,7 +5207,7 @@ static int mov_flush_fragment(AVFormatContext *s, int force)
         int64_t duration = 0;
 
         if (track->entry)
-            duration = track->start_dts + track->track_duration -
+            duration = track->end_dts -
                        track->cluster[0].dts;
         if (mov->flags & FF_MOV_FLAG_SEPARATE_MOOF) {
             if (!track->mdat_buf)
@@ -5281,7 +5280,7 @@ static int check_pkt(AVFormatContext *s, AVPacket *pkt)
         ref = trk->cluster[trk->entry - 1].dts;
     } else if (   trk->start_dts != AV_NOPTS_VALUE
                && !trk->frag_discont) {
-        ref = trk->start_dts + trk->track_duration;
+        ref = trk->end_dts;
     } else
         ref = pkt->dts; // Skip tests for the first packet
 
@@ -5494,7 +5493,7 @@ int ff_mov_write_packet(AVFormatContext *s, AVPacket *pkt)
              * of the last packet of the previous fragment based on track_duration,
              * which might not exactly match our dts. Therefore adjust the dts
              * of this packet to be what the previous packets duration implies. */
-            trk->cluster[trk->entry].dts = trk->start_dts + trk->track_duration;
+            trk->cluster[trk->entry].dts = trk->end_dts;
             /* We also may have written the pts and the corresponding duration
              * in sidx/tfrf/tfxd tags; make sure the sidx pts and duration match up with
              * the next fragment. This means the cts of the first sample must
@@ -5546,13 +5545,17 @@ int ff_mov_write_packet(AVFormatContext *s, AVPacket *pkt)
                    "this case.\n",
                    pkt->stream_index, pkt->dts);
     }
-    trk->track_duration = pkt->dts - trk->start_dts + pkt->duration;
-    trk->last_sample_is_subtitle_end = 0;
-
     if (pkt->pts == AV_NOPTS_VALUE) {
         av_log(s, AV_LOG_WARNING, "pts has no value\n");
         pkt->pts = pkt->dts;
     }
+    if (trk->start_pts == AV_NOPTS_VALUE) {
+        trk->start_pts = pkt->pts;
+    }
+    trk->track_duration = FFMAX(pkt->pts - trk->start_pts + pkt->duration, trk->track_duration);
+    trk->end_dts = pkt->dts + pkt->duration;
+    trk->last_sample_is_subtitle_end = 0;
+
     if (pkt->dts != pkt->pts)
         trk->flags |= MOV_TRACK_CTTS;
     trk->cluster[trk->entry].cts   = pkt->pts - pkt->dts;
@@ -6295,7 +6298,9 @@ static int mov_init(AVFormatContext *s)
          * this is updated. */
         track->hint_track = -1;
         track->start_dts  = AV_NOPTS_VALUE;
+        track->start_pts  = AV_NOPTS_VALUE;
         track->start_cts  = AV_NOPTS_VALUE;
+        track->end_dts    = AV_NOPTS_VALUE;
         track->end_pts    = AV_NOPTS_VALUE;
         track->dts_shift  = AV_NOPTS_VALUE;
         if (st->codecpar->codec_type == AVMEDIA_TYPE_VIDEO) {
diff --git a/libavformat/movenc.h b/libavformat/movenc.h
index 68d6f23a5a..ddad2631d7 100644
--- a/libavformat/movenc.h
+++ b/libavformat/movenc.h
@@ -116,7 +116,9 @@ typedef struct MOVTrack {
     uint32_t    tref_tag;
     int         tref_id; ///< trackID of the referenced track
     int64_t     start_dts;
+    int64_t     start_pts;
     int64_t     start_cts;
+    int64_t     end_dts;
     int64_t     end_pts;
     int         end_reliable;
     int64_t     dts_shift;