From patchwork Wed Sep 14 20:57:29 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Michael Niedermayer <michael@niedermayer.cc>
X-Patchwork-Id: 573
Delivered-To: ffmpegpatchwork@gmail.com
Received: by 10.103.140.134 with SMTP id o128csp2857100vsd;
	Wed, 14 Sep 2016 13:57:51 -0700 (PDT)
X-Received: by 10.28.203.142 with SMTP id b136mr1222536wmg.36.1473886671225;
	Wed, 14 Sep 2016 13:57:51 -0700 (PDT)
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100])
	by mx.google.com with ESMTP id
	a63si28517308wmi.22.2016.09.14.13.57.48;
	Wed, 14 Sep 2016 13:57:51 -0700 (PDT)
Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
	designates 79.124.17.100 as permitted sender)
	client-ip=79.124.17.100;
Authentication-Results: mx.google.com;
	spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
	designates 79.124.17.100 as permitted sender)
	smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4F913689DC4;
	Wed, 14 Sep 2016 23:57:33 +0300 (EEST)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from relay3-d.mail.gandi.net (relay3-d.mail.gandi.net
	[217.70.183.195])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id E53C1689BC3
	for <ffmpeg-devel@ffmpeg.org>; Wed, 14 Sep 2016 23:57:26 +0300 (EEST)
Received: from mfilter1-d.gandi.net (mfilter1-d.gandi.net [217.70.178.130])
	by relay3-d.mail.gandi.net (Postfix) with ESMTP id 59E02A80D0
	for <ffmpeg-devel@ffmpeg.org>; Wed, 14 Sep 2016 22:57:39 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at mfilter1-d.gandi.net
Received: from relay3-d.mail.gandi.net ([IPv6:::ffff:217.70.183.195])
	by mfilter1-d.gandi.net (mfilter1-d.gandi.net [::ffff:10.0.15.180])
	(amavisd-new, port 10024)
	with ESMTP id xuyqWLeR-M4N for <ffmpeg-devel@ffmpeg.org>;
	Wed, 14 Sep 2016 22:57:37 +0200 (CEST)
X-Originating-IP: 213.47.41.20
Received: from localhost (213-47-41-20.cable.dynamic.surfer.at
	[213.47.41.20]) (Authenticated sender: michael@niedermayer.cc)
	by relay3-d.mail.gandi.net (Postfix) with ESMTPSA id CAB7EA80D7
	for <ffmpeg-devel@ffmpeg.org>; Wed, 14 Sep 2016 22:57:36 +0200 (CEST)
Date: Wed, 14 Sep 2016 22:57:29 +0200
From: Michael Niedermayer <michael@niedermayer.cc>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Message-ID: <20160914205729.GS4975@nb4>
References: 
 <CAGD_KHe6prnc6X6puNcAAfHTcxhaNFYQY30Q41w48z-XvuyHUA@mail.gmail.com>
	<20160828101015.GY5460@nb4>
	<CAGD_KHfgs54s5ji3seSZoL0MXjUVq--wZbim2rt+-T7Trjtj9w@mail.gmail.com>
	<20160831122314.GX4692@nb4>
	<CAGD_KHdC2EyVwBMV=_eWzPpsHoKGqgDnbagwVk63GWUgHYhUeg@mail.gmail.com>
	<20160902225604.GA4692@nb4>
	<CAGD_KHf4w6hmvSWzHdqeLawTjYdnth6YgBvvpzfb=yDweTvFZQ@mail.gmail.com>
	<20160903234812.GK4692@nb4>
	<CAGD_KHdbkiFiYwawnkrzYMAy6wzkYO80Dk_A8mfdtUEJEQny9Q@mail.gmail.com>
	<CAGD_KHck=mBWL0bbVzHHTcbBWruNw8eaeDOAqQENHR767FYSww@mail.gmail.com>
MIME-Version: 1.0
In-Reply-To: 
 <CAGD_KHck=mBWL0bbVzHHTcbBWruNw8eaeDOAqQENHR767FYSww@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Subject: Re: [FFmpeg-devel] [PATCH 4/4] lavf/mov: Add support for edit list
	parsing.
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <http://ffmpeg.org/mailman/options/ffmpeg-devel>,
	<mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <http://ffmpeg.org/pipermail/ffmpeg-devel/>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <http://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
	<mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches
	<ffmpeg-devel@ffmpeg.org>
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>

On Wed, Sep 14, 2016 at 12:20:52AM -0700, Sasi Inguva wrote:
> On Tue, Sep 13, 2016 at 4:39 PM, Sasi Inguva <isasi@google.com> wrote:
> 
> > Sorry for the very late reply. I was busy with other things.
> >
> > On Sat, Sep 3, 2016 at 4:48 PM, Michael Niedermayer <
> > michael@niedermayer.cc> wrote:
> >
> >> On Sat, Sep 03, 2016 at 12:06:39PM -0700, Sasi Inguva wrote:
> >> > Hi Michael,
> >> >
> >> > In fact from audacity I see that out-ingu.wav out-mp3.wav are almost
> >> > equivalent,
> >>
> >> They do not match. (and that is alot more vissible if you scale the
> >> time axis up a bit)
> >>
> >> You also dont use the existing API for handling initial padding/skip
> >> And you didnt reply to this concern
> >> its probably not that hard to fix that ...
> >> instead of droping just at packet granularity you would set the stuff
> >> for droping at sample granularity (too)
> >>
> >
> > Yes. Looking at it more closely now, they don't matrch exactly and this is
> > because as you said, number of samples to drop is not exactly multiple of
> > number of packets. In that case we need to partially discard some samples
> > of a packet. This can be done by using AV_PKT_SIDE_DATA_SKIP_SAMPLES.  I
> > can change the code to use this packet side data instead of discard packet
> > flag, if it is ok.

ok


> >
> >
> >>
> >> also maybe you missed my question about your oppinion on the correct
> >> timestamp output:
> >> "My first question is, entirely independant of the implementation from
> >>  the patches. What is the correct output ? (my primary focus is on
> >>  the timestamps)"
> >>
> >> Iam curious because to me the timestamps of the dropped packets look
> >> wrong and id like to know your oppinion about this. Is this a
> >> implementation issue (if so please explain) or is there some reason
> >> independant of the implementation  (if so please explain too)
> >>
> >
> > The correct output according to me should be - we should show exactly the
> > same presentation timestmap indicated by the MOV stts, ctts atoms, for the
> > samples that fall inside the edits. As long as the PTS is according to what
> > the edit and stts,ctts atoms say. I don't really have to worry about what
> > the decoding timestamp for those packets should be (DTS might as well be
> > AV_NOPTS_VALUE for all packets) .
> >
> > In the current implementation, we directly assign the timestamp in the
> > AVIndex to  pkt->dts . http://git.videolan.org/?p=ffmpeg.git;a=blob;f=
> > libavformat/mov.c;h=6e80b93271a4f998af6ba1af795d7d7c5d67f5a1;hb=HEAD#l5332
> > . The implementation of the edit list code is such that, it needs to be
> > filled with "linear monotonically increasing timestamps for the
> > non-discarded packets " to have the samples denoted by the editlists to
> > have linear monotonically increasing timestamps, in essence, to avoid a big
> > gap between the presentation of the end of first edit list and the start of
> > next edit list because of the discarded packets.
> >
> Hence while parsing a new edit list we bump down the index entry timestamp
> > to (frame_duration + last non-discarded packet from previous edit list ) .
> > And that's why we get non-monotonic timestamps as a whole in the
> > AVIndexEntry .
> >
> >  I think my wording is confusing here. Just clarifying . For one editlist/
> within one editlist, the AVIndexEntry samples pertaining to that edit, need
> to be filled with "monotonic timestamps which increase in steps of their
> corresponding 'stts' atom entries ", to achieve the correct presentation
> timestamp for those samples.
> 
>  When we start parsing the next edit list entry, and start adding index
> entries to AVIndexEntry we need to avoid a big gap between the presentation
> of the end of first edit and the start of next edit, that might happen
> because of the DISCARD packets added to the AVIndexEntry while parsing the
> first edit. Hence we start adding index entries from a  bumped down value
> corresponding to (frame_duration + last non-discarded packet from previous
> edit ) . And that's why we get non-monotonic timestamps as a whole in the
> AVIndexEntry .
> 
> 
> > Let us take an example of a file with two edit lists. First edit list
> > ending on a B frame. This is what ffprobe -show_packet -pf compact looks
> > like . Where 'D' stands for discarded frame. ( I've attached a 6th patch to
> > ffprobe to achieve this formatting ).
> > ./ffprobe -show_packets -print_format compact mov-2elist-bpyramid-1elist-
> > ends-on-bref.mov
> > .....
> > packet|codec_type=video|stream_index=0|pts=10752|pts_
> > time=0.875000|dts=8192|dts_time=0.666667|duration=512|
> > duration_time=0.041667|convergence_duration=N/A|
> > convergence_duration_time=N/A|size=1281|pos=39546|flags=__ - Pframe
> > packet|codec_type=video|stream_index=0|pts=9728|pts_
> > time=0.791667|dts=8704|dts_time=0.708333|duration=512|
> > duration_time=0.041667|convergence_duration=N/A|
> > convergence_duration_time=N/A|size=749|pos=40827|flags=__ - B pyramidal
> > ref
> > packet|codec_type=video|stream_index=0|pts=9216|pts_
> > time=0.750000|dts=9216|dts_time=0.750000|duration=512|
> > duration_time=0.041667|convergence_duration=N/A|
> > convergence_duration_time=N/A|size=487|pos=41576|flags=__ - B
> > packet|codec_type=video|stream_index=0|pts=10240|pts_
> > time=0.833333|dts=9728|dts_time=0.791667|duration=512|
> > duration_time=0.041667|convergence_duration=N/A|
> > convergence_duration_time=N/A|size=486|pos=42063|flags=__ - B
> > packet|codec_type=video|stream_index=0|pts=12800|pts_
> > time=1.041667|dts=10240|dts_time=0.833333|duration=512|
> > duration_time=0.041667|convergence_duration=N/A|
> > convergence_duration_time=N/A|size=3455|pos=42549|flags=_D - P
> > packet|codec_type=video|stream_index=0|pts=11776|pts_
> > time=0.958333|dts=10752|dts_time=0.875000|duration=512|
> > duration_time=0.041667|convergence_duration=N/A|
> > convergence_duration_time=N/A|size=848|pos=46004|flags=_D - B pyramidal
> > ref
> > packet|codec_type=video|stream_index=0|pts=11264|pts_
> > time=0.916667|dts=11264|dts_time=0.916667|duration=512|
> > duration_time=0.041667|convergence_duration=N/A|
> > convergence_duration_time=N/A|size=655|pos=46852|flags=__ - B
> > packet|codec_type=video|stream_index=0|pts=12288|pts_
> > time=1.000000|dts=11776|dts_time=0.958333|duration=512|
> > duration_time=0.041667|convergence_duration=N/A|
> > convergence_duration_time=N/A|size=901|pos=47507|flags=_D - B
> > packet|codec_type=video|stream_index=0|pts=13312|pts_
> > time=1.083333|dts=12288|dts_time=1.000000|duration=512|
> > duration_time=0.041667|convergence_duration=N/A|
> > convergence_duration_time=N/A|size=5112|pos=48408|flags=KD  - end 1st edit
> > packet|codec_type=video|stream_index=0|pts=11772|pts_
> > time=0.958008|dts=10748|dts_time=0.874674|duration=512|
> > duration_time=0.041667|convergence_duration=N/A|
> > convergence_duration_time=N/A|size=5112|pos=48408|flags=K_ - start 2nd
> > edit
> > packet|codec_type=video|stream_index=0|pts=13820|pts_
> > time=1.124674|dts=11260|dts_time=0.916341|duration=512|
> > duration_time=0.041667|convergence_duration=N/A|
> > convergence_duration_time=N/A|size=1179|pos=53520|flags=__
> > .....
> >
> > The implementation basically increases dts by the amounts that stts atom
> > indicates, and as you can see after the end of the first edit list and
> > start of the next edit list, it bumps down the dts from 12288 to 10650 to
> > make the pts linearly increasing from the last non-discarded frame 11264 ->
> > 11772 .
> >
> > As I understand from DTS semantics in ffmpeg we have two constraints on
> > DTS.
> > i) DTS should be monotonically increasing  ii) For every packet PTS >= DTS
> > should be true.
> >
> >  Just reordering PTS to DTS doesn't work either. For example, here is the
> > DTS  I get from "ffprobe -show_packets " when I do "pkt->dts =
> > AV_NOPTS_VALUE" for all packets in mov_read_packet function
> > http://git.videolan.org/?p=ffmpeg.git;a=blob;f=libavformat/mov.c;h=
> > 6e80b93271a4f998af6ba1af795d7d7c5d67f5a1;hb=HEAD#l5332  . Ffmpeg should
> > use update_dts_from_pts function to reoder pts to dts -
> > http://git.videolan.org/?p=ffmpeg.git;a=blob;f=libavformat/utils.c;h=
> > 76cbff4ef67937499a7c113dcaacb88389e6015e;hb=HEAD#l1005 .
> >
> > ./ffprobe -show_packets -print_format compact mov-2elist-bpyramid-1elist-
> > ends-on-bref.mov
> > packet|codec_type=video|stream_index=0|pts=0|pts_time=
> > 0.000000|dts=N/A|dts_time=N/A|duration=512|duration_time=0.
> > 041667|convergence_duration=N/A|convergence_duration_time=N/
> > A|size=5683|pos=36|flags=K_
> > packet|codec_type=video|stream_index=0|pts=2048|pts_
> > time=0.166667|dts=N/A|dts_time=N/A|duration=512|duration_time=0.041667|
> > convergence_duration=N/A|convergence_duration_time=N/A|
> > size=4260|pos=5719|flags=__
> > packet|codec_type=video|stream_index=0|pts=1024|pts_
> > time=0.083333|dts=0|dts_time=0.000000|duration=512|duration_time=0.041667|
> > convergence_duration=N/A|convergence_duration_time=N/A|
> > size=3098|pos=9979|flags=__
> > packet|codec_type=video|stream_index=0|pts=512|pts_
> > time=0.041667|dts=512|dts_time=0.041667|duration=512|
> > duration_time=0.041667|convergence_duration=N/A|
> > convergence_duration_time=N/A|size=786|pos=13077|flags=__
> > ....
> > packet|codec_type=video|stream_index=0|pts=10752|pts_
> > time=0.875000|dts=8192|dts_time=0.666667|duration=512|
> > duration_time=0.041667|convergence_duration=N/A|
> > convergence_duration_time=N/A|size=1281|pos=39546|flags=__
> > packet|codec_type=video|stream_index=0|pts=9728|pts_
> > time=0.791667|dts=8704|dts_time=0.708333|duration=512|
> > duration_time=0.041667|convergence_duration=N/A|
> > convergence_duration_time=N/A|size=749|pos=40827|flags=__
> > packet|codec_type=video|stream_index=0|pts=9216|pts_
> > time=0.750000|dts=9216|dts_time=0.750000|duration=512|
> > duration_time=0.041667|convergence_duration=N/A|
> > convergence_duration_time=N/A|size=487|pos=41576|flags=__
> > packet|codec_type=video|stream_index=0|pts=10240|pts_
> > time=0.833333|dts=9728|dts_time=0.791667|duration=512|
> > duration_time=0.041667|convergence_duration=N/A|
> > convergence_duration_time=N/A|size=486|pos=42063|flags=__
> > packet|codec_type=video|stream_index=0|pts=12800|pts_
> > time=1.041667|dts=10240|dts_time=0.833333|duration=512|
> > duration_time=0.041667|convergence_duration=N/A|
> > convergence_duration_time=N/A|size=3455|pos=42549|flags=_D
> > packet|codec_type=video|stream_index=0|pts=11776|pts_
> > time=0.958333|dts=10752|dts_time=0.875000|duration=512|
> > duration_time=0.041667|convergence_duration=N/A|
> > convergence_duration_time=N/A|size=848|pos=46004|flags=_D
> > packet|codec_type=video|stream_index=0|pts=11264|pts_
> > time=0.916667|dts=11264|dts_time=0.916667|duration=512|
> > duration_time=0.041667|convergence_duration=N/A|
> > convergence_duration_time=N/A|size=655|pos=46852|flags=__
> > packet|codec_type=video|stream_index=0|pts=12288|pts_
> > time=1.000000|dts=11776|dts_time=0.958333|duration=512|
> > duration_time=0.041667|convergence_duration=N/A|
> > convergence_duration_time=N/A|size=901|pos=47507|flags=_D
> > packet|codec_type=video|stream_index=0|pts=13312|pts_
> > time=1.083333|dts=12288|dts_time=1.000000|duration=512|
> > duration_time=0.041667|convergence_duration=N/A|
> > convergence_duration_time=N/A|size=5112|pos=48408|flags=KD
> > packet|codec_type=video|stream_index=0|pts=11772|pts_
> > time=0.958008|dts=11772|dts_time=0.958008|duration=512|
> > duration_time=0.041667|convergence_duration=N/A|
> > convergence_duration_time=N/A|size=5112|pos=48408|flags=K_
> > packet|codec_type=video|stream_index=0|pts=13820|pts_
> > time=1.124674|dts=12800|dts_time=1.041667|duration=512|
> > duration_time=0.041667|convergence_duration=N/A|
> > convergence_duration_time=N/A|size=1179|pos=53520|flags=__
> > ....
> > As we can see the dts are still non monotonic.
> >
> > The problem here is that in case of edit lists, we are trying to make the
> > PTS "not grow" when the packets are being discarded. If we look at the
> > non-discarded packets in presentation order, in the above example the PTS
> > is approx. linearly growing 10752, 11264, 11772, with a difference of
> > frame_duration=512 approx. However there are 3 discarded frames that we are
> > outputting in between. So, to grow DTS monotonically linearly we need to
> > grow DTS at a higher frame rate i.e. smaller frame duration.
> >

> > If we want to make DTS  satisfy the two constraints described above , I
> > can recompute the DTS based on an assumed constant frame rate, and shift
> > the DTS so that PTS > DTS always, but this solution wont work for videos
> > with variable frame rates.
> > So in short , these are the options available I see for the DTS
> > i) Keep the change as it is , with non monotonic DTS
> > ii) Mark DTS=AV_NOPTS_VALUE for discarded frames. Maybe it is right
> > semantic to have DTS=N/A for some packets. I don't know.
> > iii) Recompute the DTS as stated above.

I guess i or ii are the least evil
staying with i and changing if it causes issues is probably least work


> >
> > Attaching all the 6 patches and the test file.

the mov patch doesnt apply cleanly, 1 patch per mail is preferred as
well

the ffprobe patch breaks fate
 Strings Metadata|8
-video|0|37|1.480000|34|1.360000|1|0.040000|N/A|N/A|24786|211456|K|1
+video|0|37|1.480000|34|1.360000|1|0.040000|N/A|N/A|24786|211456|K_|1
 Strings Metadata|8


also the mov patch breaks fate:

Test copy-trac236 failed. Look at tests/data/fate/copy-trac236.err for details.
make: *** [fate-copy-trac236] Error 1

either way, new fate samples uploaded

Thanks

[...]

--- ./tests/ref/fate/copy-trac236       2016-09-14 15:56:10.845886847 +0200
+++ tests/data/fate/copy-trac236        2016-09-14 22:44:44.494403284 +0200
@@ -1,5 +1,5 @@
-9b95afdb39b426a33bc962889f820aed *tests/data/fate/copy-trac236.mov
-630802 tests/data/fate/copy-trac236.mov
+d6e3d97b522ce881ed29c5da74cc7e63 *tests/data/fate/copy-trac236.mov
+630810 tests/data/fate/copy-trac236.mov
 #tb 0: 100/2997
 #media_type 0: video
 #codec_id 0: rawvideo