From patchwork Wed Sep 14 20:57:29 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Niedermayer X-Patchwork-Id: 573 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.103.140.134 with SMTP id o128csp2857100vsd; Wed, 14 Sep 2016 13:57:51 -0700 (PDT) X-Received: by 10.28.203.142 with SMTP id b136mr1222536wmg.36.1473886671225; Wed, 14 Sep 2016 13:57:51 -0700 (PDT) Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a63si28517308wmi.22.2016.09.14.13.57.48; Wed, 14 Sep 2016 13:57:51 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4F913689DC4; Wed, 14 Sep 2016 23:57:33 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from relay3-d.mail.gandi.net (relay3-d.mail.gandi.net [217.70.183.195]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id E53C1689BC3 for ; Wed, 14 Sep 2016 23:57:26 +0300 (EEST) Received: from mfilter1-d.gandi.net (mfilter1-d.gandi.net [217.70.178.130]) by relay3-d.mail.gandi.net (Postfix) with ESMTP id 59E02A80D0 for ; Wed, 14 Sep 2016 22:57:39 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at mfilter1-d.gandi.net Received: from relay3-d.mail.gandi.net ([IPv6:::ffff:217.70.183.195]) by mfilter1-d.gandi.net (mfilter1-d.gandi.net [::ffff:10.0.15.180]) (amavisd-new, port 10024) with ESMTP id xuyqWLeR-M4N for ; Wed, 14 Sep 2016 22:57:37 +0200 (CEST) X-Originating-IP: 213.47.41.20 Received: from localhost (213-47-41-20.cable.dynamic.surfer.at [213.47.41.20]) (Authenticated sender: michael@niedermayer.cc) by relay3-d.mail.gandi.net (Postfix) with ESMTPSA id CAB7EA80D7 for ; Wed, 14 Sep 2016 22:57:36 +0200 (CEST) Date: Wed, 14 Sep 2016 22:57:29 +0200 From: Michael Niedermayer To: FFmpeg development discussions and patches Message-ID: <20160914205729.GS4975@nb4> References: <20160828101015.GY5460@nb4> <20160831122314.GX4692@nb4> <20160902225604.GA4692@nb4> <20160903234812.GK4692@nb4> MIME-Version: 1.0 In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [FFmpeg-devel] [PATCH 4/4] lavf/mov: Add support for edit list parsing. X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" On Wed, Sep 14, 2016 at 12:20:52AM -0700, Sasi Inguva wrote: > On Tue, Sep 13, 2016 at 4:39 PM, Sasi Inguva wrote: > > > Sorry for the very late reply. I was busy with other things. > > > > On Sat, Sep 3, 2016 at 4:48 PM, Michael Niedermayer < > > michael@niedermayer.cc> wrote: > > > >> On Sat, Sep 03, 2016 at 12:06:39PM -0700, Sasi Inguva wrote: > >> > Hi Michael, > >> > > >> > In fact from audacity I see that out-ingu.wav out-mp3.wav are almost > >> > equivalent, > >> > >> They do not match. (and that is alot more vissible if you scale the > >> time axis up a bit) > >> > >> You also dont use the existing API for handling initial padding/skip > >> And you didnt reply to this concern > >> its probably not that hard to fix that ... > >> instead of droping just at packet granularity you would set the stuff > >> for droping at sample granularity (too) > >> > > > > Yes. Looking at it more closely now, they don't matrch exactly and this is > > because as you said, number of samples to drop is not exactly multiple of > > number of packets. In that case we need to partially discard some samples > > of a packet. This can be done by using AV_PKT_SIDE_DATA_SKIP_SAMPLES. I > > can change the code to use this packet side data instead of discard packet > > flag, if it is ok. ok > > > > > >> > >> also maybe you missed my question about your oppinion on the correct > >> timestamp output: > >> "My first question is, entirely independant of the implementation from > >> the patches. What is the correct output ? (my primary focus is on > >> the timestamps)" > >> > >> Iam curious because to me the timestamps of the dropped packets look > >> wrong and id like to know your oppinion about this. Is this a > >> implementation issue (if so please explain) or is there some reason > >> independant of the implementation (if so please explain too) > >> > > > > The correct output according to me should be - we should show exactly the > > same presentation timestmap indicated by the MOV stts, ctts atoms, for the > > samples that fall inside the edits. As long as the PTS is according to what > > the edit and stts,ctts atoms say. I don't really have to worry about what > > the decoding timestamp for those packets should be (DTS might as well be > > AV_NOPTS_VALUE for all packets) . > > > > In the current implementation, we directly assign the timestamp in the > > AVIndex to pkt->dts . http://git.videolan.org/?p=ffmpeg.git;a=blob;f= > > libavformat/mov.c;h=6e80b93271a4f998af6ba1af795d7d7c5d67f5a1;hb=HEAD#l5332 > > . The implementation of the edit list code is such that, it needs to be > > filled with "linear monotonically increasing timestamps for the > > non-discarded packets " to have the samples denoted by the editlists to > > have linear monotonically increasing timestamps, in essence, to avoid a big > > gap between the presentation of the end of first edit list and the start of > > next edit list because of the discarded packets. > > > Hence while parsing a new edit list we bump down the index entry timestamp > > to (frame_duration + last non-discarded packet from previous edit list ) . > > And that's why we get non-monotonic timestamps as a whole in the > > AVIndexEntry . > > > > I think my wording is confusing here. Just clarifying . For one editlist/ > within one editlist, the AVIndexEntry samples pertaining to that edit, need > to be filled with "monotonic timestamps which increase in steps of their > corresponding 'stts' atom entries ", to achieve the correct presentation > timestamp for those samples. > > When we start parsing the next edit list entry, and start adding index > entries to AVIndexEntry we need to avoid a big gap between the presentation > of the end of first edit and the start of next edit, that might happen > because of the DISCARD packets added to the AVIndexEntry while parsing the > first edit. Hence we start adding index entries from a bumped down value > corresponding to (frame_duration + last non-discarded packet from previous > edit ) . And that's why we get non-monotonic timestamps as a whole in the > AVIndexEntry . > > > > Let us take an example of a file with two edit lists. First edit list > > ending on a B frame. This is what ffprobe -show_packet -pf compact looks > > like . Where 'D' stands for discarded frame. ( I've attached a 6th patch to > > ffprobe to achieve this formatting ). > > ./ffprobe -show_packets -print_format compact mov-2elist-bpyramid-1elist- > > ends-on-bref.mov > > ..... > > packet|codec_type=video|stream_index=0|pts=10752|pts_ > > time=0.875000|dts=8192|dts_time=0.666667|duration=512| > > duration_time=0.041667|convergence_duration=N/A| > > convergence_duration_time=N/A|size=1281|pos=39546|flags=__ - Pframe > > packet|codec_type=video|stream_index=0|pts=9728|pts_ > > time=0.791667|dts=8704|dts_time=0.708333|duration=512| > > duration_time=0.041667|convergence_duration=N/A| > > convergence_duration_time=N/A|size=749|pos=40827|flags=__ - B pyramidal > > ref > > packet|codec_type=video|stream_index=0|pts=9216|pts_ > > time=0.750000|dts=9216|dts_time=0.750000|duration=512| > > duration_time=0.041667|convergence_duration=N/A| > > convergence_duration_time=N/A|size=487|pos=41576|flags=__ - B > > packet|codec_type=video|stream_index=0|pts=10240|pts_ > > time=0.833333|dts=9728|dts_time=0.791667|duration=512| > > duration_time=0.041667|convergence_duration=N/A| > > convergence_duration_time=N/A|size=486|pos=42063|flags=__ - B > > packet|codec_type=video|stream_index=0|pts=12800|pts_ > > time=1.041667|dts=10240|dts_time=0.833333|duration=512| > > duration_time=0.041667|convergence_duration=N/A| > > convergence_duration_time=N/A|size=3455|pos=42549|flags=_D - P > > packet|codec_type=video|stream_index=0|pts=11776|pts_ > > time=0.958333|dts=10752|dts_time=0.875000|duration=512| > > duration_time=0.041667|convergence_duration=N/A| > > convergence_duration_time=N/A|size=848|pos=46004|flags=_D - B pyramidal > > ref > > packet|codec_type=video|stream_index=0|pts=11264|pts_ > > time=0.916667|dts=11264|dts_time=0.916667|duration=512| > > duration_time=0.041667|convergence_duration=N/A| > > convergence_duration_time=N/A|size=655|pos=46852|flags=__ - B > > packet|codec_type=video|stream_index=0|pts=12288|pts_ > > time=1.000000|dts=11776|dts_time=0.958333|duration=512| > > duration_time=0.041667|convergence_duration=N/A| > > convergence_duration_time=N/A|size=901|pos=47507|flags=_D - B > > packet|codec_type=video|stream_index=0|pts=13312|pts_ > > time=1.083333|dts=12288|dts_time=1.000000|duration=512| > > duration_time=0.041667|convergence_duration=N/A| > > convergence_duration_time=N/A|size=5112|pos=48408|flags=KD - end 1st edit > > packet|codec_type=video|stream_index=0|pts=11772|pts_ > > time=0.958008|dts=10748|dts_time=0.874674|duration=512| > > duration_time=0.041667|convergence_duration=N/A| > > convergence_duration_time=N/A|size=5112|pos=48408|flags=K_ - start 2nd > > edit > > packet|codec_type=video|stream_index=0|pts=13820|pts_ > > time=1.124674|dts=11260|dts_time=0.916341|duration=512| > > duration_time=0.041667|convergence_duration=N/A| > > convergence_duration_time=N/A|size=1179|pos=53520|flags=__ > > ..... > > > > The implementation basically increases dts by the amounts that stts atom > > indicates, and as you can see after the end of the first edit list and > > start of the next edit list, it bumps down the dts from 12288 to 10650 to > > make the pts linearly increasing from the last non-discarded frame 11264 -> > > 11772 . > > > > As I understand from DTS semantics in ffmpeg we have two constraints on > > DTS. > > i) DTS should be monotonically increasing ii) For every packet PTS >= DTS > > should be true. > > > > Just reordering PTS to DTS doesn't work either. For example, here is the > > DTS I get from "ffprobe -show_packets " when I do "pkt->dts = > > AV_NOPTS_VALUE" for all packets in mov_read_packet function > > http://git.videolan.org/?p=ffmpeg.git;a=blob;f=libavformat/mov.c;h= > > 6e80b93271a4f998af6ba1af795d7d7c5d67f5a1;hb=HEAD#l5332 . Ffmpeg should > > use update_dts_from_pts function to reoder pts to dts - > > http://git.videolan.org/?p=ffmpeg.git;a=blob;f=libavformat/utils.c;h= > > 76cbff4ef67937499a7c113dcaacb88389e6015e;hb=HEAD#l1005 . > > > > ./ffprobe -show_packets -print_format compact mov-2elist-bpyramid-1elist- > > ends-on-bref.mov > > packet|codec_type=video|stream_index=0|pts=0|pts_time= > > 0.000000|dts=N/A|dts_time=N/A|duration=512|duration_time=0. > > 041667|convergence_duration=N/A|convergence_duration_time=N/ > > A|size=5683|pos=36|flags=K_ > > packet|codec_type=video|stream_index=0|pts=2048|pts_ > > time=0.166667|dts=N/A|dts_time=N/A|duration=512|duration_time=0.041667| > > convergence_duration=N/A|convergence_duration_time=N/A| > > size=4260|pos=5719|flags=__ > > packet|codec_type=video|stream_index=0|pts=1024|pts_ > > time=0.083333|dts=0|dts_time=0.000000|duration=512|duration_time=0.041667| > > convergence_duration=N/A|convergence_duration_time=N/A| > > size=3098|pos=9979|flags=__ > > packet|codec_type=video|stream_index=0|pts=512|pts_ > > time=0.041667|dts=512|dts_time=0.041667|duration=512| > > duration_time=0.041667|convergence_duration=N/A| > > convergence_duration_time=N/A|size=786|pos=13077|flags=__ > > .... > > packet|codec_type=video|stream_index=0|pts=10752|pts_ > > time=0.875000|dts=8192|dts_time=0.666667|duration=512| > > duration_time=0.041667|convergence_duration=N/A| > > convergence_duration_time=N/A|size=1281|pos=39546|flags=__ > > packet|codec_type=video|stream_index=0|pts=9728|pts_ > > time=0.791667|dts=8704|dts_time=0.708333|duration=512| > > duration_time=0.041667|convergence_duration=N/A| > > convergence_duration_time=N/A|size=749|pos=40827|flags=__ > > packet|codec_type=video|stream_index=0|pts=9216|pts_ > > time=0.750000|dts=9216|dts_time=0.750000|duration=512| > > duration_time=0.041667|convergence_duration=N/A| > > convergence_duration_time=N/A|size=487|pos=41576|flags=__ > > packet|codec_type=video|stream_index=0|pts=10240|pts_ > > time=0.833333|dts=9728|dts_time=0.791667|duration=512| > > duration_time=0.041667|convergence_duration=N/A| > > convergence_duration_time=N/A|size=486|pos=42063|flags=__ > > packet|codec_type=video|stream_index=0|pts=12800|pts_ > > time=1.041667|dts=10240|dts_time=0.833333|duration=512| > > duration_time=0.041667|convergence_duration=N/A| > > convergence_duration_time=N/A|size=3455|pos=42549|flags=_D > > packet|codec_type=video|stream_index=0|pts=11776|pts_ > > time=0.958333|dts=10752|dts_time=0.875000|duration=512| > > duration_time=0.041667|convergence_duration=N/A| > > convergence_duration_time=N/A|size=848|pos=46004|flags=_D > > packet|codec_type=video|stream_index=0|pts=11264|pts_ > > time=0.916667|dts=11264|dts_time=0.916667|duration=512| > > duration_time=0.041667|convergence_duration=N/A| > > convergence_duration_time=N/A|size=655|pos=46852|flags=__ > > packet|codec_type=video|stream_index=0|pts=12288|pts_ > > time=1.000000|dts=11776|dts_time=0.958333|duration=512| > > duration_time=0.041667|convergence_duration=N/A| > > convergence_duration_time=N/A|size=901|pos=47507|flags=_D > > packet|codec_type=video|stream_index=0|pts=13312|pts_ > > time=1.083333|dts=12288|dts_time=1.000000|duration=512| > > duration_time=0.041667|convergence_duration=N/A| > > convergence_duration_time=N/A|size=5112|pos=48408|flags=KD > > packet|codec_type=video|stream_index=0|pts=11772|pts_ > > time=0.958008|dts=11772|dts_time=0.958008|duration=512| > > duration_time=0.041667|convergence_duration=N/A| > > convergence_duration_time=N/A|size=5112|pos=48408|flags=K_ > > packet|codec_type=video|stream_index=0|pts=13820|pts_ > > time=1.124674|dts=12800|dts_time=1.041667|duration=512| > > duration_time=0.041667|convergence_duration=N/A| > > convergence_duration_time=N/A|size=1179|pos=53520|flags=__ > > .... > > As we can see the dts are still non monotonic. > > > > The problem here is that in case of edit lists, we are trying to make the > > PTS "not grow" when the packets are being discarded. If we look at the > > non-discarded packets in presentation order, in the above example the PTS > > is approx. linearly growing 10752, 11264, 11772, with a difference of > > frame_duration=512 approx. However there are 3 discarded frames that we are > > outputting in between. So, to grow DTS monotonically linearly we need to > > grow DTS at a higher frame rate i.e. smaller frame duration. > > > > If we want to make DTS satisfy the two constraints described above , I > > can recompute the DTS based on an assumed constant frame rate, and shift > > the DTS so that PTS > DTS always, but this solution wont work for videos > > with variable frame rates. > > So in short , these are the options available I see for the DTS > > i) Keep the change as it is , with non monotonic DTS > > ii) Mark DTS=AV_NOPTS_VALUE for discarded frames. Maybe it is right > > semantic to have DTS=N/A for some packets. I don't know. > > iii) Recompute the DTS as stated above. I guess i or ii are the least evil staying with i and changing if it causes issues is probably least work > > > > Attaching all the 6 patches and the test file. the mov patch doesnt apply cleanly, 1 patch per mail is preferred as well the ffprobe patch breaks fate Strings Metadata|8 -video|0|37|1.480000|34|1.360000|1|0.040000|N/A|N/A|24786|211456|K|1 +video|0|37|1.480000|34|1.360000|1|0.040000|N/A|N/A|24786|211456|K_|1 Strings Metadata|8 also the mov patch breaks fate: Test copy-trac236 failed. Look at tests/data/fate/copy-trac236.err for details. make: *** [fate-copy-trac236] Error 1 either way, new fate samples uploaded Thanks [...] --- ./tests/ref/fate/copy-trac236 2016-09-14 15:56:10.845886847 +0200 +++ tests/data/fate/copy-trac236 2016-09-14 22:44:44.494403284 +0200 @@ -1,5 +1,5 @@ -9b95afdb39b426a33bc962889f820aed *tests/data/fate/copy-trac236.mov -630802 tests/data/fate/copy-trac236.mov +d6e3d97b522ce881ed29c5da74cc7e63 *tests/data/fate/copy-trac236.mov +630810 tests/data/fate/copy-trac236.mov #tb 0: 100/2997 #media_type 0: video #codec_id 0: rawvideo