diff mbox

[FFmpeg-devel,1/2] mov: fix decode of fragments that overlap in time

Message ID 20170929155409.8000-1-jstebbins@jetheaddev.com
State Superseded
Headers show

Commit Message

John Stebbins Sept. 29, 2017, 3:54 p.m. UTC
When keyframe intervals of dash segments are not perfectly aligned,
fragments in the stream can overlap in time. Append new "trun" index
entries to the end of the index instead of sorting by timestamp.
Sorting by timestamp causes packets to be read out of decode order and
results in decode errors.
---
 libavformat/mov.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Michael Niedermayer Oct. 4, 2017, 10:50 a.m. UTC | #1
On Fri, Sep 29, 2017 at 08:54:08AM -0700, John Stebbins wrote:
> When keyframe intervals of dash segments are not perfectly aligned,
> fragments in the stream can overlap in time. Append new "trun" index
> entries to the end of the index instead of sorting by timestamp.
> Sorting by timestamp causes packets to be read out of decode order and
> results in decode errors.
> ---
>  libavformat/mov.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/libavformat/mov.c b/libavformat/mov.c
> index 899690d920..c7422cd9ed 100644
> --- a/libavformat/mov.c
> +++ b/libavformat/mov.c
> @@ -4340,8 +4340,8 @@ static int mov_read_trun(MOVContext *c, AVIOContext *pb, MOVAtom atom)
>                                    MOV_FRAG_SAMPLE_FLAG_DEPENDS_YES));
>          if (keyframe)
>              distance = 0;
> -        ctts_index = av_add_index_entry(st, offset, dts, sample_size, distance,
> -                                        keyframe ? AVINDEX_KEYFRAME : 0);
> +        ctts_index = add_index_entry(st, offset, dts, sample_size, distance,
> +                                     keyframe ? AVINDEX_KEYFRAME : 0);

can this lead to timestamps being out of order not just changing
from strictly monotone to monotone ?

Maybe iam missing somehing but out of order could/would cause problems
with av_index_search_timestamp() and possibly others

[...]
John Stebbins Oct. 4, 2017, 3:18 p.m. UTC | #2
On 10/04/2017 03:50 AM, Michael Niedermayer wrote:
> On Fri, Sep 29, 2017 at 08:54:08AM -0700, John Stebbins wrote:
>> When keyframe intervals of dash segments are not perfectly aligned,
>> fragments in the stream can overlap in time. Append new "trun" index
>> entries to the end of the index instead of sorting by timestamp.
>> Sorting by timestamp causes packets to be read out of decode order and
>> results in decode errors.
>> ---
>>  libavformat/mov.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/libavformat/mov.c b/libavformat/mov.c
>> index 899690d920..c7422cd9ed 100644
>> --- a/libavformat/mov.c
>> +++ b/libavformat/mov.c
>> @@ -4340,8 +4340,8 @@ static int mov_read_trun(MOVContext *c, AVIOContext *pb, MOVAtom atom)
>>                                    MOV_FRAG_SAMPLE_FLAG_DEPENDS_YES));
>>          if (keyframe)
>>              distance = 0;
>> -        ctts_index = av_add_index_entry(st, offset, dts, sample_size, distance,
>> -                                        keyframe ? AVINDEX_KEYFRAME : 0);
>> +        ctts_index = add_index_entry(st, offset, dts, sample_size, distance,
>> +                                     keyframe ? AVINDEX_KEYFRAME : 0);
> can this lead to timestamps being out of order not just changing
> from strictly monotone to monotone ?
>
> Maybe iam missing somehing but out of order could/would cause problems
> with av_index_search_timestamp() and possibly others
>
>

I'm not sure I understand the question.  But I think I can answer.  The new fragment can start before the last fragment
ends. I'll make a concrete example.  Lets say the new fragment's first DTS is 10 frames before the end of the previous
fragment. So the first DTS of the new fragment is before the timestamp of 10 entries in the index from the previous
fragment.  av_add_index_entry searches the existing index and inserts the first sample of the new fragment in position
nb_index_entries - 10 (and shifts the existing entries).  The next 9 samples of the new fragment get intermixed with the
remaining 9 samples of the previous fragment, sorted by DTS. When the samples are read out, you get samples from the
last fragment and the new fragment interleaved together causing decoding errors.

Using add_index_entry will result in the timestamps in the index going backwards by 10 frames at the fragment boundary
in this example.  In the other patch that accompanied this one, I've marked the samples from the new fragment that
overlap previous samples with AVINDEX_DISCARD. ff_index_search_timestamp appears to be AVINDEX_DISCARD aware.  So I
think av_index_search_timestamp will do the right thing.
Michael Niedermayer Oct. 4, 2017, 5:13 p.m. UTC | #3
On Wed, Oct 04, 2017 at 08:18:59AM -0700, John Stebbins wrote:
> 
> On 10/04/2017 03:50 AM, Michael Niedermayer wrote:
> > On Fri, Sep 29, 2017 at 08:54:08AM -0700, John Stebbins wrote:
> >> When keyframe intervals of dash segments are not perfectly aligned,
> >> fragments in the stream can overlap in time. Append new "trun" index
> >> entries to the end of the index instead of sorting by timestamp.
> >> Sorting by timestamp causes packets to be read out of decode order and
> >> results in decode errors.
> >> ---
> >>  libavformat/mov.c | 4 ++--
> >>  1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/libavformat/mov.c b/libavformat/mov.c
> >> index 899690d920..c7422cd9ed 100644
> >> --- a/libavformat/mov.c
> >> +++ b/libavformat/mov.c
> >> @@ -4340,8 +4340,8 @@ static int mov_read_trun(MOVContext *c, AVIOContext *pb, MOVAtom atom)
> >>                                    MOV_FRAG_SAMPLE_FLAG_DEPENDS_YES));
> >>          if (keyframe)
> >>              distance = 0;
> >> -        ctts_index = av_add_index_entry(st, offset, dts, sample_size, distance,
> >> -                                        keyframe ? AVINDEX_KEYFRAME : 0);
> >> +        ctts_index = add_index_entry(st, offset, dts, sample_size, distance,
> >> +                                     keyframe ? AVINDEX_KEYFRAME : 0);
> > can this lead to timestamps being out of order not just changing
> > from strictly monotone to monotone ?
> >
> > Maybe iam missing somehing but out of order could/would cause problems
> > with av_index_search_timestamp() and possibly others
> >
> >
> 
> I'm not sure I understand the question.  But I think I can answer.  The new fragment can start before the last fragment
> ends. I'll make a concrete example.  Lets say the new fragment's first DTS is 10 frames before the end of the previous
> fragment. So the first DTS of the new fragment is before the timestamp of 10 entries in the index from the previous
> fragment.  av_add_index_entry searches the existing index and inserts the first sample of the new fragment in position
> nb_index_entries - 10 (and shifts the existing entries).  The next 9 samples of the new fragment get intermixed with the
> remaining 9 samples of the previous fragment, sorted by DTS. When the samples are read out, you get samples from the
> last fragment and the new fragment interleaved together causing decoding errors.
> 
> Using add_index_entry will result in the timestamps in the index going backwards by 10 frames at the fragment boundary
> in this example.  In the other patch that accompanied this one, I've marked the samples from the new fragment that
> overlap previous samples with AVINDEX_DISCARD. ff_index_search_timestamp appears to be AVINDEX_DISCARD aware.  So I
> think av_index_search_timestamp will do the right thing.

yes, that makes sense now.
Please correct me if iam wrong but then patch 1 would introduce a
issue that the 2nd fixes. So both patches should be merged to avoid
this

But theres another problem, trun can be read out of order, when one
seeks around, so the next might have to be put elsewhere than after the
previous

thanks

[...]
John Stebbins Oct. 4, 2017, 5:58 p.m. UTC | #4
On 10/04/2017 10:13 AM, Michael Niedermayer wrote:
> On Wed, Oct 04, 2017 at 08:18:59AM -0700, John Stebbins wrote:
>> On 10/04/2017 03:50 AM, Michael Niedermayer wrote:
>>> On Fri, Sep 29, 2017 at 08:54:08AM -0700, John Stebbins wrote:
>>>> When keyframe intervals of dash segments are not perfectly aligned,
>>>> fragments in the stream can overlap in time. Append new "trun" index
>>>> entries to the end of the index instead of sorting by timestamp.
>>>> Sorting by timestamp causes packets to be read out of decode order and
>>>> results in decode errors.
>>>> ---
>>>>  libavformat/mov.c | 4 ++--
>>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/libavformat/mov.c b/libavformat/mov.c
>>>> index 899690d920..c7422cd9ed 100644
>>>> --- a/libavformat/mov.c
>>>> +++ b/libavformat/mov.c
>>>> @@ -4340,8 +4340,8 @@ static int mov_read_trun(MOVContext *c, AVIOContext *pb, MOVAtom atom)
>>>>                                    MOV_FRAG_SAMPLE_FLAG_DEPENDS_YES));
>>>>          if (keyframe)
>>>>              distance = 0;
>>>> -        ctts_index = av_add_index_entry(st, offset, dts, sample_size, distance,
>>>> -                                        keyframe ? AVINDEX_KEYFRAME : 0);
>>>> +        ctts_index = add_index_entry(st, offset, dts, sample_size, distance,
>>>> +                                     keyframe ? AVINDEX_KEYFRAME : 0);
>>> can this lead to timestamps being out of order not just changing
>>> from strictly monotone to monotone ?
>>>
>>> Maybe iam missing somehing but out of order could/would cause problems
>>> with av_index_search_timestamp() and possibly others
>>>
>>>
>> I'm not sure I understand the question.  But I think I can answer.  The new fragment can start before the last fragment
>> ends. I'll make a concrete example.  Lets say the new fragment's first DTS is 10 frames before the end of the previous
>> fragment. So the first DTS of the new fragment is before the timestamp of 10 entries in the index from the previous
>> fragment.  av_add_index_entry searches the existing index and inserts the first sample of the new fragment in position
>> nb_index_entries - 10 (and shifts the existing entries).  The next 9 samples of the new fragment get intermixed with the
>> remaining 9 samples of the previous fragment, sorted by DTS. When the samples are read out, you get samples from the
>> last fragment and the new fragment interleaved together causing decoding errors.
>>
>> Using add_index_entry will result in the timestamps in the index going backwards by 10 frames at the fragment boundary
>> in this example.  In the other patch that accompanied this one, I've marked the samples from the new fragment that
>> overlap previous samples with AVINDEX_DISCARD. ff_index_search_timestamp appears to be AVINDEX_DISCARD aware.  So I
>> think av_index_search_timestamp will do the right thing.
> yes, that makes sense now.
> Please correct me if iam wrong but then patch 1 would introduce a
> issue that the 2nd fixes. So both patches should be merged to avoid
> this
>
> But theres another problem, trun can be read out of order, when one
> seeks around, so the next might have to be put elsewhere than after the
> previous
>
> thanks
>

Hmm, can you describe the circumstances where this would happen.  I looked at the seek code and can't see any way for it
to seek to the middle somewhere without first reading previous trun.  It looks to me like if avformat_seek_file or
av_seek_frame fails to find the desired timestamp in the index it falls back to seek_frame_generic which seeks to the
position of the last sample in the index and performs av_read_frame until it gets to the timestamp it wants.  Is there a
path I've missed where it can skip to the middle of the file somehow?
Michael Niedermayer Oct. 4, 2017, 10:21 p.m. UTC | #5
On Wed, Oct 04, 2017 at 10:58:19AM -0700, John Stebbins wrote:
> On 10/04/2017 10:13 AM, Michael Niedermayer wrote:
> > On Wed, Oct 04, 2017 at 08:18:59AM -0700, John Stebbins wrote:
> >> On 10/04/2017 03:50 AM, Michael Niedermayer wrote:
> >>> On Fri, Sep 29, 2017 at 08:54:08AM -0700, John Stebbins wrote:
> >>>> When keyframe intervals of dash segments are not perfectly aligned,
> >>>> fragments in the stream can overlap in time. Append new "trun" index
> >>>> entries to the end of the index instead of sorting by timestamp.
> >>>> Sorting by timestamp causes packets to be read out of decode order and
> >>>> results in decode errors.
> >>>> ---
> >>>>  libavformat/mov.c | 4 ++--
> >>>>  1 file changed, 2 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/libavformat/mov.c b/libavformat/mov.c
> >>>> index 899690d920..c7422cd9ed 100644
> >>>> --- a/libavformat/mov.c
> >>>> +++ b/libavformat/mov.c
> >>>> @@ -4340,8 +4340,8 @@ static int mov_read_trun(MOVContext *c, AVIOContext *pb, MOVAtom atom)
> >>>>                                    MOV_FRAG_SAMPLE_FLAG_DEPENDS_YES));
> >>>>          if (keyframe)
> >>>>              distance = 0;
> >>>> -        ctts_index = av_add_index_entry(st, offset, dts, sample_size, distance,
> >>>> -                                        keyframe ? AVINDEX_KEYFRAME : 0);
> >>>> +        ctts_index = add_index_entry(st, offset, dts, sample_size, distance,
> >>>> +                                     keyframe ? AVINDEX_KEYFRAME : 0);
> >>> can this lead to timestamps being out of order not just changing
> >>> from strictly monotone to monotone ?
> >>>
> >>> Maybe iam missing somehing but out of order could/would cause problems
> >>> with av_index_search_timestamp() and possibly others
> >>>
> >>>
> >> I'm not sure I understand the question.  But I think I can answer.  The new fragment can start before the last fragment
> >> ends. I'll make a concrete example.  Lets say the new fragment's first DTS is 10 frames before the end of the previous
> >> fragment. So the first DTS of the new fragment is before the timestamp of 10 entries in the index from the previous
> >> fragment.  av_add_index_entry searches the existing index and inserts the first sample of the new fragment in position
> >> nb_index_entries - 10 (and shifts the existing entries).  The next 9 samples of the new fragment get intermixed with the
> >> remaining 9 samples of the previous fragment, sorted by DTS. When the samples are read out, you get samples from the
> >> last fragment and the new fragment interleaved together causing decoding errors.
> >>
> >> Using add_index_entry will result in the timestamps in the index going backwards by 10 frames at the fragment boundary
> >> in this example.  In the other patch that accompanied this one, I've marked the samples from the new fragment that
> >> overlap previous samples with AVINDEX_DISCARD. ff_index_search_timestamp appears to be AVINDEX_DISCARD aware.  So I
> >> think av_index_search_timestamp will do the right thing.
> > yes, that makes sense now.
> > Please correct me if iam wrong but then patch 1 would introduce a
> > issue that the 2nd fixes. So both patches should be merged to avoid
> > this
> >
> > But theres another problem, trun can be read out of order, when one
> > seeks around, so the next might have to be put elsewhere than after the
> > previous
> >
> > thanks
> >
> 
> Hmm, can you describe the circumstances where this would happen.  I looked at the seek code and can't see any way for it
> to seek to the middle somewhere without first reading previous trun.  It looks to me like if avformat_seek_file or
> av_seek_frame fails to find the desired timestamp in the index it falls back to seek_frame_generic which seeks to the
> position of the last sample in the index and performs av_read_frame until it gets to the timestamp it wants.  Is there a
> path I've missed where it can skip to the middle of the file somehow?

I used
-rw-r----- 1 michael michael 66908195 Dec 11  2015 buck480p30_na.mp4
./ffplay buck480p30_na.mp4

(i can upload this if needed, i dont know where its from exactly)

and when seeking around by using the right mouse buttonq it sometimes read
trun chunks with lower times than previous (seen from the av_logs in
there)

I hope i made no mistake and would assume this happens with any file
with these chunks

...
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f3884000940] AVIndex stream 0, sample 151, offset 60134, dts 450000, size 194, distance 25, keyframe 0
...
Seek to 68% ( 0:07:11) of total duration ( 0:10:34)
...
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f3884000940] AVIndex stream 0, sample 152, offset 2b74fd6, dts 38757000, size 8284, distance 0, keyframe 1
...
Seek to 14% ( 0:01:29) of total duration ( 0:10:34)
...
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f3884000940] AVIndex stream 0, sample 152, offset 959164, dts 7749000, size 55027, distance 0, keyframe 1

[...]
John Stebbins Oct. 5, 2017, 4:45 p.m. UTC | #6
On 10/04/2017 03:21 PM, Michael Niedermayer wrote:
> On Wed, Oct 04, 2017 at 10:58:19AM -0700, John Stebbins wrote:
>> On 10/04/2017 10:13 AM, Michael Niedermayer wrote:
>>> On Wed, Oct 04, 2017 at 08:18:59AM -0700, John Stebbins wrote:
>>>> On 10/04/2017 03:50 AM, Michael Niedermayer wrote:
>>>>> On Fri, Sep 29, 2017 at 08:54:08AM -0700, John Stebbins wrote:
>>>>>> When keyframe intervals of dash segments are not perfectly aligned,
>>>>>> fragments in the stream can overlap in time. Append new "trun" index
>>>>>> entries to the end of the index instead of sorting by timestamp.
>>>>>> Sorting by timestamp causes packets to be read out of decode order and
>>>>>> results in decode errors.
>>>>>> ---
>>>>>>  libavformat/mov.c | 4 ++--
>>>>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/libavformat/mov.c b/libavformat/mov.c
>>>>>> index 899690d920..c7422cd9ed 100644
>>>>>> --- a/libavformat/mov.c
>>>>>> +++ b/libavformat/mov.c
>>>>>> @@ -4340,8 +4340,8 @@ static int mov_read_trun(MOVContext *c, AVIOContext *pb, MOVAtom atom)
>>>>>>                                    MOV_FRAG_SAMPLE_FLAG_DEPENDS_YES));
>>>>>>          if (keyframe)
>>>>>>              distance = 0;
>>>>>> -        ctts_index = av_add_index_entry(st, offset, dts, sample_size, distance,
>>>>>> -                                        keyframe ? AVINDEX_KEYFRAME : 0);
>>>>>> +        ctts_index = add_index_entry(st, offset, dts, sample_size, distance,
>>>>>> +                                     keyframe ? AVINDEX_KEYFRAME : 0);
>>>>> can this lead to timestamps being out of order not just changing
>>>>> from strictly monotone to monotone ?
>>>>>
>>>>> Maybe iam missing somehing but out of order could/would cause problems
>>>>> with av_index_search_timestamp() and possibly others
>>>>>
>>>>>
>>>> I'm not sure I understand the question.  But I think I can answer.  The new fragment can start before the last fragment
>>>> ends. I'll make a concrete example.  Lets say the new fragment's first DTS is 10 frames before the end of the previous
>>>> fragment. So the first DTS of the new fragment is before the timestamp of 10 entries in the index from the previous
>>>> fragment.  av_add_index_entry searches the existing index and inserts the first sample of the new fragment in position
>>>> nb_index_entries - 10 (and shifts the existing entries).  The next 9 samples of the new fragment get intermixed with the
>>>> remaining 9 samples of the previous fragment, sorted by DTS. When the samples are read out, you get samples from the
>>>> last fragment and the new fragment interleaved together causing decoding errors.
>>>>
>>>> Using add_index_entry will result in the timestamps in the index going backwards by 10 frames at the fragment boundary
>>>> in this example.  In the other patch that accompanied this one, I've marked the samples from the new fragment that
>>>> overlap previous samples with AVINDEX_DISCARD. ff_index_search_timestamp appears to be AVINDEX_DISCARD aware.  So I
>>>> think av_index_search_timestamp will do the right thing.
>>> yes, that makes sense now.
>>> Please correct me if iam wrong but then patch 1 would introduce a
>>> issue that the 2nd fixes. So both patches should be merged to avoid
>>> this
>>>
>>> But theres another problem, trun can be read out of order, when one
>>> seeks around, so the next might have to be put elsewhere than after the
>>> previous
>>>
>>> thanks
>>>
>> Hmm, can you describe the circumstances where this would happen.  I looked at the seek code and can't see any way for it
>> to seek to the middle somewhere without first reading previous trun.  It looks to me like if avformat_seek_file or
>> av_seek_frame fails to find the desired timestamp in the index it falls back to seek_frame_generic which seeks to the
>> position of the last sample in the index and performs av_read_frame until it gets to the timestamp it wants.  Is there a
>> path I've missed where it can skip to the middle of the file somehow?
> I used
> -rw-r----- 1 michael michael 66908195 Dec 11  2015 buck480p30_na.mp4
> ./ffplay buck480p30_na.mp4
>
> (i can upload this if needed, i dont know where its from exactly)
>
> and when seeking around by using the right mouse buttonq it sometimes read
> trun chunks with lower times than previous (seen from the av_logs in
> there)
>
> I hope i made no mistake and would assume this happens with any file
> with these chunks
>
> ...
> [mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f3884000940] AVIndex stream 0, sample 151, offset 60134, dts 450000, size 194, distance 25, keyframe 0
> ...
> Seek to 68% ( 0:07:11) of total duration ( 0:10:34)
> ...
> [mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f3884000940] AVIndex stream 0, sample 152, offset 2b74fd6, dts 38757000, size 8284, distance 0, keyframe 1
> ...
> Seek to 14% ( 0:01:29) of total duration ( 0:10:34)
> ...
> [mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f3884000940] AVIndex stream 0, sample 152, offset 959164, dts 7749000, size 55027, distance 0, keyframe 1
>
>

When seeking mov_read_trun is getting called repeatedly for the same fragment which has a number of undesirable side
effects, even without my patch.  The following things get updated to incorrect values when seeking backward and the trun
is re-read:

sc->data_size
sc->duration_for_fps
sc->nb_frames_for_fps
sc->track_end

The trun is getting re-read in mov_switch_root because headers_read in MOVFragmentIndex has not yet been set for the
fragment.  I think a solution to this is to set headers_read for the appropriate entry in MOVFragmentIndex when the trun
is read the first time.  Does this sound like the right approach?
John Stebbins Oct. 5, 2017, 9:38 p.m. UTC | #7
On 10/05/2017 09:45 AM, John Stebbins wrote:
> On 10/04/2017 03:21 PM, Michael Niedermayer wrote:
>> On Wed, Oct 04, 2017 at 10:58:19AM -0700, John Stebbins wrote:
>>> On 10/04/2017 10:13 AM, Michael Niedermayer wrote:
>>>> On Wed, Oct 04, 2017 at 08:18:59AM -0700, John Stebbins wrote:
>>>>> On 10/04/2017 03:50 AM, Michael Niedermayer wrote:
>>>>>> On Fri, Sep 29, 2017 at 08:54:08AM -0700, John Stebbins wrote:
>>>>>>> When keyframe intervals of dash segments are not perfectly aligned,
>>>>>>> fragments in the stream can overlap in time. Append new "trun" index
>>>>>>> entries to the end of the index instead of sorting by timestamp.
>>>>>>> Sorting by timestamp causes packets to be read out of decode order and
>>>>>>> results in decode errors.
>>>>>>> ---
>>>>>>>  libavformat/mov.c | 4 ++--
>>>>>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/libavformat/mov.c b/libavformat/mov.c
>>>>>>> index 899690d920..c7422cd9ed 100644
>>>>>>> --- a/libavformat/mov.c
>>>>>>> +++ b/libavformat/mov.c
>>>>>>> @@ -4340,8 +4340,8 @@ static int mov_read_trun(MOVContext *c, AVIOContext *pb, MOVAtom atom)
>>>>>>>                                    MOV_FRAG_SAMPLE_FLAG_DEPENDS_YES));
>>>>>>>          if (keyframe)
>>>>>>>              distance = 0;
>>>>>>> -        ctts_index = av_add_index_entry(st, offset, dts, sample_size, distance,
>>>>>>> -                                        keyframe ? AVINDEX_KEYFRAME : 0);
>>>>>>> +        ctts_index = add_index_entry(st, offset, dts, sample_size, distance,
>>>>>>> +                                     keyframe ? AVINDEX_KEYFRAME : 0);
>>>>>> can this lead to timestamps being out of order not just changing
>>>>>> from strictly monotone to monotone ?
>>>>>>
>>>>>> Maybe iam missing somehing but out of order could/would cause problems
>>>>>> with av_index_search_timestamp() and possibly others
>>>>>>
>>>>>>
>>>>> I'm not sure I understand the question.  But I think I can answer.  The new fragment can start before the last fragment
>>>>> ends. I'll make a concrete example.  Lets say the new fragment's first DTS is 10 frames before the end of the previous
>>>>> fragment. So the first DTS of the new fragment is before the timestamp of 10 entries in the index from the previous
>>>>> fragment.  av_add_index_entry searches the existing index and inserts the first sample of the new fragment in position
>>>>> nb_index_entries - 10 (and shifts the existing entries).  The next 9 samples of the new fragment get intermixed with the
>>>>> remaining 9 samples of the previous fragment, sorted by DTS. When the samples are read out, you get samples from the
>>>>> last fragment and the new fragment interleaved together causing decoding errors.
>>>>>
>>>>> Using add_index_entry will result in the timestamps in the index going backwards by 10 frames at the fragment boundary
>>>>> in this example.  In the other patch that accompanied this one, I've marked the samples from the new fragment that
>>>>> overlap previous samples with AVINDEX_DISCARD. ff_index_search_timestamp appears to be AVINDEX_DISCARD aware.  So I
>>>>> think av_index_search_timestamp will do the right thing.
>>>> yes, that makes sense now.
>>>> Please correct me if iam wrong but then patch 1 would introduce a
>>>> issue that the 2nd fixes. So both patches should be merged to avoid
>>>> this
>>>>
>>>> But theres another problem, trun can be read out of order, when one
>>>> seeks around, so the next might have to be put elsewhere than after the
>>>> previous
>>>>
>>>> thanks
>>>>
>>> Hmm, can you describe the circumstances where this would happen.  I looked at the seek code and can't see any way for it
>>> to seek to the middle somewhere without first reading previous trun.  It looks to me like if avformat_seek_file or
>>> av_seek_frame fails to find the desired timestamp in the index it falls back to seek_frame_generic which seeks to the
>>> position of the last sample in the index and performs av_read_frame until it gets to the timestamp it wants.  Is there a
>>> path I've missed where it can skip to the middle of the file somehow?
>> I used
>> -rw-r----- 1 michael michael 66908195 Dec 11  2015 buck480p30_na.mp4
>> ./ffplay buck480p30_na.mp4
>>
>> (i can upload this if needed, i dont know where its from exactly)
>>
>> and when seeking around by using the right mouse buttonq it sometimes read
>> trun chunks with lower times than previous (seen from the av_logs in
>> there)
>>
>> I hope i made no mistake and would assume this happens with any file
>> with these chunks
>>
>> ...
>> [mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f3884000940] AVIndex stream 0, sample 151, offset 60134, dts 450000, size 194, distance 25, keyframe 0
>> ...
>> Seek to 68% ( 0:07:11) of total duration ( 0:10:34)
>> ...
>> [mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f3884000940] AVIndex stream 0, sample 152, offset 2b74fd6, dts 38757000, size 8284, distance 0, keyframe 1
>> ...
>> Seek to 14% ( 0:01:29) of total duration ( 0:10:34)
>> ...
>> [mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f3884000940] AVIndex stream 0, sample 152, offset 959164, dts 7749000, size 55027, distance 0, keyframe 1
>>
>>
> When seeking mov_read_trun is getting called repeatedly for the same fragment which has a number of undesirable side
> effects, even without my patch.  The following things get updated to incorrect values when seeking backward and the trun
> is re-read:
>
> sc->data_size
> sc->duration_for_fps
> sc->nb_frames_for_fps
> sc->track_end
>
> The trun is getting re-read in mov_switch_root because headers_read in MOVFragmentIndex has not yet been set for the
> fragment.  I think a solution to this is to set headers_read for the appropriate entry in MOVFragmentIndex when the trun
> is read the first time.  Does this sound like the right approach?
>

I got the analysis wrong above.  It's not re-reading the trun.  What's happening is that while seeking forward it can
skip one or more trun.  Then seeking back, it will read that trun.  So, as you said, re-ordering of the index will be
necessary to handle seeking past a trun. 

It can seek forward past a trun because the sidx has the offset to each moof and is used by mov_seek_stream.  I missed
this earlier.

Since the trun can overlap, reordering shouldn't be done by simply sorting by the index_enties timestamps though.  I'm
thinking a good way would be to add a index_entry member to MOVFragmentIndexItem that records where in index_entries the
samples for the trun were written.  The position in index_entries of a *new* trun would be determined by looking at the
position of the MOVFragmentIndexItem that corresponds to that trun and  finding for the next MOVFragmentIndexItem that
has a valid index_entry set (which means it's trun was read and samples inserted into the index). If no next valid
index_entry is found, the samples of the new trun get appended.  If a valid index_entry is found, open a hole in
index_entries before that entry and populate the samples from the new trun in the hole. Then fix up the index_entry
members of MOVFragmentIndexItems to account for the hole.

Looking again at sc->* most of these are accurate I think.  I question sc->track_end though.  It seem like is should not
be going backwards when seeking backwards.

Am I on the right track now?
Michael Niedermayer Oct. 6, 2017, 11:20 p.m. UTC | #8
On Thu, Oct 05, 2017 at 02:38:48PM -0700, John Stebbins wrote:
> On 10/05/2017 09:45 AM, John Stebbins wrote:
> > On 10/04/2017 03:21 PM, Michael Niedermayer wrote:
> >> On Wed, Oct 04, 2017 at 10:58:19AM -0700, John Stebbins wrote:
> >>> On 10/04/2017 10:13 AM, Michael Niedermayer wrote:
> >>>> On Wed, Oct 04, 2017 at 08:18:59AM -0700, John Stebbins wrote:
> >>>>> On 10/04/2017 03:50 AM, Michael Niedermayer wrote:
> >>>>>> On Fri, Sep 29, 2017 at 08:54:08AM -0700, John Stebbins wrote:
> >>>>>>> When keyframe intervals of dash segments are not perfectly aligned,
> >>>>>>> fragments in the stream can overlap in time. Append new "trun" index
> >>>>>>> entries to the end of the index instead of sorting by timestamp.
> >>>>>>> Sorting by timestamp causes packets to be read out of decode order and
> >>>>>>> results in decode errors.
> >>>>>>> ---
> >>>>>>>  libavformat/mov.c | 4 ++--
> >>>>>>>  1 file changed, 2 insertions(+), 2 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/libavformat/mov.c b/libavformat/mov.c
> >>>>>>> index 899690d920..c7422cd9ed 100644
> >>>>>>> --- a/libavformat/mov.c
> >>>>>>> +++ b/libavformat/mov.c
> >>>>>>> @@ -4340,8 +4340,8 @@ static int mov_read_trun(MOVContext *c, AVIOContext *pb, MOVAtom atom)
> >>>>>>>                                    MOV_FRAG_SAMPLE_FLAG_DEPENDS_YES));
> >>>>>>>          if (keyframe)
> >>>>>>>              distance = 0;
> >>>>>>> -        ctts_index = av_add_index_entry(st, offset, dts, sample_size, distance,
> >>>>>>> -                                        keyframe ? AVINDEX_KEYFRAME : 0);
> >>>>>>> +        ctts_index = add_index_entry(st, offset, dts, sample_size, distance,
> >>>>>>> +                                     keyframe ? AVINDEX_KEYFRAME : 0);
> >>>>>> can this lead to timestamps being out of order not just changing
> >>>>>> from strictly monotone to monotone ?
> >>>>>>
> >>>>>> Maybe iam missing somehing but out of order could/would cause problems
> >>>>>> with av_index_search_timestamp() and possibly others
> >>>>>>
> >>>>>>
> >>>>> I'm not sure I understand the question.  But I think I can answer.  The new fragment can start before the last fragment
> >>>>> ends. I'll make a concrete example.  Lets say the new fragment's first DTS is 10 frames before the end of the previous
> >>>>> fragment. So the first DTS of the new fragment is before the timestamp of 10 entries in the index from the previous
> >>>>> fragment.  av_add_index_entry searches the existing index and inserts the first sample of the new fragment in position
> >>>>> nb_index_entries - 10 (and shifts the existing entries).  The next 9 samples of the new fragment get intermixed with the
> >>>>> remaining 9 samples of the previous fragment, sorted by DTS. When the samples are read out, you get samples from the
> >>>>> last fragment and the new fragment interleaved together causing decoding errors.
> >>>>>
> >>>>> Using add_index_entry will result in the timestamps in the index going backwards by 10 frames at the fragment boundary
> >>>>> in this example.  In the other patch that accompanied this one, I've marked the samples from the new fragment that
> >>>>> overlap previous samples with AVINDEX_DISCARD. ff_index_search_timestamp appears to be AVINDEX_DISCARD aware.  So I
> >>>>> think av_index_search_timestamp will do the right thing.
> >>>> yes, that makes sense now.
> >>>> Please correct me if iam wrong but then patch 1 would introduce a
> >>>> issue that the 2nd fixes. So both patches should be merged to avoid
> >>>> this
> >>>>
> >>>> But theres another problem, trun can be read out of order, when one
> >>>> seeks around, so the next might have to be put elsewhere than after the
> >>>> previous
> >>>>
> >>>> thanks
> >>>>
> >>> Hmm, can you describe the circumstances where this would happen.  I looked at the seek code and can't see any way for it
> >>> to seek to the middle somewhere without first reading previous trun.  It looks to me like if avformat_seek_file or
> >>> av_seek_frame fails to find the desired timestamp in the index it falls back to seek_frame_generic which seeks to the
> >>> position of the last sample in the index and performs av_read_frame until it gets to the timestamp it wants.  Is there a
> >>> path I've missed where it can skip to the middle of the file somehow?
> >> I used
> >> -rw-r----- 1 michael michael 66908195 Dec 11  2015 buck480p30_na.mp4
> >> ./ffplay buck480p30_na.mp4
> >>
> >> (i can upload this if needed, i dont know where its from exactly)
> >>
> >> and when seeking around by using the right mouse buttonq it sometimes read
> >> trun chunks with lower times than previous (seen from the av_logs in
> >> there)
> >>
> >> I hope i made no mistake and would assume this happens with any file
> >> with these chunks
> >>
> >> ...
> >> [mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f3884000940] AVIndex stream 0, sample 151, offset 60134, dts 450000, size 194, distance 25, keyframe 0
> >> ...
> >> Seek to 68% ( 0:07:11) of total duration ( 0:10:34)
> >> ...
> >> [mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f3884000940] AVIndex stream 0, sample 152, offset 2b74fd6, dts 38757000, size 8284, distance 0, keyframe 1
> >> ...
> >> Seek to 14% ( 0:01:29) of total duration ( 0:10:34)
> >> ...
> >> [mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f3884000940] AVIndex stream 0, sample 152, offset 959164, dts 7749000, size 55027, distance 0, keyframe 1
> >>
> >>
> > When seeking mov_read_trun is getting called repeatedly for the same fragment which has a number of undesirable side
> > effects, even without my patch.  The following things get updated to incorrect values when seeking backward and the trun
> > is re-read:
> >
> > sc->data_size
> > sc->duration_for_fps
> > sc->nb_frames_for_fps
> > sc->track_end
> >
> > The trun is getting re-read in mov_switch_root because headers_read in MOVFragmentIndex has not yet been set for the
> > fragment.  I think a solution to this is to set headers_read for the appropriate entry in MOVFragmentIndex when the trun
> > is read the first time.  Does this sound like the right approach?
> >
> 
> I got the analysis wrong above.  It's not re-reading the trun.  What's happening is that while seeking forward it can
> skip one or more trun.  Then seeking back, it will read that trun.  So, as you said, re-ordering of the index will be
> necessary to handle seeking past a trun. 
> 
> It can seek forward past a trun because the sidx has the offset to each moof and is used by mov_seek_stream.  I missed
> this earlier.
> 
> Since the trun can overlap, reordering shouldn't be done by simply sorting by the index_enties timestamps though.  I'm
> thinking a good way would be to add a index_entry member to MOVFragmentIndexItem that records where in index_entries the
> samples for the trun were written.  The position in index_entries of a *new* trun would be determined by looking at the
> position of the MOVFragmentIndexItem that corresponds to that trun and  finding for the next MOVFragmentIndexItem that
> has a valid index_entry set (which means it's trun was read and samples inserted into the index). If no next valid
> index_entry is found, the samples of the new trun get appended.  If a valid index_entry is found, open a hole in
> index_entries before that entry and populate the samples from the new trun in the hole. Then fix up the index_entry
> members of MOVFragmentIndexItems to account for the hole.
> 
> Looking again at sc->* most of these are accurate I think.  I question sc->track_end though.  It seem like is should not
> be going backwards when seeking backwards.
> 
> Am I on the right track now?

I think so but this code has become quite complex, its very possible
there are aspects iam missing

[...]
John Stebbins Oct. 9, 2017, 8:08 p.m. UTC | #9
On 10/06/2017 04:20 PM, Michael Niedermayer wrote:
> On Thu, Oct 05, 2017 at 02:38:48PM -0700, John Stebbins wrote:
>> On 10/05/2017 09:45 AM, John Stebbins wrote:
>>> On 10/04/2017 03:21 PM, Michael Niedermayer wrote:
>>>> On Wed, Oct 04, 2017 at 10:58:19AM -0700, John Stebbins wrote:
>>>>> On 10/04/2017 10:13 AM, Michael Niedermayer wrote:
>>>>>> On Wed, Oct 04, 2017 at 08:18:59AM -0700, John Stebbins wrote:
>>>>>>> On 10/04/2017 03:50 AM, Michael Niedermayer wrote:
>>>>>>>> On Fri, Sep 29, 2017 at 08:54:08AM -0700, John Stebbins wrote:
>>>>>>>>> When keyframe intervals of dash segments are not perfectly aligned,
>>>>>>>>> fragments in the stream can overlap in time. Append new "trun" index
>>>>>>>>> entries to the end of the index instead of sorting by timestamp.
>>>>>>>>> Sorting by timestamp causes packets to be read out of decode order and
>>>>>>>>> results in decode errors.
>>>>>>>>> ---
>>>>>>>>>  libavformat/mov.c | 4 ++--
>>>>>>>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/libavformat/mov.c b/libavformat/mov.c
>>>>>>>>> index 899690d920..c7422cd9ed 100644
>>>>>>>>> --- a/libavformat/mov.c
>>>>>>>>> +++ b/libavformat/mov.c
>>>>>>>>> @@ -4340,8 +4340,8 @@ static int mov_read_trun(MOVContext *c, AVIOContext *pb, MOVAtom atom)
>>>>>>>>>                                    MOV_FRAG_SAMPLE_FLAG_DEPENDS_YES));
>>>>>>>>>          if (keyframe)
>>>>>>>>>              distance = 0;
>>>>>>>>> -        ctts_index = av_add_index_entry(st, offset, dts, sample_size, distance,
>>>>>>>>> -                                        keyframe ? AVINDEX_KEYFRAME : 0);
>>>>>>>>> +        ctts_index = add_index_entry(st, offset, dts, sample_size, distance,
>>>>>>>>> +                                     keyframe ? AVINDEX_KEYFRAME : 0);
>>>>>>>> can this lead to timestamps being out of order not just changing
>>>>>>>> from strictly monotone to monotone ?
>>>>>>>>
>>>>>>>> Maybe iam missing somehing but out of order could/would cause problems
>>>>>>>> with av_index_search_timestamp() and possibly others
>>>>>>>>
>>>>>>>>
>>>>>>> I'm not sure I understand the question.  But I think I can answer.  The new fragment can start before the last fragment
>>>>>>> ends. I'll make a concrete example.  Lets say the new fragment's first DTS is 10 frames before the end of the previous
>>>>>>> fragment. So the first DTS of the new fragment is before the timestamp of 10 entries in the index from the previous
>>>>>>> fragment.  av_add_index_entry searches the existing index and inserts the first sample of the new fragment in position
>>>>>>> nb_index_entries - 10 (and shifts the existing entries).  The next 9 samples of the new fragment get intermixed with the
>>>>>>> remaining 9 samples of the previous fragment, sorted by DTS. When the samples are read out, you get samples from the
>>>>>>> last fragment and the new fragment interleaved together causing decoding errors.
>>>>>>>
>>>>>>> Using add_index_entry will result in the timestamps in the index going backwards by 10 frames at the fragment boundary
>>>>>>> in this example.  In the other patch that accompanied this one, I've marked the samples from the new fragment that
>>>>>>> overlap previous samples with AVINDEX_DISCARD. ff_index_search_timestamp appears to be AVINDEX_DISCARD aware.  So I
>>>>>>> think av_index_search_timestamp will do the right thing.
>>>>>> yes, that makes sense now.
>>>>>> Please correct me if iam wrong but then patch 1 would introduce a
>>>>>> issue that the 2nd fixes. So both patches should be merged to avoid
>>>>>> this
>>>>>>
>>>>>> But theres another problem, trun can be read out of order, when one
>>>>>> seeks around, so the next might have to be put elsewhere than after the
>>>>>> previous
>>>>>>
>>>>>> thanks
>>>>>>
>>>>> Hmm, can you describe the circumstances where this would happen.  I looked at the seek code and can't see any way for it
>>>>> to seek to the middle somewhere without first reading previous trun.  It looks to me like if avformat_seek_file or
>>>>> av_seek_frame fails to find the desired timestamp in the index it falls back to seek_frame_generic which seeks to the
>>>>> position of the last sample in the index and performs av_read_frame until it gets to the timestamp it wants.  Is there a
>>>>> path I've missed where it can skip to the middle of the file somehow?
>>>> I used
>>>> -rw-r----- 1 michael michael 66908195 Dec 11  2015 buck480p30_na.mp4
>>>> ./ffplay buck480p30_na.mp4
>>>>
>>>> (i can upload this if needed, i dont know where its from exactly)
>>>>
>>>> and when seeking around by using the right mouse buttonq it sometimes read
>>>> trun chunks with lower times than previous (seen from the av_logs in
>>>> there)
>>>>
>>>> I hope i made no mistake and would assume this happens with any file
>>>> with these chunks
>>>>
>>>> ...
>>>> [mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f3884000940] AVIndex stream 0, sample 151, offset 60134, dts 450000, size 194, distance 25, keyframe 0
>>>> ...
>>>> Seek to 68% ( 0:07:11) of total duration ( 0:10:34)
>>>> ...
>>>> [mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f3884000940] AVIndex stream 0, sample 152, offset 2b74fd6, dts 38757000, size 8284, distance 0, keyframe 1
>>>> ...
>>>> Seek to 14% ( 0:01:29) of total duration ( 0:10:34)
>>>> ...
>>>> [mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f3884000940] AVIndex stream 0, sample 152, offset 959164, dts 7749000, size 55027, distance 0, keyframe 1
>>>>
>>>>
>>> When seeking mov_read_trun is getting called repeatedly for the same fragment which has a number of undesirable side
>>> effects, even without my patch.  The following things get updated to incorrect values when seeking backward and the trun
>>> is re-read:
>>>
>>> sc->data_size
>>> sc->duration_for_fps
>>> sc->nb_frames_for_fps
>>> sc->track_end
>>>
>>> The trun is getting re-read in mov_switch_root because headers_read in MOVFragmentIndex has not yet been set for the
>>> fragment.  I think a solution to this is to set headers_read for the appropriate entry in MOVFragmentIndex when the trun
>>> is read the first time.  Does this sound like the right approach?
>>>
>> I got the analysis wrong above.  It's not re-reading the trun.  What's happening is that while seeking forward it can
>> skip one or more trun.  Then seeking back, it will read that trun.  So, as you said, re-ordering of the index will be
>> necessary to handle seeking past a trun. 
>>
>> It can seek forward past a trun because the sidx has the offset to each moof and is used by mov_seek_stream.  I missed
>> this earlier.
>>
>> Since the trun can overlap, reordering shouldn't be done by simply sorting by the index_enties timestamps though.  I'm
>> thinking a good way would be to add a index_entry member to MOVFragmentIndexItem that records where in index_entries the
>> samples for the trun were written.  The position in index_entries of a *new* trun would be determined by looking at the
>> position of the MOVFragmentIndexItem that corresponds to that trun and  finding for the next MOVFragmentIndexItem that
>> has a valid index_entry set (which means it's trun was read and samples inserted into the index). If no next valid
>> index_entry is found, the samples of the new trun get appended.  If a valid index_entry is found, open a hole in
>> index_entries before that entry and populate the samples from the new trun in the hole. Then fix up the index_entry
>> members of MOVFragmentIndexItems to account for the hole.
>>
>> Looking again at sc->* most of these are accurate I think.  I question sc->track_end though.  It seem like is should not
>> be going backwards when seeking backwards.
>>
>> Am I on the right track now?
> I think so but this code has become quite complex, its very possible
> there are aspects iam missing
>
>

Yes, it is complex.  I've been working on a patch that implements the above proposal.  I had to change the layout and
usage of MOVFragmentIndex to accommodate what is needed.  I believe I have reduced the complexity of this particular
area some and made the code more understandable.  But you'll have to be the judge of that. 

I tested my original sample with overlaping fragments, your buck480p30_na.mp4 sample, and another sample I created that
has the same general structure as buck, but is much longer and adds an audio track.  I also ran fate and a valgrind
test. It would be good to test other samples affected by the code that's changed (e.g. something with a mfra/tfra), but
I don't have such samples.

While doing fate testing, I found a bug in the original code.  The sidx earliest_presentation_time is being used as a
DTS in mov_read_trun in the original code, it should be treated as PTS.  I've left this bug in place in the patch with a
FIXME comment so that this patch can be evaluated independent of any fate changes.  I'll submit another patch with a fix
and fate updates once this patch gets reviewed.

I also think that tfdt base_media_decode_time should be used when available as the dts in mov_read_trun instead of the
sidx earliest_presentation_time.  So the patch to fix the PTS bug will switch that if nobody objects.

Updated patch follows...
diff mbox

Patch

diff --git a/libavformat/mov.c b/libavformat/mov.c
index 899690d920..c7422cd9ed 100644
--- a/libavformat/mov.c
+++ b/libavformat/mov.c
@@ -4340,8 +4340,8 @@  static int mov_read_trun(MOVContext *c, AVIOContext *pb, MOVAtom atom)
                                   MOV_FRAG_SAMPLE_FLAG_DEPENDS_YES));
         if (keyframe)
             distance = 0;
-        ctts_index = av_add_index_entry(st, offset, dts, sample_size, distance,
-                                        keyframe ? AVINDEX_KEYFRAME : 0);
+        ctts_index = add_index_entry(st, offset, dts, sample_size, distance,
+                                     keyframe ? AVINDEX_KEYFRAME : 0);
         if (ctts_index >= 0 && old_nb_index_entries < st->nb_index_entries) {
             unsigned int size_needed = st->nb_index_entries * sizeof(*sc->ctts_data);
             unsigned int request_size = size_needed > sc->ctts_allocated_size ?