diff mbox series

[FFmpeg-devel,RFC,-,DO,NOT,MERGE] Revert "mov: Discard invalid CTTS."

Message ID 20210308155519.807360-1-derek.buitenhuis@gmail.com
State New
Headers show
Series [FFmpeg-devel,RFC,-,DO,NOT,MERGE] Revert "mov: Discard invalid CTTS." | expand

Checks

Context Check Description
andriy/x86_make success Make finished
andriy/x86_make_fate success Make fate finished
andriy/PPC64_make success Make finished
andriy/PPC64_make_fate success Make fate finished

Commit Message

Derek Buitenhuis March 8, 2021, 3:55 p.m. UTC
This reverts commit 4093220029a4d77f272c491e9299680480a08c00.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
---
I have CC'd Michael here, as he is the original author here and the "soltuion" is nor clear.

To explain this RFC:

We (Vimeo) have started seeing an uptick in broken MP4 files (what creates them is under
investigation; we have reached out to some users in hopes that we can identify the broken
muxer and contact its authors). They are broken in a very specific way: The CTTS box starts
off normal, but then at some point, the duration field of the CTTS entries starts to increment
on each entry, and this continues until the end of the file, resulting in incorrect and insane
PTS/DTS differences (like PTS/DTS differing by 15 minutes / 10,000 frames). I have linked both
a text dump of the boxes and a trimmed moov box as an example in [1] and [2]. I can only provide
a full file (~12 GiB) privately.

Thes files *happen* to decode linearly just fine, of course, because FFmpeg doesn't care about the
CTTS info in that case, but if you seek, everything will explode, as you would expect.

So, anyway, in the meanwhile, I was writing a simple detection filter for these sorts of files,
and I noticed it only detected some of them, and this lead me to 1fb9efbda0149c3491529ea5fa92cfdcb8cebfaa.
What happens is that if the file is long enough, the incrementing CTTS entry durations eventually
trigger this, as far as I can tell, totally arbitrary check for 1<<28, and that results in the
entire CTTS being thrown away, and libavformat setting PTS==DTS for all packets, which both
makes detection impossible, and breaks seeking in a completely different way.

So the options are:
    * Revert 1fb9efbda0149c3491529ea5fa92cfdcb8cebfaa and detect them as I was already.
    * Change the check to something else or some other value, which would still be totally
      arbitrary - but maybe something like >16 frames. This would still leave seeking totally
      broken, of course, since it will still set PTS==DTS for all packets. This is still maybe
      detectable for API users by checking b_frames and has_b_frames of video_delay, but
      PTS==DTS? Seems kind gross, but... eh.
    * (Possibly in addition to other changes) Make the check an error, or make seeking return an
      error when this sort of file is detected.
    * Something I haven't listed here.

So, basically, all options suck, and I thought some people may have opinions on the least bad
option, here - specifically Michael, as the author of the original patch.

People: Please comment if you feel strongly about one solution or another.

[1] http://chromashift.org/ctts_sadness/broken_cut.mp4.xz
[2] http://chromashift.org/ctts_sadness/broken.boxes.txt.xz

---
 libavformat/mov.c | 7 -------
 1 file changed, 7 deletions(-)

Comments

Derek Buitenhuis March 8, 2021, 4:13 p.m. UTC | #1
On 08/03/2021 15:55, Derek Buitenhuis wrote:
> So the options are:
>     * Revert 1fb9efbda0149c3491529ea5fa92cfdcb8cebfaa and detect them as I was already.

Copy-paste fail. This should say 4093220029a4d77f272c491e9299680480a08c00.

- Derek
Michael Niedermayer March 8, 2021, 5:29 p.m. UTC | #2
On Mon, Mar 08, 2021 at 03:55:19PM +0000, Derek Buitenhuis wrote:
> This reverts commit 4093220029a4d77f272c491e9299680480a08c00.
> 
> Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
> ---
> I have CC'd Michael here, as he is the original author here and the "soltuion" is nor clear.
> 
> To explain this RFC:
> 
> We (Vimeo) have started seeing an uptick in broken MP4 files (what creates them is under
> investigation; we have reached out to some users in hopes that we can identify the broken
> muxer and contact its authors). They are broken in a very specific way: The CTTS box starts
> off normal, but then at some point, the duration field of the CTTS entries starts to increment
> on each entry, and this continues until the end of the file, resulting in incorrect and insane
> PTS/DTS differences (like PTS/DTS differing by 15 minutes / 10,000 frames). I have linked both
> a text dump of the boxes and a trimmed moov box as an example in [1] and [2]. I can only provide
> a full file (~12 GiB) privately.
> 
> Thes files *happen* to decode linearly just fine, of course, because FFmpeg doesn't care about the
> CTTS info in that case, but if you seek, everything will explode, as you would expect.
> 
> So, anyway, in the meanwhile, I was writing a simple detection filter for these sorts of files,
> and I noticed it only detected some of them, and this lead me to 1fb9efbda0149c3491529ea5fa92cfdcb8cebfaa.
> What happens is that if the file is long enough, the incrementing CTTS entry durations eventually
> trigger this, as far as I can tell, totally arbitrary check for 1<<28, and that results in the
> entire CTTS being thrown away, and libavformat setting PTS==DTS for all packets, which both
> makes detection impossible, and breaks seeking in a completely different way.
> 
> So the options are:
>     * Revert 1fb9efbda0149c3491529ea5fa92cfdcb8cebfaa and detect them as I was already.
>     * Change the check to something else or some other value, which would still be totally
>       arbitrary - but maybe something like >16 frames. This would still leave seeking totally
>       broken, of course, since it will still set PTS==DTS for all packets. This is still maybe
>       detectable for API users by checking b_frames and has_b_frames of video_delay, but
>       PTS==DTS? Seems kind gross, but... eh.
>     * (Possibly in addition to other changes) Make the check an error, or make seeking return an
>       error when this sort of file is detected.
>     * Something I haven't listed here.
> 
> So, basically, all options suck, and I thought some people may have opinions on the least bad
> option, here - specifically Michael, as the author of the original patch.

I would try to detect the specific abberant muxer based on the input. 
Then have custom code in the demuxer which is based on reverse engnenering the 
issue to do a best effort to recover as much as is possible. And also print a big 
warning about the broken input / muxer so the user doesnt stay unaware.

You can see this also in 2 ways
1. It sucks its not our job to workaround someone elses bugs
2. It cool, it gives FFmpegs demuxer a unique advantage over competing demuxers

Implementing this will be more enjoyable to whoever does, if (s)he sees it as in
"2."

Thanks

[...]
Derek Buitenhuis March 10, 2021, 3:29 p.m. UTC | #3
On 08/03/2021 17:29, Michael Niedermayer wrote:
> I would try to detect the specific abberant muxer based on the input. 
> Then have custom code in the demuxer which is based on reverse engnenering the 
> issue to do a best effort to recover as much as is possible. And also print a big 
> warning about the broken input / muxer so the user doesnt stay unaware.

I mean, I already listed the options in my email. Best you can do is try and check
for insane PTS/DTS diff (I proposed 16 frames), and the 'fix' of dropping the CTTS
info just makes it broken in a different way.

> You can see this also in 2 ways
> 1. It sucks its not our job to workaround someone elses bugs
> 2. It cool, it gives FFmpegs demuxer a unique advantage over competing demuxers

As an API user, I do not see returning made up PTS/DTS rather than what is in file
as an advantage. At least if I have what is in the file, I can make my own decisions
in my program, on how to handle it.

As it stands right now, as detailed in my previous mail, I have no way to even detect
these files since avformat handles them differently depending on if the file hits
the arbitary number of 1<<28.

What I'm looking for here is something actionable: I know what the possibilities are
for changes/fixes. Which the community prefers are not clear. As it stands, it is
just broken.

- Derek
Michael Niedermayer March 10, 2021, 5:24 p.m. UTC | #4
On Wed, Mar 10, 2021 at 03:29:46PM +0000, Derek Buitenhuis wrote:
> On 08/03/2021 17:29, Michael Niedermayer wrote:
> > I would try to detect the specific abberant muxer based on the input. 
> > Then have custom code in the demuxer which is based on reverse engnenering the 
> > issue to do a best effort to recover as much as is possible. And also print a big 
> > warning about the broken input / muxer so the user doesnt stay unaware.
> 
> I mean, I already listed the options in my email. Best you can do is try and check
> for insane PTS/DTS diff (I proposed 16 frames), and the 'fix' of dropping the CTTS
> info just makes it broken in a different way.

> 
> > You can see this also in 2 ways
> > 1. It sucks its not our job to workaround someone elses bugs
> > 2. It cool, it gives FFmpegs demuxer a unique advantage over competing demuxers
> 
> As an API user, I do not see returning made up PTS/DTS rather than what is in file
> as an advantage. At least if I have what is in the file, I can make my own decisions
> in my program, on how to handle it.
> 
> As it stands right now, as detailed in my previous mail, I have no way to even detect
> these files since avformat handles them differently depending on if the file hits
> the arbitary number of 1<<28.
> 
> What I'm looking for here is something actionable: I know what the possibilities are
> for changes/fixes. Which the community prefers are not clear. As it stands, it is
> just broken.

what does the muxer exactly do ?

Let me give an example (totally made up but to explain my point)
if each ctts value for example has its file position added to it
that can be reverted and the original ctts fully recovered.
(also a user app cannot do this, only the demuxer could as the user app
 has no knowledge of the atoms file position)

Does it store uninintialized memory then it would not be revertable but
it did not sound like thats whats happening.

about never modifying data, for that a system similar to workaround_bugs
we have on the decoder side could be added. But its not really the users
job to do this. The demuxer should already return as much recoverable
data as possible.

thanks

[...]
Derek Buitenhuis March 10, 2021, 5:49 p.m. UTC | #5
On 10/03/2021 17:24, Michael Niedermayer wrote:
> what does the muxer exactly do ?

I provided an explanation of what is happening during the broken muxing 
in my original email, as well as a sample, and a text dump of the boxes.
Please see those.

Can you please actually fully read my emails and samples before going on
long tangents about theortical things or what could be?

> Let me give an example (totally made up but to explain my point)
> if each ctts value for example has its file position added to it
> that can be reverted and the original ctts fully recovered.
> (also a user app cannot do this, only the demuxer could as the user app
>  has no knowledge of the atoms file position)
> 
> Does it store uninintialized memory then it would not be revertable but
> it did not sound like thats whats happening.

This example is neither useful nor relevant to the case at hand. I described
what is happening, and provided a sample: mid-way through the CTTS box, the entry
durations start incrementing, i.e. growing, rather than matching the reordering.
Please see the sample I provided, and also the dump of the boxes as text I provided.

These theortical and vague replies are not actionable. They're bordering on philosophical
musings.

> about never modifying data, for that a system similar to workaround_bugs
> we have on the decoder side could be added. But its not really the users
> job to do this. The demuxer should already return as much recoverable
> data as possible.

But it's not. It's returning NO data rather than what is in the file. It
is completely dropping all timestamps and making them up, based on nothing.
There is *not* better. It's not fixing or salvaging anything. It's just making
it broken in a different way. And it is not even doing it consistently. It is
doing it only when an arbitrary number is reached, so as an API user, I can't
even consistently handle these.

I gave *exact* options to choose from in my original mail, as a well as a sample
and a description of the problem - I am asking which should be chosen, or to
suggest a different one in clear, actionable terms.

- Derek
Michael Niedermayer March 11, 2021, 8:36 a.m. UTC | #6
On Wed, Mar 10, 2021 at 05:49:45PM +0000, Derek Buitenhuis wrote:
> On 10/03/2021 17:24, Michael Niedermayer wrote:
> > what does the muxer exactly do ?
> 
> I provided an explanation of what is happening during the broken muxing 
> in my original email, as well as a sample, and a text dump of the boxes.
> Please see those.

These are not enough to unambigously reverse engeneer the bug in the muxer
is it true for every output of the muxer, does it always happen at the
same position ?
is the runaway delta always 8 ?
does it always coincide with the 2nd entry of stts ?
is it a coincidence that the first and 2nd stts entries differ by 8 too ?


> 
> Can you please actually fully read my emails and samples before going on
> long tangents about theortical things or what could be?
> 
> > Let me give an example (totally made up but to explain my point)
> > if each ctts value for example has its file position added to it
> > that can be reverted and the original ctts fully recovered.
> > (also a user app cannot do this, only the demuxer could as the user app
> >  has no knowledge of the atoms file position)
> > 
> > Does it store uninintialized memory then it would not be revertable but
> > it did not sound like thats whats happening.
> 
> This example is neither useful nor relevant to the case at hand. I described
> what is happening, and provided a sample: mid-way through the CTTS box, the entry
> durations start incrementing, i.e. growing, rather than matching the reordering.
> Please see the sample I provided, and also the dump of the boxes as text I provided.
> 
> These theortical and vague replies are not actionable. They're bordering on philosophical
> musings.

I was trying to describe how I would go fixing this if i was working on it.
And thus how someone else could fix it as well.
What you seem to want is the result to a snippet of your "homework". Yes that would be
easier but if i knew what exactly to do then i would already have had to test
it and so would already have an implementation of it and would need to have
all samples and much more time. 


> 
> > about never modifying data, for that a system similar to workaround_bugs
> > we have on the decoder side could be added. But its not really the users
> > job to do this. The demuxer should already return as much recoverable
> > data as possible.
> 
> But it's not. It's returning NO data rather than what is in the file. It
> is completely dropping all timestamps and making them up, based on nothing.
> There is *not* better. It's not fixing or salvaging anything. It's just making
> it broken in a different way. And it is not even doing it consistently. It is
> doing it only when an arbitrary number is reached, 

yes but that code is unrelated, with or without it the ctts are trash for this file
this needs code prior to that test that fixes the ctts and then this would not
trigger anymore as the ctts would be corrected


> so as an API user, I can't
> even consistently handle these.

the API user should receive valid timestamps and not need to handle that.
it cannot be expected that every API user carries around workarounds for random
muxer bugs. That would be really alot of duplicated code


> 
> I gave *exact* options to choose from in my original mail, as a well as a sample
> and a description of the problem - I am asking which should be chosen, or to
> suggest a different one in clear, actionable terms.

I cant give an exact solution as it requires testing that solution against as many
samples as possible (requires both the samples and the time to do the work)
Maybe something like subtracting i*8 from entry 2+ works maybe the stts entry 1-2
time should be used instead of 8, maybe we need to take the first - last ctts to
get the slope to correct it.
then we could compare the normally read CTTS vs. the "corrected" CTTS and check which
is "flatter" as in maybe sum of abs diff from stts. with a strong bias toward the
normal. 
This needs to be implemented to know if it even works, its not unlikely that this
would require adjustments to work well, i cannot in 15min just from a text dump
tell how to best detect and correct this abberation. Maybe theres also some
metadata in the file that could be used to limit to which files to attempt to
apply this ... (i didnt see any but again i didnt look very hard)
The first goal must of course be not to break any correct files which such
compensation.

Thanks

[...]
Derek Buitenhuis March 11, 2021, 11:50 a.m. UTC | #7
On 11/03/2021 08:36, Michael Niedermayer wrote:
> These are not enough to unambigously reverse engeneer the bug in the muxer
> is it true for every output of the muxer, does it always happen at the
> same position ?
> is the runaway delta always 8 ?
> does it always coincide with the 2nd entry of stts ?
> is it a coincidence that the first and 2nd stts entries differ by 8 too ?

I think you and I have very different opinions on how this shuld be fixed. The files
are broken.

Nothing is consisent between hese files, and it is not a good use of anyones time
to try and reverse engineer an unknown muxer we don't have access to based on ses
of files, so that we can remove or change a a hack from years ago. What you are
asking for is ridiculous.

Even if some files showd a pattern, coding that into a demuxer is *wrong* and fragile
at best. It just leads to more confusion and more hacks down the road when some files
are broken in slightly different ways. It's adding fragile hacks on top of fragile hacks,
to work around a bug that an unknown muxer had once.

Frankly, I regret asking for input if this is the result.


> I was trying to describe how I would go fixing this if i was working on it.
> And thus how someone else could fix it as well.

This is not useful - I can do that till the cows come home,

> What you seem to want is the result to a snippet of your "homework". Yes that would be
> easier but if i knew what exactly to do then i would already have had to test
> it and so would already have an implementation of it and would need to have
> all samples and much more time. 

This is personally insulting for a few reasons:

1. You are implying the digging I've already done is not good enough or that I'm
   lazy for not followng your extremely vague 'well you could do this' email
   based on nto even checking anything. I did the digging that is possible with
   what is available to me, and to what I think is a good thechnical solution,
   and I've recieved no feedback on those except 'maybe do some more digging and
   add some hacks'. This is demotivating and insulting, and this crap is part of
   the reason I've distanced myself from this community.
2. 'all samples' do not exist. It is infuriating to imply I should reverse engineer
   some unknown muxer, with a limited set of samples, and add a hack for them, in
   order to justify a previous entirely arbitrary hack added for a single file.

To state out this outright: The commit this revers is *wrong* and *terrible* and
should never have been pushed in the first place. I was trying to be ammicable here
engage in discussion on how best to remove or change it, but all I got was the
'Bring Me A Rock, Bring Me A Different Rock' scenario. Your barrier to add the hack
was 0, my barrier to remove it is high.

> yes but that code is unrelated, with or without it the ctts are trash for this file
> this needs code prior to that test that fixes the ctts and then this would not
> trigger anymore as the ctts would be corrected

It does not *fix* anything. Even on the old file. It removes the ctts, but it does not *fix*
it. Making up timestamps where PTS==DTS for all timestas is not *fixing* anything.

What you actually mean to say is 'it made ffmpeg.c decode linearly', which is something enirely
differnet. It can't even seek. It only works linearly because of the parser. The DTS are entirely
broken in your 'fix'.
> the API user should receive valid timestamps and not need to handle that.
> it cannot be expected that every API user carries around workarounds for random
> muxer bugs. That would be really alot of duplicated code

This is a HARD disagree from me. A demuxe should not be 'fixing' timstamps. This is
an applciation level problem.

But even ignoring that, a demuxer should provide *consistent* timestamps for simila files,
not different based on some arbitrary number hack added in for a single file.

> I cant give an exact solution as it requires testing that solution against as many
> samples as possible (requires both the samples and the time to do the work)
> Maybe something like subtracting i*8 from entry 2+ works maybe the stts entry 1-2
> time should be used instead of 8, maybe we need to take the first - last ctts to
> get the slope to correct it.
> then we could compare the normally read CTTS vs. the "corrected" CTTS and check which
> is "flatter" as in maybe sum of abs diff from stts. with a strong bias toward the
> normal. 
> This needs to be implemented to know if it even works, its not unlikely that this
> would require adjustments to work well, i cannot in 15min just from a text dump
> tell how to best detect and correct this abberation. Maybe theres also some
> metadata in the file that could be used to limit to which files to attempt to
> apply this ... (i didnt see any but again i didnt look very hard)
> The first goal must of course be not to break any correct files which such
> compensation.

This is insane. Totally insane. I have no other words for this. 

Given that others on the list (Anton, James, etc.) are probably avoiding responding to this
crazy thread, I am asking for someone to come in an moderate this formally, be it by choice
in responding or by he recently approved technical committee.

We cannot continue with this ridiculousness.

I have CC'd j-b.

- Derek
Paul B Mahol March 11, 2021, 12:07 p.m. UTC | #8
On Thu, Mar 11, 2021 at 12:58 PM Derek Buitenhuis <
derek.buitenhuis@gmail.com> wrote:

> On 11/03/2021 08:36, Michael Niedermayer wrote:
> > These are not enough to unambigously reverse engeneer the bug in the
> muxer
> > is it true for every output of the muxer, does it always happen at the
> > same position ?
> > is the runaway delta always 8 ?
> > does it always coincide with the 2nd entry of stts ?
> > is it a coincidence that the first and 2nd stts entries differ by 8 too ?
>
> I think you and I have very different opinions on how this shuld be fixed.
> The files
> are broken.
>
> Nothing is consisent between hese files, and it is not a good use of
> anyones time
> to try and reverse engineer an unknown muxer we don't have access to based
> on ses
> of files, so that we can remove or change a a hack from years ago. What
> you are
> asking for is ridiculous.
>
> Even if some files showd a pattern, coding that into a demuxer is *wrong*
> and fragile
> at best. It just leads to more confusion and more hacks down the road when
> some files
> are broken in slightly different ways. It's adding fragile hacks on top of
> fragile hacks,
> to work around a bug that an unknown muxer had once.
>
> Frankly, I regret asking for input if this is the result.
>
>
> > I was trying to describe how I would go fixing this if i was working on
> it.
> > And thus how someone else could fix it as well.
>
> This is not useful - I can do that till the cows come home,
>
> > What you seem to want is the result to a snippet of your "homework". Yes
> that would be
> > easier but if i knew what exactly to do then i would already have had to
> test
> > it and so would already have an implementation of it and would need to
> have
> > all samples and much more time.
>
> This is personally insulting for a few reasons:
>
> 1. You are implying the digging I've already done is not good enough or
> that I'm
>    lazy for not followng your extremely vague 'well you could do this'
> email
>    based on nto even checking anything. I did the digging that is possible
> with
>    what is available to me, and to what I think is a good thechnical
> solution,
>    and I've recieved no feedback on those except 'maybe do some more
> digging and
>    add some hacks'. This is demotivating and insulting, and this crap is
> part of
>    the reason I've distanced myself from this community.
> 2. 'all samples' do not exist. It is infuriating to imply I should reverse
> engineer
>    some unknown muxer, with a limited set of samples, and add a hack for
> them, in
>    order to justify a previous entirely arbitrary hack added for a single
> file.
>
> To state out this outright: The commit this revers is *wrong* and
> *terrible* and
> should never have been pushed in the first place. I was trying to be
> ammicable here
> engage in discussion on how best to remove or change it, but all I got was
> the
> 'Bring Me A Rock, Bring Me A Different Rock' scenario. Your barrier to add
> the hack
> was 0, my barrier to remove it is high.
>
> > yes but that code is unrelated, with or without it the ctts are trash
> for this file
> > this needs code prior to that test that fixes the ctts and then this
> would not
> > trigger anymore as the ctts would be corrected
>
> It does not *fix* anything. Even on the old file. It removes the ctts, but
> it does not *fix*
> it. Making up timestamps where PTS==DTS for all timestas is not *fixing*
> anything.
>
> What you actually mean to say is 'it made ffmpeg.c decode linearly', which
> is something enirely
> differnet. It can't even seek. It only works linearly because of the
> parser. The DTS are entirely
> broken in your 'fix'.
> > the API user should receive valid timestamps and not need to handle that.
> > it cannot be expected that every API user carries around workarounds for
> random
> > muxer bugs. That would be really alot of duplicated code
>
> This is a HARD disagree from me. A demuxe should not be 'fixing'
> timstamps. This is
> an applciation level problem.
>
> But even ignoring that, a demuxer should provide *consistent* timestamps
> for simila files,
> not different based on some arbitrary number hack added in for a single
> file.
>
> > I cant give an exact solution as it requires testing that solution
> against as many
> > samples as possible (requires both the samples and the time to do the
> work)
> > Maybe something like subtracting i*8 from entry 2+ works maybe the stts
> entry 1-2
> > time should be used instead of 8, maybe we need to take the first - last
> ctts to
> > get the slope to correct it.
> > then we could compare the normally read CTTS vs. the "corrected" CTTS
> and check which
> > is "flatter" as in maybe sum of abs diff from stts. with a strong bias
> toward the
> > normal.
> > This needs to be implemented to know if it even works, its not unlikely
> that this
> > would require adjustments to work well, i cannot in 15min just from a
> text dump
> > tell how to best detect and correct this abberation. Maybe theres also
> some
> > metadata in the file that could be used to limit to which files to
> attempt to
> > apply this ... (i didnt see any but again i didnt look very hard)
> > The first goal must of course be not to break any correct files which
> such
> > compensation.
>
> This is insane. Totally insane. I have no other words for this.
>
> Given that others on the list (Anton, James, etc.) are probably avoiding
> responding to this
> crazy thread, I am asking for someone to come in an moderate this
> formally, be it by choice
> in responding or by he recently approved technical committee.
>
> We cannot continue with this ridiculousness.
>
> I have CC'd j-b.
>

IMHO, please just revert hackish commits if they do not follow
specifications.

No need for TC or similar.


>
> - Derek
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Michael Niedermayer March 11, 2021, 3:38 p.m. UTC | #9
On Thu, Mar 11, 2021 at 11:50:03AM +0000, Derek Buitenhuis wrote:
> On 11/03/2021 08:36, Michael Niedermayer wrote:
> > These are not enough to unambigously reverse engeneer the bug in the muxer
> > is it true for every output of the muxer, does it always happen at the
> > same position ?
> > is the runaway delta always 8 ?
> > does it always coincide with the 2nd entry of stts ?
> > is it a coincidence that the first and 2nd stts entries differ by 8 too ?
> 
> I think you and I have very different opinions on how this shuld be fixed. The files
> are broken.
> 
> Nothing is consisent between hese files, and it is not a good use of anyones time
> to try and reverse engineer an unknown muxer we don't have access to based on ses
> of files, so that we can remove or change a a hack from years ago. What you are
> asking for is ridiculous.
> 
> Even if some files showd a pattern, coding that into a demuxer is *wrong* and fragile
> at best. It just leads to more confusion and more hacks down the road when some files
> are broken in slightly different ways. It's adding fragile hacks on top of fragile hacks,
> to work around a bug that an unknown muxer had once.
> 
> Frankly, I regret asking for input if this is the result.
> 
> 
> > I was trying to describe how I would go fixing this if i was working on it.
> > And thus how someone else could fix it as well.
> 
> This is not useful - I can do that till the cows come home,
> 
> > What you seem to want is the result to a snippet of your "homework". Yes that would be
> > easier but if i knew what exactly to do then i would already have had to test
> > it and so would already have an implementation of it and would need to have
> > all samples and much more time. 
> 
> This is personally insulting for a few reasons:
> 
> 1. You are implying the digging I've already done is not good enough or that I'm
>    lazy for not followng your extremely vague 'well you could do this' email
>    based on nto even checking anything. I did the digging that is possible with
>    what is available to me, and to what I think is a good thechnical solution,
>    and I've recieved no feedback on those except 'maybe do some more digging and
>    add some hacks'. This is demotivating and insulting, and this crap is part of
>    the reason I've distanced myself from this community.
> 2. 'all samples' do not exist. It is infuriating to imply I should reverse engineer
>    some unknown muxer, with a limited set of samples, and add a hack for them, in
>    order to justify a previous entirely arbitrary hack added for a single file.
> 
> To state out this outright: The commit this revers is *wrong* and *terrible* and
> should never have been pushed in the first place. I was trying to be ammicable here
> engage in discussion on how best to remove or change it, but all I got was the
> 'Bring Me A Rock, Bring Me A Different Rock' scenario. Your barrier to add the hack
> was 0, my barrier to remove it is high.
> 
> > yes but that code is unrelated, with or without it the ctts are trash for this file
> > this needs code prior to that test that fixes the ctts and then this would not
> > trigger anymore as the ctts would be corrected
> 
> It does not *fix* anything. Even on the old file. It removes the ctts, but it does not *fix*
> it. Making up timestamps where PTS==DTS for all timestas is not *fixing* anything.
> 
> What you actually mean to say is 'it made ffmpeg.c decode linearly', which is something enirely
> differnet. It can't even seek. It only works linearly because of the parser. The DTS are entirely
> broken in your 'fix'.
> > the API user should receive valid timestamps and not need to handle that.
> > it cannot be expected that every API user carries around workarounds for random
> > muxer bugs. That would be really alot of duplicated code
> 
> This is a HARD disagree from me. A demuxe should not be 'fixing' timstamps. This is
> an applciation level problem.
> 
> But even ignoring that, a demuxer should provide *consistent* timestamps for simila files,
> not different based on some arbitrary number hack added in for a single file.
> 
> > I cant give an exact solution as it requires testing that solution against as many
> > samples as possible (requires both the samples and the time to do the work)
> > Maybe something like subtracting i*8 from entry 2+ works maybe the stts entry 1-2
> > time should be used instead of 8, maybe we need to take the first - last ctts to
> > get the slope to correct it.
> > then we could compare the normally read CTTS vs. the "corrected" CTTS and check which
> > is "flatter" as in maybe sum of abs diff from stts. with a strong bias toward the
> > normal. 
> > This needs to be implemented to know if it even works, its not unlikely that this
> > would require adjustments to work well, i cannot in 15min just from a text dump
> > tell how to best detect and correct this abberation. Maybe theres also some
> > metadata in the file that could be used to limit to which files to attempt to
> > apply this ... (i didnt see any but again i didnt look very hard)
> > The first goal must of course be not to break any correct files which such
> > compensation.
> 
> This is insane. Totally insane. I have no other words for this. 
> 
> Given that others on the list (Anton, James, etc.) are probably avoiding responding to this
> crazy thread, I am asking for someone to come in an moderate this formally, be it by choice
> in responding or by he recently approved technical committee.
> 
> We cannot continue with this ridiculousness.

You explicitly asked me to comment, and also stated that all options you listed suck:

"So, basically, all options suck, and I thought some people may have opinions on the least bad
 option, here - specifically Michael, as the author of the original patch."

I just wanted to help and so
I described a solution that will likely allow demuxing these files without
errors. Its more work and you would workaround the muxer bug in the demuxer.
Which is, lets call it unpopular.
You are opposed to that ? then just pretend i never replied.
What i wrote was there to help not to restrict or object to anything.

The fix for Ticket 385 is not nice and if that would be done better iam
all for it. But i dont think reverting that will magically fix these other
broken files. (Thats not an objection!) It just leaves it to the user
application to deal with the muxer bugs.


> 
> I have CC'd j-b.

Iam not sure jb is interrested in this but ill leave him in CC

Thanks

[...]
Derek Buitenhuis March 11, 2021, 3:57 p.m. UTC | #10
On 11/03/2021 15:38, Michael Niedermayer wrote:
> You explicitly asked me to comment, and also stated that all options you listed suck:

Indeed, I was hoping for an insight that wasn't even worse hacks, though, or
an opinion on which was Least Bad. Maybe I should have been more clear in
my intention.

> "So, basically, all options suck, and I thought some people may have opinions on the least bad
>  option, here - specifically Michael, as the author of the original patch."

All options will suck, because the files are broken. Maybe I didn't need to state it.

> I just wanted to help and so

I'm not disparaging that, apologies if it came out like that. I am frustrated.

> I described a solution that will likely allow demuxing these files without
> errors. Its more work and you would workaround the muxer bug in the demuxer.

Indeed, but I do not agree with this approach, as it is fragile at best, and I
do not think it belong in a demuxer, but the layer above.

> Which is, lets call it unpopular.

For me it is, at least. I was hoping others would chime in with opinions, too.

> You are opposed to that ? then just pretend i never replied.

I don't think adding hacks to a demuxer is the correct place for this, no.
Whether the rest of the community shares that opinion, I do not know.

> What i wrote was there to help not to restrict or object to anything.

I didn't read it that way...

> The fix for Ticket 385 is not nice and if that would be done better iam
> all for it. But i dont think reverting that will magically fix these other
> broken files. (Thats not an objection!) It just leaves it to the user
> application to deal with the muxer bugs.

Which is where I, personally, think it should be done. A demuxer should
demux, not mess around with what it demuxes. This kind of stuff belongs
in ffmpeg.c timestamp code, or generic timestamp code in avformat.

> Iam not sure jb is interrested in this but ill leave him in CC

You never know what j-b is interested in, he likes Go2Meeting, after
all.

- Derek
Anton Khirnov March 11, 2021, 6:01 p.m. UTC | #11
Quoting Derek Buitenhuis (2021-03-11 12:50:03)
> > the API user should receive valid timestamps and not need to handle that.
> > it cannot be expected that every API user carries around workarounds for random
> > muxer bugs. That would be really alot of duplicated code
> 
> This is a HARD disagree from me. A demuxe should not be 'fixing' timstamps. This is
> an applciation level problem.

I entirely agree with this. Libavformat should export what is stored in
the file as accurately as possible. It does not have enough information
about the context/use case to fix timestamps, nor whether fixing is even
required.

We can and eventually should provide optional code for timestamps
generation/manipulation, e.g. in the form of a bitstream filter which
would have access to accurate reordering information. But it should be
entirely separate from the demuxer.
Paul B Mahol March 11, 2021, 6:22 p.m. UTC | #12
On Thu, Mar 11, 2021 at 7:01 PM Anton Khirnov <anton@khirnov.net> wrote:

> Quoting Derek Buitenhuis (2021-03-11 12:50:03)
> > > the API user should receive valid timestamps and not need to handle
> that.
> > > it cannot be expected that every API user carries around workarounds
> for random
> > > muxer bugs. That would be really alot of duplicated code
> >
> > This is a HARD disagree from me. A demuxe should not be 'fixing'
> timstamps. This is
> > an applciation level problem.
>
> I entirely agree with this. Libavformat should export what is stored in
> the file as accurately as possible. It does not have enough information
> about the context/use case to fix timestamps, nor whether fixing is even
> required.
>
> We can and eventually should provide optional code for timestamps
> generation/manipulation, e.g. in the form of a bitstream filter which
> would have access to accurate reordering information. But it should be
> entirely separate from the demuxer.
>


There is already setts bitstream filter for this.



>
> --
> Anton Khirnov
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Michael Niedermayer March 11, 2021, 8:56 p.m. UTC | #13
On Thu, Mar 11, 2021 at 07:01:21PM +0100, Anton Khirnov wrote:
> Quoting Derek Buitenhuis (2021-03-11 12:50:03)
> > > the API user should receive valid timestamps and not need to handle that.
> > > it cannot be expected that every API user carries around workarounds for random
> > > muxer bugs. That would be really alot of duplicated code
> > 
> > This is a HARD disagree from me. A demuxe should not be 'fixing' timstamps. This is
> > an applciation level problem.
> 
> I entirely agree with this. Libavformat should export what is stored in
> the file as accurately as possible. It does not have enough information
> about the context/use case to fix timestamps, nor whether fixing is even
> required.
> 
> We can and eventually should provide optional code for timestamps
> generation/manipulation, e.g. in the form of a bitstream filter which
> would have access to accurate reordering information. But it should be
> entirely separate from the demuxer.

1. (doesnt apply for this case probably but can to other bugs)
you assume that what is stored can even be read and exported without 
modifying/fixing it in the process.
There certainly have been encoder/muxer bugs where this is not true

2. (also doesnt apply here but its by a hair only, if this was dts
instead of cts that is off, it would)
Timestamps can be used for deciding which packet to return next
from the demuxer. That is interleaving. If that is not fixed at the
demuxer level the output would be very hard to use, requiring potential
massive buffering to compensate

3. Timestamps are used for seeking. If the timestamps are all off then
not only would the demuxer output have to have its timestamps fixed
but also any seek requests would need to have them unfixed as the
demuxers internal timestamp list has not been fixed. And then there
is no real gurantee that a bug results in a list without duplicates
so seeking to a time 10 in a damaged list might not be unique. There
might be multiple 10 in there

4. There are other time related fields, like duration, start time and so on
if a +i*8 error is not removed close to where its read than it may
require some care so that demuxer interaction with duration 
and user interaction with duration behaves as intended. 

5. When dts and pts diverge massivly due to apparently a muxer bug as in
this case here. How do we ensure all applications handle this correctly ?
How does code after the demuxer be that a filter or the user app know
what is correct ?
At the level of the demuxer we can check for metadata and other hints
about the buggy muxer, we have access to all timestamps of all packets
of all streams. We can even seek to any point of the input file if we
need to to identify the buggy muxer. Then this information will tell us
if dts or pts is correct and what to do with the other.
after the demuxer a filter would generally have 1 packet at a time,
the first error it will see is a error of 8ms in the pts which then will
grow with each packet. Iam not so sure how one would detect this,
at the start this could be correct. just when the pts drift seconds away
from the dts it becomes clear that something is wrong.

So i dont know, but to me fixing this sort of bug at the demuxer level
seems alot easier. But iam absolutely not against any other way of fixing
this and timestamp (error) correction and bug workaround filters certainly
are very usefull too

all above are intended as suggestions and comments to help, nothing 
is intended as an objection or demand

Thanks

[...]
diff mbox series

Patch

diff --git a/libavformat/mov.c b/libavformat/mov.c
index 1c07cff6b5..256e7d376f 100644
--- a/libavformat/mov.c
+++ b/libavformat/mov.c
@@ -3079,13 +3079,6 @@  static int mov_read_ctts(MOVContext *c, AVIOContext *pb, MOVAtom atom)
         av_log(c->fc, AV_LOG_TRACE, "count=%d, duration=%d\n",
                 count, duration);
 
-        if (FFNABS(duration) < -(1<<28) && i+2<entries) {
-            av_log(c->fc, AV_LOG_WARNING, "CTTS invalid\n");
-            av_freep(&sc->ctts_data);
-            sc->ctts_count = 0;
-            return 0;
-        }
-
         if (i+2<entries)
             mov_update_dts_shift(sc, duration, c->fc);
     }