diff mbox

[FFmpeg-devel,RFC] avformat/utils: always seek back after avformat_find_stream_info()

Message ID 20190410001448.30951-1-ffmpeg@tmm1.net
State New
Headers show

Commit Message

Aman Karmani April 10, 2019, 12:14 a.m. UTC
From: Aman Gupta <aman@tmm1.net>

Previously, the initial seek position was recorded into
old_offset at the beginning of avformat_find_stream_info(),
and passed into estimate_timings(). In the case of mpegts
with a known filesize, it was further passed into
estimate_timings_from_pts() which called avio_seek(SEEK_SET)
after doing its timing related seeks. (Interestingly, this
seeked all the way back to the initial position before
the probe, rather than only back before the timing
related seeks to the end of the file).

With this commit, we pull the avio_seek() out of the mpegts
specific code-path and unconditionally seek all streams back
to their original position after probing is complete.

This effectively prevents data that is consumed during
avformat_find_stream_info() from being discarded, so packets
contained at the beginning of a file are still passed back
to the user for playback.

Signed-off-by: Aman Gupta <aman@tmm1.net>
---
 libavformat/utils.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

Comments

Hendrik Leppkes April 10, 2019, 4:40 a.m. UTC | #1
On Wed, Apr 10, 2019 at 2:21 AM Aman Gupta <ffmpeg@tmm1.net> wrote:
>
> From: Aman Gupta <aman@tmm1.net>
>
> Previously, the initial seek position was recorded into
> old_offset at the beginning of avformat_find_stream_info(),
> and passed into estimate_timings(). In the case of mpegts
> with a known filesize, it was further passed into
> estimate_timings_from_pts() which called avio_seek(SEEK_SET)
> after doing its timing related seeks. (Interestingly, this
> seeked all the way back to the initial position before
> the probe, rather than only back before the timing
> related seeks to the end of the file).
>
> With this commit, we pull the avio_seek() out of the mpegts
> specific code-path and unconditionally seek all streams back
> to their original position after probing is complete.
>
> This effectively prevents data that is consumed during
> avformat_find_stream_info() from being discarded, so packets
> contained at the beginning of a file are still passed back
> to the user for playback.
>

I don't think I ever had a case where data was apparently lost from
the beginning of the stream. Can you give examples of what this fixes?

- Hendrik
Aman Karmani April 10, 2019, 7:30 a.m. UTC | #2
On Tue, Apr 9, 2019 at 9:49 PM Hendrik Leppkes <h.leppkes@gmail.com> wrote:

> On Wed, Apr 10, 2019 at 2:21 AM Aman Gupta <ffmpeg@tmm1.net> wrote:
> >
> > From: Aman Gupta <aman@tmm1.net>
> >
> > Previously, the initial seek position was recorded into
> > old_offset at the beginning of avformat_find_stream_info(),
> > and passed into estimate_timings(). In the case of mpegts
> > with a known filesize, it was further passed into
> > estimate_timings_from_pts() which called avio_seek(SEEK_SET)
> > after doing its timing related seeks. (Interestingly, this
> > seeked all the way back to the initial position before
> > the probe, rather than only back before the timing
> > related seeks to the end of the file).
> >
> > With this commit, we pull the avio_seek() out of the mpegts
> > specific code-path and unconditionally seek all streams back
> > to their original position after probing is complete.
> >
> > This effectively prevents data that is consumed during
> > avformat_find_stream_info() from being discarded, so packets
> > contained at the beginning of a file are still passed back
> > to the user for playback.
> >
>
> I don't think I ever had a case where data was apparently lost from
> the beginning of the stream. Can you give examples of what this fixes?


Perhaps I'm missing something.

If I see the debug log message "After avformat_find_stream_info() pos:
1000000 bytes  seeks:0", are the packets from the first megabyte of the
file still returned from API? Maybe they are being cached somewhere and I
did not notice.

This patch was based mostly on a reading of the code. I will run some more
detailed tests and provide an example/test of the new expected behavior.

Aman


>
> - Hendrik
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Hendrik Leppkes April 10, 2019, 7:59 a.m. UTC | #3
On Wed, Apr 10, 2019 at 9:55 AM Aman Gupta <ffmpeg@tmm1.net> wrote:
>
> On Tue, Apr 9, 2019 at 9:49 PM Hendrik Leppkes <h.leppkes@gmail.com> wrote:
>
> > On Wed, Apr 10, 2019 at 2:21 AM Aman Gupta <ffmpeg@tmm1.net> wrote:
> > >
> > > From: Aman Gupta <aman@tmm1.net>
> > >
> > > Previously, the initial seek position was recorded into
> > > old_offset at the beginning of avformat_find_stream_info(),
> > > and passed into estimate_timings(). In the case of mpegts
> > > with a known filesize, it was further passed into
> > > estimate_timings_from_pts() which called avio_seek(SEEK_SET)
> > > after doing its timing related seeks. (Interestingly, this
> > > seeked all the way back to the initial position before
> > > the probe, rather than only back before the timing
> > > related seeks to the end of the file).
> > >
> > > With this commit, we pull the avio_seek() out of the mpegts
> > > specific code-path and unconditionally seek all streams back
> > > to their original position after probing is complete.
> > >
> > > This effectively prevents data that is consumed during
> > > avformat_find_stream_info() from being discarded, so packets
> > > contained at the beginning of a file are still passed back
> > > to the user for playback.
> > >
> >
> > I don't think I ever had a case where data was apparently lost from
> > the beginning of the stream. Can you give examples of what this fixes?
>
>
> Perhaps I'm missing something.
>
> If I see the debug log message "After avformat_find_stream_info() pos:
> 1000000 bytes  seeks:0", are the packets from the first megabyte of the
> file still returned from API? Maybe they are being cached somewhere and I
> did not notice.
>

As far as I know, the packets read during stream info probing are
being cached in a buffer and returned to the caller on av_read_frame.

- Hendrik
Aman Karmani April 10, 2019, 8:16 a.m. UTC | #4
On Wed, Apr 10, 2019 at 1:00 AM Hendrik Leppkes <h.leppkes@gmail.com> wrote:

> On Wed, Apr 10, 2019 at 9:55 AM Aman Gupta <ffmpeg@tmm1.net> wrote:
> >
> > On Tue, Apr 9, 2019 at 9:49 PM Hendrik Leppkes <h.leppkes@gmail.com>
> wrote:
> >
> > > On Wed, Apr 10, 2019 at 2:21 AM Aman Gupta <ffmpeg@tmm1.net> wrote:
> > > >
> > > > From: Aman Gupta <aman@tmm1.net>
> > > >
> > > > Previously, the initial seek position was recorded into
> > > > old_offset at the beginning of avformat_find_stream_info(),
> > > > and passed into estimate_timings(). In the case of mpegts
> > > > with a known filesize, it was further passed into
> > > > estimate_timings_from_pts() which called avio_seek(SEEK_SET)
> > > > after doing its timing related seeks. (Interestingly, this
> > > > seeked all the way back to the initial position before
> > > > the probe, rather than only back before the timing
> > > > related seeks to the end of the file).
> > > >
> > > > With this commit, we pull the avio_seek() out of the mpegts
> > > > specific code-path and unconditionally seek all streams back
> > > > to their original position after probing is complete.
> > > >
> > > > This effectively prevents data that is consumed during
> > > > avformat_find_stream_info() from being discarded, so packets
> > > > contained at the beginning of a file are still passed back
> > > > to the user for playback.
> > > >
> > >
> > > I don't think I ever had a case where data was apparently lost from
> > > the beginning of the stream. Can you give examples of what this fixes?
> >
> >
> > Perhaps I'm missing something.
> >
> > If I see the debug log message "After avformat_find_stream_info() pos:
> > 1000000 bytes  seeks:0", are the packets from the first megabyte of the
> > file still returned from API? Maybe they are being cached somewhere and I
> > did not notice.
> >
>
> As far as I know, the packets read during stream info probing are
> being cached in a buffer and returned to the caller on av_read_frame.


Okay that seems to be the missing piece. If someone can point me to where
this buffer is in the code, that would be helpful.

In that case, it makes sense the only seek happens in the mpegts
estimate_by_pts case since that seeks all the way to the end. I'm still not
sure why it seeks back to the beginning (before probe start) position
though, and why that wouldn't cause packets to get doubled up.

Aman


>
> - Hendrik
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Diego Felix de Souza via ffmpeg-devel April 10, 2019, 9:57 a.m. UTC | #5
Aman Gupta:
> On Wed, Apr 10, 2019 at 1:00 AM Hendrik Leppkes <h.leppkes@gmail.com> wrote:
> 
>> On Wed, Apr 10, 2019 at 9:55 AM Aman Gupta <ffmpeg@tmm1.net> wrote:
>>>
>>> On Tue, Apr 9, 2019 at 9:49 PM Hendrik Leppkes <h.leppkes@gmail.com>
>> wrote:
>>>
>>>> On Wed, Apr 10, 2019 at 2:21 AM Aman Gupta <ffmpeg@tmm1.net> wrote:
>>>>>
>>>>> From: Aman Gupta <aman@tmm1.net>
>>>>>
>>>>> Previously, the initial seek position was recorded into
>>>>> old_offset at the beginning of avformat_find_stream_info(),
>>>>> and passed into estimate_timings(). In the case of mpegts
>>>>> with a known filesize, it was further passed into
>>>>> estimate_timings_from_pts() which called avio_seek(SEEK_SET)
>>>>> after doing its timing related seeks. (Interestingly, this
>>>>> seeked all the way back to the initial position before
>>>>> the probe, rather than only back before the timing
>>>>> related seeks to the end of the file).
>>>>>
>>>>> With this commit, we pull the avio_seek() out of the mpegts
>>>>> specific code-path and unconditionally seek all streams back
>>>>> to their original position after probing is complete.
>>>>>
>>>>> This effectively prevents data that is consumed during
>>>>> avformat_find_stream_info() from being discarded, so packets
>>>>> contained at the beginning of a file are still passed back
>>>>> to the user for playback.
>>>>>
>>>>
>>>> I don't think I ever had a case where data was apparently lost from
>>>> the beginning of the stream. Can you give examples of what this fixes?
>>>
>>>
>>> Perhaps I'm missing something.
>>>
>>> If I see the debug log message "After avformat_find_stream_info() pos:
>>> 1000000 bytes  seeks:0", are the packets from the first megabyte of the
>>> file still returned from API? Maybe they are being cached somewhere and I
>>> did not notice.
>>>
>>
>> As far as I know, the packets read during stream info probing are
>> being cached in a buffer and returned to the caller on av_read_frame.
> 
> 
> Okay that seems to be the missing piece. If someone can point me to where
> this buffer is in the code, that would be helpful.
> 
The packet queues are in the AVFormatInternal struct and the packets
are put there via ff_packet_list_put. The most interesting thing for
you might be ff_read_packet.

- Andreas
diff mbox

Patch

diff --git a/libavformat/utils.c b/libavformat/utils.c
index 9b3f0d28e6..94d166869a 100644
--- a/libavformat/utils.c
+++ b/libavformat/utils.c
@@ -2786,7 +2786,7 @@  static void estimate_timings_from_bit_rate(AVFormatContext *ic)
 #define DURATION_MAX_RETRY 6
 
 /* only usable for MPEG-PS streams */
-static void estimate_timings_from_pts(AVFormatContext *ic, int64_t old_offset)
+static void estimate_timings_from_pts(AVFormatContext *ic)
 {
     AVPacket pkt1, *pkt = &pkt1;
     AVStream *st;
@@ -2905,7 +2905,6 @@  static void estimate_timings_from_pts(AVFormatContext *ic, int64_t old_offset)
 skip_duration_calc:
     fill_all_stream_timings(ic);
 
-    avio_seek(ic->pb, old_offset, SEEK_SET);
     for (i = 0; i < ic->nb_streams; i++) {
         int j;
 
@@ -2918,7 +2917,7 @@  skip_duration_calc:
     }
 }
 
-static void estimate_timings(AVFormatContext *ic, int64_t old_offset)
+static void estimate_timings(AVFormatContext *ic)
 {
     int64_t file_size;
 
@@ -2934,7 +2933,7 @@  static void estimate_timings(AVFormatContext *ic, int64_t old_offset)
          !strcmp(ic->iformat->name, "mpegts")) &&
         file_size && (ic->pb->seekable & AVIO_SEEKABLE_NORMAL)) {
         /* get accurate estimate from the PTSes */
-        estimate_timings_from_pts(ic, old_offset);
+        estimate_timings_from_pts(ic);
         ic->duration_estimation_method = AVFMT_DURATION_FROM_PTS;
     } else if (has_duration(ic)) {
         /* at least one component has timings - we use them for all
@@ -4053,7 +4052,9 @@  FF_ENABLE_DEPRECATION_WARNINGS
     }
 
     if (probesize)
-        estimate_timings(ic, old_offset);
+        estimate_timings(ic);
+    if ((ic->pb->seekable & AVIO_SEEKABLE_NORMAL))
+        avio_seek(ic->pb, old_offset, SEEK_SET);
 
     av_opt_set(ic, "skip_clear", "0", AV_OPT_SEARCH_CHILDREN);