diff mbox series

[FFmpeg-devel,v5,1/1] avformat/demux: Add duration_probesize AVOption

Message ID 20240328175736.161733-2-nicolas.gaullier@cji.paris
State New
Headers show
Series avformat/demux: Add duration_probesize AVOption | expand

Checks

Context Check Description
yinshiyou/make_loongarch64 success Make finished
yinshiyou/make_fate_loongarch64 success Make fate finished
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Nicolas Gaullier March 28, 2024, 5:57 p.m. UTC
Yet another probesize used to get the durations when
estimate_timings_from_pts is required. It is aimed at users interested
in better durations probing for itself, or because using
avformat_find_stream_info indirectly and requiring exact values: for
concatdec for example, especially if streamcopying above it.
The current code is a performance trade-off that can fail to get video
stream durations in a scenario with high bitrates and buffering for
files ending cleanly (as opposed to live captures): the physical gap
between the last video packet and the last audio packet is very high in
such a case.

Default behaviour is unchanged: 250k up to 250k << 6 (step by step).
Setting this new option has two effects:
- override the maximum probesize (currently 250k << 6)
- reduce the number of steps to 1 instead of 6, this is to avoid
detecting the audio "too early" and failing to reach a video packet.
Even if a single audio stream duration is found but not the other
audio/video stream durations, there will be a retry, so at the end the
full user-overriden probesize will be used as expected by the user.

Signed-off-by: Nicolas Gaullier <nicolas.gaullier@cji.paris>
---
 doc/APIchanges              |  3 +++
 doc/formats.texi            | 19 ++++++++++++++++++-
 libavformat/avformat.h      | 16 ++++++++++++++--
 libavformat/demux.c         | 13 ++++++++-----
 libavformat/options_table.h |  1 +
 libavformat/version.h       |  2 +-
 6 files changed, 45 insertions(+), 9 deletions(-)

Comments

Stefano Sabatini March 28, 2024, 7:11 p.m. UTC | #1
On date Thursday 2024-03-28 18:57:36 +0100, Nicolas Gaullier wrote:
> Yet another probesize used to get the durations when
> estimate_timings_from_pts is required. It is aimed at users interested
> in better durations probing for itself, or because using
> avformat_find_stream_info indirectly and requiring exact values: for
> concatdec for example, especially if streamcopying above it.
> The current code is a performance trade-off that can fail to get video
> stream durations in a scenario with high bitrates and buffering for
> files ending cleanly (as opposed to live captures): the physical gap
> between the last video packet and the last audio packet is very high in
> such a case.
> 
> Default behaviour is unchanged: 250k up to 250k << 6 (step by step).
> Setting this new option has two effects:
> - override the maximum probesize (currently 250k << 6)
> - reduce the number of steps to 1 instead of 6, this is to avoid
> detecting the audio "too early" and failing to reach a video packet.
> Even if a single audio stream duration is found but not the other
> audio/video stream durations, there will be a retry, so at the end the
> full user-overriden probesize will be used as expected by the user.
> 
> Signed-off-by: Nicolas Gaullier <nicolas.gaullier@cji.paris>
> ---
>  doc/APIchanges              |  3 +++
>  doc/formats.texi            | 19 ++++++++++++++++++-
>  libavformat/avformat.h      | 16 ++++++++++++++--
>  libavformat/demux.c         | 13 ++++++++-----
>  libavformat/options_table.h |  1 +
>  libavformat/version.h       |  2 +-
>  6 files changed, 45 insertions(+), 9 deletions(-)
> 
> diff --git a/doc/APIchanges b/doc/APIchanges
> index aa102b4925..f709db536d 100644
> --- a/doc/APIchanges
> +++ b/doc/APIchanges
> @@ -2,6 +2,9 @@ The last version increases of all libraries were on 2024-03-07
>  
>  API changes, most recent first:
>  
> +2024-03-28 - xxxxxxxxxx - lavf 61.3.100 - avformat.h
> +  Add AVFormatContext.duration_probesize.
> +
>  2024-03-27 - xxxxxxxxxx - lavu 59.10.100 - frame.h
>    Add AVSideDataDescriptor, enum AVSideDataProps, and
>    av_frame_side_data_desc().
> diff --git a/doc/formats.texi b/doc/formats.texi
> index 69fc1457a4..3fe7fa9d8d 100644
> --- a/doc/formats.texi
> +++ b/doc/formats.texi
> @@ -225,9 +225,26 @@ Specifies the maximum number of streams. This can be used to reject files that
>  would require too many resources due to a large number of streams.
>  
>  @item skip_estimate_duration_from_pts @var{bool} (@emph{input})
> -Skip estimation of input duration when calculated using PTS.
> +Skip estimation of input duration if it requires an additional probing for PTS at end of file.
>  At present, applicable for MPEG-PS and MPEG-TS.
>  
> +@item duration_probesize @var{integer} (@emph{input})
> +Set probing size, in bytes, for input duration estimation when it actually requires
> +an additional probing for PTS at end of file (at present: MPEG-PS and MPEG-TS).
> +It is aimed at users interested in better durations probing for itself, or indirectly
> +because using the concat demuxer, for example.

> +The typical use case is an MPEG-TS CBR with a high bitrate, high video buffering and
> +ending cleaning with similar PTS for video and audio: in such a scenario, the large
> +physical gap between the last video packet and the last audio packet makes it necessary
> +to read many bytes in order to get the video stream duration.
> +Another use case is where the default probing behaviour only reaches a single video frame which is
> +not the last one of the stream due to frame reordering, so the duration is not accurate.


> +Setting the duration_probesize has a performance impact even for small files because the probing
> +size is fixed.

nit++:
setting the @option{duration_probesize}
or ...
setting this option

> +Default behaviour is a general purpose trade-off, largely adaptive: the probing size may range from
> +250000 up to 16M, but it is not extended to get streams durations at all costs.

I'm a bit concerned if we should really mention these values, since
they are currently hardcoded and this might result in inconsistent
documentation in case of update (probably we can only mention that it
is adaptive therefore avoiding to expose the internal thresholds).

> +Must be an integer not lesser than 1, or 0 for default behaviour.
> +

[...]

Looks good to me otherwise, thanks.
diff mbox series

Patch

diff --git a/doc/APIchanges b/doc/APIchanges
index aa102b4925..f709db536d 100644
--- a/doc/APIchanges
+++ b/doc/APIchanges
@@ -2,6 +2,9 @@  The last version increases of all libraries were on 2024-03-07
 
 API changes, most recent first:
 
+2024-03-28 - xxxxxxxxxx - lavf 61.3.100 - avformat.h
+  Add AVFormatContext.duration_probesize.
+
 2024-03-27 - xxxxxxxxxx - lavu 59.10.100 - frame.h
   Add AVSideDataDescriptor, enum AVSideDataProps, and
   av_frame_side_data_desc().
diff --git a/doc/formats.texi b/doc/formats.texi
index 69fc1457a4..3fe7fa9d8d 100644
--- a/doc/formats.texi
+++ b/doc/formats.texi
@@ -225,9 +225,26 @@  Specifies the maximum number of streams. This can be used to reject files that
 would require too many resources due to a large number of streams.
 
 @item skip_estimate_duration_from_pts @var{bool} (@emph{input})
-Skip estimation of input duration when calculated using PTS.
+Skip estimation of input duration if it requires an additional probing for PTS at end of file.
 At present, applicable for MPEG-PS and MPEG-TS.
 
+@item duration_probesize @var{integer} (@emph{input})
+Set probing size, in bytes, for input duration estimation when it actually requires
+an additional probing for PTS at end of file (at present: MPEG-PS and MPEG-TS).
+It is aimed at users interested in better durations probing for itself, or indirectly
+because using the concat demuxer, for example.
+The typical use case is an MPEG-TS CBR with a high bitrate, high video buffering and
+ending cleaning with similar PTS for video and audio: in such a scenario, the large
+physical gap between the last video packet and the last audio packet makes it necessary
+to read many bytes in order to get the video stream duration.
+Another use case is where the default probing behaviour only reaches a single video frame which is
+not the last one of the stream due to frame reordering, so the duration is not accurate.
+Setting the duration_probesize has a performance impact even for small files because the probing
+size is fixed.
+Default behaviour is a general purpose trade-off, largely adaptive: the probing size may range from
+250000 up to 16M, but it is not extended to get streams durations at all costs.
+Must be an integer not lesser than 1, or 0 for default behaviour.
+
 @item strict, f_strict @var{integer} (@emph{input/output})
 Specify how strictly to follow the standards. @code{f_strict} is deprecated and
 should be used only via the @command{ffmpeg} tool.
diff --git a/libavformat/avformat.h b/libavformat/avformat.h
index de40397676..8afdcd9fd0 100644
--- a/libavformat/avformat.h
+++ b/libavformat/avformat.h
@@ -1439,7 +1439,7 @@  typedef struct AVFormatContext {
      *
      * @note this is \e not  used for determining the \ref AVInputFormat
      *       "input format"
-     * @sa format_probesize
+     * @see format_probesize
      */
     int64_t probesize;
 
@@ -1667,6 +1667,8 @@  typedef struct AVFormatContext {
      * Skip duration calcuation in estimate_timings_from_pts.
      * - encoding: unused
      * - decoding: set by user
+     *
+     * @see duration_probesize
      */
     int skip_estimate_duration_from_pts;
 
@@ -1729,7 +1731,7 @@  typedef struct AVFormatContext {
      *
      * Demuxing only, set by the caller before avformat_open_input().
      *
-     * @sa probesize
+     * @see probesize
      */
     int format_probesize;
 
@@ -1870,6 +1872,16 @@  typedef struct AVFormatContext {
      * @return 0 on success, a negative AVERROR code on failure
      */
     int (*io_close2)(struct AVFormatContext *s, AVIOContext *pb);
+
+    /**
+     * Maximum number of bytes read from input in order to determine stream durations
+     * when using estimate_timings_from_pts in avformat_find_stream_info().
+     * Demuxing only, set by the caller before avformat_find_stream_info().
+     * Can be set to 0 to let avformat choose using a heuristic.
+     *
+     * @see skip_estimate_duration_from_pts
+     */
+    int64_t duration_probesize;
 } AVFormatContext;
 
 /**
diff --git a/libavformat/demux.c b/libavformat/demux.c
index 147f3b93ac..cc40a8ca46 100644
--- a/libavformat/demux.c
+++ b/libavformat/demux.c
@@ -1803,8 +1803,9 @@  static void estimate_timings_from_bit_rate(AVFormatContext *ic)
                "Estimating duration from bitrate, this may be inaccurate\n");
 }
 
-#define DURATION_MAX_READ_SIZE 250000LL
-#define DURATION_MAX_RETRY 6
+#define DURATION_DEFAULT_MAX_READ_SIZE 250000LL
+#define DURATION_DEFAULT_MAX_RETRY 6
+#define DURATION_MAX_RETRY 1
 
 /* only usable for MPEG-PS streams */
 static void estimate_timings_from_pts(AVFormatContext *ic, int64_t old_offset)
@@ -1812,6 +1813,8 @@  static void estimate_timings_from_pts(AVFormatContext *ic, int64_t old_offset)
     FFFormatContext *const si = ffformatcontext(ic);
     AVPacket *const pkt = si->pkt;
     int num, den, read_size, ret;
+    int64_t duration_max_read_size = ic->duration_probesize ? ic->duration_probesize >> DURATION_MAX_RETRY : DURATION_DEFAULT_MAX_READ_SIZE;
+    int duration_max_retry = ic->duration_probesize ? DURATION_MAX_RETRY : DURATION_DEFAULT_MAX_RETRY;
     int found_duration = 0;
     int is_end;
     int64_t filesize, offset, duration;
@@ -1847,7 +1850,7 @@  static void estimate_timings_from_pts(AVFormatContext *ic, int64_t old_offset)
     filesize = ic->pb ? avio_size(ic->pb) : 0;
     do {
         is_end = found_duration;
-        offset = filesize - (DURATION_MAX_READ_SIZE << retry);
+        offset = filesize - (duration_max_read_size << retry);
         if (offset < 0)
             offset = 0;
 
@@ -1856,7 +1859,7 @@  static void estimate_timings_from_pts(AVFormatContext *ic, int64_t old_offset)
         for (;;) {
             AVStream *st;
             FFStream *sti;
-            if (read_size >= DURATION_MAX_READ_SIZE << (FFMAX(retry - 1, 0)))
+            if (read_size >= duration_max_read_size << (FFMAX(retry - 1, 0)))
                 break;
 
             do {
@@ -1910,7 +1913,7 @@  static void estimate_timings_from_pts(AVFormatContext *ic, int64_t old_offset)
         }
     } while (!is_end &&
              offset &&
-             ++retry <= DURATION_MAX_RETRY);
+             ++retry <= duration_max_retry);
 
     av_opt_set_int(ic, "skip_changes", 0, AV_OPT_SEARCH_CHILDREN);
 
diff --git a/libavformat/options_table.h b/libavformat/options_table.h
index b9dca147f9..311880d24d 100644
--- a/libavformat/options_table.h
+++ b/libavformat/options_table.h
@@ -108,6 +108,7 @@  static const AVOption avformat_options[] = {
 {"max_streams", "maximum number of streams", OFFSET(max_streams), AV_OPT_TYPE_INT, { .i64 = 1000 }, 0, INT_MAX, D },
 {"skip_estimate_duration_from_pts", "skip duration calculation in estimate_timings_from_pts", OFFSET(skip_estimate_duration_from_pts), AV_OPT_TYPE_BOOL, {.i64 = 0}, 0, 1, D},
 {"max_probe_packets", "Maximum number of packets to probe a codec", OFFSET(max_probe_packets), AV_OPT_TYPE_INT, { .i64 = 2500 }, 0, INT_MAX, D },
+{"duration_probesize", "Maximum number of bytes to probe the durations of the streams in estimate_timings_from_pts", OFFSET(duration_probesize), AV_OPT_TYPE_INT64, {.i64 = 0 }, 0, INT64_MAX, D},
 {NULL},
 };
 
diff --git a/libavformat/version.h b/libavformat/version.h
index 904e7f06aa..7ff1483912 100644
--- a/libavformat/version.h
+++ b/libavformat/version.h
@@ -31,7 +31,7 @@ 
 
 #include "version_major.h"
 
-#define LIBAVFORMAT_VERSION_MINOR   2
+#define LIBAVFORMAT_VERSION_MINOR   3
 #define LIBAVFORMAT_VERSION_MICRO 100
 
 #define LIBAVFORMAT_VERSION_INT AV_VERSION_INT(LIBAVFORMAT_VERSION_MAJOR, \