diff mbox

[FFmpeg-devel] libavutil/timestamp.h: Fix loss of precision in timestamps for silencedetect on long files

Message ID e27f4d28-36d4-35a1-5ff9-d21e351107e2@yahoo.com
State New
Headers show

Commit Message

Thilo Borgmann via ffmpeg-devel April 7, 2019, 7:47 a.m. UTC
[Second try submitting to the list. This patch now passes fate.]

When the silencedetect filter is run against long files, the output
timestamps gradually lose precision as the scan proceeds further into
the file. This is because the output is formatted (in
libavutil/timestamp.h) as "%.6g", which limits the total field
length. Eventually, for offsets greater than 100000 seconds (about 28
hours), fractions of a second disappear altogether, and the
timestamps are logged as whole seconds. This is insufficient
precision for my purposes. I propose changing the format to "%.6f",
which will give microsecond precision (probably overkill but safe)
for all timestamps regardless of offset.

The timestamp string length is limited to 32 characters
(AV_TS_MAX_STRING_SIZE), so this should still be plenty long enough
with the increased length (up to 10^25 seconds).

My interest is in fixing this problem for silencedetect, which
formats the timestamps by calling the macro av_ts2timestr, defined in
timestamp.h. Since av_ts2timestr is also used in many other places (I
count 21 c files), I have created a new macro, av_ts2timestr_format,
with a format string added as a parameter, and left the original
macro interface as is for other usages, to limit the scope of this
change. The same or similar change could be made for other cases
where better precision is desired.
From 5492506534bf863cbf1ee8f09a5e59b4ee111226 Mon Sep 17 00:00:00 2001
From: Allan Cady <allancady@yahoo.com>
Date: Sun, 7 Apr 2019 00:07:58 -0700
Subject: [PATCH] libavutil/timestamp.h: Fix loss of precision in timestamps
 for silencedetect on long files

When the silencedetect filter is run against long files, the output
timestamps gradually lose precision as the scan proceeds further into
the file. This is because the output is formatted (in
libavutil/timestamp.h) as "%.6g", which limits the total field
length. Eventually, for offsets greater than 100000 seconds (about 28
hours), fractions of a second disappear altogether, and the
timestamps are logged as whole seconds. This is insufficient
precision for my purposes. I propose changing the format to "%.6f",
which will give microsecond precision (probably overkill but safe)
for all timestamps regardless of offset.

The timestamp string length is limited to 32 characters
(AV_TS_MAX_STRING_SIZE), so this should still be plenty long enough
with the increased length (up to 10^25 seconds).

My interest is in fixing this problem for silencedetect, which
formats the timestamps by calling the macro av_ts2timestr, defined in
timestamp.h. Since av_ts2timestr is also used in many other places (I
count 21 c files), I have created a new macro, av_ts2timestr_format,
with a format string added as a parameter, and left the original
macro interface as is for other usages, to limit the scope of this
change. The same or similar change could be made for other cases
where better precision is desired.
---
 libavfilter/af_silencedetect.c               | 14 ++++++++------
 libavutil/timestamp.h                        | 13 ++++++++-----
 tests/ref/fate/filter-metadata-silencedetect |  2 +-
 3 files changed, 17 insertions(+), 12 deletions(-)

Comments

James Almer April 7, 2019, 1:45 p.m. UTC | #1
On 4/7/2019 4:47 AM, Allan Cady via ffmpeg-devel wrote:
> [Second try submitting to the list. This patch now passes fate.]
> 
> When the silencedetect filter is run against long files, the output
> timestamps gradually lose precision as the scan proceeds further into
> the file. This is because the output is formatted (in
> libavutil/timestamp.h) as "%.6g", which limits the total field
> length. Eventually, for offsets greater than 100000 seconds (about 28
> hours), fractions of a second disappear altogether, and the
> timestamps are logged as whole seconds. This is insufficient
> precision for my purposes. I propose changing the format to "%.6f",
> which will give microsecond precision (probably overkill but safe)
> for all timestamps regardless of offset.
> 
> The timestamp string length is limited to 32 characters
> (AV_TS_MAX_STRING_SIZE), so this should still be plenty long enough
> with the increased length (up to 10^25 seconds).
> 
> My interest is in fixing this problem for silencedetect, which
> formats the timestamps by calling the macro av_ts2timestr, defined in
> timestamp.h. Since av_ts2timestr is also used in many other places (I
> count 21 c files), I have created a new macro, av_ts2timestr_format,
> with a format string added as a parameter, and left the original
> macro interface as is for other usages, to limit the scope of this
> change. The same or similar change could be made for other cases
> where better precision is desired.
> 
> 0001-libavutil-timestamp.h-Fix-loss-of-precision-in-times.patch
> 
> From 5492506534bf863cbf1ee8f09a5e59b4ee111226 Mon Sep 17 00:00:00 2001
> From: Allan Cady <allancady@yahoo.com>
> Date: Sun, 7 Apr 2019 00:07:58 -0700
> Subject: [PATCH] libavutil/timestamp.h: Fix loss of precision in timestamps
>  for silencedetect on long files
> 
> When the silencedetect filter is run against long files, the output
> timestamps gradually lose precision as the scan proceeds further into
> the file. This is because the output is formatted (in
> libavutil/timestamp.h) as "%.6g", which limits the total field
> length. Eventually, for offsets greater than 100000 seconds (about 28
> hours), fractions of a second disappear altogether, and the
> timestamps are logged as whole seconds. This is insufficient
> precision for my purposes. I propose changing the format to "%.6f",
> which will give microsecond precision (probably overkill but safe)
> for all timestamps regardless of offset.
> 
> The timestamp string length is limited to 32 characters
> (AV_TS_MAX_STRING_SIZE), so this should still be plenty long enough
> with the increased length (up to 10^25 seconds).
> 
> My interest is in fixing this problem for silencedetect, which
> formats the timestamps by calling the macro av_ts2timestr, defined in
> timestamp.h. Since av_ts2timestr is also used in many other places (I
> count 21 c files), I have created a new macro, av_ts2timestr_format,
> with a format string added as a parameter, and left the original
> macro interface as is for other usages, to limit the scope of this
> change. The same or similar change could be made for other cases
> where better precision is desired.
> ---
>  libavfilter/af_silencedetect.c               | 14 ++++++++------
>  libavutil/timestamp.h                        | 13 ++++++++-----
>  tests/ref/fate/filter-metadata-silencedetect |  2 +-
>  3 files changed, 17 insertions(+), 12 deletions(-)
> 
> diff --git a/libavfilter/af_silencedetect.c b/libavfilter/af_silencedetect.c
> index 3a71f39..2da8dbe 100644
> --- a/libavfilter/af_silencedetect.c
> +++ b/libavfilter/af_silencedetect.c
> @@ -32,6 +32,8 @@
>  #include "avfilter.h"
>  #include "internal.h"
>  
> +const char TIMESTAMP_FMT_SILENCEDETECT[] = "%.6f";
> +
>  typedef struct SilenceDetectContext {
>      const AVClass *class;
>      double noise;               ///< noise amplitude ratio
> @@ -86,11 +88,11 @@ static av_always_inline void update(SilenceDetectContext *s, AVFrame *insamples,
>                  s->start[channel] = insamples->pts + av_rescale_q(current_sample / s->channels + 1 - nb_samples_notify * s->independent_channels / s->channels,
>                          (AVRational){ 1, s->last_sample_rate }, time_base);
>                  set_meta(insamples, s->mono ? channel + 1 : 0, "silence_start",
> -                        av_ts2timestr(s->start[channel], &time_base));
> +                        av_ts2timestr_fmt(s->start[channel], &time_base, TIMESTAMP_FMT_SILENCEDETECT));
>                  if (s->mono)
>                      av_log(s, AV_LOG_INFO, "channel: %d | ", channel);
>                  av_log(s, AV_LOG_INFO, "silence_start: %s\n",
> -                        av_ts2timestr(s->start[channel], &time_base));
> +                        av_ts2timestr_fmt(s->start[channel], &time_base, TIMESTAMP_FMT_SILENCEDETECT));
>              }
>          }
>      } else {
> @@ -101,15 +103,15 @@ static av_always_inline void update(SilenceDetectContext *s, AVFrame *insamples,
>              int64_t duration_ts = end_pts - s->start[channel];
>              if (insamples) {
>                  set_meta(insamples, s->mono ? channel + 1 : 0, "silence_end",
> -                        av_ts2timestr(end_pts, &time_base));
> +                        av_ts2timestr_fmt(end_pts, &time_base, TIMESTAMP_FMT_SILENCEDETECT));
>                  set_meta(insamples, s->mono ? channel + 1 : 0, "silence_duration",
> -                        av_ts2timestr(duration_ts, &time_base));
> +                        av_ts2timestr_fmt(duration_ts, &time_base, TIMESTAMP_FMT_SILENCEDETECT));
>              }
>              if (s->mono)
>                  av_log(s, AV_LOG_INFO, "channel: %d | ", channel);
>              av_log(s, AV_LOG_INFO, "silence_end: %s | silence_duration: %s\n",
> -                    av_ts2timestr(end_pts, &time_base),
> -                    av_ts2timestr(duration_ts, &time_base));
> +                    av_ts2timestr_fmt(end_pts, &time_base, TIMESTAMP_FMT_SILENCEDETECT),
> +                    av_ts2timestr_fmt(duration_ts, &time_base, TIMESTAMP_FMT_SILENCEDETECT));
>          }
>          s->nb_null_samples[channel] = 0;
>          s->start[channel] = INT64_MIN;
> diff --git a/libavutil/timestamp.h b/libavutil/timestamp.h
> index e082f01..fd9baaf 100644
> --- a/libavutil/timestamp.h
> +++ b/libavutil/timestamp.h
> @@ -31,6 +31,7 @@
>  #endif
>  
>  #define AV_TS_MAX_STRING_SIZE 32
> +#define TIMESTAMP_FMT_GENERAL "%.6g"

timestamp.h is an installed library, so you need to add an AV_ prefix to
this define.
Also, TS instead of TIMESTAMP is shorter and consistent with the above
define. Similarly, maybe use DEFAULT instead of GENERAL.

>  
>  /**
>   * Fill the provided buffer with a string containing a timestamp
> @@ -55,24 +56,26 @@ static inline char *av_ts_make_string(char *buf, int64_t ts)
>  
>  /**
>   * Fill the provided buffer with a string containing a timestamp time
> - * representation.
> + * representation, allowing format specification.
>   *
>   * @param buf a buffer with size in bytes of at least AV_TS_MAX_STRING_SIZE
>   * @param ts the timestamp to represent
>   * @param tb the timebase of the timestamp
> + * @param format format string for timestamp output, e.g. "%.5f".

Instead of %.5f, mention the define you added above.

>   * @return the buffer in input
>   */
> -static inline char *av_ts_make_time_string(char *buf, int64_t ts, AVRational *tb)
> +static inline char *av_ts_make_time_string_format(char *buf, int64_t ts, AVRational *tb, const char *format)

Being public API, you can't just remove an existing function. Add the
new av_ts_make_time_string_format() function alongside the existing
av_ts_make_time_string() one, making the latter call the former with the
default value as argument for format.

If you consider the old function av_ts_make_time_string() becomes
superfluous after this, then you can simply deprecate it and schedule it
for removal using a new FF_API_ define in version.h

>  {
>      if (ts == AV_NOPTS_VALUE) snprintf(buf, AV_TS_MAX_STRING_SIZE, "NOPTS");
> -    else                      snprintf(buf, AV_TS_MAX_STRING_SIZE, "%.6g", av_q2d(*tb) * ts);
> +    else                      snprintf(buf, AV_TS_MAX_STRING_SIZE, format, av_q2d(*tb) * ts);
>      return buf;
>  }
>  
>  /**
> - * Convenience macro, the return value should be used only directly in
> + * Convenience macros, the return value should be used only directly in
>   * function arguments but never stand-alone.
>   */
> -#define av_ts2timestr(ts, tb) av_ts_make_time_string((char[AV_TS_MAX_STRING_SIZE]){0}, ts, tb)
> +#define av_ts2timestr_fmt(ts, tb, fmt) av_ts_make_time_string_format((char[AV_TS_MAX_STRING_SIZE]){0}, ts, tb, fmt)
> +#define av_ts2timestr(ts, tb) av_ts_make_time_string_format((char[AV_TS_MAX_STRING_SIZE]){0}, ts, tb, TIMESTAMP_FMT_GENERAL)
>  
>  #endif /* AVUTIL_TIMESTAMP_H */
> diff --git a/tests/ref/fate/filter-metadata-silencedetect b/tests/ref/fate/filter-metadata-silencedetect
> index 16a9d07..5d8063c 100644
> --- a/tests/ref/fate/filter-metadata-silencedetect
> +++ b/tests/ref/fate/filter-metadata-silencedetect
> @@ -1,5 +1,5 @@
>  pkt_pts=0|tag:lavfi.silence_duration=0.523107|tag:lavfi.silence_end=0.690023|tag:lavfi.silence_start=0.736417
> -pkt_pts=46080|tag:lavfi.silence_start=1.27626|tag:lavfi.silence_end=1.80751|tag:lavfi.silence_duration=0.531247
> +pkt_pts=46080|tag:lavfi.silence_start=1.276259|tag:lavfi.silence_end=1.807506|tag:lavfi.silence_duration=0.531247
>  pkt_pts=92160
>  pkt_pts=138240
>  pkt_pts=184320
> -- 2.7.4
> 
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
diff mbox

Patch

diff --git a/libavfilter/af_silencedetect.c b/libavfilter/af_silencedetect.c
index 3a71f39..2da8dbe 100644
--- a/libavfilter/af_silencedetect.c
+++ b/libavfilter/af_silencedetect.c
@@ -32,6 +32,8 @@ 
 #include "avfilter.h"
 #include "internal.h"
 
+const char TIMESTAMP_FMT_SILENCEDETECT[] = "%.6f";
+
 typedef struct SilenceDetectContext {
     const AVClass *class;
     double noise;               ///< noise amplitude ratio
@@ -86,11 +88,11 @@  static av_always_inline void update(SilenceDetectContext *s, AVFrame *insamples,
                 s->start[channel] = insamples->pts + av_rescale_q(current_sample / s->channels + 1 - nb_samples_notify * s->independent_channels / s->channels,
                         (AVRational){ 1, s->last_sample_rate }, time_base);
                 set_meta(insamples, s->mono ? channel + 1 : 0, "silence_start",
-                        av_ts2timestr(s->start[channel], &time_base));
+                        av_ts2timestr_fmt(s->start[channel], &time_base, TIMESTAMP_FMT_SILENCEDETECT));
                 if (s->mono)
                     av_log(s, AV_LOG_INFO, "channel: %d | ", channel);
                 av_log(s, AV_LOG_INFO, "silence_start: %s\n",
-                        av_ts2timestr(s->start[channel], &time_base));
+                        av_ts2timestr_fmt(s->start[channel], &time_base, TIMESTAMP_FMT_SILENCEDETECT));
             }
         }
     } else {
@@ -101,15 +103,15 @@  static av_always_inline void update(SilenceDetectContext *s, AVFrame *insamples,
             int64_t duration_ts = end_pts - s->start[channel];
             if (insamples) {
                 set_meta(insamples, s->mono ? channel + 1 : 0, "silence_end",
-                        av_ts2timestr(end_pts, &time_base));
+                        av_ts2timestr_fmt(end_pts, &time_base, TIMESTAMP_FMT_SILENCEDETECT));
                 set_meta(insamples, s->mono ? channel + 1 : 0, "silence_duration",
-                        av_ts2timestr(duration_ts, &time_base));
+                        av_ts2timestr_fmt(duration_ts, &time_base, TIMESTAMP_FMT_SILENCEDETECT));
             }
             if (s->mono)
                 av_log(s, AV_LOG_INFO, "channel: %d | ", channel);
             av_log(s, AV_LOG_INFO, "silence_end: %s | silence_duration: %s\n",
-                    av_ts2timestr(end_pts, &time_base),
-                    av_ts2timestr(duration_ts, &time_base));
+                    av_ts2timestr_fmt(end_pts, &time_base, TIMESTAMP_FMT_SILENCEDETECT),
+                    av_ts2timestr_fmt(duration_ts, &time_base, TIMESTAMP_FMT_SILENCEDETECT));
         }
         s->nb_null_samples[channel] = 0;
         s->start[channel] = INT64_MIN;
diff --git a/libavutil/timestamp.h b/libavutil/timestamp.h
index e082f01..fd9baaf 100644
--- a/libavutil/timestamp.h
+++ b/libavutil/timestamp.h
@@ -31,6 +31,7 @@ 
 #endif
 
 #define AV_TS_MAX_STRING_SIZE 32
+#define TIMESTAMP_FMT_GENERAL "%.6g"
 
 /**
  * Fill the provided buffer with a string containing a timestamp
@@ -55,24 +56,26 @@  static inline char *av_ts_make_string(char *buf, int64_t ts)
 
 /**
  * Fill the provided buffer with a string containing a timestamp time
- * representation.
+ * representation, allowing format specification.
  *
  * @param buf a buffer with size in bytes of at least AV_TS_MAX_STRING_SIZE
  * @param ts the timestamp to represent
  * @param tb the timebase of the timestamp
+ * @param format format string for timestamp output, e.g. "%.5f".
  * @return the buffer in input
  */
-static inline char *av_ts_make_time_string(char *buf, int64_t ts, AVRational *tb)
+static inline char *av_ts_make_time_string_format(char *buf, int64_t ts, AVRational *tb, const char *format)
 {
     if (ts == AV_NOPTS_VALUE) snprintf(buf, AV_TS_MAX_STRING_SIZE, "NOPTS");
-    else                      snprintf(buf, AV_TS_MAX_STRING_SIZE, "%.6g", av_q2d(*tb) * ts);
+    else                      snprintf(buf, AV_TS_MAX_STRING_SIZE, format, av_q2d(*tb) * ts);
     return buf;
 }
 
 /**
- * Convenience macro, the return value should be used only directly in
+ * Convenience macros, the return value should be used only directly in
  * function arguments but never stand-alone.
  */
-#define av_ts2timestr(ts, tb) av_ts_make_time_string((char[AV_TS_MAX_STRING_SIZE]){0}, ts, tb)
+#define av_ts2timestr_fmt(ts, tb, fmt) av_ts_make_time_string_format((char[AV_TS_MAX_STRING_SIZE]){0}, ts, tb, fmt)
+#define av_ts2timestr(ts, tb) av_ts_make_time_string_format((char[AV_TS_MAX_STRING_SIZE]){0}, ts, tb, TIMESTAMP_FMT_GENERAL)
 
 #endif /* AVUTIL_TIMESTAMP_H */
diff --git a/tests/ref/fate/filter-metadata-silencedetect b/tests/ref/fate/filter-metadata-silencedetect
index 16a9d07..5d8063c 100644
--- a/tests/ref/fate/filter-metadata-silencedetect
+++ b/tests/ref/fate/filter-metadata-silencedetect
@@ -1,5 +1,5 @@ 
 pkt_pts=0|tag:lavfi.silence_duration=0.523107|tag:lavfi.silence_end=0.690023|tag:lavfi.silence_start=0.736417
-pkt_pts=46080|tag:lavfi.silence_start=1.27626|tag:lavfi.silence_end=1.80751|tag:lavfi.silence_duration=0.531247
+pkt_pts=46080|tag:lavfi.silence_start=1.276259|tag:lavfi.silence_end=1.807506|tag:lavfi.silence_duration=0.531247
 pkt_pts=92160
 pkt_pts=138240
 pkt_pts=184320