diff mbox series

[FFmpeg-devel,RFC] vf_fps: Requantize pts of CFR videos

Message ID 837ff72bd80c351924dc4cf0d96c003f082681f3.camel@kepstin.ca
State New
Headers show
Series [FFmpeg-devel,RFC] vf_fps: Requantize pts of CFR videos | expand

Checks

Context Check Description
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished
andriy/make_ppc success Make finished
andriy/make_fate_ppc success Make fate finished

Commit Message

Calvin Walton Dec. 22, 2021, 4:20 p.m. UTC
This is mostly to avoid oddities in small framerate adjustments when you
have input videos from containers such as matroska, where the pts values
are quantized with a large enough error to cause issues.

An example, when converting from 24/1.001 fps to 24 fps with round=down
from an mkv source (inlink time base is 1/1000, outlink is 1001/24000):

  In PTS | Out PTS  | Rounded Out PTS
  69292  | 1663.008 | 1663
  69333  | 1663.992 | 1663
  69375  | 1665.000 | 1665

In this example, the fps filter would drop the frame with pts 69292,
then duplicate the frame with pts 69333, an undesirable result.

By first requantizing the input pts to the inlink frame rate, the result
looks much nicer:

  In PTS | Req. PTS | Out PTS
  69292	 | 1661     | 1662
  69333  | 1662     | 1663
  69375  | 1663     | 1664

(Note that the same rounding mode is used for both conversions,
resulting in the final out pts being a bit lower in this case. With the
normal nearest mode, it would be closer.)

I've verified that in conversions of longer mkv files to "close"
framerates that previously had issues due to quantization, this
significantly reduces the number of incorrectly dropped or duplicated
frames.

The potential downside of this change is that if an input file is
probed as CFR but is actually VFR, then the results will be poor
(you'll get unnecessarily dropped frames or added judder). A new
option, "requantize", is added to allow disabling this behaviour in
those cases.

Signed-off-by: Calvin Walton <calvin.walton@kepstin.ca>

---
 libavfilter/vf_fps.c | 48 +++++++++++++++++++++++++++++++++++++-------
 1 file changed, 41 insertions(+), 7 deletions(-)

Comments

Michael Niedermayer Dec. 22, 2021, 10:04 p.m. UTC | #1
On Wed, Dec 22, 2021 at 11:20:26AM -0500, Calvin Walton wrote:
> This is mostly to avoid oddities in small framerate adjustments when you
> have input videos from containers such as matroska, where the pts values
> are quantized with a large enough error to cause issues.
> 
> An example, when converting from 24/1.001 fps to 24 fps with round=down
> from an mkv source (inlink time base is 1/1000, outlink is 1001/24000):
> 
>   In PTS | Out PTS  | Rounded Out PTS
>   69292  | 1663.008 | 1663
>   69333  | 1663.992 | 1663
>   69375  | 1665.000 | 1665
> 
> In this example, the fps filter would drop the frame with pts 69292,
> then duplicate the frame with pts 69333, an undesirable result.
> 
> By first requantizing the input pts to the inlink frame rate, the result
> looks much nicer:
> 
>   In PTS | Req. PTS | Out PTS
>   69292	 | 1661     | 1662
>   69333  | 1662     | 1663
>   69375  | 1663     | 1664
> 
> (Note that the same rounding mode is used for both conversions,
> resulting in the final out pts being a bit lower in this case. With the
> normal nearest mode, it would be closer.)
> 
> I've verified that in conversions of longer mkv files to "close"
> framerates that previously had issues due to quantization, this
> significantly reduces the number of incorrectly dropped or duplicated
> frames.
> 
> The potential downside of this change is that if an input file is
> probed as CFR but is actually VFR, then the results will be poor
> (you'll get unnecessarily dropped frames or added judder). A new
> option, "requantize", is added to allow disabling this behaviour in
> those cases.
> 
> Signed-off-by: Calvin Walton <calvin.walton@kepstin.ca>
> 
> ---
>  libavfilter/vf_fps.c | 48 +++++++++++++++++++++++++++++++++++++-------
>  1 file changed, 41 insertions(+), 7 deletions(-)

breaks a 60fps 1/60000 timebase input downconverted with -vf fps=30:-0.01
the expected result is identical to -vf fps=30 but

-i 60fps.mov -bitexact -vf fps=30:-0.01  -t 2 -v 99 fps.flv
the expected output is that either all odd or all even frames are returned but
its a mix after this patch
I cannot share the input file but this should reproduce with any 60fps
input i hope. If not say so and ill try to find some other file that shows this

thx

[...]
Calvin Walton Dec. 23, 2021, 2:57 a.m. UTC | #2
On Wed, 2021-12-22 at 23:04 +0100, Michael Niedermayer wrote:
> On Wed, Dec 22, 2021 at 11:20:26AM -0500, Calvin Walton wrote:
> > This is mostly to avoid oddities in small framerate adjustments
> > when you
> > have input videos from containers such as matroska, where the pts
> > values
> > are quantized with a large enough error to cause issues.
> > 
> 
> breaks a 60fps 1/60000 timebase input downconverted with -vf fps=30:-
> 0.01
> the expected result is identical to -vf fps=30 but
> 
> -i 60fps.mov -bitexact -vf fps=30:-0.01  -t 2 -v 99 fps.flv
> the expected output is that either all odd or all even frames are
> returned but
> its a mix after this patch
> I cannot share the input file but this should reproduce with any
> 60fps
> input i hope. If not say so and ill try to find some other file that
> shows this

I'll take a look at this. Hopefully I can reproduce it with an
artificially created file (if so, I'll make a test).

If you could provide the logs from the fps filter at 'verbose' log
level for the initial segment of the file (first ~1 second of so) that
would provide helpful information.

It's possible that the reason you're having a problem with fps=30:-0.01
is that I'm applying the start offset in the wrong position in the
conversion; I'll try moving it around.
Calvin Walton Dec. 23, 2021, 2:59 a.m. UTC | #3
On Wed, 2021-12-22 at 21:57 -0500, Calvin Walton wrote:
> 
> If you could provide the logs from the fps filter at 'verbose' log
> level for the initial segment of the file (first ~1 second of so)
> that
> would provide helpful information.

I said "verbose", but I meant "debug" here :/
diff mbox series

Patch

diff --git a/libavfilter/vf_fps.c b/libavfilter/vf_fps.c
index 99e679441e..d010083a35 100644
--- a/libavfilter/vf_fps.c
+++ b/libavfilter/vf_fps.c
@@ -74,6 +74,7 @@  typedef struct FPSContext {
     char *framerate;        ///< expression that defines the target framerate
     int rounding;           ///< AVRounding method for timestamps
     int eof_action;         ///< action performed for last frame in FIFO
+    int requantize;         ///< whether to requantize timestamps of cfr inputs
 
     /* Set during outlink configuration */
     int64_t  in_pts_off;    ///< input frame pts offset for start_time handling
@@ -111,6 +112,8 @@  static const AVOption fps_options[] = {
     { "eof_action", "action performed for last frame", OFFSET(eof_action), AV_OPT_TYPE_INT, { .i64 = EOF_ACTION_ROUND }, 0, EOF_ACTION_NB-1, V|F, "eof_action" },
         { "round", "round similar to other frames",  0, AV_OPT_TYPE_CONST, { .i64 = EOF_ACTION_ROUND }, 0, 0, V|F, "eof_action" },
         { "pass",  "pass through last frame",        0, AV_OPT_TYPE_CONST, { .i64 = EOF_ACTION_PASS  }, 0, 0, V|F, "eof_action" },
+    { "requantize", "requantize input timestamps in CFR video based on framerate",
+      OFFSET(requantize), AV_OPT_TYPE_BOOL, { .i64 = 1 }, 0, 1, V|F },
     { NULL }
 };
 
@@ -177,6 +180,7 @@  static int config_props(AVFilterLink* outlink)
     FPSContext      *s      = ctx->priv;
 
     double var_values[VARS_NB], res;
+    AVRational in_time_base;
     int ret;
 
     var_values[VAR_SOURCE_FPS]    = av_q2d(inlink->frame_rate);
@@ -193,6 +197,18 @@  static int config_props(AVFilterLink* outlink)
     outlink->frame_rate = av_d2q(res, INT_MAX);
     outlink->time_base  = av_inv_q(outlink->frame_rate);
 
+    /* Disable requantization if input video is VFR */
+    if (s->requantize && inlink->frame_rate.num == 1 && inlink->frame_rate.den == 0) {
+        av_log(ctx, AV_LOG_INFO, "Not requantizing input timestamps; video is VFR\n");
+        s->requantize = 0;
+    }
+
+    in_time_base = inlink->time_base;
+    if (s->requantize) {
+        in_time_base = av_inv_q(inlink->frame_rate);
+        av_log(ctx, AV_LOG_VERBOSE, "Requantizing input timestamps to time_base=%d/%d\n", in_time_base.num, in_time_base.den);
+    }
+
     /* Calculate the input and output pts offsets for start_time */
     if (s->start_time != DBL_MAX && s->start_time != AV_NOPTS_VALUE) {
         double first_pts = s->start_time * AV_TIME_BASE;
@@ -201,7 +217,7 @@  static int config_props(AVFilterLink* outlink)
                    s->start_time);
             return AVERROR(EINVAL);
         }
-        s->in_pts_off  = av_rescale_q_rnd(first_pts, AV_TIME_BASE_Q, inlink->time_base,
+        s->in_pts_off  = av_rescale_q_rnd(first_pts, AV_TIME_BASE_Q, in_time_base,
                                           s->rounding | AV_ROUND_PASS_MINMAX);
         s->out_pts_off = av_rescale_q_rnd(first_pts, AV_TIME_BASE_Q, outlink->time_base,
                                           s->rounding | AV_ROUND_PASS_MINMAX);
@@ -220,7 +236,8 @@  static int read_frame(AVFilterContext *ctx, FPSContext *s, AVFilterLink *inlink,
 {
     AVFrame *frame;
     int ret;
-    int64_t in_pts;
+    AVRational in_time_base;
+    int64_t orig_pts, in_pts;
 
     /* Must only be called when we have buffer room available */
     av_assert1(s->frames_count < 2);
@@ -231,16 +248,24 @@  static int read_frame(AVFilterContext *ctx, FPSContext *s, AVFilterLink *inlink,
     if (ret < 0)
         return ret;
 
+    /* Requantize CFR video pts based on frame rate */
+    in_time_base = inlink->time_base;
+    orig_pts = in_pts = frame->pts;
+    if (s->requantize) {
+        in_time_base = av_inv_q(inlink->frame_rate);
+        in_pts = av_rescale_q_rnd(in_pts, inlink->time_base, in_time_base,
+                                  s->rounding | AV_ROUND_PASS_MINMAX);
+    }
+
     /* Convert frame pts to output timebase.
      * The dance with offsets is required to match the rounding behaviour of the
      * previous version of the fps filter when using the start_time option. */
-    in_pts = frame->pts;
     frame->pts = s->out_pts_off + av_rescale_q_rnd(in_pts - s->in_pts_off,
-                                                   inlink->time_base, outlink->time_base,
+                                                   in_time_base, outlink->time_base,
                                                    s->rounding | AV_ROUND_PASS_MINMAX);
 
-    av_log(ctx, AV_LOG_DEBUG, "Read frame with in pts %"PRId64", out pts %"PRId64"\n",
-           in_pts, frame->pts);
+    av_log(ctx, AV_LOG_DEBUG, "Read frame with in pts %"PRId64", req pts %"PRId64", out pts %"PRId64"\n",
+           orig_pts, in_pts, frame->pts);
 
     s->frames[s->frames_count++] = frame;
     s->frames_in++;
@@ -304,7 +329,16 @@  static int write_frame(AVFilterContext *ctx, FPSContext *s, AVFilterLink *outlin
 static void update_eof_pts(AVFilterContext *ctx, FPSContext *s, AVFilterLink *inlink, AVFilterLink *outlink, int64_t status_pts)
 {
     int eof_rounding = (s->eof_action == EOF_ACTION_PASS) ? AV_ROUND_UP : s->rounding;
-    s->status_pts = av_rescale_q_rnd(status_pts, inlink->time_base, outlink->time_base,
+    AVRational in_time_base;
+
+    /* Requantize CFR video pts based on frame rate */
+    in_time_base = inlink->time_base;
+    if (s->requantize) {
+        in_time_base = av_inv_q(inlink->frame_rate);
+        status_pts = av_rescale_q_rnd(status_pts, inlink->time_base, in_time_base,
+                                      eof_rounding | AV_ROUND_PASS_MINMAX);
+    }
+    s->status_pts = av_rescale_q_rnd(status_pts, in_time_base, outlink->time_base,
                                      eof_rounding | AV_ROUND_PASS_MINMAX);
 
     av_log(ctx, AV_LOG_DEBUG, "EOF is at pts %"PRId64"\n", s->status_pts);