diff mbox

[FFmpeg-devel,DISCUSS] nvenc: Add encoder flush API.

Message ID 1574125994-7782-1-git-send-email-joshua.allmann@gmail.com
State New
Headers show

Commit Message

Josh Allmann Nov. 19, 2019, 1:13 a.m. UTC
This patch is meant to be an entry point for discussion around an
issue we are having with flushing the nvenc encoder while doing
segmented transcoding. Hopefully there will be a less kludgey
workaround than this.

First, some background some info on where this is coming from. We do
segmented transcoding on Nvidia GPUs using the libav* libraries [0].
The flow is roughly this:

1. Segment incoming stream
2. Send each segment to a transcoder

We've noticed a significant overhead around setting up new transcode
sessions / contexts for each segment, and this overhead is magnified
the more streams a given machine is processing, regardless of the
number of attached GPUs [1].

Now, the logical solution here would be to reuse the GPU sessions
for segments during a given stream. However, there is a problem
around flushing internal decode / encode buffers. Because we do
segmented transcoding [2], we need to ensure that all stages in the
transcode pipeline are completely flushed in between each segment.

Here is what we do for each stage of decode, filter and encode:

* Decoding : Cache the first packet of each segment. When the
  IO layer EOFs, feed the cached packet with a sentinel pts of -1.
  (This doesn't seem to cause issues with h264_cuvid.) Once a frame
  is returned from the decoder with the sentinel pts set, we know
  the decoder is flushed of legitimate input. For a typical 2-second
  segment, this has typically added about 6 frames (~10%) of overhead
  which is tolerable because decoding is typically less expensive than
  encoding, No changes are required to FFmpeg itself, which is nice.

* Filtering : Close the filtergraph (via av_buffersrc_close) and re-
  initialize the filter with each segment. Again, the overhead here
  seems tolerable. Have not seen a straightforward way to drain the
  filtergraph without also closing or re-opening it.

* Encoding : This patch.

  We add a very special "av_nvenc_flush" API to signal end-of-stream
  in the same way as `avcodec_send_packet(ctx, NULL)` but bypassing
  all the higher-level libavcodec machinery before hitting nvenc.
  This seems to successfully drain pending frames. Afterwards,
  we can continue to send packets for the next segments via
  `avcodec_send_packet` and the internal state will more-or-less
  reinitialize as if nothing had happened.

  Now, it is quite likely that this behavior is entirely accidental,
  and should not be expected to be stable in the future.

  While the nvenc encoder itself does seem to be "resumable" according
  to the documentation around the `NV_ENC_FLAGS_EOS` flag (cf.
  NVIDIA Video Encoder API Programming Guide), FFmpeg has no such
  mode. So we've had to sort of inject one in here.

The questions here are:

* Are these workarounds reasonable for the problem of Nvidia GPU
  sessions taking a long time to initialize when transcoding under
  load?

* Is there an alternative to carrying around this patch to flush
  the encoder in between segments?

* If there is no alternative, would you be open to a more formalized
  addition to the avcodec API around "flushable" or "resumable"
  encoders?

Thanks for your thoughts!

Josh

[0] https://github.com/livepeer/lpms

[1] https://gist.github.com/j0sh/ae9e5a97e794e364a6dfe513fa2591c2

[2] For historical reasons we cannot easily change right now
---
 libavcodec/avcodec.h | 2 ++
 libavcodec/nvenc.c   | 5 +++++
 2 files changed, 7 insertions(+)

Comments

Philip Langdale Dec. 20, 2019, 11:36 p.m. UTC | #1
On 2019-11-18 17:13, Josh Allmann wrote:
> This patch is meant to be an entry point for discussion around an
> issue we are having with flushing the nvenc encoder while doing
> segmented transcoding. Hopefully there will be a less kludgey
> workaround than this.

Hi Josh,

I happened to see your email recently, and took a quick look into
this. It seems that encoders are allowed to implement .flush() and
then avcodec_flush_buffers() can be called on them like on a
decoder. So I've posted a patch that does this (with the same impl
that you had). If that works for you, then that's all it takes -
no need for a new API call because there's already one you can use.

Let us know,

--phil
Josh Allmann Dec. 20, 2019, 11:57 p.m. UTC | #2
On Fri, 20 Dec 2019 at 15:36, Philip Langdale <philipl@overt.org> wrote:
>
> On 2019-11-18 17:13, Josh Allmann wrote:
> > This patch is meant to be an entry point for discussion around an
> > issue we are having with flushing the nvenc encoder while doing
> > segmented transcoding. Hopefully there will be a less kludgey
> > workaround than this.
>
> Hi Josh,
>
> I happened to see your email recently, and took a quick look into
> this. It seems that encoders are allowed to implement .flush() and
> then avcodec_flush_buffers() can be called on them like on a
> decoder. So I've posted a patch that does this (with the same impl
> that you had). If that works for you, then that's all it takes -
> no need for a new API call because there's already one you can use.

That would be perfect - thought .flush() was decode-only for some
reason. Thank you!

Josh
Dennis Mungai Dec. 21, 2019, 12:12 a.m. UTC | #3
On Fri, 20 Dec 2019 at 19:03, Josh Allmann <joshua.allmann@gmail.com> wrote:
>
> On Fri, 20 Dec 2019 at 15:36, Philip Langdale <philipl@overt.org> wrote:
> >
> > On 2019-11-18 17:13, Josh Allmann wrote:
> > > This patch is meant to be an entry point for discussion around an
> > > issue we are having with flushing the nvenc encoder while doing
> > > segmented transcoding. Hopefully there will be a less kludgey
> > > workaround than this.
> >
> > Hi Josh,
> >
> > I happened to see your email recently, and took a quick look into
> > this. It seems that encoders are allowed to implement .flush() and
> > then avcodec_flush_buffers() can be called on them like on a
> > decoder. So I've posted a patch that does this (with the same impl
> > that you had). If that works for you, then that's all it takes -
> > no need for a new API call because there's already one you can use.
>
> That would be perfect - thought .flush() was decode-only for some
> reason. Thank you!
>
> Josh

Related:

For CLI usage, does this affect the behavior of the global output option

-flush_packets 1

When the NVENC encoder is in use, in any way?
Philip Langdale Dec. 21, 2019, 4:12 a.m. UTC | #4
On Fri, 20 Dec 2019 19:12:00 -0500
Dennis Mungai <dmngaie@gmail.com> wrote:

> On Fri, 20 Dec 2019 at 19:03, Josh Allmann <joshua.allmann@gmail.com>
> 
> For CLI usage, does this affect the behavior of the global output
> option
> 
> -flush_packets 1
> 
> When the NVENC encoder is in use, in any way?

It's unrelated. This is a muxer level (avio specific, in fact) option.
It doesn't affect decoders or encoders.

--phil
Dennis Mungai Dec. 21, 2019, 2:01 p.m. UTC | #5
On Sat, 21 Dec 2019 at 07:12, Philip Langdale <philipl@overt.org> wrote:
>
> On Fri, 20 Dec 2019 19:12:00 -0500
> Dennis Mungai <dmngaie@gmail.com> wrote:
>
> > On Fri, 20 Dec 2019 at 19:03, Josh Allmann <joshua.allmann@gmail.com>
> >
> > For CLI usage, does this affect the behavior of the global output
> > option
> >
> > -flush_packets 1
> >
> > When the NVENC encoder is in use, in any way?
>
> It's unrelated. This is a muxer level (avio specific, in fact) option.
> It doesn't affect decoders or encoders.
>
> --phil

Thanks for the clarification.
diff mbox

Patch

diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
index bcb931f0dd..763a557d82 100644
--- a/libavcodec/avcodec.h
+++ b/libavcodec/avcodec.h
@@ -6232,6 +6232,8 @@  const AVCodecDescriptor *avcodec_descriptor_get_by_name(const char *name);
  */
 AVCPBProperties *av_cpb_properties_alloc(size_t *size);
 
+int av_nvenc_flush(AVCodecContext *avctx);
+
 /**
  * @}
  */
diff --git a/libavcodec/nvenc.c b/libavcodec/nvenc.c
index 111048d043..36134fa6a9 100644
--- a/libavcodec/nvenc.c
+++ b/libavcodec/nvenc.c
@@ -2071,6 +2071,11 @@  static void reconfig_encoder(AVCodecContext *avctx, const AVFrame *frame)
     }
 }
 
+int attribute_align_arg av_nvenc_flush(AVCodecContext *avctx)
+{
+  return ff_nvenc_send_frame(avctx, NULL);
+}
+
 int ff_nvenc_send_frame(AVCodecContext *avctx, const AVFrame *frame)
 {
     NVENCSTATUS nv_status;