[FFmpeg-devel] lavc/decode: allow users to shrink the hw frames pool size, if they so desire.

Submitted by Timo Rothenpieler on Nov. 24, 2018, 1:44 a.m.

Details

Message ID 20181124014445.30148-1-timo@rothenpieler.org
State New
Headers show

Commit Message

Timo Rothenpieler Nov. 24, 2018, 1:44 a.m.
---
 libavcodec/decode.c        | 9 +++++----
 libavcodec/options_table.h | 2 +-
 2 files changed, 6 insertions(+), 5 deletions(-)

Comments

Moritz Barsnick Nov. 25, 2018, 9:15 a.m.
On Sat, Nov 24, 2018 at 02:44:45 +0100, Timo Rothenpieler wrote:
> +            // available then add/substract them here.
                                     ^ nit: subtract

> -{"extra_hw_frames", "Number of extra hardware frames to allocate for the user", OFFSET(extra_hw_frames), AV_OPT_TYPE_INT, { .i64 = -1 }, -1, INT_MAX, V|D },
> +{"extra_hw_frames", "Number of extra hardware frames to allocate for the user, negative values are supported at own risk", OFFSET(extra_hw_frames), AV_OPT_TYPE_INT, { .i64 = 0 }, INT_MIN, INT_MAX, V|D },

"at own risk" sounds hazy, but I guess the consequences can't be
explained here.

Moritz
Mark Thompson Nov. 25, 2018, 2:05 p.m.
On 24/11/18 01:44, Timo Rothenpieler wrote:
> ---
>  libavcodec/decode.c        | 9 +++++----
>  libavcodec/options_table.h | 2 +-
>  2 files changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/libavcodec/decode.c b/libavcodec/decode.c
> index c89c77c43a..08ae8788a2 100644
> --- a/libavcodec/decode.c
> +++ b/libavcodec/decode.c
> @@ -1256,10 +1256,11 @@ int avcodec_get_hw_frames_parameters(AVCodecContext *avctx,
>          AVHWFramesContext *frames_ctx = (AVHWFramesContext*)frames_ref->data;
>  
>          if (frames_ctx->initial_pool_size) {
> -            // If the user has requested that extra output surfaces be
> -            // available then add them here.
> -            if (avctx->extra_hw_frames > 0)
> -                frames_ctx->initial_pool_size += avctx->extra_hw_frames;
> +            // If the user has requested extra/fewer output surfaces be
> +            // available then add/substract them here.
> +            frames_ctx->initial_pool_size += avctx->extra_hw_frames;
> +            if (avctx->extra_hw_frames < 0)
> +                av_log(avctx, AV_LOG_WARNING, "Decreasing hwaccel frame pool size!\n");
>  
>              // If frame threading is enabled then an extra surface per thread
>              // is also required.
> diff --git a/libavcodec/options_table.h b/libavcodec/options_table.h
> index 099261e168..af0ab1cfe0 100644
> --- a/libavcodec/options_table.h
> +++ b/libavcodec/options_table.h
> @@ -478,7 +478,7 @@ static const AVOption avcodec_options[] = {
>  {"ignore_level", "ignore level even if the codec level used is unknown or higher than the maximum supported level reported by the hardware driver", 0, AV_OPT_TYPE_CONST, { .i64 = AV_HWACCEL_FLAG_IGNORE_LEVEL }, INT_MIN, INT_MAX, V | D, "hwaccel_flags" },
>  {"allow_high_depth", "allow to output YUV pixel formats with a different chroma sampling than 4:2:0 and/or other than 8 bits per component", 0, AV_OPT_TYPE_CONST, {.i64 = AV_HWACCEL_FLAG_ALLOW_HIGH_DEPTH }, INT_MIN, INT_MAX, V | D, "hwaccel_flags"},
>  {"allow_profile_mismatch", "attempt to decode anyway if HW accelerated decoder's supported profiles do not exactly match the stream", 0, AV_OPT_TYPE_CONST, {.i64 = AV_HWACCEL_FLAG_ALLOW_PROFILE_MISMATCH }, INT_MIN, INT_MAX, V | D, "hwaccel_flags"},
> -{"extra_hw_frames", "Number of extra hardware frames to allocate for the user", OFFSET(extra_hw_frames), AV_OPT_TYPE_INT, { .i64 = -1 }, -1, INT_MAX, V|D },
> +{"extra_hw_frames", "Number of extra hardware frames to allocate for the user, negative values are supported at own risk", OFFSET(extra_hw_frames), AV_OPT_TYPE_INT, { .i64 = 0 }, INT_MIN, INT_MAX, V|D },
>  {NULL},
>  };

I see what you're trying to do here, but I'm not sure that overloading this option is the best way to achieve it.  This option has a specific meaning in terms of the consumer of the frames, and is intended to be negotiated (any consumers of hardware frames should declare how many they will hold onto at most in a given configuration), but since that is very much configuration-dependent it hasn't yet been done.  Setting it to, for example, -3 ends up saying "I want the frame pool to be large enough for the consumer to be able to hold onto at most -2 frames at any given time", which is nonsensical in the meaning of the particular option.

Presumably the case you're actually imagining wanting this option is to reduce memory use when a stream doesn't actually use the maximum number of reference frames - given that this is generally signalled in the stream, can you instead propagate the real number of reference frames into the allocation so that it uses the correct number rather than the maximum?  Alternatively, if you have some particular program in mind then it could just use hw_frames_ctx rather than hw_device_ctx and do this allocation itself.

- Mark
Timo Rothenpieler Nov. 26, 2018, 9:45 a.m.
On 25/11/2018 15:05, Mark Thompson wrote:
> I see what you're trying to do here, but I'm not sure that overloading this option is the best way to achieve it.  This option has a specific meaning in terms of the consumer of the frames, and is intended to be negotiated (any consumers of hardware frames should declare how many they will hold onto at most in a given configuration), but since that is very much configuration-dependent it hasn't yet been done.  Setting it to, for example, -3 ends up saying "I want the frame pool to be large enough for the consumer to be able to hold onto at most -2 frames at any given time", which is nonsensical in the meaning of the particular option.
> 
> Presumably the case you're actually imagining wanting this option is to reduce memory use when a stream doesn't actually use the maximum number of reference frames - given that this is generally signalled in the stream, can you instead propagate the real number of reference frames into the allocation so that it uses the correct number rather than the maximum?  Alternatively, if you have some particular program in mind then it could just use hw_frames_ctx rather than hw_device_ctx and do this allocation itself.

The case I'm targeting is that for example the CUDA yadif filter needs 
to hold onto a buffer of up to 2 frames, which it in turn steals from 
the nvdec decoders buffer. But if you don't use yadif, that's 2 unused 
frames, which is potentially quite a bit of VRAM.

nvdec already uses some logic based on the stream, but that logic is 
sometimes wrong in both directions. So the idea here is to give the user 
an option to optimize a given commandline for memory use.

Patch hide | download patch | download mbox

diff --git a/libavcodec/decode.c b/libavcodec/decode.c
index c89c77c43a..08ae8788a2 100644
--- a/libavcodec/decode.c
+++ b/libavcodec/decode.c
@@ -1256,10 +1256,11 @@  int avcodec_get_hw_frames_parameters(AVCodecContext *avctx,
         AVHWFramesContext *frames_ctx = (AVHWFramesContext*)frames_ref->data;
 
         if (frames_ctx->initial_pool_size) {
-            // If the user has requested that extra output surfaces be
-            // available then add them here.
-            if (avctx->extra_hw_frames > 0)
-                frames_ctx->initial_pool_size += avctx->extra_hw_frames;
+            // If the user has requested extra/fewer output surfaces be
+            // available then add/substract them here.
+            frames_ctx->initial_pool_size += avctx->extra_hw_frames;
+            if (avctx->extra_hw_frames < 0)
+                av_log(avctx, AV_LOG_WARNING, "Decreasing hwaccel frame pool size!\n");
 
             // If frame threading is enabled then an extra surface per thread
             // is also required.
diff --git a/libavcodec/options_table.h b/libavcodec/options_table.h
index 099261e168..af0ab1cfe0 100644
--- a/libavcodec/options_table.h
+++ b/libavcodec/options_table.h
@@ -478,7 +478,7 @@  static const AVOption avcodec_options[] = {
 {"ignore_level", "ignore level even if the codec level used is unknown or higher than the maximum supported level reported by the hardware driver", 0, AV_OPT_TYPE_CONST, { .i64 = AV_HWACCEL_FLAG_IGNORE_LEVEL }, INT_MIN, INT_MAX, V | D, "hwaccel_flags" },
 {"allow_high_depth", "allow to output YUV pixel formats with a different chroma sampling than 4:2:0 and/or other than 8 bits per component", 0, AV_OPT_TYPE_CONST, {.i64 = AV_HWACCEL_FLAG_ALLOW_HIGH_DEPTH }, INT_MIN, INT_MAX, V | D, "hwaccel_flags"},
 {"allow_profile_mismatch", "attempt to decode anyway if HW accelerated decoder's supported profiles do not exactly match the stream", 0, AV_OPT_TYPE_CONST, {.i64 = AV_HWACCEL_FLAG_ALLOW_PROFILE_MISMATCH }, INT_MIN, INT_MAX, V | D, "hwaccel_flags"},
-{"extra_hw_frames", "Number of extra hardware frames to allocate for the user", OFFSET(extra_hw_frames), AV_OPT_TYPE_INT, { .i64 = -1 }, -1, INT_MAX, V|D },
+{"extra_hw_frames", "Number of extra hardware frames to allocate for the user, negative values are supported at own risk", OFFSET(extra_hw_frames), AV_OPT_TYPE_INT, { .i64 = 0 }, INT_MIN, INT_MAX, V|D },
 {NULL},
 };