diff mbox series

[FFmpeg-devel] lavfi/vaapi: Don't use fixed-size frame pools

Message ID 817dc1ea-495e-4c7f-8615-264e755ff6bd@jkqxz.net
State New
Headers show
Series [FFmpeg-devel] lavfi/vaapi: Don't use fixed-size frame pools | expand

Checks

Context Check Description
andriy/configure_x86 warning Failed to apply patch
yinshiyou/configure_loongarch64 warning Failed to apply patch

Commit Message

Mark Thompson Feb. 25, 2024, 7:51 p.m. UTC
Since e0da916b8f5b079a4865eef7f64863f50785463d the ffmpeg utility has
held multiple frames output by the filter graph in internal queues
without telling the filter which created the frames that it is going to
do so.  This broke many VAAPI filter->encode cases because a fixed-size
frame pool is used in each filter; this avoids the problem by changing
VAAPI filtering to always use a dynamically-sized pool.  (Note that
other cases with fixed-size pools my still be broken, since the ffmpeg
utility is still lying to libavfilter by not setting extra_hw_frames.)
Fixed-size frame pool support remains in the decoder, where it may still
be needed for compatibility.
---
On 08/02/2024 04:15, Xiang, Haihao wrote:
 > Is there any comment or concern about adding a quirk for workable drivers ? We
 > may use a dynamic frame pool in vaapi decoders and filters for workable drivers
 > only.
 >
 > Note since commit e0da916b, a command below doesn't work with the current fixed
 > frame pool used in vaapi filters.
 >
 > $ ffmpeg -hwaccel_output_format vaapi -hwaccel vaapi -i input.mp4 -vf
 > 'scale_vaapi=w=720:h=480' -c:v hevc_vaapi -f null -
 > [...]
 > [vf#0:0 @ 0x562847b01050] Error while filtering: Cannot allocate memory
 > [vf#0:0 @ 0x562847b01050] Task finished with error code: -12 (Cannot allocate
 > memory)
 > [vf#0:0 @ 0x562847b01050] Terminating thread with return code -12 (Cannot
 > allocate memory)
 > [...]

Having thought about this more carefully:

There is plenty of decoder hardware out there which effectively has one address register for the reference frames and therefore requires them in array form.  VAAPI originally enforced this, but more recent drivers avoid requiring it by various methods (updated hardware or dual output).  It is still required by DXVA/D3D11, where the restriction is enforced directly and can't be avoided.

Filters do not have the same problem with reference frames since none of the current forms read back in their own outputs, and therefore the restriction on array textures for output frames doesn't really make sense for them.  The one possible case I can see which could plausibly be relevant are filters with temporal inputs (the deinterlacer), but the inputs could always be allocated freely so this doesn't really change anything.

Hence these two changes:

* <https://lists.ffmpeg.org/pipermail/ffmpeg-devel/2024-February/322125.html>, which fixes the ffmpeg utility to set extra_hw_frames properly to account for frames stored in queues.

* This patch, which removes the fixed-size pools for VAAPI filtering.

I believe this fixes all of the VAAPI problem cases (and also DXVA/D3D11 at the same time).  Any thoughts?

Thanks,

- Mark


  libavfilter/vaapi_vpp.c | 11 +++--------
  1 file changed, 3 insertions(+), 8 deletions(-)

Comments

Xiang, Haihao Feb. 26, 2024, 6:11 a.m. UTC | #1
On So, 2024-02-25 at 19:51 +0000, Mark Thompson wrote:
> Since e0da916b8f5b079a4865eef7f64863f50785463d the ffmpeg utility has
> held multiple frames output by the filter graph in internal queues
> without telling the filter which created the frames that it is going to
> do so.  This broke many VAAPI filter->encode cases because a fixed-size
> frame pool is used in each filter; this avoids the problem by changing
> VAAPI filtering to always use a dynamically-sized pool.  (Note that
> other cases with fixed-size pools my still be broken, since the ffmpeg
> utility is still lying to libavfilter by not setting extra_hw_frames.)
> Fixed-size frame pool support remains in the decoder, where it may still
> be needed for compatibility.
> ---
> On 08/02/2024 04:15, Xiang, Haihao wrote:
>  > Is there any comment or concern about adding a quirk for workable drivers ?
> We
>  > may use a dynamic frame pool in vaapi decoders and filters for workable
> drivers
>  > only.
>  >
>  > Note since commit e0da916b, a command below doesn't work with the current
> fixed
>  > frame pool used in vaapi filters.
>  >
>  > $ ffmpeg -hwaccel_output_format vaapi -hwaccel vaapi -i input.mp4 -vf
>  > 'scale_vaapi=w=720:h=480' -c:v hevc_vaapi -f null -
>  > [...]
>  > [vf#0:0 @ 0x562847b01050] Error while filtering: Cannot allocate memory
>  > [vf#0:0 @ 0x562847b01050] Task finished with error code: -12 (Cannot
> allocate
>  > memory)
>  > [vf#0:0 @ 0x562847b01050] Terminating thread with return code -12 (Cannot
>  > allocate memory)
>  > [...]
> 
> Having thought about this more carefully:
> 
> There is plenty of decoder hardware out there which effectively has one
> address register for the reference frames and therefore requires them in array
> form.  VAAPI originally enforced this, but more recent drivers avoid requiring
> it by various methods (updated hardware or dual output).  It is still required
> by DXVA/D3D11, where the restriction is enforced directly and can't be
> avoided.
> 
> Filters do not have the same problem with reference frames since none of the
> current forms read back in their own outputs, and therefore the restriction on
> array textures for output frames doesn't really make sense for them.  The one
> possible case I can see which could plausibly be relevant are filters with
> temporal inputs (the deinterlacer), but the inputs could always be allocated
> freely so this doesn't really change anything.
> 
> Hence these two changes:
> 
> * <https://lists.ffmpeg.org/pipermail/ffmpeg-devel/2024-February/322125.html>,
> which fixes the ffmpeg utility to set extra_hw_frames properly to account for
> frames stored in queues.
> 
> * This patch, which removes the fixed-size pools for VAAPI filtering.
> 
> I believe this fixes all of the VAAPI problem cases (and also DXVA/D3D11 at
> the same time).  Any thoughts?

* There are still some issue after applying the above two changes manually. 

For example (you may find more cases from https://trac.ffmpeg.org/ticket/10856
):

$ ffmpeg -hwaccel vaapi -hwaccel_output_format vaapi -i 4k.mp4 -c:v h264_vaapi -
f null -

This is because lots of vaapi frames are being used in the downstream elements
(h264_vaapi encoder in the above case). We may reduce the error with the change
in  https://github.com/intel-media-ci/ffmpeg/pull/709/files  

* Some command lines never work, for example:

$ ffmpeg -hwaccel vaapi -hwaccel_output_format vaapi -i 1000frames.mp4 -vf
reverse -an -f null -

The above two issues disappear if using a dynamically-sized pool. So could we
use a dynamically-sized pool in vaapi decoders for all known backend drivers
(e.g. iHD, i965 ) ? For other backend drivers, we still use fixed-size frame
pools in vaapi decoders, and apply your two changes and
https://github.com/intel-media-ci/ffmpeg/pull/709/files 

BTW seems there are format errors in your patches, I failed to apply your
patches with 'git am'.

Thanks
Haihao


> 
> Thanks,
> 
> - Mark
> 
> 
>   libavfilter/vaapi_vpp.c | 11 +++--------
>   1 file changed, 3 insertions(+), 8 deletions(-)
> 
> diff --git a/libavfilter/vaapi_vpp.c b/libavfilter/vaapi_vpp.c
> index 59961bfa4a..19e7fdae97 100644
> --- a/libavfilter/vaapi_vpp.c
> +++ b/libavfilter/vaapi_vpp.c
> @@ -104,7 +104,6 @@ int ff_vaapi_vpp_config_output(AVFilterLink *outlink)
>       AVVAAPIHWConfig *hwconfig = NULL;
>       AVHWFramesConstraints *constraints = NULL;
>       AVHWFramesContext *output_frames;
> -    AVVAAPIFramesContext *va_frames;
>       VAStatus vas;
>       int err, i;
> 
> @@ -203,9 +202,9 @@ int ff_vaapi_vpp_config_output(AVFilterLink *outlink)
>       output_frames->width     = ctx->output_width;
>       output_frames->height    = ctx->output_height;
> 
> -    output_frames->initial_pool_size = 4;
> +    output_frames->initial_pool_size = 0;
> 
> -    err = ff_filter_init_hw_frames(avctx, outlink, 10);
> +    err = ff_filter_init_hw_frames(avctx, outlink, 0);
>       if (err < 0)
>           goto fail;
> 
> @@ -216,14 +215,10 @@ int ff_vaapi_vpp_config_output(AVFilterLink *outlink)
>           goto fail;
>       }
> 
> -    va_frames = output_frames->hwctx;
> -
>       av_assert0(ctx->va_context == VA_INVALID_ID);
>       vas = vaCreateContext(ctx->hwctx->display, ctx->va_config,
>                             ctx->output_width, ctx->output_height,
> -                          VA_PROGRESSIVE,
> -                          va_frames->surface_ids, va_frames->nb_surfaces,
> -                          &ctx->va_context);
> +                          VA_PROGRESSIVE, NULL, 0, &ctx->va_context);
>       if (vas != VA_STATUS_SUCCESS) {
>           av_log(avctx, AV_LOG_ERROR, "Failed to create processing pipeline "
>                  "context: %d (%s).\n", vas, vaErrorStr(vas));
Xiang, Haihao Feb. 26, 2024, 8:24 a.m. UTC | #2
On Ma, 2024-02-26 at 06:11 +0000, Xiang, Haihao wrote:
> On So, 2024-02-25 at 19:51 +0000, Mark Thompson wrote:
> > Since e0da916b8f5b079a4865eef7f64863f50785463d the ffmpeg utility has
> > held multiple frames output by the filter graph in internal queues
> > without telling the filter which created the frames that it is going to
> > do so.  This broke many VAAPI filter->encode cases because a fixed-size
> > frame pool is used in each filter; this avoids the problem by changing
> > VAAPI filtering to always use a dynamically-sized pool.  (Note that
> > other cases with fixed-size pools my still be broken, since the ffmpeg
> > utility is still lying to libavfilter by not setting extra_hw_frames.)
> > Fixed-size frame pool support remains in the decoder, where it may still
> > be needed for compatibility.
> > ---
> > On 08/02/2024 04:15, Xiang, Haihao wrote:
> >  > Is there any comment or concern about adding a quirk for workable drivers
> > ?
> > We
> >  > may use a dynamic frame pool in vaapi decoders and filters for workable
> > drivers
> >  > only.
> >  >
> >  > Note since commit e0da916b, a command below doesn't work with the current
> > fixed
> >  > frame pool used in vaapi filters.
> >  >
> >  > $ ffmpeg -hwaccel_output_format vaapi -hwaccel vaapi -i input.mp4 -vf
> >  > 'scale_vaapi=w=720:h=480' -c:v hevc_vaapi -f null -
> >  > [...]
> >  > [vf#0:0 @ 0x562847b01050] Error while filtering: Cannot allocate memory
> >  > [vf#0:0 @ 0x562847b01050] Task finished with error code: -12 (Cannot
> > allocate
> >  > memory)
> >  > [vf#0:0 @ 0x562847b01050] Terminating thread with return code -12 (Cannot
> >  > allocate memory)
> >  > [...]
> > 
> > Having thought about this more carefully:
> > 
> > There is plenty of decoder hardware out there which effectively has one
> > address register for the reference frames and therefore requires them in
> > array
> > form.  VAAPI originally enforced this, but more recent drivers avoid
> > requiring
> > it by various methods (updated hardware or dual output).  It is still
> > required
> > by DXVA/D3D11, where the restriction is enforced directly and can't be
> > avoided.
> > 
> > Filters do not have the same problem with reference frames since none of the
> > current forms read back in their own outputs, and therefore the restriction
> > on
> > array textures for output frames doesn't really make sense for them.  The
> > one
> > possible case I can see which could plausibly be relevant are filters with
> > temporal inputs (the deinterlacer), but the inputs could always be allocated
> > freely so this doesn't really change anything.
> > 
> > Hence these two changes:
> > 
> > *
> > <https://lists.ffmpeg.org/pipermail/ffmpeg-devel/2024-February/322125.html>,
> > which fixes the ffmpeg utility to set extra_hw_frames properly to account
> > for
> > frames stored in queues.
> > 
> > * This patch, which removes the fixed-size pools for VAAPI filtering.
> > 
> > I believe this fixes all of the VAAPI problem cases (and also DXVA/D3D11 at
> > the same time).  Any thoughts?
> 
> * There are still some issue after applying the above two changes manually. 
> 
> For example (you may find more cases from https://trac.ffmpeg.org/ticket/10856
> ):
> 
> $ ffmpeg -hwaccel vaapi -hwaccel_output_format vaapi -i 4k.mp4 -c:v h264_vaapi
> -
> f null -
> 
> This is because lots of vaapi frames are being used in the downstream elements
> (h264_vaapi encoder in the above case). 

Note the above reason is wrong. libavcodec spawns lots of threads based on the
number of cpu, each thread needs a vaapi frame for decoding, so it is easy to
make the fixed pool exhausted. 

Thanks
Haihao

> We may reduce the error with the change
> in  https://github.com/intel-media-ci/ffmpeg/pull/709/files  
> 
> * Some command lines never work, for example:
> 
> $ ffmpeg -hwaccel vaapi -hwaccel_output_format vaapi -i 1000frames.mp4 -vf
> reverse -an -f null -
> 
> The above two issues disappear if using a dynamically-sized pool. So could we
> use a dynamically-sized pool in vaapi decoders for all known backend drivers
> (e.g. iHD, i965 ) ? For other backend drivers, we still use fixed-size frame
> pools in vaapi decoders, and apply your two changes and
> https://github.com/intel-media-ci/ffmpeg/pull/709/files 
> 
> BTW seems there are format errors in your patches, I failed to apply your
> patches with 'git am'.
> 
> Thanks
> Haihao
> 
> 
> > 
> > Thanks,
> > 
> > - Mark
> > 
> > 
> >   libavfilter/vaapi_vpp.c | 11 +++--------
> >   1 file changed, 3 insertions(+), 8 deletions(-)
> > 
> > diff --git a/libavfilter/vaapi_vpp.c b/libavfilter/vaapi_vpp.c
> > index 59961bfa4a..19e7fdae97 100644
> > --- a/libavfilter/vaapi_vpp.c
> > +++ b/libavfilter/vaapi_vpp.c
> > @@ -104,7 +104,6 @@ int ff_vaapi_vpp_config_output(AVFilterLink *outlink)
> >       AVVAAPIHWConfig *hwconfig = NULL;
> >       AVHWFramesConstraints *constraints = NULL;
> >       AVHWFramesContext *output_frames;
> > -    AVVAAPIFramesContext *va_frames;
> >       VAStatus vas;
> >       int err, i;
> > 
> > @@ -203,9 +202,9 @@ int ff_vaapi_vpp_config_output(AVFilterLink *outlink)
> >       output_frames->width     = ctx->output_width;
> >       output_frames->height    = ctx->output_height;
> > 
> > -    output_frames->initial_pool_size = 4;
> > +    output_frames->initial_pool_size = 0;
> > 
> > -    err = ff_filter_init_hw_frames(avctx, outlink, 10);
> > +    err = ff_filter_init_hw_frames(avctx, outlink, 0);
> >       if (err < 0)
> >           goto fail;
> > 
> > @@ -216,14 +215,10 @@ int ff_vaapi_vpp_config_output(AVFilterLink *outlink)
> >           goto fail;
> >       }
> > 
> > -    va_frames = output_frames->hwctx;
> > -
> >       av_assert0(ctx->va_context == VA_INVALID_ID);
> >       vas = vaCreateContext(ctx->hwctx->display, ctx->va_config,
> >                             ctx->output_width, ctx->output_height,
> > -                          VA_PROGRESSIVE,
> > -                          va_frames->surface_ids, va_frames->nb_surfaces,
> > -                          &ctx->va_context);
> > +                          VA_PROGRESSIVE, NULL, 0, &ctx->va_context);
> >       if (vas != VA_STATUS_SUCCESS) {
> >           av_log(avctx, AV_LOG_ERROR, "Failed to create processing pipeline
> > "
> >                  "context: %d (%s).\n", vas, vaErrorStr(vas));
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
diff mbox series

Patch

diff --git a/libavfilter/vaapi_vpp.c b/libavfilter/vaapi_vpp.c
index 59961bfa4a..19e7fdae97 100644
--- a/libavfilter/vaapi_vpp.c
+++ b/libavfilter/vaapi_vpp.c
@@ -104,7 +104,6 @@  int ff_vaapi_vpp_config_output(AVFilterLink *outlink)
      AVVAAPIHWConfig *hwconfig = NULL;
      AVHWFramesConstraints *constraints = NULL;
      AVHWFramesContext *output_frames;
-    AVVAAPIFramesContext *va_frames;
      VAStatus vas;
      int err, i;

@@ -203,9 +202,9 @@  int ff_vaapi_vpp_config_output(AVFilterLink *outlink)
      output_frames->width     = ctx->output_width;
      output_frames->height    = ctx->output_height;

-    output_frames->initial_pool_size = 4;
+    output_frames->initial_pool_size = 0;

-    err = ff_filter_init_hw_frames(avctx, outlink, 10);
+    err = ff_filter_init_hw_frames(avctx, outlink, 0);
      if (err < 0)
          goto fail;

@@ -216,14 +215,10 @@  int ff_vaapi_vpp_config_output(AVFilterLink *outlink)
          goto fail;
      }

-    va_frames = output_frames->hwctx;
-
      av_assert0(ctx->va_context == VA_INVALID_ID);
      vas = vaCreateContext(ctx->hwctx->display, ctx->va_config,
                            ctx->output_width, ctx->output_height,
-                          VA_PROGRESSIVE,
-                          va_frames->surface_ids, va_frames->nb_surfaces,
-                          &ctx->va_context);
+                          VA_PROGRESSIVE, NULL, 0, &ctx->va_context);
      if (vas != VA_STATUS_SUCCESS) {
          av_log(avctx, AV_LOG_ERROR, "Failed to create processing pipeline "
                 "context: %d (%s).\n", vas, vaErrorStr(vas));