diff mbox series

[FFmpeg-devel,v3] lavc/libvpx: remove thread limit

Message ID 20230105104151.2055-1-Dmitriy.Ovchinnikov@amd.com
State New
Headers show
Series [FFmpeg-devel,v3] lavc/libvpx: remove thread limit | expand

Checks

Context Check Description
yinshiyou/make_loongarch64 success Make finished
yinshiyou/make_fate_loongarch64 success Make fate finished
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Dmitrii Ovchinnikov Jan. 5, 2023, 10:41 a.m. UTC
From: Dmitrii Ovchinnikov <ovchinnikov.dmitrii@gmail.com>

This change improves the performance and multicore
 scalability of the vp9 codec for streaming single-pass encoded videos. The
 current thread limit for ffmpeg codecs is 16 (MAX_AUTO_THREADS in
 pthread_internal.h) due to a limitation in H.264 codec that prevents more
 than 16 threads being used.

Increasing the thread limit to 64 for vp9 improves
the performance for encoding 4K raw videos for streaming by up to 47%
compared to 16 threads, and from 20-30% for 32 threads, with the same quality
as measured by the VMAF score.

Did not need to add a check for limit in libvpx as it is already present 
in libvpx/vp9/vp9_cx_iface.c:
  RANGE_CHECK_HI(cfg, g_threads, 64);
As demonstrated by following message when -threads is set to anything more than 64
[libvpx-vp9 @ 0x30ed380]   Additional information: g_threads out of range [..64]

---
 libavcodec/libvpxdec.c | 2 +-
 libavcodec/libvpxenc.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

Comments

Tomas Härdin Jan. 5, 2023, 5:14 p.m. UTC | #1
tor 2023-01-05 klockan 11:41 +0100 skrev Dmitrii Ovchinnikov:
> From: Dmitrii Ovchinnikov <ovchinnikov.dmitrii@gmail.com>
> 
> This change improves the performance and multicore
>  scalability of the vp9 codec for streaming single-pass encoded
> videos. The
>  current thread limit for ffmpeg codecs is 16 (MAX_AUTO_THREADS in
>  pthread_internal.h) due to a limitation in H.264 codec that prevents
> more
>  than 16 threads being used.

This limitation should be restricted to H.264 IMO, not applied to all
codecs wholesale. I ran into this issue with jpeg2000dec.

> 
> Increasing the thread limit to 64 for vp9 improves
> the performance for encoding 4K raw videos for streaming by up to 47%
> compared to 16 threads, and from 20-30% for 32 threads, with the same
> quality
> as measured by the VMAF score.
> 
> Did not need to add a check for limit in libvpx as it is already
> present 
> in libvpx/vp9/vp9_cx_iface.c:
>   RANGE_CHECK_HI(cfg, g_threads, 64);

Perhaps we should have a table of known max number of threads per
codec?

/Tomas
James Zern Jan. 10, 2023, 3:06 a.m. UTC | #2
On Thu, Jan 5, 2023 at 2:42 AM Dmitrii Ovchinnikov
<ovchinnikov.dmitrii@gmail.com> wrote:
>
> From: Dmitrii Ovchinnikov <ovchinnikov.dmitrii@gmail.com>
>
> This change improves the performance and multicore
>  scalability of the vp9 codec for streaming single-pass encoded videos. The
>  current thread limit for ffmpeg codecs is 16 (MAX_AUTO_THREADS in
>  pthread_internal.h) due to a limitation in H.264 codec that prevents more
>  than 16 threads being used.
>
> Increasing the thread limit to 64 for vp9 improves
> the performance for encoding 4K raw videos for streaming by up to 47%
> compared to 16 threads, and from 20-30% for 32 threads, with the same quality
> as measured by the VMAF score.
>
> Did not need to add a check for limit in libvpx as it is already present
> in libvpx/vp9/vp9_cx_iface.c:
>   RANGE_CHECK_HI(cfg, g_threads, 64);
> As demonstrated by following message when -threads is set to anything more than 64
> [libvpx-vp9 @ 0x30ed380]   Additional information: g_threads out of range [..64]
>

The difference is that it will cause a failure rather than silently
capping the value which might break some workflows. libaomenc caps at
64, so it may be best to match that implementation for now. I'll make
that change before submitting unless there are other comments.

With this change there will still be the misleading output from
validate_thread_parameters() in pthread.c, though it's not a unique
problem to libvpxenc:
[libvpx-vp9 @ 0x5639aa4c0200] Application has requested 64 threads.
Using a thread count greater than 16 is not recommended.
Jun Zhao Jan. 10, 2023, 10:47 a.m. UTC | #3
On Thu, Jan 5, 2023 at 6:42 PM Dmitrii Ovchinnikov
<ovchinnikov.dmitrii@gmail.com> wrote:
>
> From: Dmitrii Ovchinnikov <ovchinnikov.dmitrii@gmail.com>
>
> This change improves the performance and multicore
>  scalability of the vp9 codec for streaming single-pass encoded videos. The
>  current thread limit for ffmpeg codecs is 16 (MAX_AUTO_THREADS in
>  pthread_internal.h) due to a limitation in H.264 codec that prevents more
>  than 16 threads being used.
>
> Increasing the thread limit to 64 for vp9 improves
> the performance for encoding 4K raw videos for streaming by up to 47%
> compared to 16 threads, and from 20-30% for 32 threads, with the same quality
> as measured by the VMAF score.
>
> Did not need to add a check for limit in libvpx as it is already present
> in libvpx/vp9/vp9_cx_iface.c:
>   RANGE_CHECK_HI(cfg, g_threads, 64);
> As demonstrated by following message when -threads is set to anything more than 64
> [libvpx-vp9 @ 0x30ed380]   Additional information: g_threads out of range [..64]
>
> ---
>  libavcodec/libvpxdec.c | 2 +-
>  libavcodec/libvpxenc.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/libavcodec/libvpxdec.c b/libavcodec/libvpxdec.c
> index 9cd2c56caf..19407092d0 100644
> --- a/libavcodec/libvpxdec.c
> +++ b/libavcodec/libvpxdec.c
> @@ -88,7 +88,7 @@ static av_cold int vpx_init(AVCodecContext *avctx,
>                              const struct vpx_codec_iface *iface)
>  {
>      struct vpx_codec_dec_cfg deccfg = {
> -        .threads = FFMIN(avctx->thread_count ? avctx->thread_count : av_cpu_count(), 16)
> +        .threads = avctx->thread_count ? avctx->thread_count : av_cpu_count()
>      };
>
>      av_log(avctx, AV_LOG_INFO, "%s\n", vpx_codec_version_str());
> diff --git a/libavcodec/libvpxenc.c b/libavcodec/libvpxenc.c
> index 9aa5510c28..0627e13973 100644
> --- a/libavcodec/libvpxenc.c
> +++ b/libavcodec/libvpxenc.c
> @@ -942,7 +942,7 @@ static av_cold int vpx_init(AVCodecContext *avctx,
>      enccfg.g_timebase.num = avctx->time_base.num;
>      enccfg.g_timebase.den = avctx->time_base.den;
>      enccfg.g_threads      =
> -        FFMIN(avctx->thread_count ? avctx->thread_count : av_cpu_count(), 16);
> +        avctx->thread_count ? avctx->thread_count : av_cpu_count();
>      enccfg.g_lag_in_frames= ctx->lag_in_frames;
>
Do you check the change with the old version libvpx?  as I know, older
versions libvpx setting the number of threads higher than 16 will
cause a crash, so I think at least a version check needs to be added

>      if (avctx->flags & AV_CODEC_FLAG_PASS1)
> --
> 2.38.1.windows.1
James Zern Jan. 11, 2023, 1:23 a.m. UTC | #4
On Tue, Jan 10, 2023 at 2:47 AM mypopy@gmail.com <mypopy@gmail.com> wrote:
>
> On Thu, Jan 5, 2023 at 6:42 PM Dmitrii Ovchinnikov
> <ovchinnikov.dmitrii@gmail.com> wrote:
> [...]
> > diff --git a/libavcodec/libvpxenc.c b/libavcodec/libvpxenc.c
> > index 9aa5510c28..0627e13973 100644
> > --- a/libavcodec/libvpxenc.c
> > +++ b/libavcodec/libvpxenc.c
> > @@ -942,7 +942,7 @@ static av_cold int vpx_init(AVCodecContext *avctx,
> >      enccfg.g_timebase.num = avctx->time_base.num;
> >      enccfg.g_timebase.den = avctx->time_base.den;
> >      enccfg.g_threads      =
> > -        FFMIN(avctx->thread_count ? avctx->thread_count : av_cpu_count(), 16);
> > +        avctx->thread_count ? avctx->thread_count : av_cpu_count();
> >      enccfg.g_lag_in_frames= ctx->lag_in_frames;
> >
> Do you check the change with the old version libvpx?  as I know, older
> versions libvpx setting the number of threads higher than 16 will
> cause a crash, so I think at least a version check needs to be added
>

Do you have a bug or version in mind? There were some performance
regressions [1] over the releases and some issues with changing the
number of tiles, but I don't remember a crash for a high thread count
(though there have been plenty of crashes related to threads in
general [2]). The range check predates 1.4.0, which is the minimum
required by ffmpeg.

[1] https://crbug.com/webm/1574
[2] https://crbug.com/webm/851
James Zern Jan. 17, 2023, 10:06 p.m. UTC | #5
On Tue, Jan 10, 2023 at 5:23 PM James Zern <jzern@google.com> wrote:
>
> On Tue, Jan 10, 2023 at 2:47 AM mypopy@gmail.com <mypopy@gmail.com> wrote:
> >
> > On Thu, Jan 5, 2023 at 6:42 PM Dmitrii Ovchinnikov
> > <ovchinnikov.dmitrii@gmail.com> wrote:
> > [...]
> > > diff --git a/libavcodec/libvpxenc.c b/libavcodec/libvpxenc.c
> > > index 9aa5510c28..0627e13973 100644
> > > --- a/libavcodec/libvpxenc.c
> > > +++ b/libavcodec/libvpxenc.c
> > > @@ -942,7 +942,7 @@ static av_cold int vpx_init(AVCodecContext *avctx,
> > >      enccfg.g_timebase.num = avctx->time_base.num;
> > >      enccfg.g_timebase.den = avctx->time_base.den;
> > >      enccfg.g_threads      =
> > > -        FFMIN(avctx->thread_count ? avctx->thread_count : av_cpu_count(), 16);
> > > +        avctx->thread_count ? avctx->thread_count : av_cpu_count();
> > >      enccfg.g_lag_in_frames= ctx->lag_in_frames;
> > >
> > Do you check the change with the old version libvpx?  as I know, older
> > versions libvpx setting the number of threads higher than 16 will
> > cause a crash, so I think at least a version check needs to be added
> >
>
> Do you have a bug or version in mind? There were some performance
> regressions [1] over the releases and some issues with changing the
> number of tiles, but I don't remember a crash for a high thread count
> (though there have been plenty of crashes related to threads in
> general [2]). The range check predates 1.4.0, which is the minimum
> required by ffmpeg.
>

I merged the changes without a check. If there is a problem with a
specific version we can add a check.

> [1] https://crbug.com/webm/1574
> [2] https://crbug.com/webm/851
diff mbox series

Patch

diff --git a/libavcodec/libvpxdec.c b/libavcodec/libvpxdec.c
index 9cd2c56caf..19407092d0 100644
--- a/libavcodec/libvpxdec.c
+++ b/libavcodec/libvpxdec.c
@@ -88,7 +88,7 @@  static av_cold int vpx_init(AVCodecContext *avctx,
                             const struct vpx_codec_iface *iface)
 {
     struct vpx_codec_dec_cfg deccfg = {
-        .threads = FFMIN(avctx->thread_count ? avctx->thread_count : av_cpu_count(), 16)
+        .threads = avctx->thread_count ? avctx->thread_count : av_cpu_count()
     };
 
     av_log(avctx, AV_LOG_INFO, "%s\n", vpx_codec_version_str());
diff --git a/libavcodec/libvpxenc.c b/libavcodec/libvpxenc.c
index 9aa5510c28..0627e13973 100644
--- a/libavcodec/libvpxenc.c
+++ b/libavcodec/libvpxenc.c
@@ -942,7 +942,7 @@  static av_cold int vpx_init(AVCodecContext *avctx,
     enccfg.g_timebase.num = avctx->time_base.num;
     enccfg.g_timebase.den = avctx->time_base.den;
     enccfg.g_threads      =
-        FFMIN(avctx->thread_count ? avctx->thread_count : av_cpu_count(), 16);
+        avctx->thread_count ? avctx->thread_count : av_cpu_count();
     enccfg.g_lag_in_frames= ctx->lag_in_frames;
 
     if (avctx->flags & AV_CODEC_FLAG_PASS1)