From patchwork Thu Oct 6 13:49:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitrii Ovchinnikov X-Patchwork-Id: 38584 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:4d9:b0:9c:f4b:4e41 with SMTP id 25csp207169pzd; Thu, 6 Oct 2022 06:50:25 -0700 (PDT) X-Google-Smtp-Source: AMsMyM58SDbpac4/8DGGYv/nsL3nEna489+9AhYzeUKy2ohrxFsXaujo2U9ph5fZx8eCktIj6zpg X-Received: by 2002:a17:906:8a64:b0:78c:725f:610b with SMTP id hy4-20020a1709068a6400b0078c725f610bmr4201040ejc.347.1665064225475; Thu, 06 Oct 2022 06:50:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1665064225; cv=none; d=google.com; s=arc-20160816; b=Boow00E9Eg6FWI+Ib9ZLBstit9pEGYZELZAnjsTgOAJqgzLd9ZN+nKoqqBFT5Mjdyd xsioJjgAW64P0BINpZ26SB3h8IQCFodkSzEzE1CRgXjKB6Oc3coyxQjJliwhSS+5ScGI ctFPidK4p94SmMq7PK0HPIiGynA9IM5IAHMzK30/XysFRLt+MoyhlHeef/hWqAcDAe7e FKicSjMVkffXnCKLbTmrKW0w2XUPJNmTA20BkcClFoJXRij4EnaxOpfab33D7E0aXbp4 lmomUsv7unJlhh/6NTTdt5JKrsXXWZ2Hx7zXnFMwdZCItCvrqaaVvC9jj5oX8AQtado2 +19A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=LdzClJLBrSpqHvkx52Y3U6UXiTVhmmS9294Kaw+E1A4=; b=JImqHl0hS9KGYJ2ejYTa9bfanSiTM/rXSYm485rB/mfzPNMFZAqc5WVVfd2d7nrkFY OMYVdxc5tVNm9jDNn1JKbjC/ZzETnvhwgYW+T24+cb31FRs+zfmT0RAgG8FA0ZEAiSyf 28JTTP9EN5e8oiU+gzdVwPK8JQBoPtwwa8Gi2PWhtUMOthvmxsG6QqyoEm4vud5U1QHy 5oiQd9fq5RaPSCbzniHA0VdD/uFaZfQ/q0IZvSqAhCzcCFU26qRSobH40LwUfBjFE7FC u7MQxbFoYhm1fB2/KPHfPdVVrOCPFFuP5xsjrCu+MeOmy7dXKcE57Bot68Rrm4MgoerS MLXA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=ZJQ9h5D0; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hc42-20020a17090716aa00b0077ab738911fsi17297252ejc.140.2022.10.06.06.50.24; Thu, 06 Oct 2022 06:50:25 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=ZJQ9h5D0; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8AB7F68BBBF; Thu, 6 Oct 2022 16:50:21 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wr1-f41.google.com (mail-wr1-f41.google.com [209.85.221.41]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id B015068B9D7 for ; Thu, 6 Oct 2022 16:50:14 +0300 (EEST) Received: by mail-wr1-f41.google.com with SMTP id b4so2841091wrs.1 for ; Thu, 06 Oct 2022 06:50:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date; bh=eVwLrQdrWeDGjb9I8p4dFPPFqV26FLs2BPoDKuoa4cI=; b=ZJQ9h5D0LMxTO+0Uw1jqhW22NirN0XK0ti1GfxXsPkQ78l9zi4WTsp2vYAPXiUCUmc jdOJeG8wLeqALVW/pSmv97fpJ4QJZ6uFVTcWvlUhssYz5NC9zRzm2GlVK4ds9F2QCUJl xGitU1rF/pq319XNX7vprAFGDkcfLQPFA22FLClojOeadssbM3A+Kf8Atz/U0W7jjWZR juPoxNGndpfUenGDZy7MWPlTFmR00ZWC1EteyS0Yphsh9bcCODHkGLSii/icfHZe1Xt9 KWLxrEBgK6r+bzomVgLEprSHLvHhxMggyLWKE/C2bVjvl3rx05S0FBP52/VuBRH1L0jv 8yBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date; bh=eVwLrQdrWeDGjb9I8p4dFPPFqV26FLs2BPoDKuoa4cI=; b=1RMB7hrWYb5Zp70FYxmDuVYg2WPWSdJIupqUkrM76kCG6jHArhNXXBLNnwheks7lSe VnnNX1eH4yFzYh2d06nEQLDOI9c/ssf94jmjyYzNmeLYhfkvbKMT+keXSyS6Fx6evqJp jhcNn64Ovy1kvpyqc7KPj1i467LTDjY0+Mw42SY3gDy0/0fKv/iuwQWZjVONR3fI47ql IByQjJTDYoFseiZ74UINjKJTRGGKY3r2UIPm6MCN12SdQ/ShfL6CdlH4NA/twprQ7enc UpQRxHWvsFF3/1Kb8pfyDnk96fjFSFFjWhiMqUJY8sRlVaItmqRXdoFnPEeESAhO2ZKN 86nQ== X-Gm-Message-State: ACrzQf2r+mi84H5MjhoOIF+AhnHVMJ5Sgt229cSVKxQnw0S/eOSH3FLT 2TwZUHnj311ovldHdTDvkf5s7IbjkBw1+5fBxbA= X-Received: by 2002:a5d:5f0a:0:b0:228:dff9:5f7e with SMTP id cl10-20020a5d5f0a000000b00228dff95f7emr6885wrb.601.1665064213569; Thu, 06 Oct 2022 06:50:13 -0700 (PDT) Received: from localhost.localdomain (178-222-23-217.dynamic.isp.telekom.rs. [178.222.23.217]) by smtp.gmail.com with ESMTPSA id iv19-20020a05600c549300b003b47b913901sm16104241wmb.1.2022.10.06.06.50.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Oct 2022 06:50:12 -0700 (PDT) From: OvchinnikovDmitrii To: ffmpeg-devel@ffmpeg.org Date: Thu, 6 Oct 2022 15:49:59 +0200 Message-Id: <20221006134959.771-1-ovchinnikov.dmitrii@gmail.com> X-Mailer: git-send-email 2.30.0.windows.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH V2] lavc/libvpx: increase thread limit to 64 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: OvchinnikovDmitrii Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: +LGuoVqNFrob This change improves the performance and multicore scalability of the vp9 codec for streaming single-pass encoded videos by taking advantage of up to 64 cores in the system. The current thread limit for ffmpeg codecs is 16 (MAX_AUTO_THREADS in pthread_internal.h) due to a limitation in H.264 codec that prevents more than 16 threads being used. Experiments show that increasing the thread limit to 64 for vp9 improves the performance for encoding 4K raw videos for streaming by up to 47% compared to 16 threads, and from 20-30% for 32 threads, with the same quality as measured by the VMAF score. Rationale for this change: Vp9 uses tiling to split the video frame into multiple columns; tiles must be at least 256 pixels wide, so there is a limit to how many tiles can be used. The tiles can be processed in parallel, and more tiles mean more CPU threads can be used. 4K videos can make use of 16 threads, and 8K videos can use 32. Row-mt can double the number of threads so 64 threads can be used. --- libavcodec/libvpx.h | 2 ++ libavcodec/libvpxdec.c | 2 +- libavcodec/libvpxenc.c | 2 +- 3 files changed, 4 insertions(+), 2 deletions(-) diff --git a/libavcodec/libvpx.h b/libavcodec/libvpx.h index 0caed8cdcb..331feb8745 100644 --- a/libavcodec/libvpx.h +++ b/libavcodec/libvpx.h @@ -25,6 +25,8 @@ #include "codec_internal.h" +#define MAX_VPX_THREADS 64 + void ff_vp9_init_static(FFCodec *codec); #if 0 enum AVPixelFormat ff_vpx_imgfmt_to_pixfmt(vpx_img_fmt_t img); diff --git a/libavcodec/libvpxdec.c b/libavcodec/libvpxdec.c index 9cd2c56caf..0ae19c3f72 100644 --- a/libavcodec/libvpxdec.c +++ b/libavcodec/libvpxdec.c @@ -88,7 +88,7 @@ static av_cold int vpx_init(AVCodecContext *avctx, const struct vpx_codec_iface *iface) { struct vpx_codec_dec_cfg deccfg = { - .threads = FFMIN(avctx->thread_count ? avctx->thread_count : av_cpu_count(), 16) + .threads = FFMIN(avctx->thread_count ? avctx->thread_count : av_cpu_count(), MAX_VPX_THREADS) }; av_log(avctx, AV_LOG_INFO, "%s\n", vpx_codec_version_str()); diff --git a/libavcodec/libvpxenc.c b/libavcodec/libvpxenc.c index 667cffc200..3ff86ad08d 100644 --- a/libavcodec/libvpxenc.c +++ b/libavcodec/libvpxenc.c @@ -942,7 +942,7 @@ static av_cold int vpx_init(AVCodecContext *avctx, enccfg.g_timebase.num = avctx->time_base.num; enccfg.g_timebase.den = avctx->time_base.den; enccfg.g_threads = - FFMIN(avctx->thread_count ? avctx->thread_count : av_cpu_count(), 16); + FFMIN(avctx->thread_count ? avctx->thread_count : av_cpu_count(), MAX_VPX_THREADS); enccfg.g_lag_in_frames= ctx->lag_in_frames; if (avctx->flags & AV_CODEC_FLAG_PASS1)