From patchwork Wed Sep 28 01:51:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitrii Ovchinnikov X-Patchwork-Id: 38393 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp948950pzh; Tue, 27 Sep 2022 18:52:38 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7ELSAz3Oa1F11I5liJ4YjOZ4Yeu5c5XAsoCZ2UmIbXcPrTK0OQL2BKPhZesQvOJe1zlJIN X-Received: by 2002:a17:906:9752:b0:783:96ed:2e1d with SMTP id o18-20020a170906975200b0078396ed2e1dmr10919316ejy.166.1664329958372; Tue, 27 Sep 2022 18:52:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664329958; cv=none; d=google.com; s=arc-20160816; b=czA4hYIz5P3fYyWcLbeaX4zOAOELaSvatrGfYgFchTA9/r6tgJhK81/+b4dLq/7Kgt If6jMvMSO2u09pys1LaYH/yb35CsM1sv3FjwmBbQEqyyVf/VnovIfZDs0y7bQVh1NkNU fKigFJWJJdlekujDFRbwv13mhWPX9Oe2SfjtZdzTzEUgLqf3HCriHSOSiJhutzG0MYgc ssFHRObNkDnZJGXoWCawpIJGKr6+gSuIst7HBTfqsOe4+WIYCfk+9cafnXLg1J6K0NL4 H7RbEqdOpZE7EDVmrqZb5f18zcNQqwcxfY5+LRFBdHTj4dNldqtJZOjGB10/x3VnVPqw rbOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=i7LUJrjBd3yuEYjI/wyDkieO0Vbk3IbAtWMAXbEO1d4=; b=K9CeNrJEYKTw/b7+H207Osj4qltUUCEWrg8Sqr397OIyMjVLePH3WYRjLsEmQ2PyD8 fJJrjzwEbnBoFLz4zY8528pJ0JvnMipjG/6a7VF7AyRhnOoHZ1BreAj0HgwOCDeQItWM kj17UtczvnMGIvXktpjk0upLuasPogyo6neQtwycztG90ScjUGMuAd9PRUrcOPEqiAnC fZEFe0BYJEp2UPOORNJgEnVXB2plLmnn6d2sn40ORvG393H0EodqiKg8dgKhyLVyyHGz LC4HAWDgZDRGgmOw59BZiFkDdPr4GSnj6g7l/IciS/I/gQMFtsvyfkjppX0qdZNJVLP6 ylxA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=n4pT0QLI; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id dp20-20020a170906c15400b00779ee2eadbcsi3314939ejc.131.2022.09.27.18.52.13; Tue, 27 Sep 2022 18:52:38 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=n4pT0QLI; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2E3D268BB08; Wed, 28 Sep 2022 04:52:11 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wm1-f46.google.com (mail-wm1-f46.google.com [209.85.128.46]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id A127C68B647 for ; Wed, 28 Sep 2022 04:52:04 +0300 (EEST) Received: by mail-wm1-f46.google.com with SMTP id e18so7628295wmq.3 for ; Tue, 27 Sep 2022 18:52:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date; bh=lU8VSfPNKu1RAcS3uVGW7qDiRkiH32NlxJan3R4szMs=; b=n4pT0QLIXqsmKb1nwheG6Imm5IVrzTrh/DXqUuMcnSMLzbdAsq2KR8kVCgqT7N3Tds xep+787atkCiRBFXjMUohXoomz344gyJLHCNRwfwCzrziU9Ut/OBVEFIT3b57rh4Xot7 GPDkA108aiW5EEasF8bwyHvJOy/HrgROeU0dxQqnoRPs8UgKzLrEnuh+rc0sEhR7jLGJ zf1WMXetwHOjY/geM+gSLP9caYw+6hDHML2gxlhEOyYGBzb+GSZPS68ZO/dy+hWTEw8s xyzfylppSwOGtbU2mk+qqNr7sRxQNnnWftUma1mLQ7uJ9zCE+pC6bDl2RzDGIZ2Zmn9E ymlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date; bh=lU8VSfPNKu1RAcS3uVGW7qDiRkiH32NlxJan3R4szMs=; b=K2D/jYxMIQLhgtvT/b93vAMI8mcuPaM5pQr5s3+uod+/rnvIvpvnVWgEw6Uy3LmB+p EMgedueunNTlPjA16XIjx08eNcE88SCJnWaPXXtfA1rnGPcriCINesOdX3ORYxeDJlKp qcW6uFXfV93jTzJiy/3kW0gnoMoTuQ+FHC8QIfyrDXdnoKBkXmdL+CfWLGoUm5l6A1fy +TprZxL8G4bBj0tTufgE0/AN+QbhKpvMCZOcOmkt3omCqL7+b2DszkRWY9V3LoD8438A A00ICiBuA14RScbyWZ8CwrcglWI00q2JM9s97xsdXo4+Tubw89LfnhaEVbkncmBiTOEV T3Pg== X-Gm-Message-State: ACrzQf1DMya+dcPp5m1y59Ogg6YtVqDlreg54++mOqjwnE38caQnvJRH 0iKslsquzNEpPYqKp2RvbgAHSz6f5TMMTg== X-Received: by 2002:a05:600c:4f89:b0:3b4:a6fc:89e5 with SMTP id n9-20020a05600c4f8900b003b4a6fc89e5mr4628057wmq.149.1664329923401; Tue, 27 Sep 2022 18:52:03 -0700 (PDT) Received: from localhost.localdomain (178-222-23-217.dynamic.isp.telekom.rs. [178.222.23.217]) by smtp.gmail.com with ESMTPSA id e13-20020a05600c4e4d00b003b535ad4a5bsm374243wmq.9.2022.09.27.18.52.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Sep 2022 18:52:02 -0700 (PDT) From: OvchinnikovDmitrii To: ffmpeg-devel@ffmpeg.org Date: Wed, 28 Sep 2022 03:51:49 +0200 Message-Id: <20220928015149.1830-1-ovchinnikov.dmitrii@gmail.com> X-Mailer: git-send-email 2.30.0.windows.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] lavc/libvpx: increase thread limit to 64 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: OvchinnikovDmitrii Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: UlvDzj9nRIcv This change improves the performance and multicore scalability of the vp9 codec for streaming single-pass encoded videos by taking advantage of up to 64 cores in the system. The current thread limit for ffmpeg codecs is 16 (MAX_AUTO_THREADS in pthread_internal.h) due to a limitation in H.264 codec that prevents more than 16 threads being used. Our experiments show that increasing the thread limit to 64 for vp9 improves the performance for encoding 4K raw videos for streaming by up to 47% compared to 16 threads, and from 20-30% for 32 threads, with the same quality as measured by the VMAF score. Rationale for our change: Vp9 uses tiling to split the video frame into multiple columns; tiles must be at least 256 pixels wide, so there is a limit to how many tiles can be used. The tiles can be processed in parallel, and more tiles mean more CPU threads can be used. 4K videos can make use of 16 threads, and 8K videos can use 32. Row-mt can double the number of threads so 64 threads can be used. --- libavcodec/libvpx.h | 3 +++ libavcodec/libvpxdec.c | 2 +- libavcodec/libvpxenc.c | 2 +- 3 files changed, 5 insertions(+), 2 deletions(-) diff --git a/libavcodec/libvpx.h b/libavcodec/libvpx.h index 0caed8cdcb..c0615e96a4 100644 --- a/libavcodec/libvpx.h +++ b/libavcodec/libvpx.h @@ -25,6 +25,9 @@ #include "codec_internal.h" +/* Increase max threads for libvpx from 16 to 64 to benefit 4K/8K video encoding. */ +#define MAX_VPX_THREADS 64 + void ff_vp9_init_static(FFCodec *codec); #if 0 enum AVPixelFormat ff_vpx_imgfmt_to_pixfmt(vpx_img_fmt_t img); diff --git a/libavcodec/libvpxdec.c b/libavcodec/libvpxdec.c index c5b95332d3..819af47505 100644 --- a/libavcodec/libvpxdec.c +++ b/libavcodec/libvpxdec.c @@ -89,7 +89,7 @@ static av_cold int vpx_init(AVCodecContext *avctx, const struct vpx_codec_iface *iface) { struct vpx_codec_dec_cfg deccfg = { - .threads = FFMIN(avctx->thread_count ? avctx->thread_count : av_cpu_count(), 16) + .threads = FFMIN(avctx->thread_count ? avctx->thread_count : av_cpu_count(), MAX_VPX_THREADS) }; av_log(avctx, AV_LOG_INFO, "%s\n", vpx_codec_version_str()); diff --git a/libavcodec/libvpxenc.c b/libavcodec/libvpxenc.c index 5b7c7735a1..baa90957c5 100644 --- a/libavcodec/libvpxenc.c +++ b/libavcodec/libvpxenc.c @@ -939,7 +939,7 @@ static av_cold int vpx_init(AVCodecContext *avctx, enccfg.g_timebase.num = avctx->time_base.num; enccfg.g_timebase.den = avctx->time_base.den; enccfg.g_threads = - FFMIN(avctx->thread_count ? avctx->thread_count : av_cpu_count(), 16); + FFMIN(avctx->thread_count ? avctx->thread_count : av_cpu_count(), MAX_VPX_THREADS); enccfg.g_lag_in_frames= ctx->lag_in_frames; if (avctx->flags & AV_CODEC_FLAG_PASS1)