From patchwork Mon Aug 19 14:23:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Araz X-Patchwork-Id: 51079 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:b6ca:0:b0:48e:c0f8:d0de with SMTP id s10csp1979507vqj; Mon, 19 Aug 2024 07:41:24 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCWClyAkI8in8Ya+3Ic1TwCtAOC60RwcdpYvcQfF/iKsuNYQvTPl37uPaG1gZMllO0a4YoENyG+sT37qXR83BjsHPR1f8EhXuXJHCw== X-Google-Smtp-Source: AGHT+IEoPbvgT8zzaLmgfxDbvtCesNYj8xzcu9XFbDG3i8KjGmk4KE8Oo47nBoOZCBHHK9lsl1M1 X-Received: by 2002:a05:6402:51ce:b0:5be:fdc0:e704 with SMTP id 4fb4d7f45d1cf-5befdc0e99cmr2100739a12.10.1724078483599; Mon, 19 Aug 2024 07:41:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1724078483; cv=none; d=google.com; s=arc-20160816; b=FF2FhNCr/o4oiMtpQIv1T7xSZ3xVFrH13/sh4DLlDK4NH6lscNySEExQ7JcUzMMpnR 8iRQyXRzSvTbcEnT6OJqC0Zemll5Q5Bsr4iI3lb5AfkjtlAgT4LJeLnDOx46MG7UnBGr vzoife/Ff4D/dJ3N/gVw80RBzXf9Dq9gPxlXuTAJ8odPSImn2fUOWdFX2RwWq3tqIUBR ZTTnYVn0UtmVZ6drBw+pzi4ndx4a3rBMsAseYC13cx3Y5kvA3+lgdXqFLuff3WPd0JGL 90ZvodZ8wBNc+sBt5NYrIrJsCLmJLcNU6X9YlxNs2lqYLTNTmc7S9qpydHAkPPJTiER0 arrg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=vC9/XVx2eyibezveZRjsYOxBwcwLPAQunSI4DN4Z06k=; fh=l8BBgSaHK1qPkRTUMT3LasR23D+FgYCLMIPW5YpagSs=; b=rO3VGjDQBS0ajJZD3Rd3Ubpapp9NPNKG5d/PJfZE/xFfC95UlNmyTUWo/a+xy13VSe HVED2IlVJgRjVAZuF4FxHOWuV3uHtpi8yxfyzERccg3EDdxj1BH9MBiZzUn5TJHPiTY2 hgFoLVtbJDONa6tsdZDpZVPTZnU6+Aj5B+pwX/61oNYxH1YJvjmHqSmHyqSO95hP/FU/ ayGQihlCBlhwlZBfBmdx/IYZ0gPF4/0aX+3Jk3RwGiyNHcUgE8xVP4LmahVOiFZVrmiG w9F2QAykaAzPPLrAtZsP0R30b2iM9QvzUsN9ISN314k1K2v4XzCaakoJV6Y42fr4boog BgAg==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=WOA4FK+v; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 4fb4d7f45d1cf-5bf00712cf1si841260a12.581.2024.08.19.07.41.17; Mon, 19 Aug 2024 07:41:23 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=WOA4FK+v; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id CBF9368DCF5; Mon, 19 Aug 2024 17:24:07 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lj1-f176.google.com (mail-lj1-f176.google.com [209.85.208.176]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id DC41B68DB4F for ; Mon, 19 Aug 2024 17:24:01 +0300 (EEST) Received: by mail-lj1-f176.google.com with SMTP id 38308e7fff4ca-2f15dd0b489so61807181fa.3 for ; Mon, 19 Aug 2024 07:24:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724077441; x=1724682241; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=dzh1+KK+A93ID22S6YkW5qSAzIztpCzWGxf/R6vgqUk=; b=WOA4FK+vMWLpJItScauRMbOrKrbvsC9tZzW6hUxrZK2y9JGn6JSEzrCOa014SNumzl MmJp0cAvLQtJ4HABU0wy98MTWBjbGOlnKiSdZDohzD2P0ybZJPDV0gTea99Fg1T2uaDi sTnakOWxGuGZFDbrCyoe9WnX0pVs+PbPteOTw9TKUtp9nHA1MA9ATY3KN/YnhK8yTbdW 93IFwjMq0p39Az9VDNlFa05z0WDpoEKEYX3A0FaHX1PDuxo8Wk2nuW108IdzZ8WhoLbd Xoyh7oPNJKu+H3ZwcJckRZ5do9W0ljw2Tjqie629aiXeqXxi305zY9i7P9FwNH71YxaJ y08A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724077441; x=1724682241; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=dzh1+KK+A93ID22S6YkW5qSAzIztpCzWGxf/R6vgqUk=; b=rMnmx22eHDZBTwjILM0BbMFuV443WMRpbx06tIUX4IwP5AWZkgSVPQBeElzwHk3ZFM VIYewZ9yuJRclRxN1pGFpWPimLVUsaclTqm1YKZD86NZviAk3DzyZvvVHUDWZWXi8lNr INmmQEKbf+pRf2YxVNjJ2C+jiuqlpZYk8UGdEQ3Rz3yuDo3oYl1P3e5tQMFWQUXYZdps Ky6T34FLkOCTQmDaKJ533VcfkAnRq9f+smP6So1lD7vP8Av3+lYVBb2PLTufVqPPSHZw CzDbG/IwjLY6xctuwsUKrHsBfnz72Ro87U5hxWzeV8Hur5SVLkY3ihStk60zrb+GT00D s4Jw== X-Gm-Message-State: AOJu0YzmiM7LHcKgaNq1UO7g3MnCYqB/OePAzuqNU0lztN7d+NLAo6H5 KTeWISWeyo4IsLtU5uIlRGO4KkrFBqk0QF1PexYhXrs1tsb/yfupqGwO1CZa X-Received: by 2002:a05:6512:1087:b0:52c:cc38:592c with SMTP id 2adb3069b0e04-5331c61ced6mr7537407e87.0.1724077439657; Mon, 19 Aug 2024 07:23:59 -0700 (PDT) Received: from aiusubov.amd.com ([87.116.134.250]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a8383946c05sm643420866b.183.2024.08.19.07.23.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Aug 2024 07:23:59 -0700 (PDT) From: Araz Iusubov X-Google-Original-From: Araz Iusubov To: ffmpeg-devel@ffmpeg.org Date: Mon, 19 Aug 2024 16:23:55 +0200 Message-ID: <20240819142355.1967-1-Primeadvice@gmail.com> X-Mailer: git-send-email 2.45.2.windows.1 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH, v2] avcodec/amfenc: increase precision of Sleep() on Windows X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Evgeny Pavlov Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 7z/pdcE7KhaE From: Evgeny Pavlov This commit increase precision of Sleep() function on Windows. This fix reduces the sleep time on Windows to improve AMF encoding performance on low resolution input videos. Fix for issue #10622 We evaluated CreateWaitableTimerExW with CREATE_WAITABLE_TIMER_HIGH_RESOLUTION flag. In fact, this function has the same precision level as the Sleep() function. Usually changing the time resolution will only affect the current process and will not impact other processes, thus it will not cause a global effect on the current system. Here is an info from documentation on timeBeginPeriod https://learn.microsoft.com/en-us/windows/win32/api/timeapi/nf-timeapi-timebeginperiod "Prior to Windows 10, version 2004, this function affects a global Windows setting. For all processes Windows uses the lowest value (that is, highest resolution) requested by any process. Starting with Windows 10, version 2004, this function no longer affects global timer resolution. For processes which call this function, Windows uses the lowest value (that is, highest resolution) requested by any process. For processes which have not called this function, Windows does not guarantee a higher resolution than the default system resolution." We provide the following measurement to show performance improvements with this patch. 1. Performance tests show that this high precision sleep will improve performance, especially for low resolution sequences, it can get about 20% improvement. Frames Per Second (FPS) being encoded by the hardware encoder (Navi 31 RX7900XT ): Source Type: H.264 , Output Type: H.264 (Sorry for bad formatting) No. | Sequence Resolution | No. of Frames| FPS Before patch | FPS after patch | Difference | Improvement % ----|-----------------------|--------------|------------------------|-------------------|---------------|---------- 1 | 480x360 | 8290 | 2030 | 2365 | 335 | 16.5% 2 | 720x576 | 8290 | 1440 | 1790 | 350 | 24.3% 3 | 1280x720 | 8290 | 1120 | 1190 | 70 | 6.3% 4 | 1920x1080 | 8290 | 692 | 714 | 22 | 3.2% 5 | 3840x2160 | 8290 | 200 | 200 | 0 | 0.0% The sample ffmpeg command line: $ ffmpeg.exe -y -hwaccel d3d11va -hwaccel_output_format d3d11 -i input.mp4 -c:v h264_amf out.mp4 where input.mp4 should be changed to corresponding resolution input H.264 format bitstream. 2. The power tests show an increase in power is within limit scope. The purpose of the power test is to examine the increase in CPU power consumption due to the improvement in CPU time resolution after using this patch. We were testing a product from AMD called Phoenix, which we refer to as an APU. It combines a general-purpose AMD CPU and a 3D integrated graphics processing unit (IGPU) on a single die. Only the APU has a DAP connector to the board's power rails. We got the power test data shown below: | | 480x360 | 720x576 | 1280x720 | 1920x1080 | 3840x2160 | average |------------------------|-----------|------------|----------|-----------|-----------|-------- |CPU power change | 1.93% | 2.43% | -1.69% | 3.49% | 2.92% | 1.82% |APU power total change | 0.86% | 1.34% | -0.62% | 1.54% | -0.58% | 0.51 When using a high precision clock by applying the patch, the average power consumption for CPU increases 1.82%, and the APU total increases 0.51%. We can see the power increase in power not very significant. Signed-off-by: Evgeny Pavlov --- libavcodec/amfenc.c | 31 +++++++++++++++++++++++++++++++ libavcodec/amfenc.h | 3 +++ 2 files changed, 34 insertions(+) diff --git a/libavcodec/amfenc.c b/libavcodec/amfenc.c index 061859f85c..55e24856e8 100644 --- a/libavcodec/amfenc.c +++ b/libavcodec/amfenc.c @@ -42,7 +42,12 @@ #endif #ifdef _WIN32 +#include #include "compat/w32dlfcn.h" + +typedef MMRESULT (*timeapi_fun)(UINT uPeriod); +#define WINMM_DLL "winmm.dll" + #else #include #endif @@ -113,6 +118,9 @@ static int amf_load_library(AVCodecContext *avctx) AMFInit_Fn init_fun; AMFQueryVersion_Fn version_fun; AMF_RESULT res; +#ifdef _WIN32 + timeapi_fun time_begin_fun; +#endif ctx->delayed_frame = av_frame_alloc(); if (!ctx->delayed_frame) { @@ -145,6 +153,16 @@ static int amf_load_library(AVCodecContext *avctx) AMF_RETURN_IF_FALSE(ctx, res == AMF_OK, AVERROR_UNKNOWN, "GetTrace() failed with error %d\n", res); res = ctx->factory->pVtbl->GetDebug(ctx->factory, &ctx->debug); AMF_RETURN_IF_FALSE(ctx, res == AMF_OK, AVERROR_UNKNOWN, "GetDebug() failed with error %d\n", res); + +#ifdef _WIN32 + // Increase precision of Sleep() function on Windows platform + ctx->winmm_lib = dlopen(WINMM_DLL, RTLD_NOW | RTLD_LOCAL); + AMF_RETURN_IF_FALSE(ctx, ctx->winmm_lib != NULL, 0, "DLL %s failed to open\n", WINMM_DLL); + time_begin_fun = (timeapi_fun)dlsym(ctx->winmm_lib, "timeBeginPeriod"); + AMF_RETURN_IF_FALSE(ctx, time_begin_fun != NULL, 0, "DLL %s failed to find function %s\n", WINMM_DLL, "timeBeginPeriod"); + time_begin_fun(1); +#endif //_WIN32 + return 0; } @@ -375,6 +393,9 @@ static int amf_init_encoder(AVCodecContext *avctx) int av_cold ff_amf_encode_close(AVCodecContext *avctx) { AmfContext *ctx = avctx->priv_data; +#ifdef _WIN32 + timeapi_fun time_end_fun; +#endif //_WIN32 if (ctx->delayed_surface) { ctx->delayed_surface->pVtbl->Release(ctx->delayed_surface); @@ -410,6 +431,16 @@ int av_cold ff_amf_encode_close(AVCodecContext *avctx) av_frame_free(&ctx->delayed_frame); av_fifo_freep2(&ctx->timestamp_list); +#ifdef _WIN32 + if (ctx->winmm_lib) { + time_end_fun = (timeapi_fun)dlsym(ctx->winmm_lib, "timeEndPeriod"); + AMF_RETURN_IF_FALSE(ctx, time_end_fun != NULL, 0, "DLL %s failed to find function %s\n", WINMM_DLL, "timeEndPeriod"); + time_end_fun(1); + dlclose(ctx->winmm_lib); + ctx->winmm_lib = NULL; + } +#endif //_WIN32 + return 0; } diff --git a/libavcodec/amfenc.h b/libavcodec/amfenc.h index 2dbd378ef8..35bcf1dfe3 100644 --- a/libavcodec/amfenc.h +++ b/libavcodec/amfenc.h @@ -50,6 +50,9 @@ typedef struct AmfContext { AVClass *avclass; // access to AMF runtime amf_handle library; ///< handle to DLL library +#ifdef _WIN32 + amf_handle winmm_lib; ///< handle to winmm DLL library +#endif //_WIN32 AMFFactory *factory; ///< pointer to AMF factory AMFDebug *debug; ///< pointer to AMF debug interface AMFTrace *trace; ///< pointer to AMF trace interface