From patchwork Tue Feb 8 03:05:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Chen, Wenbin" X-Patchwork-Id: 34159 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6602:2c4e:0:0:0:0 with SMTP id x14csp415857iov; Mon, 7 Feb 2022 19:06:07 -0800 (PST) X-Google-Smtp-Source: ABdhPJxqbwoqHnD+4TNczNR9iiIBDpPlbTJVQP6ufvEXcZzdP2LnsHGwm2IsQt0ji4wmwwTERkhR X-Received: by 2002:a17:907:60cf:: with SMTP id hv15mr1931296ejc.488.1644289567433; Mon, 07 Feb 2022 19:06:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644289567; cv=none; d=google.com; s=arc-20160816; b=Qc7DZJNCSrUlQRq5nzGSm+O9JLXADMCUJZk2hUJmHOf9NYotFu0f6+JAJOX07WjgMt Liylah8NkFpclccObYEFx/hTURmfovXbSm5hUg+a4LwA4Hfcc1DYO2jEvpVXbSRmFud1 3wztfM5jSW3MYybYXXU4Ts59umAeELyoRT9LqfjhV52rMUFwByHPs4y9Y7bQ62YSPsLd sZRaWQw4yuL++U+2j9I53u+yOBq0vV+Nt0L8qToMVfuocTGwALgIUMO7hcRJ/YcLe+gL 0fsOUZz1Uv0ZTIm8oeqWWH8RL3s0BtM6+ZAh7FcxFADje3mxq55iV1jDlc63P6oAzPHF n+TA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=IRUc/qPOHoZuCtUoSyDJ/nygSFEiLE39MBBMV8ZCtPc=; b=UdUgMr7lIVwpidfxkGUTW0dIUgZSAE1qlq58/HV43REdcgn2c/2BozVUDdCsQ9nBl3 uSnsagMdvZgXfcaMRgUBuozZRNaZtFsxY6aIEs2jPfZfE5SB92IyZ1Hzx4A17jsOVzpw CHluJyXLGTOdAuimxEBw63OHmMzdRmor2YsAeBO0KoPo5ffhSTFRrALgXVj2vLZeCYWX bK+CiXSoHqX9a/bdrspAbnhx3gbjuTqX0L3I5TYzblUoEwfPPcCKJ8nESqxAZYsTv5xW Y3oWmKawJZZhrsyF1xISdEo56Wj5wbb8kpmGLxlpgFu41mBfUXQ5y0gGM0yLga66LTeo eSpQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=E8LbtOFv; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id en11si7112050edb.551.2022.02.07.19.06.06; Mon, 07 Feb 2022 19:06:07 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=E8LbtOFv; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1331668B115; Tue, 8 Feb 2022 05:06:02 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 7B39868ADD4 for ; Tue, 8 Feb 2022 05:05:54 +0200 (EET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1644289559; x=1675825559; h=from:to:subject:date:message-id:mime-version: content-transfer-encoding; bh=QiCNCp/gDTlUbYKcz3Uzdh5pokNLv3YzLCRiV5eancc=; b=E8LbtOFvn+bydCVDeUtewVagV/dRsG5rhCxdmxux6oMqiSzdsgTTtC49 qHPYkvBK8wai/0uvjdDfy0W73JAYme/SpUmXq1lGxAwI9F0T6hLEhUdXV MvADxIOR4teMeYCIxFg0tPko/q+61tVJFSKlfjDrte6sr08g5kyjQKJa2 qQaD4LSb1JLXsBQgJjPp5WViJnIvxjKDkRBQvOCm4E1Tf++hzityemJ8p w5M8vwiGd+Cik6mBefgm2QTz5PdqWvMFZgCk4Vi8zZIvMhhjv/owmXM9I +N/dxaW011la4dwATF1VHuBwS27q0Mp31ybKZeQ90xJxquGmx33kTw2nu w==; X-IronPort-AV: E=McAfee;i="6200,9189,10251"; a="228829619" X-IronPort-AV: E=Sophos;i="5.88,351,1635231600"; d="scan'208";a="228829619" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Feb 2022 19:05:52 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,351,1635231600"; d="scan'208";a="628742453" Received: from wenbin-z390-aorus-ultra.sh.intel.com ([10.239.35.110]) by fmsmga002.fm.intel.com with ESMTP; 07 Feb 2022 19:05:51 -0800 From: Wenbin Chen To: ffmpeg-devel@ffmpeg.org Date: Tue, 8 Feb 2022 11:05:47 +0800 Message-Id: <20220208030549.340748-1-wenbin.chen@intel.com> X-Mailer: git-send-email 2.32.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH V3 1/3] libavcodec/vaapi_encode: Add new API adaption to vaapi_encode X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: OxeaixE3uDl8 Add vaSyncBuffer to VAAPI encoder. Old version API vaSyncSurface wait surface to complete. When surface is used for multiple operation, it waits all operations to finish. vaSyncBuffer only wait one channel to finish. Add wait param to vaapi_encode_wait() to prepare for the async_depth option. "wait=1" means wait until operation ready. "wait=0" means query operation's status. If it is ready return 0, if it is still in progress return EAGAIN. Signed-off-by: Wenbin Chen --- libavcodec/vaapi_encode.c | 47 +++++++++++++++++++++++++++++++++------ 1 file changed, 40 insertions(+), 7 deletions(-) diff --git a/libavcodec/vaapi_encode.c b/libavcodec/vaapi_encode.c index 3bf379b1a0..b87b58a42b 100644 --- a/libavcodec/vaapi_encode.c +++ b/libavcodec/vaapi_encode.c @@ -134,7 +134,8 @@ static int vaapi_encode_make_misc_param_buffer(AVCodecContext *avctx, } static int vaapi_encode_wait(AVCodecContext *avctx, - VAAPIEncodePicture *pic) + VAAPIEncodePicture *pic, + uint8_t wait) { VAAPIEncodeContext *ctx = avctx->priv_data; VAStatus vas; @@ -150,11 +151,43 @@ static int vaapi_encode_wait(AVCodecContext *avctx, "(input surface %#x).\n", pic->display_order, pic->encode_order, pic->input_surface); - vas = vaSyncSurface(ctx->hwctx->display, pic->input_surface); - if (vas != VA_STATUS_SUCCESS) { - av_log(avctx, AV_LOG_ERROR, "Failed to sync to picture completion: " - "%d (%s).\n", vas, vaErrorStr(vas)); +#if VA_CHECK_VERSION(1, 9, 0) + // Try vaSyncBuffer. + vas = vaSyncBuffer(ctx->hwctx->display, + pic->output_buffer, + wait ? VA_TIMEOUT_INFINITE : 0); + if (vas == VA_STATUS_ERROR_TIMEDOUT) { + return AVERROR(EAGAIN); + } else if (vas != VA_STATUS_SUCCESS && vas != VA_STATUS_ERROR_UNIMPLEMENTED) { + av_log(avctx, AV_LOG_ERROR, "Failed to sync to output buffer completion: " + "%d (%s).\n", vas, vaErrorStr(vas)); return AVERROR(EIO); + } else if (vas == VA_STATUS_ERROR_UNIMPLEMENTED) + // If vaSyncBuffer is not implemented, try old version API. +#endif + { + if (!wait) { + VASurfaceStatus surface_status; + vas = vaQuerySurfaceStatus(ctx->hwctx->display, + pic->input_surface, + &surface_status); + if (vas == VA_STATUS_SUCCESS && + surface_status != VASurfaceReady && + surface_status != VASurfaceSkipped) { + return AVERROR(EAGAIN); + } else if (vas != VA_STATUS_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Failed to query surface status: " + "%d (%s).\n", vas, vaErrorStr(vas)); + return AVERROR(EIO); + } + } else { + vas = vaSyncSurface(ctx->hwctx->display, pic->input_surface); + if (vas != VA_STATUS_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Failed to sync to picture completion: " + "%d (%s).\n", vas, vaErrorStr(vas)); + return AVERROR(EIO); + } + } } // Input is definitely finished with now. @@ -633,7 +666,7 @@ static int vaapi_encode_output(AVCodecContext *avctx, uint8_t *ptr; int err; - err = vaapi_encode_wait(avctx, pic); + err = vaapi_encode_wait(avctx, pic, 1); if (err < 0) return err; @@ -695,7 +728,7 @@ fail: static int vaapi_encode_discard(AVCodecContext *avctx, VAAPIEncodePicture *pic) { - vaapi_encode_wait(avctx, pic); + vaapi_encode_wait(avctx, pic, 1); if (pic->output_buffer_ref) { av_log(avctx, AV_LOG_DEBUG, "Discard output for pic " From patchwork Tue Feb 8 03:05:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Chen, Wenbin" X-Patchwork-Id: 34160 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6602:2c4e:0:0:0:0 with SMTP id x14csp415967iov; Mon, 7 Feb 2022 19:06:18 -0800 (PST) X-Google-Smtp-Source: ABdhPJyG/itDdQ+hofy1NX76xKRdGEd6GbfounWFmdbTch9cqdvQi/vUwH/xjxPselSTdWSg5HLV X-Received: by 2002:a17:906:9b8d:: with SMTP id dd13mr1928036ejc.121.1644289578627; Mon, 07 Feb 2022 19:06:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644289578; cv=none; d=google.com; s=arc-20160816; b=pYC1CTTLiwQzXogAh69DWcHsxkukTeAFJIomXTCGSKzfpFr+wgCAXNcH/Icxv8oj7O oUYxNmlcmuygKnQ1xWzHKsIn2j7D5NHAnGg0KIhjLN0aSTWOYLqaD9twx8TaDE1rl6U4 feetpaCZt97bZVc4exYFnTYcWG4GWrlbHCVB8jG6wYEes2iWovv/D+jhaeRYmfE7COg2 SsyV+fpRza9EJtyvqUJLWDplaQOhHp5Emz+8N60NlZqrK2NnUo+J1nC+hdkA/2yHatqK f+VokW5C7R6P7nEbX0e/zUpEZxhf4B8AWllNe/8u4JJ9piI/CmC7K4iNCnbDTe+HxCE2 AwhQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=1yTSVExclD86ZDzoLcogjwD8F5MtiispGaaRPHmaCXA=; b=IvLvKeeBrlH6EUJ9jfhvno0yWHmm7nlp9Ra266++e99MtraQmyqZCfQUsJSZdtSjZO ilxh2Si/3vScqME/+DZ6Lah3fOVLlGM5QxviraJP1p7r6lAYb4O+ozMj8oTzMTvgY5Z4 9xs5XGskdw9JxnbrdHPK21FtE64hwXdl2VEdjDU6LVqWnG9bHdlHQ5ixp10jCKqm3arA h5EwXSk9pD/Is9k3Mp8fDDES1Hv3Zw9g+iB1jGYtafF/pbwpeXoZED6gL4bAfxrNzGRR h5RHJU1B29WgZwM/VIfVIiRrFUvLfOIQd3fNqyWTggUlxYVnpFoB65Q3zIl+AYlU4ac4 0YPQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=LGrt9Crg; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hr4si10313821ejc.321.2022.02.07.19.06.18; Mon, 07 Feb 2022 19:06:18 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=LGrt9Crg; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 183D768AE13; Tue, 8 Feb 2022 05:06:03 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id F281D68AE13 for ; Tue, 8 Feb 2022 05:05:55 +0200 (EET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1644289561; x=1675825561; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=5hSoa3nl8Elgb09nf3PNqO/X0BcGgr5aNWUF7RkJ+Cw=; b=LGrt9Crgk65yGLuj36Bv4dH5rKY1tVuSOhrql34tJ0cTDROAcMwfFQ2A zW3CO+IlBRhRxk/pioAu7Jou5njACj87NGRVub6cLAXnaJ15pwsTUwtrK WyoczVkPXgXAKB7J7u8x/nu++MEWyfYKCWwT1hgMVU9GvpWLPx191OGpa 2FhtZ3tEBHobfatEjtbIcDBD+sl1A/+OTvxqiD5ebQWp/Xv2+Zl69cVbi Xd6xHAVGmeMczMya3uLbPxBy9aOLAHjtKL0PmClfgNVgHjx6dALtnhN0A MWnTjhs2zZQYJddUQnn1IX+y4/OWdFg6bkEz/Kkw+AQJvwRTTQQ+RZwkt g==; X-IronPort-AV: E=McAfee;i="6200,9189,10251"; a="228829621" X-IronPort-AV: E=Sophos;i="5.88,351,1635231600"; d="scan'208";a="228829621" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Feb 2022 19:05:53 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,351,1635231600"; d="scan'208";a="628742459" Received: from wenbin-z390-aorus-ultra.sh.intel.com ([10.239.35.110]) by fmsmga002.fm.intel.com with ESMTP; 07 Feb 2022 19:05:52 -0800 From: Wenbin Chen To: ffmpeg-devel@ffmpeg.org Date: Tue, 8 Feb 2022 11:05:48 +0800 Message-Id: <20220208030549.340748-2-wenbin.chen@intel.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220208030549.340748-1-wenbin.chen@intel.com> References: <20220208030549.340748-1-wenbin.chen@intel.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH V3 2/3] libavcodec/vaapi_encode: Change the way to call async to increase performance X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: vgK6yy4tTjlt Fix: #7706. After commit 5fdcf85bbffe7451c2, vaapi encoder's performance decrease. The reason is that vaRenderPicture() and vaSyncBuffer() are called at the same time (vaRenderPicture() always followed by a vaSyncBuffer()). When we encode stream with B frames, we need buffer to reorder frames, so we can send serveral frames to HW at once to increase performance. Now I changed them to be called in a asynchronous way, which will make better use of hardware. 1080p transcoding increases about 17% fps on my environment. This change fits vaSyncBuffer(), so if driver does not support vaSyncBuffer, it will keep previous operation. Signed-off-by: Wenbin Chen --- libavcodec/vaapi_encode.c | 64 ++++++++++++++++++++++++++++++++------- libavcodec/vaapi_encode.h | 5 +++ 2 files changed, 58 insertions(+), 11 deletions(-) diff --git a/libavcodec/vaapi_encode.c b/libavcodec/vaapi_encode.c index b87b58a42b..15ddbbaa4a 100644 --- a/libavcodec/vaapi_encode.c +++ b/libavcodec/vaapi_encode.c @@ -984,8 +984,10 @@ static int vaapi_encode_pick_next(AVCodecContext *avctx, if (!pic && ctx->end_of_stream) { --b_counter; pic = ctx->pic_end; - if (pic->encode_issued) + if (pic->encode_complete) return AVERROR_EOF; + else if (pic->encode_issued) + return AVERROR(EAGAIN); } if (!pic) { @@ -1210,18 +1212,44 @@ int ff_vaapi_encode_receive_packet(AVCodecContext *avctx, AVPacket *pkt) return AVERROR(EAGAIN); } - pic = NULL; - err = vaapi_encode_pick_next(avctx, &pic); - if (err < 0) - return err; - av_assert0(pic); +#if VA_CHECK_VERSION(1, 9, 0) + if (ctx->has_sync_buffer_func) { + while (av_fifo_can_read(ctx->encode_fifo) <= MAX_PICTURE_REFERENCES) { + pic = NULL; + err = vaapi_encode_pick_next(avctx, &pic); + if (err < 0) + break; + + av_assert0(pic); + pic->encode_order = ctx->encode_order + + av_fifo_can_read(ctx->encode_fifo); + err = vaapi_encode_issue(avctx, pic); + if (err < 0) { + av_log(avctx, AV_LOG_ERROR, "Encode failed: %d.\n", err); + return err; + } + av_fifo_write(ctx->encode_fifo, &pic, 1); + } + if (!av_fifo_can_read(ctx->encode_fifo)) + return err; + av_fifo_read(ctx->encode_fifo, &pic, 1); + ctx->encode_order = pic->encode_order + 1; + } else +#endif + { + pic = NULL; + err = vaapi_encode_pick_next(avctx, &pic); + if (err < 0) + return err; + av_assert0(pic); - pic->encode_order = ctx->encode_order++; + pic->encode_order = ctx->encode_order++; - err = vaapi_encode_issue(avctx, pic); - if (err < 0) { - av_log(avctx, AV_LOG_ERROR, "Encode failed: %d.\n", err); - return err; + err = vaapi_encode_issue(avctx, pic); + if (err < 0) { + av_log(avctx, AV_LOG_ERROR, "Encode failed: %d.\n", err); + return err; + } } err = vaapi_encode_output(avctx, pic, pkt); @@ -2555,6 +2583,19 @@ av_cold int ff_vaapi_encode_init(AVCodecContext *avctx) } } +#if VA_CHECK_VERSION(1, 9, 0) + //check vaSyncBuffer function + vas = vaSyncBuffer(ctx->hwctx->display, 0, 0); + if (vas != VA_STATUS_ERROR_UNIMPLEMENTED) { + ctx->has_sync_buffer_func = 1; + ctx->encode_fifo = av_fifo_alloc2(MAX_PICTURE_REFERENCES + 1, + sizeof(VAAPIEncodePicture *), + 0); + if (!ctx->encode_fifo) + return AVERROR(ENOMEM); + } +#endif + return 0; fail: @@ -2592,6 +2633,7 @@ av_cold int ff_vaapi_encode_close(AVCodecContext *avctx) av_freep(&ctx->codec_sequence_params); av_freep(&ctx->codec_picture_params); + av_fifo_freep2(&ctx->encode_fifo); av_buffer_unref(&ctx->recon_frames_ref); av_buffer_unref(&ctx->input_frames_ref); diff --git a/libavcodec/vaapi_encode.h b/libavcodec/vaapi_encode.h index b41604a883..d33a486cb8 100644 --- a/libavcodec/vaapi_encode.h +++ b/libavcodec/vaapi_encode.h @@ -29,6 +29,7 @@ #include "libavutil/hwcontext.h" #include "libavutil/hwcontext_vaapi.h" +#include "libavutil/fifo.h" #include "avcodec.h" #include "hwconfig.h" @@ -345,6 +346,10 @@ typedef struct VAAPIEncodeContext { int roi_warned; AVFrame *frame; + //Store buffered pic + AVFifo *encode_fifo; + //Whether the driver support vaSyncBuffer + int has_sync_buffer_func; } VAAPIEncodeContext; enum { From patchwork Tue Feb 8 03:05:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Chen, Wenbin" X-Patchwork-Id: 34161 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6602:2c4e:0:0:0:0 with SMTP id x14csp416101iov; Mon, 7 Feb 2022 19:06:28 -0800 (PST) X-Google-Smtp-Source: ABdhPJy/PsJYpjAQQJ3HM6nBd+nhPLed0SsqNPkneTFclMN/jVsDLT8sEadNDK+1/SDKRneqcqiG X-Received: by 2002:aa7:d299:: with SMTP id w25mr2354824edq.21.1644289588751; Mon, 07 Feb 2022 19:06:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644289588; cv=none; d=google.com; s=arc-20160816; b=dRAik8q6wrIoMfItCfA0BNw1zjkSNaHUCc9rv9q3kavFiTG8qtGZFbn3fzV9I03sn7 uaWDf0ofcNXDJDEjywWLYVZxfavQI8mF4wBr9266GM1GyydtpNEaonmy7aM2McGaH/+p gIk5skmEbEp7piM2bCit0Ys90TYR++vECdq814yPMhZ1pJttEeTKpue+Ocxoe4bTksWm Pygds2QCGPYPzOAtdC0K1gmWVHte53GLcdkjxdm4l0LQU5THS84z1iqWNWHsqQ1+fOIt 90c2shm3fxTfvbGSbxm4xbNum6hGwJILBoIF/1R9mU/3JZAgibL0CaxLdRzvMgQazD9v v3iw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=im4TQ2Og9E914vMqOfRaM2w7xCTMVG4KyvOsxuGrVHA=; b=Tzz4hIKQYUnKlEWW0DRGufWKvs7Zi8oMSyLeiKk5wda/VhdgSRdi6viFBGYLnwToMZ Cbm+7Q3cQ1o8r4NyBbYpeXwhUV2KqVQkbSPnqBUSWRmaBaVYflFXl+L+J1fRE0et/BS1 VMyLV2HuI6mIHWN7Idbrtvwr318GWyJSI7jlslQYPD0HcUMVz6eyTYfo3Dg+tmcIXi3C Q3gkE5noRiU6DiIjZBSFZqSoFrliawnHr/w0rje8F2MvpHNi0/o3R4T1hSrZqiW0lgpF +2AZYQ1GFL9Mw1IqNPJkpj7+nECUzvfWHYvmA9FyNIwps4oAdSyTEo1QDjbGazS+tIXq EQyQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b="Jh/7tmm+"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id f9si9511807edf.226.2022.02.07.19.06.28; Mon, 07 Feb 2022 19:06:28 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b="Jh/7tmm+"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 34D0B68B185; Tue, 8 Feb 2022 05:06:08 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 60EFB68A897 for ; Tue, 8 Feb 2022 05:06:00 +0200 (EET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1644289565; x=1675825565; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=sM93SyByF+ptb3y3h/mwNpnZgXn8Sum4YMOJBA2DOSA=; b=Jh/7tmm+kAUcGjUNUqFcFj0qOywRdatTqSdhWahisweob32Uz5m5JXub q2F3jFuadEBIjVFg/+CWoHROok8dWq5FmxUk/IS/lHMSGbmWLXtX5K9sQ jg6Z+D8x9C42mAEyYaGIorfYHMq4x5U5prfa13C//t4ItCz7VMzs1JVuT /GAOAaX0MaAGr0vlI3wVw9IbGt2on6RKQ2M2mCUYrPJ9b/FV2MjH0YNC+ w5QMLSz/dIZjDifIvY5kzvViHDKM7QS4c3QmWfHoYUIZYO9fe0CHe3NJW y6vYUlTJXbvqKRehNO/2Tw9vQWGFQambWbU95P+y9djoAjbJ5MDGtj7WH w==; X-IronPort-AV: E=McAfee;i="6200,9189,10251"; a="228829623" X-IronPort-AV: E=Sophos;i="5.88,351,1635231600"; d="scan'208";a="228829623" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Feb 2022 19:05:54 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,351,1635231600"; d="scan'208";a="628742460" Received: from wenbin-z390-aorus-ultra.sh.intel.com ([10.239.35.110]) by fmsmga002.fm.intel.com with ESMTP; 07 Feb 2022 19:05:53 -0800 From: Wenbin Chen To: ffmpeg-devel@ffmpeg.org Date: Tue, 8 Feb 2022 11:05:49 +0800 Message-Id: <20220208030549.340748-3-wenbin.chen@intel.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220208030549.340748-1-wenbin.chen@intel.com> References: <20220208030549.340748-1-wenbin.chen@intel.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH V3 3/3] libavcodec/vaapi_encode: Add async_depth to vaapi_encoder to increase performance X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: YMcVSMXpiwBn Add async_depth to increase encoder's performance. Reuse encode_fifo as async buffer. Encoder puts all reordered frame to HW and then check fifo size. If fifo < async_depth and the top frame is not ready, it will return AVERROR(EAGAIN) to require more frames. 1080p transcoding (no B frames) with -async_depth=4 can increase 20% performance on my environment. The async increases performance but also introduces frame delay. Signed-off-by: Wenbin Chen --- libavcodec/vaapi_encode.c | 16 ++++++++++++---- libavcodec/vaapi_encode.h | 12 ++++++++++-- 2 files changed, 22 insertions(+), 6 deletions(-) diff --git a/libavcodec/vaapi_encode.c b/libavcodec/vaapi_encode.c index 15ddbbaa4a..432abf31f7 100644 --- a/libavcodec/vaapi_encode.c +++ b/libavcodec/vaapi_encode.c @@ -1158,7 +1158,8 @@ static int vaapi_encode_send_frame(AVCodecContext *avctx, AVFrame *frame) if (ctx->input_order == ctx->decode_delay) ctx->dts_pts_diff = pic->pts - ctx->first_pts; if (ctx->output_delay > 0) - ctx->ts_ring[ctx->input_order % (3 * ctx->output_delay)] = pic->pts; + ctx->ts_ring[ctx->input_order % + (3 * ctx->output_delay + ctx->async_depth)] = pic->pts; pic->display_order = ctx->input_order; ++ctx->input_order; @@ -1214,7 +1215,7 @@ int ff_vaapi_encode_receive_packet(AVCodecContext *avctx, AVPacket *pkt) #if VA_CHECK_VERSION(1, 9, 0) if (ctx->has_sync_buffer_func) { - while (av_fifo_can_read(ctx->encode_fifo) <= MAX_PICTURE_REFERENCES) { + while (av_fifo_can_read(ctx->encode_fifo) <= MAX_ASYNC_DEPTH) { pic = NULL; err = vaapi_encode_pick_next(avctx, &pic); if (err < 0) @@ -1232,6 +1233,13 @@ int ff_vaapi_encode_receive_packet(AVCodecContext *avctx, AVPacket *pkt) } if (!av_fifo_can_read(ctx->encode_fifo)) return err; + if (av_fifo_can_read(ctx->encode_fifo) < ctx->async_depth && + !ctx->end_of_stream) { + av_fifo_peek(ctx->encode_fifo, &pic, 1, 0); + err = vaapi_encode_wait(avctx, pic, 0); + if (err < 0) + return err; + } av_fifo_read(ctx->encode_fifo, &pic, 1); ctx->encode_order = pic->encode_order + 1; } else @@ -1267,7 +1275,7 @@ int ff_vaapi_encode_receive_packet(AVCodecContext *avctx, AVPacket *pkt) pkt->dts = ctx->ts_ring[pic->encode_order] - ctx->dts_pts_diff; } else { pkt->dts = ctx->ts_ring[(pic->encode_order - ctx->decode_delay) % - (3 * ctx->output_delay)]; + (3 * ctx->output_delay + ctx->async_depth)]; } av_log(avctx, AV_LOG_DEBUG, "Output packet: pts %"PRId64" dts %"PRId64".\n", pkt->pts, pkt->dts); @@ -2588,7 +2596,7 @@ av_cold int ff_vaapi_encode_init(AVCodecContext *avctx) vas = vaSyncBuffer(ctx->hwctx->display, 0, 0); if (vas != VA_STATUS_ERROR_UNIMPLEMENTED) { ctx->has_sync_buffer_func = 1; - ctx->encode_fifo = av_fifo_alloc2(MAX_PICTURE_REFERENCES + 1, + ctx->encode_fifo = av_fifo_alloc2(MAX_ASYNC_DEPTH, sizeof(VAAPIEncodePicture *), 0); if (!ctx->encode_fifo) diff --git a/libavcodec/vaapi_encode.h b/libavcodec/vaapi_encode.h index d33a486cb8..691521387d 100644 --- a/libavcodec/vaapi_encode.h +++ b/libavcodec/vaapi_encode.h @@ -48,6 +48,7 @@ enum { MAX_TILE_ROWS = 22, // A.4.1: table A.6 allows at most 20 tile columns for any level. MAX_TILE_COLS = 20, + MAX_ASYNC_DEPTH = 64, }; extern const AVCodecHWConfigInternal *const ff_vaapi_encode_hw_configs[]; @@ -298,7 +299,8 @@ typedef struct VAAPIEncodeContext { // Timestamp handling. int64_t first_pts; int64_t dts_pts_diff; - int64_t ts_ring[MAX_REORDER_DELAY * 3]; + int64_t ts_ring[MAX_REORDER_DELAY * 3 + + MAX_ASYNC_DEPTH]; // Slice structure. int slice_block_rows; @@ -350,6 +352,8 @@ typedef struct VAAPIEncodeContext { AVFifo *encode_fifo; //Whether the driver support vaSyncBuffer int has_sync_buffer_func; + //Max number of frame buffered in encoder. + int async_depth; } VAAPIEncodeContext; enum { @@ -460,7 +464,11 @@ int ff_vaapi_encode_close(AVCodecContext *avctx); { "b_depth", \ "Maximum B-frame reference depth", \ OFFSET(common.desired_b_depth), AV_OPT_TYPE_INT, \ - { .i64 = 1 }, 1, INT_MAX, FLAGS } + { .i64 = 1 }, 1, INT_MAX, FLAGS }, \ + { "async_depth", "Maximum processing parallelism. " \ + "Increase this to improve single channel performance", \ + OFFSET(common.async_depth), AV_OPT_TYPE_INT, \ + { .i64 = 4 }, 0, MAX_ASYNC_DEPTH, FLAGS } #define VAAPI_ENCODE_RC_MODE(name, desc) \ { #name, desc, 0, AV_OPT_TYPE_CONST, { .i64 = RC_MODE_ ## name }, \