From patchwork Wed Jan 5 02:48:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Chen, Wenbin" X-Patchwork-Id: 33074 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a6b:cd86:0:0:0:0:0 with SMTP id d128csp347198iog; Tue, 4 Jan 2022 18:48:40 -0800 (PST) X-Google-Smtp-Source: ABdhPJx1uZhey+3r6IoSI4/E4kR2ioEqcomo61fdw41ujsuaQnVMLI5rTi58zDM6BxIeECz6RES8 X-Received: by 2002:a05:6402:35d6:: with SMTP id z22mr51124435edc.334.1641350920270; Tue, 04 Jan 2022 18:48:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1641350920; cv=none; d=google.com; s=arc-20160816; b=cT96AMwMEGftLNlU72+vkSceIQKhaBfTP1WokvbWPdly83baJfBwIeKJ8Of4I+vC9B zMfPAnmB1RC/BAtSNOXhbGarkYow5kTyH8E53SPLx0J0o7cWeBr7dAMmwtdGr91O5DMe aJ4ZY0kxnweI28vntpmOUGIP/OCNYkOxzgw3A4RcrZqdqce/Ot2qN1A6o4L4aQFjPzZu eRDYTxk3GY5NIWRbD7T+Riqji3AvLp8L+m1m2r4Yhif1MVmq8Lubu3NSoNkz1umwmOyb 3WJlQkeeGd9LaFU8PrNdk+1EtWsQjODxX+JdVz1klBjAv6LG/9LrGnCIGANg84U3DhB6 5F3A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=LWsvDPoC3+sBe7LaIsdTaOKo6ExqbPpQcOso+ozqKSw=; b=FSQcCJ2GZbm8nw5Dt9RaSy2exnZV1nwS0xBJlX7nD6hu/67IerL0zaKYbuQJWxU8ic lS1lrfd/ubn/qEwFd9FEBtqPjgQOzC/CcvyQ60gw5dsAa4LDPLjIQibLgdCYMboBtTqi oIFj3w5sik8kXYdGjuucbhvxvwWo7h7HMfX1iKhtPk06lPeEOyzio0CHG1b6ybQpnevz Bik3Y4Eza8HqKfqwga2Y6/C3ryCM0jRUlhFBwI00x0VfHd03vImM0vCTrle82zOFAF5f gmaTlXI3GoeDNXTULCf3ohYynj+QwKtz2CflNaiyEC613dNVsKUkSmWAk4jmJtW58H1H k/XA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b="fSc/OWLt"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id g26si19216830edv.517.2022.01.04.18.48.40; Tue, 04 Jan 2022 18:48:40 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b="fSc/OWLt"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 970C3680B71; Wed, 5 Jan 2022 04:48:29 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 06E826802C3 for ; Wed, 5 Jan 2022 04:48:22 +0200 (EET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1641350908; x=1672886908; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=hodF1ZiwJNTJro7UmZQgZo5KYsyfkLyISUD/aIlTcKU=; b=fSc/OWLtS4rKkTcY5JpY+47Ni8pW6YobD3Di2EnLpI7nyXSMdKRwWBC4 idab/ikhakZTLnH/si5ylSS7EBLmZvNcnDejWue/NP4Qb0E4MHkL95nOA NGN2vvtbgirbuwSs301W5laUHCX73u9gRawexUs4IqW0xXZJRezWZ6nHW +K9nU51nItDxG+25dcloIWN6/gKOiP+6HWPxH09aXn54vmIIfko2dJKDt kGgoJtENU5jhpMBVC5nD5Mlx1pO7mqeanwIYMq9Tpvy0fZU6WvUU865Ub y8EiAj5DRmmkHayCpodAbY6yHIodOkrjRK5rThL57oQa5EQVVe7jfOVXr A==; X-IronPort-AV: E=McAfee;i="6200,9189,10217"; a="229171973" X-IronPort-AV: E=Sophos;i="5.88,262,1635231600"; d="scan'208";a="229171973" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jan 2022 18:48:20 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,262,1635231600"; d="scan'208";a="470375913" Received: from chenwenbin-z390-aorus-ultra.sh.intel.com ([10.239.35.110]) by orsmga003.jf.intel.com with ESMTP; 04 Jan 2022 18:48:19 -0800 From: Wenbin Chen To: ffmpeg-devel@ffmpeg.org Date: Wed, 5 Jan 2022 10:48:09 +0800 Message-Id: <20220105024810.435597-2-wenbin.chen@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220105024810.435597-1-wenbin.chen@intel.com> References: <20220105024810.435597-1-wenbin.chen@intel.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH V2 2/3] libavcodec/vaapi_encode: Change the way to call async to increase performance X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: PnzBwiIUjEGZ Fix: #7706. After commit 5fdcf85bbffe7451c2, vaapi encoder's performance decrease. The reason is that vaRenderPicture() and vaSyncBuffer() are called at the same time (vaRenderPicture() always followed by a vaSyncBuffer()). When we encode stream with B frames, we need buffer to reorder frames, so we can send serveral frames to HW at once to increase performance. Now I changed them to be called in a asynchronous way, which will make better use of hardware. 1080p transcoding increases about 17% fps on my environment. This change fits vaSyncBuffer(), so if driver does not support vaSyncBuffer, it will keep previous operation. Signed-off-by: Wenbin Chen --- libavcodec/vaapi_encode.c | 64 ++++++++++++++++++++++++++++++++------- libavcodec/vaapi_encode.h | 5 +++ 2 files changed, 58 insertions(+), 11 deletions(-) diff --git a/libavcodec/vaapi_encode.c b/libavcodec/vaapi_encode.c index b87b58a42b..9a3b3ba4ad 100644 --- a/libavcodec/vaapi_encode.c +++ b/libavcodec/vaapi_encode.c @@ -984,8 +984,10 @@ static int vaapi_encode_pick_next(AVCodecContext *avctx, if (!pic && ctx->end_of_stream) { --b_counter; pic = ctx->pic_end; - if (pic->encode_issued) + if (pic->encode_complete) return AVERROR_EOF; + else if (pic->encode_issued) + return AVERROR(EAGAIN); } if (!pic) { @@ -1210,18 +1212,45 @@ int ff_vaapi_encode_receive_packet(AVCodecContext *avctx, AVPacket *pkt) return AVERROR(EAGAIN); } - pic = NULL; - err = vaapi_encode_pick_next(avctx, &pic); - if (err < 0) - return err; - av_assert0(pic); +#if VA_CHECK_VERSION(1, 9, 0) + if (ctx->has_sync_buffer_func) { + while (av_fifo_size(ctx->encode_fifo) <= + MAX_PICTURE_REFERENCES * sizeof(VAAPIEncodePicture *)) { + pic = NULL; + err = vaapi_encode_pick_next(avctx, &pic); + if (err < 0) + break; + + av_assert0(pic); + pic->encode_order = ctx->encode_order + + (av_fifo_size(ctx->encode_fifo) / sizeof(VAAPIEncodePicture *)); + err = vaapi_encode_issue(avctx, pic); + if (err < 0) { + av_log(avctx, AV_LOG_ERROR, "Encode failed: %d.\n", err); + return err; + } + av_fifo_generic_write(ctx->encode_fifo, &pic, sizeof(pic), NULL); + } + if (!av_fifo_size(ctx->encode_fifo)) + return err; + av_fifo_generic_read(ctx->encode_fifo, &pic, sizeof(pic), NULL); + ctx->encode_order = pic->encode_order + 1; + } else +#endif + { + pic = NULL; + err = vaapi_encode_pick_next(avctx, &pic); + if (err < 0) + return err; + av_assert0(pic); - pic->encode_order = ctx->encode_order++; + pic->encode_order = ctx->encode_order++; - err = vaapi_encode_issue(avctx, pic); - if (err < 0) { - av_log(avctx, AV_LOG_ERROR, "Encode failed: %d.\n", err); - return err; + err = vaapi_encode_issue(avctx, pic); + if (err < 0) { + av_log(avctx, AV_LOG_ERROR, "Encode failed: %d.\n", err); + return err; + } } err = vaapi_encode_output(avctx, pic, pkt); @@ -2555,6 +2584,18 @@ av_cold int ff_vaapi_encode_init(AVCodecContext *avctx) } } +#if VA_CHECK_VERSION(1, 9, 0) + //check vaSyncBuffer function + vas = vaSyncBuffer(ctx->hwctx->display, 0, 0); + if (vas != VA_STATUS_ERROR_UNIMPLEMENTED) { + ctx->has_sync_buffer_func = 1; + ctx->encode_fifo = av_fifo_alloc((MAX_PICTURE_REFERENCES + 1) * + sizeof(VAAPIEncodePicture *)); + if (!ctx->encode_fifo) + return AVERROR(ENOMEM); + } +#endif + return 0; fail: @@ -2592,6 +2633,7 @@ av_cold int ff_vaapi_encode_close(AVCodecContext *avctx) av_freep(&ctx->codec_sequence_params); av_freep(&ctx->codec_picture_params); + av_fifo_freep(&ctx->encode_fifo); av_buffer_unref(&ctx->recon_frames_ref); av_buffer_unref(&ctx->input_frames_ref); diff --git a/libavcodec/vaapi_encode.h b/libavcodec/vaapi_encode.h index b41604a883..560a1c42a9 100644 --- a/libavcodec/vaapi_encode.h +++ b/libavcodec/vaapi_encode.h @@ -29,6 +29,7 @@ #include "libavutil/hwcontext.h" #include "libavutil/hwcontext_vaapi.h" +#include "libavutil/fifo.h" #include "avcodec.h" #include "hwconfig.h" @@ -345,6 +346,10 @@ typedef struct VAAPIEncodeContext { int roi_warned; AVFrame *frame; + //Store buffered pic + AVFifoBuffer *encode_fifo; + //Whether the driver support vaSyncBuffer + int has_sync_buffer_func; } VAAPIEncodeContext; enum {