From patchwork Wed Nov  6 04:53:57 2019
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Fu, Linjie" <linjie.fu@intel.com>
X-Patchwork-Id: 16139
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
X-Original-To: patchwork@ffaux-bg.ffmpeg.org
Delivered-To: patchwork@ffaux-bg.ffmpeg.org
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100])
	by ffaux.localdomain (Postfix) with ESMTP id 6A7D3447E9D
	for <patchwork@ffaux-bg.ffmpeg.org>;
	Wed,  6 Nov 2019 06:55:12 +0200 (EET)
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4566A68B046;
	Wed,  6 Nov 2019 06:55:12 +0200 (EET)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from mga03.intel.com (mga03.intel.com [134.134.136.65])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 4527B68AD38
	for <ffmpeg-devel@ffmpeg.org>; Wed,  6 Nov 2019 06:55:04 +0200 (EET)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga001.fm.intel.com ([10.253.24.23])
	by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
	05 Nov 2019 20:55:02 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.68,272,1569308400"; d="scan'208";a="212687150"
Received: from media_lj_kbl.sh.intel.com ([10.239.13.13])
	by fmsmga001.fm.intel.com with ESMTP; 05 Nov 2019 20:55:01 -0800
From: Linjie Fu <linjie.fu@intel.com>
To: ffmpeg-devel@ffmpeg.org
Date: Wed,  6 Nov 2019 12:53:57 +0800
Message-Id: <20191106045357.21011-1-linjie.fu@intel.com>
X-Mailer: git-send-email 2.17.1
Subject: [FFmpeg-devel] [PATCH] lavc/vaapi_decode: Set surfaces reference
	pool size according to SPS for H.264/HEVC
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
	<mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
	<mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches
	<ffmpeg-devel@ffmpeg.org>
Cc: Linjie Fu <linjie.fu@intel.com>
MIME-Version: 1.0
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>

Set surfaces pool used for storing reference frames dynamically
according to SPS.(reference framecount, reordered frame number, etc)

Compared to a fixed pool size for H.264 and HEVC, the usage of
GPU memory could be improved.

CMD:
ffmpeg -hwaccel vaapi -vaapi_device /dev/dri/renderD128
        -i bbb_sunflower_1080p_30fps_normal.mp4 -f null -
Source:
    https://download.blender.org/demo/movies/BBB/

GEM Object memory usage:
    watch cat /sys/kernel/debug/dri/0/i915_gem_objects
Before:
    ~112M
After:
    ~80M

Signed-off-by: Linjie Fu <linjie.fu@intel.com>
---
 libavcodec/vaapi_decode.c | 39 ++++++++++++++++++---------------------
 libavcodec/vaapi_decode.h |  3 +++
 libavcodec/vaapi_h264.c   | 11 ++++++++++-
 libavcodec/vaapi_hevc.c   | 11 ++++++++++-
 libavcodec/vaapi_vp8.c    |  8 +++++++-
 libavcodec/vaapi_vp9.c    |  8 +++++++-
 6 files changed, 55 insertions(+), 25 deletions(-)

diff --git a/libavcodec/vaapi_decode.c b/libavcodec/vaapi_decode.c
index 69512e1d45..5fc9767802 100644
--- a/libavcodec/vaapi_decode.c
+++ b/libavcodec/vaapi_decode.c
@@ -408,7 +408,8 @@ static const struct {
 static int vaapi_decode_make_config(AVCodecContext *avctx,
                                     AVBufferRef *device_ref,
                                     VAConfigID *va_config,
-                                    AVBufferRef *frames_ref)
+                                    AVBufferRef *frames_ref,
+                                    int dpb_size)
 {
     AVVAAPIHWConfig       *hwconfig    = NULL;
     AVHWFramesConstraints *constraints = NULL;
@@ -549,22 +550,8 @@ static int vaapi_decode_make_config(AVCodecContext *avctx,
         if (err < 0)
             goto fail;
 
-        frames->initial_pool_size = 1;
-        // Add per-codec number of surfaces used for storing reference frames.
-        switch (avctx->codec_id) {
-        case AV_CODEC_ID_H264:
-        case AV_CODEC_ID_HEVC:
-            frames->initial_pool_size += 16;
-            break;
-        case AV_CODEC_ID_VP9:
-            frames->initial_pool_size += 8;
-            break;
-        case AV_CODEC_ID_VP8:
-            frames->initial_pool_size += 3;
-            break;
-        default:
-            frames->initial_pool_size += 2;
-        }
+        if (dpb_size > 0)
+            frames->initial_pool_size = dpb_size + 1;
     }
 
     av_hwframe_constraints_free(&constraints);
@@ -583,8 +570,9 @@ fail:
     return err;
 }
 
-int ff_vaapi_common_frame_params(AVCodecContext *avctx,
-                                 AVBufferRef *hw_frames_ctx)
+int ff_vaapi_frame_params_with_dpb_size(AVCodecContext *avctx,
+                                        AVBufferRef *hw_frames_ctx,
+                                        int dpb_size)
 {
     AVHWFramesContext *hw_frames = (AVHWFramesContext *)hw_frames_ctx->data;
     AVHWDeviceContext *device_ctx = hw_frames->device_ctx;
@@ -597,7 +585,7 @@ int ff_vaapi_common_frame_params(AVCodecContext *avctx,
     hwctx = device_ctx->hwctx;
 
     err = vaapi_decode_make_config(avctx, hw_frames->device_ref, &va_config,
-                                   hw_frames_ctx);
+                                   hw_frames_ctx, dpb_size);
     if (err)
         return err;
 
@@ -607,6 +595,13 @@ int ff_vaapi_common_frame_params(AVCodecContext *avctx,
     return 0;
 }
 
+int ff_vaapi_common_frame_params(AVCodecContext *avctx,
+                                 AVBufferRef *hw_frames_ctx)
+{
+    // Set common dpb_size for vc1/mjpeg/mpeg2/mpeg4.
+    return ff_vaapi_frame_params_with_dpb_size(avctx, hw_frames_ctx, 2);
+}
+
 int ff_vaapi_decode_init(AVCodecContext *avctx)
 {
     VAAPIDecodeContext *ctx = avctx->internal->hwaccel_priv_data;
@@ -666,7 +661,9 @@ int ff_vaapi_decode_init(AVCodecContext *avctx)
     ctx->hwctx  = ctx->device->hwctx;
 
     err = vaapi_decode_make_config(avctx, ctx->frames->device_ref,
-                                   &ctx->va_config, avctx->hw_frames_ctx);
+                                   &ctx->va_config, avctx->hw_frames_ctx,
+                                   0);
+
     if (err)
         goto fail;
 
diff --git a/libavcodec/vaapi_decode.h b/libavcodec/vaapi_decode.h
index 6b415dd1d3..c3e74bf9c7 100644
--- a/libavcodec/vaapi_decode.h
+++ b/libavcodec/vaapi_decode.h
@@ -98,6 +98,9 @@ int ff_vaapi_decode_cancel(AVCodecContext *avctx,
 int ff_vaapi_decode_init(AVCodecContext *avctx);
 int ff_vaapi_decode_uninit(AVCodecContext *avctx);
 
+int ff_vaapi_frame_params_with_dpb_size(AVCodecContext *avctx,
+                                        AVBufferRef *hw_frames_ctx,
+                                        int dpb_size);
 int ff_vaapi_common_frame_params(AVCodecContext *avctx,
                                  AVBufferRef *hw_frames_ctx);
 
diff --git a/libavcodec/vaapi_h264.c b/libavcodec/vaapi_h264.c
index dd2a657160..8d7f5c2004 100644
--- a/libavcodec/vaapi_h264.c
+++ b/libavcodec/vaapi_h264.c
@@ -385,6 +385,15 @@ static int vaapi_h264_decode_slice(AVCodecContext *avctx,
     return 0;
 }
 
+static int ff_vaapi_h264_frame_params(AVCodecContext *avctx,
+                                   AVBufferRef *hw_frames_ctx)
+{
+    const H264Context *h = avctx->priv_data;
+    const SPS       *sps = h->ps.sps;
+    return ff_vaapi_frame_params_with_dpb_size(avctx, hw_frames_ctx,
+                                               sps->ref_frame_count + sps->num_reorder_frames);
+}
+
 const AVHWAccel ff_h264_vaapi_hwaccel = {
     .name                 = "h264_vaapi",
     .type                 = AVMEDIA_TYPE_VIDEO,
@@ -396,7 +405,7 @@ const AVHWAccel ff_h264_vaapi_hwaccel = {
     .frame_priv_data_size = sizeof(VAAPIDecodePicture),
     .init                 = &ff_vaapi_decode_init,
     .uninit               = &ff_vaapi_decode_uninit,
-    .frame_params         = &ff_vaapi_common_frame_params,
+    .frame_params         = &ff_vaapi_h264_frame_params,
     .priv_data_size       = sizeof(VAAPIDecodeContext),
     .caps_internal        = HWACCEL_CAP_ASYNC_SAFE,
 };
diff --git a/libavcodec/vaapi_hevc.c b/libavcodec/vaapi_hevc.c
index c69d63d8ec..41e973626c 100644
--- a/libavcodec/vaapi_hevc.c
+++ b/libavcodec/vaapi_hevc.c
@@ -421,6 +421,15 @@ static int vaapi_hevc_decode_slice(AVCodecContext *avctx,
     return 0;
 }
 
+static int ff_vaapi_hevc_frame_params(AVCodecContext *avctx,
+                                   AVBufferRef *hw_frames_ctx)
+{
+    const HEVCContext *s = avctx->priv_data;
+    const HEVCSPS   *sps = s->ps.sps;
+    return ff_vaapi_frame_params_with_dpb_size(avctx, hw_frames_ctx,
+                                               sps->temporal_layer[sps->max_sub_layers - 1].max_dec_pic_buffering);
+}
+
 const AVHWAccel ff_hevc_vaapi_hwaccel = {
     .name                 = "hevc_vaapi",
     .type                 = AVMEDIA_TYPE_VIDEO,
@@ -432,7 +441,7 @@ const AVHWAccel ff_hevc_vaapi_hwaccel = {
     .frame_priv_data_size = sizeof(VAAPIDecodePictureHEVC),
     .init                 = ff_vaapi_decode_init,
     .uninit               = ff_vaapi_decode_uninit,
-    .frame_params         = ff_vaapi_common_frame_params,
+    .frame_params         = ff_vaapi_hevc_frame_params,
     .priv_data_size       = sizeof(VAAPIDecodeContext),
     .caps_internal        = HWACCEL_CAP_ASYNC_SAFE,
 };
diff --git a/libavcodec/vaapi_vp8.c b/libavcodec/vaapi_vp8.c
index 2426b30f13..4b74b9e0ad 100644
--- a/libavcodec/vaapi_vp8.c
+++ b/libavcodec/vaapi_vp8.c
@@ -220,6 +220,12 @@ fail:
     return err;
 }
 
+static int ff_vaapi_vp8_frame_params(AVCodecContext *avctx,
+                                  AVBufferRef *hw_frames_ctx)
+{
+    return ff_vaapi_frame_params_with_dpb_size(avctx, hw_frames_ctx, 3);
+}
+
 const AVHWAccel ff_vp8_vaapi_hwaccel = {
     .name                 = "vp8_vaapi",
     .type                 = AVMEDIA_TYPE_VIDEO,
@@ -231,7 +237,7 @@ const AVHWAccel ff_vp8_vaapi_hwaccel = {
     .frame_priv_data_size = sizeof(VAAPIDecodePicture),
     .init                 = &ff_vaapi_decode_init,
     .uninit               = &ff_vaapi_decode_uninit,
-    .frame_params         = &ff_vaapi_common_frame_params,
+    .frame_params         = &ff_vaapi_vp8_frame_params,
     .priv_data_size       = sizeof(VAAPIDecodeContext),
     .caps_internal        = HWACCEL_CAP_ASYNC_SAFE,
 };
diff --git a/libavcodec/vaapi_vp9.c b/libavcodec/vaapi_vp9.c
index f384ba7873..ce15701405 100644
--- a/libavcodec/vaapi_vp9.c
+++ b/libavcodec/vaapi_vp9.c
@@ -168,6 +168,12 @@ static int vaapi_vp9_decode_slice(AVCodecContext *avctx,
     return 0;
 }
 
+static int ff_vaapi_vp9_frame_params(AVCodecContext *avctx,
+                                  AVBufferRef *hw_frames_ctx)
+{
+    return ff_vaapi_frame_params_with_dpb_size(avctx, hw_frames_ctx, 8);
+}
+
 const AVHWAccel ff_vp9_vaapi_hwaccel = {
     .name                 = "vp9_vaapi",
     .type                 = AVMEDIA_TYPE_VIDEO,
@@ -179,7 +185,7 @@ const AVHWAccel ff_vp9_vaapi_hwaccel = {
     .frame_priv_data_size = sizeof(VAAPIDecodePicture),
     .init                 = ff_vaapi_decode_init,
     .uninit               = ff_vaapi_decode_uninit,
-    .frame_params         = ff_vaapi_common_frame_params,
+    .frame_params         = ff_vaapi_vp9_frame_params,
     .priv_data_size       = sizeof(VAAPIDecodeContext),
     .caps_internal        = HWACCEL_CAP_ASYNC_SAFE,
 };