From patchwork Wed Dec  5 09:58:58 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Guo, Yejun" <yejun.guo@intel.com>
X-Patchwork-Id: 11280
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
X-Original-To: patchwork@ffaux-bg.ffmpeg.org
Delivered-To: patchwork@ffaux-bg.ffmpeg.org
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100])
	by ffaux.localdomain (Postfix) with ESMTP id 82C8044CF4D
	for <patchwork@ffaux-bg.ffmpeg.org>;
	Wed,  5 Dec 2018 04:06:27 +0200 (EET)
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 265EF68A5DF;
	Wed,  5 Dec 2018 04:06:19 +0200 (EET)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from mga18.intel.com (mga18.intel.com [134.134.136.126])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 4EF9068A595
	for <ffmpeg-devel@ffmpeg.org>; Wed,  5 Dec 2018 04:06:12 +0200 (EET)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga007.fm.intel.com ([10.253.24.52])
	by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
	04 Dec 2018 18:06:18 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.56,316,1539673200"; d="scan'208";a="104909604"
Received: from yguo18-skl-u1604.sh.intel.com ([10.239.13.25])
	by fmsmga007.fm.intel.com with ESMTP; 04 Dec 2018 18:06:17 -0800
From: "Guo, Yejun" <yejun.guo@intel.com>
To: ffmpeg-devel@ffmpeg.org
Date: Wed,  5 Dec 2018 17:58:58 +0800
Message-Id: <1544003938-2219-1-git-send-email-yejun.guo@intel.com>
X-Mailer: git-send-email 2.7.4
Subject: [FFmpeg-devel] [PATCH] add support for ROI-based encoding
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <http://ffmpeg.org/mailman/options/ffmpeg-devel>,
	<mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <http://ffmpeg.org/pipermail/ffmpeg-devel/>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <http://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
	<mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches
	<ffmpeg-devel@ffmpeg.org>
MIME-Version: 1.0
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>

this patch is not ask for merge, it is more to get a feature feedback.

The encoders such as libx264 support different QPs offset for different MBs,
it makes possible for ROI-based encoding. It makes sense to add support
within ffmpeg to generate/accept ROI infos and pass into encoders.

Typical usage: After AVFrame is decoded, a ffmpeg filter or user's code
generates ROI info for that frame, and the encoder finally does the
ROI-based encoding. And so I choose to maintain the ROI info within
AVFrame struct.

TODO:
- remove code in vf_scale.c, it is just an example to generate ROI info
- use AVBufferRef instead of current implementation within AVFrame struct.
- add other encoders support

Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
---
 libavcodec/libx264.c   | 35 +++++++++++++++++++++++++++++++++++
 libavfilter/vf_scale.c |  8 ++++++++
 libavutil/frame.c      |  9 +++++++++
 libavutil/frame.h      | 14 ++++++++++++++
 4 files changed, 66 insertions(+)

diff --git a/libavcodec/libx264.c b/libavcodec/libx264.c
index a68d0a7..d8cc327 100644
--- a/libavcodec/libx264.c
+++ b/libavcodec/libx264.c
@@ -26,6 +26,7 @@
 #include "libavutil/pixdesc.h"
 #include "libavutil/stereo3d.h"
 #include "libavutil/intreadwrite.h"
+#include "libavutil/avassert.h"
 #include "avcodec.h"
 #include "internal.h"
 
@@ -345,6 +346,40 @@ static int X264_frame(AVCodecContext *ctx, AVPacket *pkt, const AVFrame *frame,
                 }
             }
         }
+
+        if (frame->nb_rois > 0) {
+            if (x4->params.rc.i_aq_mode == X264_AQ_NONE) {
+                av_log(ctx, AV_LOG_ERROR, "Adaptive quantization must be enabled to use ROI encoding, skipping ROI.\n");
+            }
+            if (frame->interlaced_frame == 0) {
+                const static int MBSIZE = 16;
+                size_t mbx = (frame->width + MBSIZE - 1) / MBSIZE;
+                size_t mby = (frame->height + MBSIZE - 1) / MBSIZE;
+                float* qoffsets = (float*)av_malloc(sizeof(float) * mbx * mby);
+                memset(qoffsets, 0, sizeof(float) * mbx * mby);
+
+                for (size_t roi = 0; roi < frame->nb_rois; ++roi) {
+                    int starty = FFMIN(mby, frame->rois[roi].top / MBSIZE);
+                    int endy = FFMIN(mby, (frame->rois[roi].bottom + MBSIZE - 1)/ MBSIZE);
+                    int startx = FFMIN(mbx, frame->rois[roi].left / MBSIZE);
+                    int endx = FFMIN(mbx, (frame->rois[roi].right + MBSIZE - 1)/ MBSIZE);
+                    for (int y = starty; y < endy; ++y) {
+                        for (int x = startx; x < endx; ++x) {
+                            qoffsets[x + y*mbx] = frame->rois[roi].qoffset;
+                        }
+                    }
+                }
+
+                x4->pic.prop.quant_offsets = qoffsets;
+                x4->pic.prop.quant_offsets_free = av_free;
+            } else {
+                av_log(ctx, AV_LOG_ERROR, "interlaced_frame not supported for ROI encoding, skipping ROI.\n");
+            }
+        } else {
+            //to be removed in the final code, it is just for debug usage now.
+            printf("ooooops, frame 0x%p with rois %ld\n", frame, frame->nb_rois);
+            av_assert0(!"should not reach here");
+        }
     }
 
     do {
diff --git a/libavfilter/vf_scale.c b/libavfilter/vf_scale.c
index f741419..71def72 100644
--- a/libavfilter/vf_scale.c
+++ b/libavfilter/vf_scale.c
@@ -437,6 +437,14 @@ static int filter_frame(AVFilterLink *link, AVFrame *in)
             return ret;
     }
 
+    // to be removed, just for debug usage temporarily
+    in->nb_rois = 1;
+    in->rois[0].top = 0;
+    in->rois[0].left = 0;
+    in->rois[0].bottom = in->height;
+    in->rois[0].right = in->width/2;
+    in->rois[0].qoffset = 15.0f; // 15.0f, +-5.0f, +-25.0f
+
     if (!scale->sws)
         return ff_filter_frame(outlink, in);
 
diff --git a/libavutil/frame.c b/libavutil/frame.c
index 9b3fb13..9c38bdd 100644
--- a/libavutil/frame.c
+++ b/libavutil/frame.c
@@ -425,6 +425,15 @@ FF_DISABLE_DEPRECATION_WARNINGS
 FF_ENABLE_DEPRECATION_WARNINGS
 #endif
 
+    dst->nb_rois = src->nb_rois;
+    for (int i = 0; i < dst->nb_rois; ++i) {
+        dst->rois[i].top = src->rois[i].top;
+        dst->rois[i].bottom = src->rois[i].bottom;
+        dst->rois[i].left = src->rois[i].left;
+        dst->rois[i].right = src->rois[i].right;
+        dst->rois[i].qoffset = src->rois[i].qoffset;
+    }
+
     av_buffer_unref(&dst->opaque_ref);
     av_buffer_unref(&dst->private_ref);
     if (src->opaque_ref) {
diff --git a/libavutil/frame.h b/libavutil/frame.h
index 66f27f4..b245a90 100644
--- a/libavutil/frame.h
+++ b/libavutil/frame.h
@@ -193,6 +193,15 @@ typedef struct AVFrameSideData {
     AVBufferRef *buf;
 } AVFrameSideData;
 
+
+typedef struct AVFrameROI {
+    size_t top;
+    size_t bottom;
+    size_t left;
+    size_t right;
+    float qoffset;
+} AVFrameROI;
+
 /**
  * This structure describes decoded (raw) audio or video data.
  *
@@ -556,6 +565,11 @@ typedef struct AVFrame {
     attribute_deprecated
     AVBufferRef *qp_table_buf;
 #endif
+
+    //TODO: AVBufferRef*
+    AVFrameROI rois[2];
+    size_t nb_rois;
+
     /**
      * For hwaccel-format frames, this should be a reference to the
      * AVHWFramesContext describing the frame.