From patchwork Wed Dec 5 09:58:58 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Guo, Yejun" X-Patchwork-Id: 11280 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 82C8044CF4D for ; Wed, 5 Dec 2018 04:06:27 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 265EF68A5DF; Wed, 5 Dec 2018 04:06:19 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 4EF9068A595 for ; Wed, 5 Dec 2018 04:06:12 +0200 (EET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 04 Dec 2018 18:06:18 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,316,1539673200"; d="scan'208";a="104909604" Received: from yguo18-skl-u1604.sh.intel.com ([10.239.13.25]) by fmsmga007.fm.intel.com with ESMTP; 04 Dec 2018 18:06:17 -0800 From: "Guo, Yejun" To: ffmpeg-devel@ffmpeg.org Date: Wed, 5 Dec 2018 17:58:58 +0800 Message-Id: <1544003938-2219-1-git-send-email-yejun.guo@intel.com> X-Mailer: git-send-email 2.7.4 Subject: [FFmpeg-devel] [PATCH] add support for ROI-based encoding X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" this patch is not ask for merge, it is more to get a feature feedback. The encoders such as libx264 support different QPs offset for different MBs, it makes possible for ROI-based encoding. It makes sense to add support within ffmpeg to generate/accept ROI infos and pass into encoders. Typical usage: After AVFrame is decoded, a ffmpeg filter or user's code generates ROI info for that frame, and the encoder finally does the ROI-based encoding. And so I choose to maintain the ROI info within AVFrame struct. TODO: - remove code in vf_scale.c, it is just an example to generate ROI info - use AVBufferRef instead of current implementation within AVFrame struct. - add other encoders support Signed-off-by: Guo, Yejun --- libavcodec/libx264.c | 35 +++++++++++++++++++++++++++++++++++ libavfilter/vf_scale.c | 8 ++++++++ libavutil/frame.c | 9 +++++++++ libavutil/frame.h | 14 ++++++++++++++ 4 files changed, 66 insertions(+) diff --git a/libavcodec/libx264.c b/libavcodec/libx264.c index a68d0a7..d8cc327 100644 --- a/libavcodec/libx264.c +++ b/libavcodec/libx264.c @@ -26,6 +26,7 @@ #include "libavutil/pixdesc.h" #include "libavutil/stereo3d.h" #include "libavutil/intreadwrite.h" +#include "libavutil/avassert.h" #include "avcodec.h" #include "internal.h" @@ -345,6 +346,40 @@ static int X264_frame(AVCodecContext *ctx, AVPacket *pkt, const AVFrame *frame, } } } + + if (frame->nb_rois > 0) { + if (x4->params.rc.i_aq_mode == X264_AQ_NONE) { + av_log(ctx, AV_LOG_ERROR, "Adaptive quantization must be enabled to use ROI encoding, skipping ROI.\n"); + } + if (frame->interlaced_frame == 0) { + const static int MBSIZE = 16; + size_t mbx = (frame->width + MBSIZE - 1) / MBSIZE; + size_t mby = (frame->height + MBSIZE - 1) / MBSIZE; + float* qoffsets = (float*)av_malloc(sizeof(float) * mbx * mby); + memset(qoffsets, 0, sizeof(float) * mbx * mby); + + for (size_t roi = 0; roi < frame->nb_rois; ++roi) { + int starty = FFMIN(mby, frame->rois[roi].top / MBSIZE); + int endy = FFMIN(mby, (frame->rois[roi].bottom + MBSIZE - 1)/ MBSIZE); + int startx = FFMIN(mbx, frame->rois[roi].left / MBSIZE); + int endx = FFMIN(mbx, (frame->rois[roi].right + MBSIZE - 1)/ MBSIZE); + for (int y = starty; y < endy; ++y) { + for (int x = startx; x < endx; ++x) { + qoffsets[x + y*mbx] = frame->rois[roi].qoffset; + } + } + } + + x4->pic.prop.quant_offsets = qoffsets; + x4->pic.prop.quant_offsets_free = av_free; + } else { + av_log(ctx, AV_LOG_ERROR, "interlaced_frame not supported for ROI encoding, skipping ROI.\n"); + } + } else { + //to be removed in the final code, it is just for debug usage now. + printf("ooooops, frame 0x%p with rois %ld\n", frame, frame->nb_rois); + av_assert0(!"should not reach here"); + } } do { diff --git a/libavfilter/vf_scale.c b/libavfilter/vf_scale.c index f741419..71def72 100644 --- a/libavfilter/vf_scale.c +++ b/libavfilter/vf_scale.c @@ -437,6 +437,14 @@ static int filter_frame(AVFilterLink *link, AVFrame *in) return ret; } + // to be removed, just for debug usage temporarily + in->nb_rois = 1; + in->rois[0].top = 0; + in->rois[0].left = 0; + in->rois[0].bottom = in->height; + in->rois[0].right = in->width/2; + in->rois[0].qoffset = 15.0f; // 15.0f, +-5.0f, +-25.0f + if (!scale->sws) return ff_filter_frame(outlink, in); diff --git a/libavutil/frame.c b/libavutil/frame.c index 9b3fb13..9c38bdd 100644 --- a/libavutil/frame.c +++ b/libavutil/frame.c @@ -425,6 +425,15 @@ FF_DISABLE_DEPRECATION_WARNINGS FF_ENABLE_DEPRECATION_WARNINGS #endif + dst->nb_rois = src->nb_rois; + for (int i = 0; i < dst->nb_rois; ++i) { + dst->rois[i].top = src->rois[i].top; + dst->rois[i].bottom = src->rois[i].bottom; + dst->rois[i].left = src->rois[i].left; + dst->rois[i].right = src->rois[i].right; + dst->rois[i].qoffset = src->rois[i].qoffset; + } + av_buffer_unref(&dst->opaque_ref); av_buffer_unref(&dst->private_ref); if (src->opaque_ref) { diff --git a/libavutil/frame.h b/libavutil/frame.h index 66f27f4..b245a90 100644 --- a/libavutil/frame.h +++ b/libavutil/frame.h @@ -193,6 +193,15 @@ typedef struct AVFrameSideData { AVBufferRef *buf; } AVFrameSideData; + +typedef struct AVFrameROI { + size_t top; + size_t bottom; + size_t left; + size_t right; + float qoffset; +} AVFrameROI; + /** * This structure describes decoded (raw) audio or video data. * @@ -556,6 +565,11 @@ typedef struct AVFrame { attribute_deprecated AVBufferRef *qp_table_buf; #endif + + //TODO: AVBufferRef* + AVFrameROI rois[2]; + size_t nb_rois; + /** * For hwaccel-format frames, this should be a reference to the * AVHWFramesContext describing the frame.