From patchwork Fri Nov 25 19:11:45 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Philip Langdale <philipl@overt.org>
X-Patchwork-Id: 1563
Delivered-To: ffmpegpatchwork@gmail.com
Received: by 10.103.90.1 with SMTP id o1csp538283vsb;
	Fri, 25 Nov 2016 11:12:01 -0800 (PST)
X-Received: by 10.194.141.141 with SMTP id ro13mr8845439wjb.76.1480101121573;
	Fri, 25 Nov 2016 11:12:01 -0800 (PST)
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100])
	by mx.google.com with ESMTP id
	kr2si44048438wjc.288.2016.11.25.11.12.00;
	Fri, 25 Nov 2016 11:12:01 -0800 (PST)
Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
	designates 79.124.17.100 as permitted sender)
	client-ip=79.124.17.100;
Authentication-Results: mx.google.com;
	dkim=neutral (body hash did not verify) header.i=@overt.org;
	dkim=neutral (body hash did not verify) header.i=@overt.org;
	spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
	designates 79.124.17.100 as permitted sender)
	smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3D852689B12;
	Fri, 25 Nov 2016 21:11:54 +0200 (EET)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from so254-29.mailgun.net (so254-29.mailgun.net [198.61.254.29])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3E03F68973A
	for <ffmpeg-devel@ffmpeg.org>; Fri, 25 Nov 2016 21:11:47 +0200 (EET)
DKIM-Signature: a=rsa-sha256; v=1; c=relaxed/relaxed; d=overt.org; q=dns/txt;
	s=k1;
	t=1480101110; h=Message-Id: Date: Subject: Cc: To: From: Sender;
	bh=BgV9fSZJ28O+xxkkR1wJV2ev4sKAAzfjd9EF8714Zkc=;
	b=UWzCvlUGgsgJBQON4spFH7bdKsW2JfQLProLTM1QMHNlVhzC7iIfcRwMC6uaTvbP5LP9r3kp
	a95eiR5SquMNoyR7dS5DRd+4jSuZv2YzYilpfXgIO6vxR4KLmS3iQVJZXyxWoRNkX2rCpXXk
	5SHMFS6YhVcQ22wjXhTsPsdhEjU=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=overt.org; s=k1; q=dns;
	h=Sender: From: To: Cc: Subject: Date: Message-Id;
	b=Txk4ewrmyK2wMcFTyc93EqQ/Unntw3e4lc6ilY7WlJ8sdRMnUmepIK/5Tkx6ll6f5H+pSu
	R/YM6j2qC9pC5oKO966huN8q5iL1MN6Xg7/6fzoVBNWqd8IGOftKo7MyrUmXuuYJUQmMsG7Q
	Po8t9vPqzVgVBB2k5qscmInOdsKCY=
X-Mailgun-Sending-Ip: 198.61.254.29
X-Mailgun-Sid: 
 WyIyM2Q3MCIsICJmZm1wZWctZGV2ZWxAZmZtcGVnLm9yZyIsICI0YTg5NjEiXQ==
Received: from mail.overt.org (155.208.178.107.bc.googleusercontent.com
	[107.178.208.155])
	by mxa.mailgun.org with ESMTP id 58388cf6.7f6ad2514d88-smtp-out-n01;
	Fri, 25 Nov 2016 19:11:50 -0000 (UTC)
Received: from authenticated-user (mail.overt.org [107.178.208.155])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128
	bits)) (No client certificate requested)
	by mail.overt.org (Postfix) with ESMTPSA id A9B9761777;
	Fri, 25 Nov 2016 19:11:49 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=overt.org; s=mail;
	t=1480101109; bh=Mw1gEotTvbFcVc2HiElMg3clYUPvzlVvqJtTHdRziJY=;
	h=From:To:Cc:Subject:Date:From;
	b=V2TqetzQ1RjyXPcyRBS2C28IAR5mzDHtApcsadzJH4nbLXQxVIpwMuIewZzWvOi70
	Rz1wMA3oRxbHhHQqsCJO6mG08MINiHWMNE63nTrGFYGfdImHEcGlNXYYnHez/Tp/vg
	ht+wxn72hN7Aog0HtaFsFMcJABLlD0iyTqxDYqG+5Tc3oc7rFZ17YxcPxkPFQ1dOh8
	97m4TK4w63RC7O2+ulBwvcMNjCZv2Pg0vSD4HyDi/d0L7oOBMEOHHiJ6NAo7cztFUy
	aA0P/DbD7PV7D8Jbnr5V9beMjzqjl3yL+J0p2GNmL/6c0AUX/dm98C7Z65+5YaJGeG
	kWsdyaEqm7ztA==
From: Philip Langdale <philipl@overt.org>
To: ffmpeg-devel@ffmpeg.org
Date: Fri, 25 Nov 2016 11:11:45 -0800
Message-Id: <20161125191145.32597-1-philipl@overt.org>
Subject: [FFmpeg-devel] [PATCH] avcodec/nvenc: Delay identification of
	underlying format of cuda frames
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <http://ffmpeg.org/mailman/options/ffmpeg-devel>,
	<mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <http://ffmpeg.org/pipermail/ffmpeg-devel/>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <http://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
	<mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches
	<ffmpeg-devel@ffmpeg.org>
Cc: Philip Langdale <philipl@overt.org>
MIME-Version: 1.0
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>

When input surfaces are cuda frames, we will not know what the actual
underlying format (nv12, p010, etc) is at surface allocation time.

On the other hand, we will know when the input frames are actually
registered and associated with a surface.

So, let's delay format discovery until registration time, which is
actually how we handle other frame properties, such as dimensions.

By itself, this change doesn't allow for transcoding of 10bit
content from cuvid, but it reduces the problem to the hardcoding of
the sw format in ffmpeg_cuvid.c

Signed-off-by: Philip Langdale <philipl@overt.org>
---
 libavcodec/nvenc.c | 71 ++++++++++++++++++++++++++++--------------------------
 1 file changed, 37 insertions(+), 34 deletions(-)

diff --git a/libavcodec/nvenc.c b/libavcodec/nvenc.c
index d24d278..2353161 100644
--- a/libavcodec/nvenc.c
+++ b/libavcodec/nvenc.c
@@ -28,6 +28,7 @@
 #include "libavutil/imgutils.h"
 #include "libavutil/avassert.h"
 #include "libavutil/mem.h"
+#include "libavutil/pixdesc.h"
 #include "internal.h"
 
 #define NVENC_CAP 0x30
@@ -1009,49 +1010,37 @@ static av_cold int nvenc_setup_encoder(AVCodecContext *avctx)
     return 0;
 }
 
-static av_cold int nvenc_alloc_surface(AVCodecContext *avctx, int idx)
+static NV_ENC_BUFFER_FORMAT nvenc_map_buffer_format(enum AVPixelFormat pix_fmt)
 {
-    NvencContext *ctx = avctx->priv_data;
-    NvencDynLoadFunctions *dl_fn = &ctx->nvenc_dload_funcs;
-    NV_ENCODE_API_FUNCTION_LIST *p_nvenc = &dl_fn->nvenc_funcs;
-
-    NVENCSTATUS nv_status;
-    NV_ENC_CREATE_BITSTREAM_BUFFER allocOut = { 0 };
-    allocOut.version = NV_ENC_CREATE_BITSTREAM_BUFFER_VER;
-
-    switch (ctx->data_pix_fmt) {
+    switch (pix_fmt) {
     case AV_PIX_FMT_YUV420P:
-        ctx->surfaces[idx].format = NV_ENC_BUFFER_FORMAT_YV12_PL;
-        break;
-
+        return NV_ENC_BUFFER_FORMAT_YV12_PL;
     case AV_PIX_FMT_NV12:
-        ctx->surfaces[idx].format = NV_ENC_BUFFER_FORMAT_NV12_PL;
-        break;
-
+        return NV_ENC_BUFFER_FORMAT_NV12_PL;
     case AV_PIX_FMT_P010:
-        ctx->surfaces[idx].format = NV_ENC_BUFFER_FORMAT_YUV420_10BIT;
-        break;
-
+        return NV_ENC_BUFFER_FORMAT_YUV420_10BIT;
     case AV_PIX_FMT_YUV444P:
-        ctx->surfaces[idx].format = NV_ENC_BUFFER_FORMAT_YUV444_PL;
-        break;
-
+        return NV_ENC_BUFFER_FORMAT_YUV444_PL;
     case AV_PIX_FMT_YUV444P16:
-        ctx->surfaces[idx].format = NV_ENC_BUFFER_FORMAT_YUV444_10BIT;
-        break;
-
+        return NV_ENC_BUFFER_FORMAT_YUV444_10BIT;
     case AV_PIX_FMT_0RGB32:
-        ctx->surfaces[idx].format = NV_ENC_BUFFER_FORMAT_ARGB;
-        break;
-
+        return NV_ENC_BUFFER_FORMAT_ARGB;
     case AV_PIX_FMT_0BGR32:
-        ctx->surfaces[idx].format = NV_ENC_BUFFER_FORMAT_ABGR;
-        break;
-
+        return NV_ENC_BUFFER_FORMAT_ABGR;
     default:
-        av_log(avctx, AV_LOG_FATAL, "Invalid input pixel format\n");
-        return AVERROR(EINVAL);
+        return NV_ENC_BUFFER_FORMAT_UNDEFINED;
     }
+}
+
+static av_cold int nvenc_alloc_surface(AVCodecContext *avctx, int idx)
+{
+    NvencContext *ctx = avctx->priv_data;
+    NvencDynLoadFunctions *dl_fn = &ctx->nvenc_dload_funcs;
+    NV_ENCODE_API_FUNCTION_LIST *p_nvenc = &dl_fn->nvenc_funcs;
+
+    NVENCSTATUS nv_status;
+    NV_ENC_CREATE_BITSTREAM_BUFFER allocOut = { 0 };
+    allocOut.version = NV_ENC_CREATE_BITSTREAM_BUFFER_VER;
 
     if (avctx->pix_fmt == AV_PIX_FMT_CUDA) {
         ctx->surfaces[idx].in_ref = av_frame_alloc();
@@ -1059,6 +1048,14 @@ static av_cold int nvenc_alloc_surface(AVCodecContext *avctx, int idx)
             return AVERROR(ENOMEM);
     } else {
         NV_ENC_CREATE_INPUT_BUFFER allocSurf = { 0 };
+
+        ctx->surfaces[idx].format = nvenc_map_buffer_format(ctx->data_pix_fmt);
+        if (ctx->surfaces[idx].format == NV_ENC_BUFFER_FORMAT_UNDEFINED) {
+            av_log(avctx, AV_LOG_FATAL, "Invalid input pixel format: %s\n",
+                   av_get_pix_fmt_name(ctx->data_pix_fmt));
+            return AVERROR(EINVAL);
+        }
+
         allocSurf.version = NV_ENC_CREATE_INPUT_BUFFER_VER;
         allocSurf.width = (avctx->width + 31) & ~31;
         allocSurf.height = (avctx->height + 31) & ~31;
@@ -1351,10 +1348,16 @@ static int nvenc_register_frame(AVCodecContext *avctx, const AVFrame *frame)
     reg.resourceType       = NV_ENC_INPUT_RESOURCE_TYPE_CUDADEVICEPTR;
     reg.width              = frames_ctx->width;
     reg.height             = frames_ctx->height;
-    reg.bufferFormat       = ctx->surfaces[0].format;
     reg.pitch              = frame->linesize[0];
     reg.resourceToRegister = frame->data[0];
 
+    reg.bufferFormat       = nvenc_map_buffer_format(frames_ctx->sw_format);
+    if (reg.bufferFormat == NV_ENC_BUFFER_FORMAT_UNDEFINED) {
+        av_log(avctx, AV_LOG_FATAL, "Invalid input pixel format: %s\n",
+               av_get_pix_fmt_name(frames_ctx->sw_format));
+        return AVERROR(EINVAL);
+    }
+
     ret = p_nvenc->nvEncRegisterResource(ctx->nvencoder, &reg);
     if (ret != NV_ENC_SUCCESS) {
         nvenc_print_error(avctx, ret, "Error registering an input resource");