From patchwork Fri Nov 25 19:11:45 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philip Langdale X-Patchwork-Id: 1563 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.103.90.1 with SMTP id o1csp538283vsb; Fri, 25 Nov 2016 11:12:01 -0800 (PST) X-Received: by 10.194.141.141 with SMTP id ro13mr8845439wjb.76.1480101121573; Fri, 25 Nov 2016 11:12:01 -0800 (PST) Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id kr2si44048438wjc.288.2016.11.25.11.12.00; Fri, 25 Nov 2016 11:12:01 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@overt.org; dkim=neutral (body hash did not verify) header.i=@overt.org; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3D852689B12; Fri, 25 Nov 2016 21:11:54 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from so254-29.mailgun.net (so254-29.mailgun.net [198.61.254.29]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3E03F68973A for ; Fri, 25 Nov 2016 21:11:47 +0200 (EET) DKIM-Signature: a=rsa-sha256; v=1; c=relaxed/relaxed; d=overt.org; q=dns/txt; s=k1; t=1480101110; h=Message-Id: Date: Subject: Cc: To: From: Sender; bh=BgV9fSZJ28O+xxkkR1wJV2ev4sKAAzfjd9EF8714Zkc=; b=UWzCvlUGgsgJBQON4spFH7bdKsW2JfQLProLTM1QMHNlVhzC7iIfcRwMC6uaTvbP5LP9r3kp a95eiR5SquMNoyR7dS5DRd+4jSuZv2YzYilpfXgIO6vxR4KLmS3iQVJZXyxWoRNkX2rCpXXk 5SHMFS6YhVcQ22wjXhTsPsdhEjU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=overt.org; s=k1; q=dns; h=Sender: From: To: Cc: Subject: Date: Message-Id; b=Txk4ewrmyK2wMcFTyc93EqQ/Unntw3e4lc6ilY7WlJ8sdRMnUmepIK/5Tkx6ll6f5H+pSu R/YM6j2qC9pC5oKO966huN8q5iL1MN6Xg7/6fzoVBNWqd8IGOftKo7MyrUmXuuYJUQmMsG7Q Po8t9vPqzVgVBB2k5qscmInOdsKCY= X-Mailgun-Sending-Ip: 198.61.254.29 X-Mailgun-Sid: WyIyM2Q3MCIsICJmZm1wZWctZGV2ZWxAZmZtcGVnLm9yZyIsICI0YTg5NjEiXQ== Received: from mail.overt.org (155.208.178.107.bc.googleusercontent.com [107.178.208.155]) by mxa.mailgun.org with ESMTP id 58388cf6.7f6ad2514d88-smtp-out-n01; Fri, 25 Nov 2016 19:11:50 -0000 (UTC) Received: from authenticated-user (mail.overt.org [107.178.208.155]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.overt.org (Postfix) with ESMTPSA id A9B9761777; Fri, 25 Nov 2016 19:11:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=overt.org; s=mail; t=1480101109; bh=Mw1gEotTvbFcVc2HiElMg3clYUPvzlVvqJtTHdRziJY=; h=From:To:Cc:Subject:Date:From; b=V2TqetzQ1RjyXPcyRBS2C28IAR5mzDHtApcsadzJH4nbLXQxVIpwMuIewZzWvOi70 Rz1wMA3oRxbHhHQqsCJO6mG08MINiHWMNE63nTrGFYGfdImHEcGlNXYYnHez/Tp/vg ht+wxn72hN7Aog0HtaFsFMcJABLlD0iyTqxDYqG+5Tc3oc7rFZ17YxcPxkPFQ1dOh8 97m4TK4w63RC7O2+ulBwvcMNjCZv2Pg0vSD4HyDi/d0L7oOBMEOHHiJ6NAo7cztFUy aA0P/DbD7PV7D8Jbnr5V9beMjzqjl3yL+J0p2GNmL/6c0AUX/dm98C7Z65+5YaJGeG kWsdyaEqm7ztA== From: Philip Langdale To: ffmpeg-devel@ffmpeg.org Date: Fri, 25 Nov 2016 11:11:45 -0800 Message-Id: <20161125191145.32597-1-philipl@overt.org> Subject: [FFmpeg-devel] [PATCH] avcodec/nvenc: Delay identification of underlying format of cuda frames X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Philip Langdale MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" When input surfaces are cuda frames, we will not know what the actual underlying format (nv12, p010, etc) is at surface allocation time. On the other hand, we will know when the input frames are actually registered and associated with a surface. So, let's delay format discovery until registration time, which is actually how we handle other frame properties, such as dimensions. By itself, this change doesn't allow for transcoding of 10bit content from cuvid, but it reduces the problem to the hardcoding of the sw format in ffmpeg_cuvid.c Signed-off-by: Philip Langdale --- libavcodec/nvenc.c | 71 ++++++++++++++++++++++++++++-------------------------- 1 file changed, 37 insertions(+), 34 deletions(-) diff --git a/libavcodec/nvenc.c b/libavcodec/nvenc.c index d24d278..2353161 100644 --- a/libavcodec/nvenc.c +++ b/libavcodec/nvenc.c @@ -28,6 +28,7 @@ #include "libavutil/imgutils.h" #include "libavutil/avassert.h" #include "libavutil/mem.h" +#include "libavutil/pixdesc.h" #include "internal.h" #define NVENC_CAP 0x30 @@ -1009,49 +1010,37 @@ static av_cold int nvenc_setup_encoder(AVCodecContext *avctx) return 0; } -static av_cold int nvenc_alloc_surface(AVCodecContext *avctx, int idx) +static NV_ENC_BUFFER_FORMAT nvenc_map_buffer_format(enum AVPixelFormat pix_fmt) { - NvencContext *ctx = avctx->priv_data; - NvencDynLoadFunctions *dl_fn = &ctx->nvenc_dload_funcs; - NV_ENCODE_API_FUNCTION_LIST *p_nvenc = &dl_fn->nvenc_funcs; - - NVENCSTATUS nv_status; - NV_ENC_CREATE_BITSTREAM_BUFFER allocOut = { 0 }; - allocOut.version = NV_ENC_CREATE_BITSTREAM_BUFFER_VER; - - switch (ctx->data_pix_fmt) { + switch (pix_fmt) { case AV_PIX_FMT_YUV420P: - ctx->surfaces[idx].format = NV_ENC_BUFFER_FORMAT_YV12_PL; - break; - + return NV_ENC_BUFFER_FORMAT_YV12_PL; case AV_PIX_FMT_NV12: - ctx->surfaces[idx].format = NV_ENC_BUFFER_FORMAT_NV12_PL; - break; - + return NV_ENC_BUFFER_FORMAT_NV12_PL; case AV_PIX_FMT_P010: - ctx->surfaces[idx].format = NV_ENC_BUFFER_FORMAT_YUV420_10BIT; - break; - + return NV_ENC_BUFFER_FORMAT_YUV420_10BIT; case AV_PIX_FMT_YUV444P: - ctx->surfaces[idx].format = NV_ENC_BUFFER_FORMAT_YUV444_PL; - break; - + return NV_ENC_BUFFER_FORMAT_YUV444_PL; case AV_PIX_FMT_YUV444P16: - ctx->surfaces[idx].format = NV_ENC_BUFFER_FORMAT_YUV444_10BIT; - break; - + return NV_ENC_BUFFER_FORMAT_YUV444_10BIT; case AV_PIX_FMT_0RGB32: - ctx->surfaces[idx].format = NV_ENC_BUFFER_FORMAT_ARGB; - break; - + return NV_ENC_BUFFER_FORMAT_ARGB; case AV_PIX_FMT_0BGR32: - ctx->surfaces[idx].format = NV_ENC_BUFFER_FORMAT_ABGR; - break; - + return NV_ENC_BUFFER_FORMAT_ABGR; default: - av_log(avctx, AV_LOG_FATAL, "Invalid input pixel format\n"); - return AVERROR(EINVAL); + return NV_ENC_BUFFER_FORMAT_UNDEFINED; } +} + +static av_cold int nvenc_alloc_surface(AVCodecContext *avctx, int idx) +{ + NvencContext *ctx = avctx->priv_data; + NvencDynLoadFunctions *dl_fn = &ctx->nvenc_dload_funcs; + NV_ENCODE_API_FUNCTION_LIST *p_nvenc = &dl_fn->nvenc_funcs; + + NVENCSTATUS nv_status; + NV_ENC_CREATE_BITSTREAM_BUFFER allocOut = { 0 }; + allocOut.version = NV_ENC_CREATE_BITSTREAM_BUFFER_VER; if (avctx->pix_fmt == AV_PIX_FMT_CUDA) { ctx->surfaces[idx].in_ref = av_frame_alloc(); @@ -1059,6 +1048,14 @@ static av_cold int nvenc_alloc_surface(AVCodecContext *avctx, int idx) return AVERROR(ENOMEM); } else { NV_ENC_CREATE_INPUT_BUFFER allocSurf = { 0 }; + + ctx->surfaces[idx].format = nvenc_map_buffer_format(ctx->data_pix_fmt); + if (ctx->surfaces[idx].format == NV_ENC_BUFFER_FORMAT_UNDEFINED) { + av_log(avctx, AV_LOG_FATAL, "Invalid input pixel format: %s\n", + av_get_pix_fmt_name(ctx->data_pix_fmt)); + return AVERROR(EINVAL); + } + allocSurf.version = NV_ENC_CREATE_INPUT_BUFFER_VER; allocSurf.width = (avctx->width + 31) & ~31; allocSurf.height = (avctx->height + 31) & ~31; @@ -1351,10 +1348,16 @@ static int nvenc_register_frame(AVCodecContext *avctx, const AVFrame *frame) reg.resourceType = NV_ENC_INPUT_RESOURCE_TYPE_CUDADEVICEPTR; reg.width = frames_ctx->width; reg.height = frames_ctx->height; - reg.bufferFormat = ctx->surfaces[0].format; reg.pitch = frame->linesize[0]; reg.resourceToRegister = frame->data[0]; + reg.bufferFormat = nvenc_map_buffer_format(frames_ctx->sw_format); + if (reg.bufferFormat == NV_ENC_BUFFER_FORMAT_UNDEFINED) { + av_log(avctx, AV_LOG_FATAL, "Invalid input pixel format: %s\n", + av_get_pix_fmt_name(frames_ctx->sw_format)); + return AVERROR(EINVAL); + } + ret = p_nvenc->nvEncRegisterResource(ctx->nvencoder, ®); if (ret != NV_ENC_SUCCESS) { nvenc_print_error(avctx, ret, "Error registering an input resource");