From patchwork Sun Nov 12 09:30:21 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hendrik Leppkes X-Patchwork-Id: 6007 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.2.161.94 with SMTP id m30csp909871jah; Sun, 12 Nov 2017 01:30:33 -0800 (PST) X-Google-Smtp-Source: AGs4zMa3a53eEIIq/8pc+tyzQSjoCDE/DeIGori0VeyarmSHCV1WwIRRxfqGKM4269KBeNjh73hP X-Received: by 10.223.169.21 with SMTP id u21mr4161630wrc.30.1510479033818; Sun, 12 Nov 2017 01:30:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1510479033; cv=none; d=google.com; s=arc-20160816; b=QyiFVl0lAzjO/7alAFQqx1Q1u3mg16mDktgzyIbqZJVc7JXe5xThtppbO1s1j+YUYm yfQTKRvfACVlLmODQDQGFqmYV5WMeEwH+qZBErqv+FrcOM+t2mAJXYyxHYCSvJ3tvb/7 gpLeond96LPcqAIZXCuHgq0xEptXOkjVSnvtouF8yJCnOEKwuBsL5A+SfEMf7NL/UZkI xSok829uFsBAxV9XwcvmvImLAgXOj/DKFHuT9EqXJdd2CoZkxPoEOndhhOpELkgAkc9M qysId1yGPM1B6qIJR4DouNVoLziwbRyfY4TV0bBNOT6mHRFeaB04hjUSF+tm5aWZxXLv 0s9w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:references:in-reply-to:message-id:date :to:from:dkim-signature:delivered-to:arc-authentication-results; bh=0da0krsIjMMsyI7iqEmKfGhNSTEhAzo/c7uMwobDTCM=; b=tBbwioBGr7M3srFDm/VspXBTijwWYadulGtk+ky505NIhJCgxfJYOGllzRpw4duxjz J4bmaMTBcj3f64d0KcM3CHO61mNuZhayoVGbGyD/B6X2Guq8VtSkXDaJqSJZ4UdtJJSE w2UBo0MTG1oJwuKrVt/GKhHsFmZgwDpbvUvap0zmlawBAUZ6yfNqxLWMwzpzzx5H1ind c4toaspCoW/3BylLKzFhqqp0A6ouMhGAI85v/DszFJMDuyujBFKEr6nukGZguCyxXOBC dlTIFPo/OJ6dRb9pIq2g8xU3sNzFK76jfPYlGhBe5zrFXDC5D0XZ4zOr0HFr/ApuITLr RxBQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20161025 header.b=jGudjXi/; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 140si4280598wmp.193.2017.11.12.01.30.32; Sun, 12 Nov 2017 01:30:33 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20161025 header.b=jGudjXi/; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 7CA0068A355; Sun, 12 Nov 2017 11:30:17 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wr0-f193.google.com (mail-wr0-f193.google.com [209.85.128.193]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 6F1DF68A2C0 for ; Sun, 12 Nov 2017 11:30:11 +0200 (EET) Received: by mail-wr0-f193.google.com with SMTP id l8so11927299wre.12 for ; Sun, 12 Nov 2017 01:30:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id:in-reply-to:references; bh=m5zZWDUayJ1EOJxM3U87zNoh2NM77aDdbyNq1qr2Xo0=; b=jGudjXi/k5Vvq8RDrSWxQOLVOwRks9i5XKKT71ngEedc1je3+vi7psmRiwz3tSZXwJ 6bset+IhZwUVFYSK+V0COdum2B6Xuao/zWC4nFOb3xOnfmYy1w2vuS3DQ5R+BKRKTONd IRDEhIAKX2CeAyNT4dwYurbzyZ6OqNW7MVYdj6NS+S7w5FQc1+cwX1egg3Ea3d00GqBm bL6gqwXHxZNTYmSnpizxgYJ9Ul2Z3tvEqGSKZYso6DwfYQkdzJdQIru4nuxuXKU+LlF/ iBF4pHKtQXI2eSYtRyOMhVGKPsGkseeTfgtCt5Gz0Gg/ktuDjBTDXOf7pv5RlDokrDCA rXHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=m5zZWDUayJ1EOJxM3U87zNoh2NM77aDdbyNq1qr2Xo0=; b=af12749p5wt7cS7UKcBoiCVnl3qhK/gvtCuzRSz5XjnzABzMP1bIk767Yd/gaLYiVQ yKlbcdHwcqAHkORzSiOjb9RoCdTWhXSCb+H223w2Rtv7STpJst9jRI/W87VECA8uGglK CLoCPHBBXCdKFaP/n3t4fqcaWJIRDl1eUTXx8GLhTNkeNiSGDHgKc5uQNBetqyf6VC7G vW7xUYXVqojkGW2KCFg/IltBDbOLvo2OAlsUJ3vN2T07NS+/w7RJ/v9HATsSVCnXXQ56 fQ2NVfeoI0F4Gy5XOXEL8meYL01jkqwErbhbjykinJkt+Ro1xNCItI2g+lKEeiAjeWaL bs9g== X-Gm-Message-State: AJaThX4W0Z8VxEMymqgNGexuHPm4tq1bXjNXtuUQeFhhzcbecyozrQ8P xXERZCkDXrgYDMShPC7hGVCxN+MG X-Received: by 10.223.199.205 with SMTP id y13mr4761355wrg.71.1510479024733; Sun, 12 Nov 2017 01:30:24 -0800 (PST) Received: from localhost (p4FC4F36E.dip0.t-ipconnect.de. [79.196.243.110]) by smtp.gmail.com with ESMTPSA id t200sm5195294wmd.45.2017.11.12.01.30.23 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 12 Nov 2017 01:30:24 -0800 (PST) From: Hendrik Leppkes To: ffmpeg-devel@ffmpeg.org Date: Sun, 12 Nov 2017 10:30:21 +0100 Message-Id: <20171112093021.23484-2-h.leppkes@gmail.com> X-Mailer: git-send-email 2.13.2.windows.1 In-Reply-To: <20171112093021.23484-1-h.leppkes@gmail.com> References: <20171112093021.23484-1-h.leppkes@gmail.com> Subject: [FFmpeg-devel] [PATCH 2/2] nvenc: support d3d11 surface input X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" --- libavcodec/nvenc.c | 106 ++++++++++++++++++++++++++++++++++++++++++----------- libavcodec/nvenc.h | 11 +++++- 2 files changed, 95 insertions(+), 22 deletions(-) diff --git a/libavcodec/nvenc.c b/libavcodec/nvenc.c index c685d973c1..eba59634f6 100644 --- a/libavcodec/nvenc.c +++ b/libavcodec/nvenc.c @@ -45,6 +45,9 @@ const enum AVPixelFormat ff_nvenc_pix_fmts[] = { AV_PIX_FMT_0RGB32, AV_PIX_FMT_0BGR32, AV_PIX_FMT_CUDA, +#if CONFIG_D3D11VA + AV_PIX_FMT_D3D11, +#endif AV_PIX_FMT_NONE }; @@ -172,6 +175,9 @@ static int nvenc_push_context(AVCodecContext *avctx) NvencDynLoadFunctions *dl_fn = &ctx->nvenc_dload_funcs; CUresult cu_res; + if (ctx->d3d11_device) + return 0; + cu_res = dl_fn->cuda_dl->cuCtxPushCurrent(ctx->cu_context); if (cu_res != CUDA_SUCCESS) { av_log(avctx, AV_LOG_ERROR, "cuCtxPushCurrent failed\n"); @@ -188,6 +194,9 @@ static int nvenc_pop_context(AVCodecContext *avctx) CUresult cu_res; CUcontext dummy; + if (ctx->d3d11_device) + return 0; + cu_res = dl_fn->cuda_dl->cuCtxPopCurrent(&dummy); if (cu_res != CUDA_SUCCESS) { av_log(avctx, AV_LOG_ERROR, "cuCtxPopCurrent failed\n"); @@ -206,8 +215,16 @@ static av_cold int nvenc_open_session(AVCodecContext *avctx) params.version = NV_ENC_OPEN_ENCODE_SESSION_EX_PARAMS_VER; params.apiVersion = NVENCAPI_VERSION; - params.device = ctx->cu_context; - params.deviceType = NV_ENC_DEVICE_TYPE_CUDA; + if (ctx->d3d11_device) + { + params.device = ctx->d3d11_device; + params.deviceType = NV_ENC_DEVICE_TYPE_DIRECTX; + } + else + { + params.device = ctx->cu_context; + params.deviceType = NV_ENC_DEVICE_TYPE_CUDA; + } ret = p_nvenc->nvEncOpenEncodeSessionEx(¶ms, &ctx->nvencoder); if (ret != NV_ENC_SUCCESS) { @@ -458,23 +475,48 @@ static av_cold int nvenc_setup_device(AVCodecContext *avctx) return AVERROR_BUG; } - if (avctx->pix_fmt == AV_PIX_FMT_CUDA || avctx->hw_frames_ctx || avctx->hw_device_ctx) { + if (avctx->pix_fmt == AV_PIX_FMT_CUDA || avctx->pix_fmt == AV_PIX_FMT_D3D11 || avctx->hw_frames_ctx || avctx->hw_device_ctx) { AVHWFramesContext *frames_ctx; AVHWDeviceContext *hwdev_ctx; - AVCUDADeviceContext *device_hwctx; + AVCUDADeviceContext *cuda_device_hwctx = NULL; +#if CONFIG_D3D11VA + AVD3D11VADeviceContext *d3d11_device_hwctx = NULL; +#endif int ret; if (avctx->hw_frames_ctx) { frames_ctx = (AVHWFramesContext*)avctx->hw_frames_ctx->data; - device_hwctx = frames_ctx->device_ctx->hwctx; + if (frames_ctx->format == AV_PIX_FMT_CUDA) + cuda_device_hwctx = frames_ctx->device_ctx->hwctx; +#if CONFIG_D3D11VA + else if (frames_ctx->format == AV_PIX_FMT_D3D11) + d3d11_device_hwctx = frames_ctx->device_ctx->hwctx; +#endif + else + return AVERROR(EINVAL); } else if (avctx->hw_device_ctx) { hwdev_ctx = (AVHWDeviceContext*)avctx->hw_device_ctx->data; - device_hwctx = hwdev_ctx->hwctx; + if (hwdev_ctx->type == AV_HWDEVICE_TYPE_CUDA) + cuda_device_hwctx = hwdev_ctx->hwctx; +#if CONFIG_D3D11VA + else if (hwdev_ctx->type == AV_HWDEVICE_TYPE_D3D11VA) + d3d11_device_hwctx = hwdev_ctx->hwctx; +#endif + else + return AVERROR(EINVAL); } else { return AVERROR(EINVAL); } - ctx->cu_context = device_hwctx->cuda_ctx; + if (cuda_device_hwctx) { + ctx->cu_context = cuda_device_hwctx->cuda_ctx; + } +#if CONFIG_D3D11VA + else if (d3d11_device_hwctx) { + ctx->d3d11_device = d3d11_device_hwctx->device; + ID3D11Device_AddRef(ctx->d3d11_device); + } +#endif ret = nvenc_open_session(avctx); if (ret < 0) @@ -1205,7 +1247,7 @@ static av_cold int nvenc_alloc_surface(AVCodecContext *avctx, int idx) NV_ENC_CREATE_BITSTREAM_BUFFER allocOut = { 0 }; allocOut.version = NV_ENC_CREATE_BITSTREAM_BUFFER_VER; - if (avctx->pix_fmt == AV_PIX_FMT_CUDA) { + if (avctx->pix_fmt == AV_PIX_FMT_CUDA || avctx->pix_fmt == AV_PIX_FMT_D3D11) { ctx->surfaces[idx].in_ref = av_frame_alloc(); if (!ctx->surfaces[idx].in_ref) return AVERROR(ENOMEM); @@ -1237,7 +1279,7 @@ static av_cold int nvenc_alloc_surface(AVCodecContext *avctx, int idx) nv_status = p_nvenc->nvEncCreateBitstreamBuffer(ctx->nvencoder, &allocOut); if (nv_status != NV_ENC_SUCCESS) { int err = nvenc_print_error(avctx, nv_status, "CreateBitstreamBuffer failed"); - if (avctx->pix_fmt != AV_PIX_FMT_CUDA) + if (avctx->pix_fmt != AV_PIX_FMT_CUDA && avctx->pix_fmt != AV_PIX_FMT_D3D11) p_nvenc->nvEncDestroyInputBuffer(ctx->nvencoder, ctx->surfaces[idx].input_surface); av_frame_free(&ctx->surfaces[idx].in_ref); return err; @@ -1351,7 +1393,7 @@ av_cold int ff_nvenc_encode_close(AVCodecContext *avctx) av_fifo_freep(&ctx->output_surface_queue); av_fifo_freep(&ctx->unused_surface_queue); - if (ctx->surfaces && avctx->pix_fmt == AV_PIX_FMT_CUDA) { + if (ctx->surfaces && (avctx->pix_fmt == AV_PIX_FMT_CUDA || avctx->pix_fmt == AV_PIX_FMT_D3D11)) { for (i = 0; i < ctx->nb_surfaces; ++i) { if (ctx->surfaces[i].input_surface) { p_nvenc->nvEncUnmapInputResource(ctx->nvencoder, ctx->surfaces[i].in_map.mappedResource); @@ -1366,7 +1408,7 @@ av_cold int ff_nvenc_encode_close(AVCodecContext *avctx) if (ctx->surfaces) { for (i = 0; i < ctx->nb_surfaces; ++i) { - if (avctx->pix_fmt != AV_PIX_FMT_CUDA) + if (avctx->pix_fmt != AV_PIX_FMT_CUDA && avctx->pix_fmt != AV_PIX_FMT_D3D11) p_nvenc->nvEncDestroyInputBuffer(ctx->nvencoder, ctx->surfaces[i].input_surface); av_frame_free(&ctx->surfaces[i].in_ref); p_nvenc->nvEncDestroyBitstreamBuffer(ctx->nvencoder, ctx->surfaces[i].output_surface); @@ -1388,6 +1430,13 @@ av_cold int ff_nvenc_encode_close(AVCodecContext *avctx) dl_fn->cuda_dl->cuCtxDestroy(ctx->cu_context_internal); ctx->cu_context = ctx->cu_context_internal = NULL; +#if CONFIG_D3D11VA + if (ctx->d3d11_device) { + ID3D11Device_Release(ctx->d3d11_device); + ctx->d3d11_device = NULL; + } +#endif + nvenc_free_functions(&dl_fn->nvenc_dl); cuda_free_functions(&dl_fn->cuda_dl); @@ -1403,7 +1452,7 @@ av_cold int ff_nvenc_encode_init(AVCodecContext *avctx) NvencContext *ctx = avctx->priv_data; int ret; - if (avctx->pix_fmt == AV_PIX_FMT_CUDA) { + if (avctx->pix_fmt == AV_PIX_FMT_CUDA || avctx->pix_fmt == AV_PIX_FMT_D3D11) { AVHWFramesContext *frames_ctx; if (!avctx->hw_frames_ctx) { av_log(avctx, AV_LOG_ERROR, @@ -1411,6 +1460,11 @@ av_cold int ff_nvenc_encode_init(AVCodecContext *avctx) return AVERROR(EINVAL); } frames_ctx = (AVHWFramesContext*)avctx->hw_frames_ctx->data; + if (frames_ctx->format != avctx->pix_fmt) { + av_log(avctx, AV_LOG_ERROR, + "hw_frames_ctx must match the GPU frame type\n"); + return AVERROR(EINVAL); + } ctx->data_pix_fmt = frames_ctx->sw_format; } else { ctx->data_pix_fmt = avctx->pix_fmt; @@ -1516,7 +1570,9 @@ static int nvenc_register_frame(AVCodecContext *avctx, const AVFrame *frame) int i, idx, ret; for (i = 0; i < ctx->nb_registered_frames; i++) { - if (ctx->registered_frames[i].ptr == (CUdeviceptr)frame->data[0]) + if (avctx->pix_fmt == AV_PIX_FMT_CUDA && ctx->registered_frames[i].ptr == frame->data[0]) + return i; + else if (avctx->pix_fmt == AV_PIX_FMT_D3D11 && ctx->registered_frames[i].ptr == frame->data[0] && ctx->registered_frames[i].ptr_index == (intptr_t)frame->data[1]) return i; } @@ -1525,12 +1581,19 @@ static int nvenc_register_frame(AVCodecContext *avctx, const AVFrame *frame) return idx; reg.version = NV_ENC_REGISTER_RESOURCE_VER; - reg.resourceType = NV_ENC_INPUT_RESOURCE_TYPE_CUDADEVICEPTR; reg.width = frames_ctx->width; reg.height = frames_ctx->height; reg.pitch = frame->linesize[0]; reg.resourceToRegister = frame->data[0]; + if (avctx->pix_fmt == AV_PIX_FMT_CUDA) { + reg.resourceType = NV_ENC_INPUT_RESOURCE_TYPE_CUDADEVICEPTR; + } + else if (avctx->pix_fmt == AV_PIX_FMT_D3D11) { + reg.resourceType = NV_ENC_INPUT_RESOURCE_TYPE_DIRECTX; + reg.subResourceIndex = (intptr_t)frame->data[1]; + } + reg.bufferFormat = nvenc_map_buffer_format(frames_ctx->sw_format); if (reg.bufferFormat == NV_ENC_BUFFER_FORMAT_UNDEFINED) { av_log(avctx, AV_LOG_FATAL, "Invalid input pixel format: %s\n", @@ -1544,8 +1607,9 @@ static int nvenc_register_frame(AVCodecContext *avctx, const AVFrame *frame) return AVERROR_UNKNOWN; } - ctx->registered_frames[idx].ptr = (CUdeviceptr)frame->data[0]; - ctx->registered_frames[idx].regptr = reg.registeredResource; + ctx->registered_frames[idx].ptr = frame->data[0]; + ctx->registered_frames[idx].ptr_index = reg.subResourceIndex; + ctx->registered_frames[idx].regptr = reg.registeredResource; return idx; } @@ -1559,10 +1623,10 @@ static int nvenc_upload_frame(AVCodecContext *avctx, const AVFrame *frame, int res; NVENCSTATUS nv_status; - if (avctx->pix_fmt == AV_PIX_FMT_CUDA) { + if (avctx->pix_fmt == AV_PIX_FMT_CUDA || avctx->pix_fmt == AV_PIX_FMT_D3D11) { int reg_idx = nvenc_register_frame(avctx, frame); if (reg_idx < 0) { - av_log(avctx, AV_LOG_ERROR, "Could not register an input CUDA frame\n"); + av_log(avctx, AV_LOG_ERROR, "Could not register an input HW frame\n"); return reg_idx; } @@ -1731,7 +1795,7 @@ static int process_output_surface(AVCodecContext *avctx, AVPacket *pkt, NvencSur nvenc_print_error(avctx, nv_status, "Failed unlocking bitstream buffer, expect the gates of mordor to open"); - if (avctx->pix_fmt == AV_PIX_FMT_CUDA) { + if (avctx->pix_fmt == AV_PIX_FMT_CUDA || avctx->pix_fmt == AV_PIX_FMT_D3D11) { p_nvenc->nvEncUnmapInputResource(ctx->nvencoder, tmpoutsurf->in_map.mappedResource); av_frame_unref(tmpoutsurf->in_ref); ctx->registered_frames[tmpoutsurf->reg_idx].mapped = 0; @@ -1818,7 +1882,7 @@ int ff_nvenc_send_frame(AVCodecContext *avctx, const AVFrame *frame) NV_ENC_PIC_PARAMS pic_params = { 0 }; pic_params.version = NV_ENC_PIC_PARAMS_VER; - if (!ctx->cu_context || !ctx->nvencoder) + if ((!ctx->cu_context && !ctx->d3d11_device) || !ctx->nvencoder) return AVERROR(EINVAL); if (ctx->encoder_flushing) @@ -1915,7 +1979,7 @@ int ff_nvenc_receive_packet(AVCodecContext *avctx, AVPacket *pkt) NvencContext *ctx = avctx->priv_data; - if (!ctx->cu_context || !ctx->nvencoder) + if ((!ctx->cu_context && !ctx->d3d11_device) || !ctx->nvencoder) return AVERROR(EINVAL); if (output_ready(avctx, ctx->encoder_flushing)) { diff --git a/libavcodec/nvenc.h b/libavcodec/nvenc.h index afb93cc22c..55ac5f220d 100644 --- a/libavcodec/nvenc.h +++ b/libavcodec/nvenc.h @@ -27,6 +27,13 @@ #include "libavutil/fifo.h" #include "libavutil/opt.h" +#if CONFIG_D3D11VA +#define COBJMACROS +#include "libavutil/hwcontext_d3d11va.h" +#else +typedef void ID3D11Device; +#endif + #include "avcodec.h" #define MAX_REGISTERED_FRAMES 64 @@ -107,6 +114,7 @@ typedef struct NvencContext NV_ENC_CONFIG encode_config; CUcontext cu_context; CUcontext cu_context_internal; + ID3D11Device *d3d11_device; int nb_surfaces; NvencSurface *surfaces; @@ -119,7 +127,8 @@ typedef struct NvencContext int encoder_flushing; struct { - CUdeviceptr ptr; + void *ptr; + int ptr_index; NV_ENC_REGISTERED_PTR regptr; int mapped; } registered_frames[MAX_REGISTERED_FRAMES];