From patchwork Mon Nov 18 09:26:25 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oleg Dobkin X-Patchwork-Id: 16315 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 4A89644A28E for ; Mon, 18 Nov 2019 11:32:03 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 25EDD68A3AF; Mon, 18 Nov 2019 11:32:03 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wm1-f67.google.com (mail-wm1-f67.google.com [209.85.128.67]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 2604B689D3B for ; Mon, 18 Nov 2019 11:31:57 +0200 (EET) Received: by mail-wm1-f67.google.com with SMTP id l1so16532067wme.2 for ; Mon, 18 Nov 2019 01:31:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=anyvision-co.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=xFq+qRRpyOs0e80qblfifwPRYYiVKLfiIxDt6QIAil4=; b=ZWEWRLLx3+QUyriIjrZCMVDfGSOKhy3vz5S+bvzV/60UmORvgakvGooi/catM/Z7oi d7ol4uSPkdONDd3LSIKesIgrn8hdNG1iLrgxqur0Hplp7sJQ8FqRBx5lBjSL8zx03xg8 a6O7MLtoeyMgKzStThG0k2qOAVnv4X1ekxcCA0C+diGcF+u8BfAu0bh6zl/wCfO0XCWP PVqQe7goQSFmQ7TUt/x6m1u6L9JABsQqqNUdLaIxgOpZ27Yc97p0jZgvNKPtkdoZMqND wBqVU7R15QuxkMvOVQCjwf3qhaMj/v+aY4vK44RcSn3QqYtZ/EpwffcNLJro8BI+tqYG 95qg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=xFq+qRRpyOs0e80qblfifwPRYYiVKLfiIxDt6QIAil4=; b=KDfC3r5p3E28Gey0pyl0OJPG4rzEN2twz+vFxO0Ws1LwwrKNIqVw6NAz4zbpwHSK7k KMkpKKdm8dMtxs02x27bKEoA8+G1j4zAXnb9Sv+AowPCVcb1zUJW/gv3ob/JPHDmUeIo wazDoYHZzs+UgBDRz95f1sIqgP3P1Ob/z8HwoSJbNj9XwudblEiyQIgtl31IMDsleesi u9Q5A9zHb0//G+z4mXn9s7CTlNC39Z4+OZ4dWtfoFUrXqI070qHYU3BCdb+EhfOV7I8v gSVvxNC8a8beJgZoTLMfiR51y/ImBdadWPjZ9mZ9fk6nM99PZDT5PaYQOaNVmn6y/yhg 6lnA== X-Gm-Message-State: APjAAAXVOVilar8WNN3ggugWdCD+8h07UVwn6MrMRNuyu60jWmJQriKY P/SqZHUbxKtl9UCVJ9aBMGrT+xoHaEU= X-Google-Smtp-Source: APXvYqxwLOl1BX+t9KhmPvgWdo1mDQBdmh4QhqpgSuBkAcMsGxhw41mmdYrQRDZZKr1Ss/EzEbHh9w== X-Received: by 2002:a1c:4c10:: with SMTP id z16mr25710930wmf.24.1574069189556; Mon, 18 Nov 2019 01:26:29 -0800 (PST) Received: from localhost.localdomain ([31.154.171.234]) by smtp.gmail.com with ESMTPSA id l13sm18942388wmh.12.2019.11.18.01.26.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Nov 2019 01:26:28 -0800 (PST) From: Oleg Dobkin To: ffmpeg-devel@ffmpeg.org Date: Mon, 18 Nov 2019 11:26:25 +0200 Message-Id: <20191118092625.24877-1-olegd@anyvision.co> X-Mailer: git-send-email 2.17.1 In-Reply-To: <66af27d8-0f8d-02b9-c8df-93fe55088439@rothenpieler.org> References: <66af27d8-0f8d-02b9-c8df-93fe55088439@rothenpieler.org> Subject: [FFmpeg-devel] [PATCH] Allow using primary CUDA device context X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Oleg Dobkin MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Add AVCUDADeviceContextFlags to control the creation of CUDA device context for the hardware CUDA decoder. The current values are 0 (default behavior) - new context will be created for each decoder, and 1 - primary CUDA context will be used. There are several reasons for using primary device context instead of creating a new one: - This is the recommended way to handle device contexts (see https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__CTX.html#group__CUDA__CTX_1g65dc0012348bc84810e2103a40d8e2cf) - Memory allocations, kernels and other state are associated with the current device context. Currently, the context is not accessible from FFmpeg API, so, technically, the memory created by the hardware decoder (the video frame) can't be safely read. Signed-off-by: Oleg Dobkin --- libavutil/hwcontext_cuda.c | 22 +++++++++++++++++----- libavutil/hwcontext_cuda_internal.h | 2 ++ 2 files changed, 19 insertions(+), 5 deletions(-) diff --git a/libavutil/hwcontext_cuda.c b/libavutil/hwcontext_cuda.c index cca39e9fc7..e72efbe5af 100644 --- a/libavutil/hwcontext_cuda.c +++ b/libavutil/hwcontext_cuda.c @@ -29,6 +29,8 @@ #define CUDA_FRAME_ALIGNMENT 256 +#define USE_PRIMARY_CONTEXT 1 + typedef struct CUDAFramesContext { int shift_width, shift_height; } CUDAFramesContext; @@ -281,8 +283,12 @@ static void cuda_device_uninit(AVHWDeviceContext *device_ctx) if (hwctx->internal) { CudaFunctions *cu = hwctx->internal->cuda_dl; if (hwctx->internal->is_allocated && hwctx->cuda_ctx) { - CHECK_CU(cu->cuCtxDestroy(hwctx->cuda_ctx)); + if (hwctx->internal->flags & USE_PRIMARY_CONTEXT) + CHECK_CU(cu->cuDevicePrimaryCtxRelease(hwctx->internal->cuda_device)); + else + CHECK_CU(cu->cuCtxDestroy(hwctx->cuda_ctx)); hwctx->cuda_ctx = NULL; + hwctx->internal->cuda_device = NULL; } cuda_free_functions(&hwctx->internal->cuda_dl); } @@ -322,7 +328,6 @@ static int cuda_device_create(AVHWDeviceContext *device_ctx, { AVCUDADeviceContext *hwctx = device_ctx->hwctx; CudaFunctions *cu; - CUdevice cu_device; CUcontext dummy; int ret, device_idx = 0; @@ -338,18 +343,25 @@ static int cuda_device_create(AVHWDeviceContext *device_ctx, if (ret < 0) goto error; - ret = CHECK_CU(cu->cuDeviceGet(&cu_device, device_idx)); + ret = CHECK_CU(cu->cuDeviceGet(&hwctx->internal->cuda_device, device_idx)); if (ret < 0) goto error; - ret = CHECK_CU(cu->cuCtxCreate(&hwctx->cuda_ctx, CU_CTX_SCHED_BLOCKING_SYNC, cu_device)); + hwctx->internal->flags = flags; + + if (flags & USE_PRIMARY_CONTEXT) + ret = CHECK_CU(cu->cuDevicePrimaryCtxRetain(&hwctx->cuda_ctx, hwctx->internal->cuda_device)); + else + ret = CHECK_CU(cu->cuCtxCreate(&hwctx->cuda_ctx, CU_CTX_SCHED_BLOCKING_SYNC, hwctx->internal->cuda_device)); + if (ret < 0) goto error; // Setting stream to NULL will make functions automatically use the default CUstream hwctx->stream = NULL; - CHECK_CU(cu->cuCtxPopCurrent(&dummy)); + if (!(flags & USE_PRIMARY_CONTEXT)) + CHECK_CU(cu->cuCtxPopCurrent(&dummy)); hwctx->internal->is_allocated = 1; diff --git a/libavutil/hwcontext_cuda_internal.h b/libavutil/hwcontext_cuda_internal.h index e1bc6ff350..d5633c58d5 100644 --- a/libavutil/hwcontext_cuda_internal.h +++ b/libavutil/hwcontext_cuda_internal.h @@ -31,6 +31,8 @@ struct AVCUDADeviceContextInternal { CudaFunctions *cuda_dl; int is_allocated; + CUdevice cuda_device; + int flags; }; #endif /* AVUTIL_HWCONTEXT_CUDA_INTERNAL_H */