From patchwork Sun Nov 17 14:58:04 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oleg Dobkin X-Patchwork-Id: 16305 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 71BE644A920 for ; Sun, 17 Nov 2019 16:58:29 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4156D68A2CA; Sun, 17 Nov 2019 16:58:29 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wr1-f67.google.com (mail-wr1-f67.google.com [209.85.221.67]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C5AE5689735 for ; Sun, 17 Nov 2019 16:58:23 +0200 (EET) Received: by mail-wr1-f67.google.com with SMTP id w9so16497164wrr.0 for ; Sun, 17 Nov 2019 06:58:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=anyvision-co.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id; bh=fPIuvO5uZ3xcueYnfPXNIVpGaLJK+DXi7Hny869Blhk=; b=pXeY+Z3k7C8bd3j64X8UAlg05TuC+T3AUoCwDi5WbvIE8DVKL37ECk6g2LOiMl7JL4 P/iZtpEgmbi/7CSTGy6LwWxW8Kd0Y6wq57kqHNKQs90o/Qx59sRf8HA0LBwL9sYDEPvc j/vN1CsiV92hxEloJDOvoQSCASptSqkYpiVz7KwlcOE8AaApN0PjSf+9zXCSjvEibjj7 DYPLQrbDnjCGlbr1Ef6+uQnv+yu6Jlgqg0PHQLWYkqwUnDw+KB8/C45jwVg7ELgJiSyn AOxtWzJnQGloinJ4/LHaPh5Y6p1LvXUZOZ9GzvwFaoWCnlZ8AiWwZemrDXH7D6Ew6uGZ 3Gjg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=fPIuvO5uZ3xcueYnfPXNIVpGaLJK+DXi7Hny869Blhk=; b=qaJUfrNZJifRZH1zqVZ7orbUGQKZn0l4aJxmei9cBCrTI2EXcLrvip87OjBp1A/QcX Uz1faFDam2Gnm0s/iAwmAwimJyy2WVmI5KnyPV7npxlWhJkp1cmmMwVgLrT35OcZY0Bw q2TPxFdyI51DEtU4Rmg67NEESZpqFVLfv1/fY2HkLtvLfur9BVyC4NFSKNUgjT16qGau O4dNy2V4PFlaspcClz4xKFAOg8R0t2nIa3GYZ4Zh46gJS6ROWLW9vRJeraFtuWj+BwHl Ha9obixejoCFHmw9ENXFLh7zYg1TKH2L8s1VthKVUSFnb6YkwYKexPXwWPDehCcOlUvf 9ZBg== X-Gm-Message-State: APjAAAWvpZ70KpM+PlMq2fnLB2qS5G5UE35k7iP1ZyB9aWBFJbs9xGvd 3L/nUvO5KjfTfNyP7AMMKAcmTUb9umc= X-Google-Smtp-Source: APXvYqxAypjUprvrrYfOFQmP2+Q/Ruhxdx3iWSwr3bcxcZ7pt/PV/ldeEIjmDpYdeVW9yt7Oabc4pQ== X-Received: by 2002:a5d:6104:: with SMTP id v4mr24297913wrt.36.1574002703015; Sun, 17 Nov 2019 06:58:23 -0800 (PST) Received: from localhost.localdomain ([31.154.171.234]) by smtp.gmail.com with ESMTPSA id v9sm18946590wrs.95.2019.11.17.06.58.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 17 Nov 2019 06:58:22 -0800 (PST) From: Oleg Dobkin To: ffmpeg-devel@ffmpeg.org Date: Sun, 17 Nov 2019 16:58:04 +0200 Message-Id: <20191117145804.32642-1-olegd@anyvision.co> X-Mailer: git-send-email 2.17.1 Subject: [FFmpeg-devel] [PATCH] Allow using primary CUDA device context X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Oleg Dobkin MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Add AVCUDADeviceContextFlags to control the creation of CUDA device context for the hardware CUDA decoder. The current values are 0 (default behavior) - new context will be created for each decoder, and 1 - primary CUDA context will be used. There are several reasons for using primary device context instead of creating a new one: - This is the recommended way to handle device contexts (see https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__CTX.html#group__CUDA__CTX_1g65dc0012348bc84810e2103a40d8e2cf) - Memory allocations, kernels and other state are associated with the current device context. Currently, the context is not accessible from FFmpeg API, so, technically, the memory created by the hardware decoder (the video frame) can't be safely read. Signed-off-by: Oleg Dobkin --- libavutil/hwcontext_cuda.c | 20 +++++++++++++++----- libavutil/hwcontext_cuda.h | 7 +++++++ 2 files changed, 22 insertions(+), 5 deletions(-) diff --git a/libavutil/hwcontext_cuda.c b/libavutil/hwcontext_cuda.c index cca39e9fc7..608ea57569 100644 --- a/libavutil/hwcontext_cuda.c +++ b/libavutil/hwcontext_cuda.c @@ -281,8 +281,12 @@ static void cuda_device_uninit(AVHWDeviceContext *device_ctx) if (hwctx->internal) { CudaFunctions *cu = hwctx->internal->cuda_dl; if (hwctx->internal->is_allocated && hwctx->cuda_ctx) { - CHECK_CU(cu->cuCtxDestroy(hwctx->cuda_ctx)); + if (hwctx->flags == DCF_CREATE_CONTEXT) + CHECK_CU(cu->cuCtxDestroy(hwctx->cuda_ctx)); + else + CHECK_CU(cu->cuDevicePrimaryCtxRelease(hwctx->cuda_device)); hwctx->cuda_ctx = NULL; + hwctx->cuda_device = NULL; } cuda_free_functions(&hwctx->internal->cuda_dl); } @@ -322,7 +326,6 @@ static int cuda_device_create(AVHWDeviceContext *device_ctx, { AVCUDADeviceContext *hwctx = device_ctx->hwctx; CudaFunctions *cu; - CUdevice cu_device; CUcontext dummy; int ret, device_idx = 0; @@ -338,18 +341,25 @@ static int cuda_device_create(AVHWDeviceContext *device_ctx, if (ret < 0) goto error; - ret = CHECK_CU(cu->cuDeviceGet(&cu_device, device_idx)); + ret = CHECK_CU(cu->cuDeviceGet(&hwctx->cuda_device, device_idx)); if (ret < 0) goto error; - ret = CHECK_CU(cu->cuCtxCreate(&hwctx->cuda_ctx, CU_CTX_SCHED_BLOCKING_SYNC, cu_device)); + hwctx->flags = flags; + + if (flags == DCF_CREATE_CONTEXT) + ret = CHECK_CU(cu->cuCtxCreate(&hwctx->cuda_ctx, CU_CTX_SCHED_BLOCKING_SYNC, hwctx->cuda_device)); + else + ret = CHECK_CU(cu->cuDevicePrimaryCtxRetain(&hwctx->cuda_ctx, hwctx->cuda_device)); + if (ret < 0) goto error; // Setting stream to NULL will make functions automatically use the default CUstream hwctx->stream = NULL; - CHECK_CU(cu->cuCtxPopCurrent(&dummy)); + if (flags == DCF_CREATE_CONTEXT) + CHECK_CU(cu->cuCtxPopCurrent(&dummy)); hwctx->internal->is_allocated = 1; diff --git a/libavutil/hwcontext_cuda.h b/libavutil/hwcontext_cuda.h index 81a0552cab..bab5eefe54 100644 --- a/libavutil/hwcontext_cuda.h +++ b/libavutil/hwcontext_cuda.h @@ -34,6 +34,11 @@ * AVBufferRefs whose data pointer is a CUdeviceptr. */ +enum AVCUDADeviceContextFlags { + DCF_CREATE_CONTEXT = 0, + DCF_USE_PRIMARY_CONTEXT = 1 +}; + typedef struct AVCUDADeviceContextInternal AVCUDADeviceContextInternal; /** @@ -43,6 +48,8 @@ typedef struct AVCUDADeviceContext { CUcontext cuda_ctx; CUstream stream; AVCUDADeviceContextInternal *internal; + CUdevice cuda_device; + enum AVCUDADeviceContextFlags flags; } AVCUDADeviceContext; /**