From patchwork Tue May 8 13:31:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Timo Rothenpieler X-Patchwork-Id: 8870 Delivered-To: ffmpegpatchwork@gmail.com Received: by 2002:a02:155:0:0:0:0:0 with SMTP id c82-v6csp4022119jad; Tue, 8 May 2018 06:32:24 -0700 (PDT) X-Google-Smtp-Source: AB8JxZodayP6w0h0XbWUb7929vBHRzznlv0MLMB19Q9gPcEpgxgHVHpcanWc5Xjjv6J0b/Cr+hgv X-Received: by 2002:a1c:d34e:: with SMTP id k75-v6mr3505321wmg.29.1525786344892; Tue, 08 May 2018 06:32:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525786344; cv=none; d=google.com; s=arc-20160816; b=DjSTVKSpIDbfd6d/e1VRmoUyKRl0cZKLsy245WZrP/8wArNS/2YvBjc4Vvi5OQfJ6L 2mocJ8X8/BIVLTZmHrBQz56yO+alvSSSacovrXaZponcDgSgwNfiIEE1z0sClIcaELhc U+RZ+W08iKmD4CnZ01ELjsOG/O0BZe3UtKAr+ByOUmDBcf4cmxifYp177yXTul72fIzJ ADnHcupBX7Iqz4e/faC4jH7VFn3hRfaJ18Wbhj3yDYFwLvnZIpmde//dSlBQpy7poomj fqV/UF3+ZvjT70sSXH+mcU/X9dQAtuKkmRjFSZeXRKQVWFv5ZI/Q2vO4oDoxxzabbhql vjIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:references:in-reply-to:message-id:date :to:from:dkim-signature:delivered-to:arc-authentication-results; bh=kM2ok4/uTnTpa6C57sRnZCaZq87ww9D3bLCdUJum1XE=; b=SczKvhXv46bXIc4rHtPgP6ULZ7cPpGdiBqdyWGBt7uuOs1ArKg2MZRa4QYYkN69Rxe dLPZ1lP4wKd89JeiQICIZoupvaq4iq/8YLO1nAw/E7xVb+vMspYl5ZfVGDbGSKuDd6Za lGN3Hx+f20oZNdrGONysbzyo9QXOgOevkMmTGy09668Phni7KS5/9sz/UEkYEuGfdV4w 608BTKB+a2y5Izck6bHxCj9+PrArFYXwbPKp400dy+f1qpDEsR8E25eOOTVo8K6rap3w Jj+7CnkFNUmJcWtAI9DMjZH2l6cVZRS5YFp+1I5xk7WUhynNzplFjvz0KVPjiIFaAEyC hhxg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b=Ey6WhSQY; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 3si6366171wmz.110.2018.05.08.06.32.24; Tue, 08 May 2018 06:32:24 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b=Ey6WhSQY; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C252A68A6F2; Tue, 8 May 2018 16:31:18 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from btbn.de (btbn.de [5.9.118.179]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 33AF768A6CA for ; Tue, 8 May 2018 16:31:10 +0300 (EEST) Received: from localhost.localdomain (200116b864513a00ba975afffe10ec69.dip.versatel-1u1.de [IPv6:2001:16b8:6451:3a00:ba97:5aff:fe10:ec69]) by btbn.de (Postfix) with ESMTPSA id D31AC7328A; Tue, 8 May 2018 15:31:43 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=rothenpieler.org; s=mail; t=1525786303; bh=QSKXBd/KbFpCnF1NAEg3nRs+1z24LJ3HPqZzBNfDEcg=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=Ey6WhSQYI9DaT6ue2/O7djjPlxBx5Qb37lFRGMtm+Ra8a1DZg1EFM1zkb5aSaSLgb 1f47x+3qYV6vJtgHudgMEG7JOQ8Pu3Eu8s+DUGLHarWMP+AOrhDr5Dd5FyBuyy51Tz F5EyasWXMa0fFR3A6JxeYAjrfYBshQAokDdUOeKA= From: Timo Rothenpieler To: ffmpeg-devel@ffmpeg.org Date: Tue, 8 May 2018 15:31:28 +0200 Message-Id: <20180508133132.28940-2-timo@rothenpieler.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180508133132.28940-1-timo@rothenpieler.org> References: <20180508133132.28940-1-timo@rothenpieler.org> Subject: [FFmpeg-devel] [PATCH 2/6] avcodec/nvdec: avoid needless copy of output frame X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Timo Rothenpieler MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Replaces the data pointers with the mapped cuvid ones. Adds buffer_refs to the frame to ensure the needed contexts stay alive and the cuvid idx stays allocated. Adds another buffer_ref to unmap the frame when it's unreferenced itself. --- libavcodec/nvdec.c | 83 +++++++++++++++++++++++++++++++++------------- 1 file changed, 60 insertions(+), 23 deletions(-) diff --git a/libavcodec/nvdec.c b/libavcodec/nvdec.c index ab3cb88b27..d98f9dd95e 100644 --- a/libavcodec/nvdec.c +++ b/libavcodec/nvdec.c @@ -308,7 +308,7 @@ int ff_nvdec_decode_init(AVCodecContext *avctx) params.CodecType = cuvid_codec_type; params.ChromaFormat = cuvid_chroma_format; params.ulNumDecodeSurfaces = frames_ctx->initial_pool_size; - params.ulNumOutputSurfaces = 1; + params.ulNumOutputSurfaces = frames_ctx->initial_pool_size; ret = nvdec_decoder_create(&ctx->decoder_ref, frames_ctx->device_ref, ¶ms, avctx); if (ret < 0) { @@ -354,6 +354,32 @@ static void nvdec_fdd_priv_free(void *priv) av_freep(&priv); } +static void nvdec_unmap_mapped_frame(void *opaque, uint8_t *data) +{ + NVDECFrame *unmap_data = (NVDECFrame*)data; + NVDECDecoder *decoder = (NVDECDecoder*)unmap_data->decoder_ref->data; + CUdeviceptr devptr = (CUdeviceptr)opaque; + CUresult err; + CUcontext dummy; + + err = decoder->cudl->cuCtxPushCurrent(decoder->cuda_ctx); + if (err != CUDA_SUCCESS) { + av_log(NULL, AV_LOG_ERROR, "cuCtxPushCurrent failed\n"); + goto finish; + } + + err = decoder->cvdl->cuvidUnmapVideoFrame(decoder->decoder, devptr); + if (err != CUDA_SUCCESS) + av_log(NULL, AV_LOG_ERROR, "cuvidUnmapVideoFrame failed\n"); + + decoder->cudl->cuCtxPopCurrent(&dummy); + +finish: + av_buffer_unref(&unmap_data->idx_ref); + av_buffer_unref(&unmap_data->decoder_ref); + av_free(unmap_data); +} + static int nvdec_retrieve_data(void *logctx, AVFrame *frame) { FrameDecodeData *fdd = (FrameDecodeData*)frame->private_ref->data; @@ -361,6 +387,7 @@ static int nvdec_retrieve_data(void *logctx, AVFrame *frame) NVDECDecoder *decoder = (NVDECDecoder*)cf->decoder_ref->data; CUVIDPROCPARAMS vpp = { .progressive_frame = 1 }; + NVDECFrame *unmap_data = NULL; CUresult err; CUcontext dummy; @@ -383,32 +410,39 @@ static int nvdec_retrieve_data(void *logctx, AVFrame *frame) goto finish; } - for (i = 0; frame->data[i]; i++) { - CUDA_MEMCPY2D cpy = { - .srcMemoryType = CU_MEMORYTYPE_DEVICE, - .dstMemoryType = CU_MEMORYTYPE_DEVICE, - .srcDevice = devptr, - .dstDevice = (CUdeviceptr)frame->data[i], - .srcPitch = pitch, - .dstPitch = frame->linesize[i], - .srcY = offset, - .WidthInBytes = FFMIN(pitch, frame->linesize[i]), - .Height = frame->height >> (i ? 1 : 0), - }; - - err = decoder->cudl->cuMemcpy2D(&cpy); - if (err != CUDA_SUCCESS) { - av_log(logctx, AV_LOG_ERROR, "Error copying decoded frame: %d\n", - err); - ret = AVERROR_UNKNOWN; - goto copy_fail; - } + unmap_data = av_mallocz(sizeof(*unmap_data)); + if (!unmap_data) { + ret = AVERROR(ENOMEM); + goto copy_fail; + } - offset += cpy.Height; + frame->buf[1] = av_buffer_create((uint8_t *)unmap_data, sizeof(*unmap_data), + nvdec_unmap_mapped_frame, (void*)devptr, + AV_BUFFER_FLAG_READONLY); + if (!frame->buf[1]) { + ret = AVERROR(ENOMEM); + goto copy_fail; } + unmap_data->idx = cf->idx; + unmap_data->idx_ref = av_buffer_ref(cf->idx_ref); + unmap_data->decoder_ref = av_buffer_ref(cf->decoder_ref); + + for (i = 0; frame->linesize[i]; i++) { + frame->data[i] = (uint8_t*)(devptr + offset); + frame->linesize[i] = pitch; + offset += pitch * (frame->height >> (i ? 1 : 0)); + } + + goto finish; + copy_fail: - decoder->cvdl->cuvidUnmapVideoFrame(decoder->decoder, devptr); + if (!frame->buf[1]) { + decoder->cvdl->cuvidUnmapVideoFrame(decoder->decoder, devptr); + av_freep(&unmap_data); + } else { + av_buffer_unref(&frame->buf[1]); + } finish: decoder->cudl->cuCtxPopCurrent(&dummy); @@ -526,6 +560,7 @@ int ff_nvdec_frame_params(AVCodecContext *avctx, int dpb_size) { AVHWFramesContext *frames_ctx = (AVHWFramesContext*)hw_frames_ctx->data; + AVCUDAFramesContext *hwctx = (AVCUDAFramesContext*)frames_ctx->hwctx; const AVPixFmtDescriptor *sw_desc; int cuvid_codec_type, cuvid_chroma_format; @@ -550,6 +585,8 @@ int ff_nvdec_frame_params(AVCodecContext *avctx, frames_ctx->height = (avctx->coded_height + 1) & ~1; frames_ctx->initial_pool_size = dpb_size; + hwctx->flags = AV_CUDA_HWFRAMES_DUMMY_MODE; + switch (sw_desc->comp[0].depth) { case 8: frames_ctx->sw_format = AV_PIX_FMT_NV12;