From patchwork Sun Oct 7 17:50:56 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philip Langdale X-Patchwork-Id: 10556 Delivered-To: ffmpegpatchwork@gmail.com Received: by 2002:ab0:73d2:0:0:0:0:0 with SMTP id m18csp2761941uaq; Sun, 7 Oct 2018 10:52:12 -0700 (PDT) X-Google-Smtp-Source: ACcGV62T0+dVx0kPjONmkd8gEiiZX3MtHxI+HGVjxehGnVY5J37d9mA9wCtARgRRcNdOKipOsZBi X-Received: by 2002:adf:a201:: with SMTP id p1-v6mr4392033wra.89.1538934732147; Sun, 07 Oct 2018 10:52:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538934732; cv=none; d=google.com; s=arc-20160816; b=yyyOHjeZePcR3p2EVYzCaqOlbuImWkvvHoC6N2/3YL2HtvJOlAqMhYhEzi+eKNIHC9 owqj+/WkszbouU6ZAmZDwZ6oA9rJT4cCdQjVYzFS0Nz3WkYivBGNvJXCUfe2bb8RZYfm jEwbF5e6Qi9fmrpJmRVA4nwqhOd2pCeQeG8Us75nUvRH+Zm5ImNxOf4tSyol6UUpydf6 MGDFw2bkM9VyjBj1UxYQpdvCv3jQWYk5YFvTPCaGdq4ZuJL1YubePFEhLTmN8wbjC4Sk RBWo3F/Z+f1c+kpIC8oobEqZBE4zS2C8wm7kqiw/+UQbyovo5VlrcldxEaKZI4Wip0D2 QywQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:references:in-reply-to:message-id:date :to:from:dkim-signature:delivered-to; bh=kZCZhEmNvAAalA42d80cUcar+fO+T0g6ySv2SuPsT9E=; b=B05KTAMclSsPpq+9L956UTqxniuYOmEGfJA1p+aswf+z5vYTQQU5Fa6UlVXFpder5O i9qiX8NL8gZs3J3+Kr7afUkvJ37EAcOESd6wYdlP6+O2Tq4aMJwJK9vVQMRQ8Eff6IQ8 +UqQSzS9iiCr/nOV9LSt9j6DVWL7h8sKR4Ll957PTF1JIjFxYi/8shLOufjpx/p35xaa RBbd3HLQGNcNmkEAVp0A7H0T1nJnghC26RDZMMCyzXzNHdFinL6Lr3/faWye7gKJZSXA A2ACd3H51Esh0fJkdlalbZCzu1A4do7+aLGWExFCLAkUkHtE1ZLB2eUKrPXa+fXiHt5n MlEg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@overt.org header.s=mail header.b=HTdOG1Zp; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id j9-v6si10867190wrp.186.2018.10.07.10.52.11; Sun, 07 Oct 2018 10:52:12 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@overt.org header.s=mail header.b=HTdOG1Zp; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 12BBA68A18E; Sun, 7 Oct 2018 20:51:16 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-io1-f99.google.com (mail-io1-f99.google.com [209.85.166.99]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 8E8EA689F86 for ; Sun, 7 Oct 2018 20:51:07 +0300 (EEST) Received: by mail-io1-f99.google.com with SMTP id z16-v6so14251741iol.6 for ; Sun, 07 Oct 2018 10:51:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references; bh=/6ORARhrdhoVhciEBmVH75kLwiIRtcLTjlCatD3Tijk=; b=Sv0D0519ux6nneg/oymXJz1ZD5FVFAnYhzulyK5AH2nTPebR/A3DwGD2N93HONcmbc nN9TPssrzUKiZTKdekxQ35/8ol4PsjbZ9knhiEn8kOsRasxTzcj/F2vOaeDSdjLVVpN0 Awmm3q/YPVTgZdnFqdWap6Axe7pDS4TPgacgnO0aV5kOztIrMEdVOITRDdxIIQM6tijv CNa4/aDb9gb3Yv+Rs6jiMhfHa5NlAA0DOpk4ZhRXlFVqG76QmO9afjoVc9L+FST1t9jK 0xP7LkR0Wzfltnxk3xK3RN4MUDdTA+zAvfA2TpIkY+YuKOjckFWSx4ZEz/zA71Y0IgxG KnBA== X-Gm-Message-State: ABuFfohmJKpGaO45Jx+rzqedq4GFD7eok8AWj6EX2MrcfBZaXeN1wYAO 151SGqmvOYsvFTb7JlJjyMBwQ4lGgNuyNGCWMN88JL3aptDL/Q== X-Received: by 2002:a6b:cd45:: with SMTP id d66-v6mr14241864iog.121.1538934687604; Sun, 07 Oct 2018 10:51:27 -0700 (PDT) Received: from mail.overt.org (155.208.178.107.bc.googleusercontent.com. [107.178.208.155]) by smtp-relay.gmail.com with ESMTPS id m203-v6sm695751itb.10.2018.10.07.10.51.27 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 07 Oct 2018 10:51:27 -0700 (PDT) X-Relaying-Domain: gapps.overt.org Received: from authenticated-user (mail.overt.org [107.178.208.155]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.overt.org (Postfix) with ESMTPSA id E3A1060756; Sun, 7 Oct 2018 17:51:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=overt.org; s=mail; t=1538934687; bh=yZg10Wm5m77/0WWKf+wVyKuVz7udpj776kclTk+MrRs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HTdOG1ZpNpMbaPVkYdb035zC4Wxih2ZlaMy1Qx1F9y/k3Udb9YUYrqXYet5HI4xR5 xJhzw69QBbE4tbSbKBdEZiSUQjjmvxvhH5EuV4gn1XR4VSmiED92LvtCZ7Kf/YFTlJ EotSy4oUEKhGmV/CXoIqGgFRUP22bBz1KQ4ZaEgOPTvbU9l8PkE153hL3190vYWXz0 R/HxgCDGh0LqkZctqksAbfareQDrD7qIGL2qoncntuIUi8SnuQHUdZVoj9fOfLFPta vDXXKOpQ6emOIaNn/3pHSU/I2NEbr1i1e8ws/R2oeV4J9GcpA7NbQBClYjnn34xX3s l2KriiTvDKpUQ== From: Philip Langdale To: ffmpeg-devel@ffmpeg.org, Timo Rothenpieler Date: Sun, 7 Oct 2018 10:50:56 -0700 Message-Id: <20181007175057.31070-5-philipl@overt.org> In-Reply-To: <20181007175057.31070-1-philipl@overt.org> References: <20181007175057.31070-1-philipl@overt.org> Subject: [FFmpeg-devel] [PATCH 4/5] avcodec/cuviddec: Add support for decoding HEVC 4:4:4 content X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Philip Langdale MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" This is the equivalent change for cuviddec after the previous change for nvdec. I made similar changes to the copying routines to handle pixel formats in a more generic way. Note that unlike with nvdec, there is no confusion about the ability of a codec to output 444 formats. This is because the cuvid parser is used, meaning that 444 JPEG content is still indicated as using a 420 output format. Signed-off-by: Philip Langdale --- libavcodec/cuviddec.c | 59 +++++++++++++++++++++++++++++-------------- 1 file changed, 40 insertions(+), 19 deletions(-) diff --git a/libavcodec/cuviddec.c b/libavcodec/cuviddec.c index 4d3caf924e..595249475d 100644 --- a/libavcodec/cuviddec.c +++ b/libavcodec/cuviddec.c @@ -35,6 +35,9 @@ #include "hwaccel.h" #include "internal.h" +#define CUVID_FORMAT_YUV444P 2 +#define CUVID_FORMAT_YUV444P16 3 + typedef struct CuvidContext { AVClass *avclass; @@ -127,6 +130,7 @@ static int CUDAAPI cuvid_handle_video_sequence(void *opaque, CUVIDEOFORMAT* form CUVIDDECODECAPS *caps = NULL; CUVIDDECODECREATEINFO cuinfo; int surface_fmt; + int chroma_444; int old_width = avctx->width; int old_height = avctx->height; @@ -169,17 +173,19 @@ static int CUDAAPI cuvid_handle_video_sequence(void *opaque, CUVIDEOFORMAT* form cuinfo.target_rect.right = cuinfo.ulTargetWidth; cuinfo.target_rect.bottom = cuinfo.ulTargetHeight; + chroma_444 = format->chroma_format == cudaVideoChromaFormat_444; + switch (format->bit_depth_luma_minus8) { case 0: // 8-bit - pix_fmts[1] = AV_PIX_FMT_NV12; + pix_fmts[1] = chroma_444 ? AV_PIX_FMT_YUV444P : AV_PIX_FMT_NV12; caps = &ctx->caps8; break; case 2: // 10-bit - pix_fmts[1] = AV_PIX_FMT_P010; + pix_fmts[1] = chroma_444 ? AV_PIX_FMT_YUV444P10_LSB : AV_PIX_FMT_P010; caps = &ctx->caps10; break; case 4: // 12-bit - pix_fmts[1] = AV_PIX_FMT_P016; + pix_fmts[1] = chroma_444 ? AV_PIX_FMT_YUV444P12_LSB : AV_PIX_FMT_P016; caps = &ctx->caps12; break; default: @@ -282,12 +288,6 @@ static int CUDAAPI cuvid_handle_video_sequence(void *opaque, CUVIDEOFORMAT* form return 0; } - if (format->chroma_format != cudaVideoChromaFormat_420) { - av_log(avctx, AV_LOG_ERROR, "Chroma formats other than 420 are not supported\n"); - ctx->internal_error = AVERROR(EINVAL); - return 0; - } - ctx->chroma_format = format->chroma_format; cuinfo.CodecType = ctx->codec_type = format->codec; @@ -301,6 +301,14 @@ static int CUDAAPI cuvid_handle_video_sequence(void *opaque, CUVIDEOFORMAT* form case AV_PIX_FMT_P016: cuinfo.OutputFormat = cudaVideoSurfaceFormat_P016; break; + case AV_PIX_FMT_YUV444P: + cuinfo.OutputFormat = CUVID_FORMAT_YUV444P; + break; + case AV_PIX_FMT_YUV444P10_LSB: + case AV_PIX_FMT_YUV444P12_LSB: + case AV_PIX_FMT_YUV444P16: + cuinfo.OutputFormat = CUVID_FORMAT_YUV444P16; + break; default: av_log(avctx, AV_LOG_ERROR, "Output formats other than NV12, P010 or P016 are not supported\n"); ctx->internal_error = AVERROR(EINVAL); @@ -507,6 +515,7 @@ static int cuvid_output_frame(AVCodecContext *avctx, AVFrame *frame) return ret; if (av_fifo_size(ctx->frame_queue)) { + const AVPixFmtDescriptor *pixdesc; CuvidParsedFrame parsed_frame; CUVIDPROCPARAMS params; unsigned int pitch = 0; @@ -537,7 +546,10 @@ static int cuvid_output_frame(AVCodecContext *avctx, AVFrame *frame) goto error; } - for (i = 0; i < 2; i++) { + pixdesc = av_pix_fmt_desc_get(avctx->sw_pix_fmt); + + for (i = 0; i < pixdesc->nb_components; i++) { + size_t height = avctx->height >> (i ? pixdesc->log2_chroma_h : 0); CUDA_MEMCPY2D cpy = { .srcMemoryType = CU_MEMORYTYPE_DEVICE, .dstMemoryType = CU_MEMORYTYPE_DEVICE, @@ -547,22 +559,27 @@ static int cuvid_output_frame(AVCodecContext *avctx, AVFrame *frame) .dstPitch = frame->linesize[i], .srcY = offset, .WidthInBytes = FFMIN(pitch, frame->linesize[i]), - .Height = avctx->height >> (i ? 1 : 0), + .Height = height, }; ret = CHECK_CU(ctx->cudl->cuMemcpy2DAsync(&cpy, device_hwctx->stream)); if (ret < 0) goto error; - offset += avctx->height; + offset += height; } ret = CHECK_CU(ctx->cudl->cuStreamSynchronize(device_hwctx->stream)); if (ret < 0) goto error; - } else if (avctx->pix_fmt == AV_PIX_FMT_NV12 || - avctx->pix_fmt == AV_PIX_FMT_P010 || - avctx->pix_fmt == AV_PIX_FMT_P016) { + } else if (avctx->pix_fmt == AV_PIX_FMT_NV12 || + avctx->pix_fmt == AV_PIX_FMT_P010 || + avctx->pix_fmt == AV_PIX_FMT_P016 || + avctx->pix_fmt == AV_PIX_FMT_YUV444P || + avctx->pix_fmt == AV_PIX_FMT_YUV444P10_LSB || + avctx->pix_fmt == AV_PIX_FMT_YUV444P12_LSB || + avctx->pix_fmt == AV_PIX_FMT_YUV444P16) { + size_t offset = 0; AVFrame *tmp_frame = av_frame_alloc(); if (!tmp_frame) { av_log(avctx, AV_LOG_ERROR, "av_frame_alloc failed\n"); @@ -570,15 +587,19 @@ static int cuvid_output_frame(AVCodecContext *avctx, AVFrame *frame) goto error; } + pixdesc = av_pix_fmt_desc_get(avctx->sw_pix_fmt); + tmp_frame->format = AV_PIX_FMT_CUDA; tmp_frame->hw_frames_ctx = av_buffer_ref(ctx->hwframe); - tmp_frame->data[0] = (uint8_t*)mapped_frame; - tmp_frame->linesize[0] = pitch; - tmp_frame->data[1] = (uint8_t*)(mapped_frame + avctx->height * pitch); - tmp_frame->linesize[1] = pitch; tmp_frame->width = avctx->width; tmp_frame->height = avctx->height; + for (i = 0; i < pixdesc->nb_components; i++) { + tmp_frame->data[i] = (uint8_t*)mapped_frame + offset; + tmp_frame->linesize[i] = pitch; + offset += avctx->height >> (i ? pixdesc->log2_chroma_h : 0); + } + ret = ff_get_buffer(avctx, frame, 0); if (ret < 0) { av_log(avctx, AV_LOG_ERROR, "ff_get_buffer failed\n");