From patchwork Tue Nov 14 15:24:14 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philip Langdale X-Patchwork-Id: 6046 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.2.161.94 with SMTP id m30csp3454010jah; Tue, 14 Nov 2017 07:25:07 -0800 (PST) X-Google-Smtp-Source: AGs4zMZsq93WWY3xN/ZDtdxkLFn1rwzvPA2Ozun+NSoatHPzoSHLX+rSC4KwzqC9Qgk4SlbxjyKM X-Received: by 10.28.215.4 with SMTP id o4mr9147419wmg.0.1510673107539; Tue, 14 Nov 2017 07:25:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1510673107; cv=none; d=google.com; s=arc-20160816; b=XBSir8QdB5ifXYoJXN3dED9QNvBnggGz+ih+ZWJpCDzK6KMueqbgMPUAOLxT2gXCbP bTr79oH8urPfCJd2zJygOYiKlUOjcABe2+e7bkj04WwgYL/TL27/PNNQBOWRNybWRm/G pVJs2zOrwObiBAOX9wKxyHWBpYU56TAvL4fN55F0MRjwx8qoRKojMFbJyjsoxXHA5wCE 9kteyTJ9vEkCS6BMAXyxUo5lgIU8qtHPgCUrna+2itqYzfAd3rOmE7wrPwrqZaCJW0dB iano7cdMXpcMAb0KSjLcioUoTiDf5z1dvaLxMVvN8ZuPIa713suwnELkDD6jj8sqXO9e IlLw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:references:in-reply-to:message-id:date :to:from:dkim-signature:delivered-to:arc-authentication-results; bh=Uc0ucr48loWjC8bUJjIjDOQLwMxAQZGQPzcjy8aXXCU=; b=llcb8W7IJjAsTeYCqhCo1BOFtyB+QhO/glIKuKamDiHgZysBHb1mXS/HQXb3dIG1mN 2GxLQ/wkYac3WXt2MPxuS/uXl2tibpIwN16S3thFLojBdvqFuMbYSKjPfkkMFUrKGoBa sCANeTCg89CgBaECS3cNFeRZYku/cL6N1/TL76St1e0yiimijnADf4LvcZeWZ4Vk8xxq zHVuqx86JSsrrxk9alkgeXjf3h4ewtMKeEm5H1kNndvktPf6nbnbCZiBrAVPvFXbBnbB kT2nfuamHJrh8j0YhXKjnbm/roGf++1rDpPG+tGilD1xnELnb+Kh33n7kfuai5ygJkUh QSwQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@overt.org header.s=mail header.b=KTJza7WI; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id e11si7615188wmi.258.2017.11.14.07.25.07; Tue, 14 Nov 2017 07:25:07 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@overt.org header.s=mail header.b=KTJza7WI; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E42E5689F9E; Tue, 14 Nov 2017 17:24:34 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-io0-f226.google.com (mail-io0-f226.google.com [209.85.223.226]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 007EA689F60 for ; Tue, 14 Nov 2017 17:24:26 +0200 (EET) Received: by mail-io0-f226.google.com with SMTP id q101so7364282ioi.1 for ; Tue, 14 Nov 2017 07:24:42 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references; bh=1UKZeutAfeHDX9rupAQ9yAmfN8uuBk5RxBA/KYj2Yew=; b=bIFZtnfJvHfFcgh/qvHpxbGsMbpmcG2zvMJaOoVffPMsawl8DMnbC6luCzF148mZ2f lXvxpRrE16unS4oE0sTGVTfn0M6SibZtwGrMMOk0Mtyj/JF9L/Vp0oMKnfjoNZ7V/cAJ 917yr4eSu78NNoSA6TMROEknczp4Fuf4deqq8LprCpeDf4XpnJFgHiE6s4bq/u5C0amz 68dHcYOKq+NCVuc6szDzsyZa84jwloYx57625XXDeGFUnN2TSnnn/7Okmk5zjzvCBA9j FVUdOeijWFPWsqOLuYW9P02UFHzb55uv7Gv6/hHAs2nEPxfdNpFk7Aj9zBYXBFSyhKjj 846w== X-Gm-Message-State: AJaThX5dMEk19NtxuhlyFzVlIfMxTQY2ya5ZD6KXtv0/Mlijp2rJb0Q3 IrOIhwAF/BJXniRy84tQiEbQtkNwyLZpn9vxAR9ufZ/+xW9Rlw== X-Received: by 10.107.212.15 with SMTP id l15mr14402236iog.257.1510673080824; Tue, 14 Nov 2017 07:24:40 -0800 (PST) Received: from mail.overt.org (155.208.178.107.bc.googleusercontent.com. [107.178.208.155]) by smtp-relay.gmail.com with ESMTPS id q1sm3716077itc.7.2017.11.14.07.24.40 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 14 Nov 2017 07:24:40 -0800 (PST) X-Relaying-Domain: gapps.overt.org Received: from authenticated-user (mail.overt.org [107.178.208.155]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.overt.org (Postfix) with ESMTPSA id 4B58C627CF; Tue, 14 Nov 2017 15:24:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=overt.org; s=mail; t=1510673080; bh=/9XELfBZOdXEJ7dsaHBVgy4IoXu3oHLp37eyJc2Li7Y=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=KTJza7WIc1zlQcCBVJuyt88cD+U9TUJvb/ZVOmKH41qm/9YAhii24mMe+S+O+32zn v30Ujby7/IkarW42HkBpIR/bIg5MMfNeL1oJ/g6Dv4eedN9RjYDmtIGHKQYogGVOEq HYR4JXUxjJy5vULV9RrfoXWGAkVdhHx3kniH+TggMiHydyPBhTTMMliHxx43BIdDRm FcHTpkWRkFyCGy0VJutzDdU2lWUvpj3dVjFj7SF5uwaN88JTmd+vORICQueqp0mgNj /zdZgleFhHa7z2jFgAP/NpehlfNKPXNJDqfVsRpyIIqBNWiK0oSSl1PkQm9xjdPOSk UqEM2m2UWDvUQ== From: Philip Langdale To: ffmpeg-devel@ffmpeg.org Date: Tue, 14 Nov 2017 07:24:14 -0800 Message-Id: <20171114152414.18478-3-philipl@overt.org> In-Reply-To: <20171114152414.18478-1-philipl@overt.org> References: <20171114152414.18478-1-philipl@overt.org> Subject: [FFmpeg-devel] [PATCH 2/2] avcodec: Implement vc1 nvdec hwaccel X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Philip Langdale MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" This hwaccel is interesting because it also works for wmv3/9 content, which is not supported by the nvidia parser used by cuviddec. Signed-off-by: Philip Langdale --- configure | 3 + libavcodec/Makefile | 1 + libavcodec/allcodecs.c | 2 + libavcodec/nvdec.c | 2 + libavcodec/nvdec_vc1.c | 184 +++++++++++++++++++++++++++++++++++++++++++++++++ libavcodec/vc1dec.c | 3 + 6 files changed, 195 insertions(+) create mode 100644 libavcodec/nvdec_vc1.c diff --git a/configure b/configure index 3788f26956..934ac3abfd 100755 --- a/configure +++ b/configure @@ -2740,6 +2740,8 @@ vc1_d3d11va2_hwaccel_select="vc1_decoder" vc1_dxva2_hwaccel_deps="dxva2" vc1_dxva2_hwaccel_select="vc1_decoder" vc1_mmal_hwaccel_deps="mmal" +vc1_nvdec_hwaccel_deps="nvdec" +vc1_nvdec_hwaccel_select="vc1_decoder" vc1_qsv_hwaccel_deps="libmfx" vc1_vaapi_hwaccel_deps="vaapi" vc1_vaapi_hwaccel_select="vc1_decoder" @@ -2763,6 +2765,7 @@ vp9_vaapi_hwaccel_select="vp9_decoder" wmv3_d3d11va_hwaccel_select="vc1_d3d11va_hwaccel" wmv3_d3d11va2_hwaccel_select="vc1_d3d11va2_hwaccel" wmv3_dxva2_hwaccel_select="vc1_dxva2_hwaccel" +wmv3_nvdec_hwaccel_select="vc1_nvdec_hwaccel" wmv3_vaapi_hwaccel_select="vc1_vaapi_hwaccel" wmv3_vdpau_hwaccel_select="vc1_vdpau_hwaccel" diff --git a/libavcodec/Makefile b/libavcodec/Makefile index 2476aecc40..6315672573 100644 --- a/libavcodec/Makefile +++ b/libavcodec/Makefile @@ -864,6 +864,7 @@ OBJS-$(CONFIG_MPEG4_VDPAU_HWACCEL) += vdpau_mpeg4.o OBJS-$(CONFIG_MPEG4_VIDEOTOOLBOX_HWACCEL) += videotoolbox.o OBJS-$(CONFIG_VC1_D3D11VA_HWACCEL) += dxva2_vc1.o OBJS-$(CONFIG_VC1_DXVA2_HWACCEL) += dxva2_vc1.o +OBJS-$(CONFIG_VC1_NVDEC_HWACCEL) += nvdec_vc1.o OBJS-$(CONFIG_VC1_QSV_HWACCEL) += qsvdec_other.o OBJS-$(CONFIG_VC1_VAAPI_HWACCEL) += vaapi_vc1.o OBJS-$(CONFIG_VC1_VDPAU_HWACCEL) += vdpau_vc1.o diff --git a/libavcodec/allcodecs.c b/libavcodec/allcodecs.c index 0781862de5..e213f3757c 100644 --- a/libavcodec/allcodecs.c +++ b/libavcodec/allcodecs.c @@ -111,6 +111,7 @@ static void register_all(void) REGISTER_HWACCEL(VC1_D3D11VA, vc1_d3d11va); REGISTER_HWACCEL(VC1_D3D11VA2, vc1_d3d11va2); REGISTER_HWACCEL(VC1_DXVA2, vc1_dxva2); + REGISTER_HWACCEL(VC1_NVDEC, vc1_nvdec); REGISTER_HWACCEL(VC1_VAAPI, vc1_vaapi); REGISTER_HWACCEL(VC1_VDPAU, vc1_vdpau); REGISTER_HWACCEL(VC1_MMAL, vc1_mmal); @@ -128,6 +129,7 @@ static void register_all(void) REGISTER_HWACCEL(WMV3_D3D11VA, wmv3_d3d11va); REGISTER_HWACCEL(WMV3_D3D11VA2, wmv3_d3d11va2); REGISTER_HWACCEL(WMV3_DXVA2, wmv3_dxva2); + REGISTER_HWACCEL(WMV3_NVDEC, wmv3_nvdec); REGISTER_HWACCEL(WMV3_VAAPI, wmv3_vaapi); REGISTER_HWACCEL(WMV3_VDPAU, wmv3_vdpau); diff --git a/libavcodec/nvdec.c b/libavcodec/nvdec.c index ac68faca99..20d7c3db27 100644 --- a/libavcodec/nvdec.c +++ b/libavcodec/nvdec.c @@ -54,7 +54,9 @@ static int map_avcodec_id(enum AVCodecID id) switch (id) { case AV_CODEC_ID_H264: return cudaVideoCodec_H264; case AV_CODEC_ID_HEVC: return cudaVideoCodec_HEVC; + case AV_CODEC_ID_VC1: return cudaVideoCodec_VC1; case AV_CODEC_ID_VP9: return cudaVideoCodec_VP9; + case AV_CODEC_ID_WMV3: return cudaVideoCodec_VC1; } return -1; } diff --git a/libavcodec/nvdec_vc1.c b/libavcodec/nvdec_vc1.c new file mode 100644 index 0000000000..cf75ba5aca --- /dev/null +++ b/libavcodec/nvdec_vc1.c @@ -0,0 +1,184 @@ +/* + * VC1 HW decode acceleration through NVDEC + * + * Copyright (c) 2017 Philip Langdale + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "avcodec.h" +#include "nvdec.h" +#include "decode.h" +#include "vc1.h" + +static unsigned char get_ref_idx(AVFrame *frame) +{ + FrameDecodeData *fdd; + NVDECFrame *cf; + + if (!frame || !frame->private_ref) + return 255; + + fdd = (FrameDecodeData*)frame->private_ref->data; + cf = (NVDECFrame*)fdd->hwaccel_priv; + + return cf->idx; +} + +static int nvdec_vc1_start_frame(AVCodecContext *avctx, const uint8_t *buffer, uint32_t size) +{ + VC1Context *v = avctx->priv_data; + MpegEncContext *s = &v->s; + + NVDECContext *ctx = avctx->internal->hwaccel_priv_data; + CUVIDPICPARAMS *pp = &ctx->pic_params; + FrameDecodeData *fdd; + NVDECFrame *cf; + AVFrame *cur_frame = s->current_picture.f; + + int ret; + + ret = ff_nvdec_start_frame(avctx, cur_frame); + if (ret < 0) + return ret; + + fdd = (FrameDecodeData*)cur_frame->private_ref->data; + cf = (NVDECFrame*)fdd->hwaccel_priv; + + *pp = (CUVIDPICPARAMS) { + .PicWidthInMbs = (cur_frame->width + 15) / 16, + .FrameHeightInMbs = (cur_frame->height + 15) / 16, + .CurrPicIdx = cf->idx, + .field_pic_flag = v->field_mode, + .bottom_field_flag = v->cur_field_type, + .second_field = v->second_field, + + .intra_pic_flag = s->pict_type == AV_PICTURE_TYPE_I || + s->pict_type == AV_PICTURE_TYPE_BI, + .ref_pic_flag = s->pict_type == AV_PICTURE_TYPE_I || + s->pict_type == AV_PICTURE_TYPE_P, + + .CodecSpecific.vc1 = { + .ForwardRefIdx = get_ref_idx(s->last_picture.f), + .BackwardRefIdx = get_ref_idx(s->next_picture.f), + .FrameWidth = cur_frame->width, + .FrameHeight = cur_frame->height, + + .intra_pic_flag = s->pict_type == AV_PICTURE_TYPE_I || + s->pict_type == AV_PICTURE_TYPE_BI, + .ref_pic_flag = s->pict_type == AV_PICTURE_TYPE_I || + s->pict_type == AV_PICTURE_TYPE_P, + .progressive_fcm = v->fcm == 0, + + .profile = v->profile, + .postprocflag = v->postprocflag, + .pulldown = v->broadcast, + .interlace = v->interlace, + .tfcntrflag = v->tfcntrflag, + .finterpflag = v->finterpflag, + .psf = v->psf, + .multires = v->multires, + .syncmarker = v->resync_marker, + .rangered = v->rangered, + .maxbframes = s->max_b_frames, + + .panscan_flag = v->panscanflag, + .refdist_flag = v->refdist_flag, + .extended_mv = v->extended_mv, + .dquant = v->dquant, + .vstransform = v->vstransform, + .loopfilter = v->s.loop_filter, + .fastuvmc = v->fastuvmc, + .overlap = v->overlap, + .quantizer = v->quantizer_mode, + .extended_dmv = v->extended_dmv, + .range_mapy_flag = v->range_mapy_flag, + .range_mapy = v->range_mapy, + .range_mapuv_flag = v->range_mapuv_flag, + .range_mapuv = v->range_mapuv, + .rangeredfrm = v->rangeredfrm, + } + }; + + return 0; +} + +static int nvdec_vc1_end_frame(AVCodecContext *avctx) +{ + NVDECContext *ctx = avctx->internal->hwaccel_priv_data; + int ret = ff_nvdec_end_frame(avctx); + ctx->bitstream = NULL; + return ret; +} + +static int nvdec_vc1_decode_slice(AVCodecContext *avctx, const uint8_t *buffer, uint32_t size) +{ + NVDECContext *ctx = avctx->internal->hwaccel_priv_data; + void *tmp; + + tmp = av_fast_realloc(ctx->slice_offsets, &ctx->slice_offsets_allocated, + (ctx->nb_slices + 1) * sizeof(*ctx->slice_offsets)); + if (!tmp) + return AVERROR(ENOMEM); + ctx->slice_offsets = tmp; + + if (!ctx->bitstream) + ctx->bitstream = (uint8_t*)buffer; + + ctx->slice_offsets[ctx->nb_slices] = buffer - ctx->bitstream; + ctx->bitstream_len += size; + ctx->nb_slices++; + + return 0; +} + +static int nvdec_vc1_frame_params(AVCodecContext *avctx, + AVBufferRef *hw_frames_ctx) +{ + // Each frame can at most have one P and one B reference + return ff_nvdec_frame_params(avctx, hw_frames_ctx, 2); +} + +AVHWAccel ff_vc1_nvdec_hwaccel = { + .name = "vc1_nvdec", + .type = AVMEDIA_TYPE_VIDEO, + .id = AV_CODEC_ID_VC1, + .pix_fmt = AV_PIX_FMT_CUDA, + .start_frame = nvdec_vc1_start_frame, + .end_frame = nvdec_vc1_end_frame, + .decode_slice = nvdec_vc1_decode_slice, + .frame_params = nvdec_vc1_frame_params, + .init = ff_nvdec_decode_init, + .uninit = ff_nvdec_decode_uninit, + .priv_data_size = sizeof(NVDECContext), +}; + +#if CONFIG_WMV3_NVDEC_HWACCEL +AVHWAccel ff_wmv3_nvdec_hwaccel = { + .name = "wmv3_nvdec", + .type = AVMEDIA_TYPE_VIDEO, + .id = AV_CODEC_ID_WMV3, + .pix_fmt = AV_PIX_FMT_CUDA, + .start_frame = nvdec_vc1_start_frame, + .end_frame = nvdec_vc1_end_frame, + .decode_slice = nvdec_vc1_decode_slice, + .frame_params = nvdec_vc1_frame_params, + .init = ff_nvdec_decode_init, + .uninit = ff_nvdec_decode_uninit, + .priv_data_size = sizeof(NVDECContext), +}; +#endif diff --git a/libavcodec/vc1dec.c b/libavcodec/vc1dec.c index 6bdaeca98e..96b8bb5364 100644 --- a/libavcodec/vc1dec.c +++ b/libavcodec/vc1dec.c @@ -1119,6 +1119,9 @@ static const enum AVPixelFormat vc1_hwaccel_pixfmt_list_420[] = { AV_PIX_FMT_D3D11VA_VLD, AV_PIX_FMT_D3D11, #endif +#if CONFIG_VC1_NVDEC_HWACCEL + AV_PIX_FMT_CUDA, +#endif #if CONFIG_VC1_VAAPI_HWACCEL AV_PIX_FMT_VAAPI, #endif