From patchwork Fri Jun 2 08:06:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wu, Tong1" X-Patchwork-Id: 41949 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:c51c:b0:10c:5e6f:955f with SMTP id gm28csp1055000pzb; Fri, 2 Jun 2023 01:11:44 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7hxDJIyUIrIiTyO1c5k8IE5yZ4aGP+Cml+IOCJ5lgOOV6AtNmLTQrV6uPXqLaNTvo0a6KD X-Received: by 2002:a17:907:e90:b0:959:18b2:454a with SMTP id ho16-20020a1709070e9000b0095918b2454amr12229113ejc.76.1685693503719; Fri, 02 Jun 2023 01:11:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685693503; cv=none; d=google.com; s=arc-20160816; b=lZo4bNHr1nQe42O4/5+jbmyZzrLNNCMSUKi1CSW9i8h59U6IBpUeCcA5J+Uj8oJswS fSVDvSLsPeGoVCJmr+FZqyLy751uX6qIZimnxNh8iG2NYvUTVzI/NfOFc1vAvOS1iTHc QhS99wVgAguympa/hxxaD0IPYjuVxwNVNV0PVAaEC+z0OOQQyGs6I8dQunYEooQqfLEA zkWX1PyqaElmRMT97h+w77oR0LivvPcg6DtXGok00l0rTcgKyh9g5x65bLwo+g7iVlh2 eiLS6QaBkEWBdW/3V7Ir7TUIA0zPlVW/lAqPoAvdkGhJhwg5sLWObLrfRu6Cu+B32P5b I/Og== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=8wid94bOlfB1S7IDz10TnFF8C8c0PG87/8fhFHmIKxg=; b=JuumbUp9jgu0GfG5apw9CwtkhMEmSNNz4E2uJlJUZQ6ZqepxJGuqadPWpSxuzgpj74 p1dOd6z4LSO4tsWE8VlR3vy9yzZMuabMw2o/gJzjkrIv4yyCS4eNHUFDrFd6CaT++ZrB UWo4sH1yfFTfoIHHy1sB3JYB8OM5e7fndV/mj7C4W6inrWqp+RumUx60TgpW0xvBk6iW gn+5NhWFn76aJDBjBV+etkt5BT+9tESAMqOHZmkpYf3B6myhC48ybaeZvlniEIk9z5kP gPI/rj5vJInhTN/VOil1BK2jcVhlHrEYEiK0Mhhr2BOAKXI4UVSlSzwBoGgYpONRsJ2x 3JXw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=ga+uqW73; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id j20-20020a50ed14000000b005149c8b6824si524518eds.24.2023.06.02.01.11.43; Fri, 02 Jun 2023 01:11:43 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=ga+uqW73; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6991568C316; Fri, 2 Jun 2023 11:11:39 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1672168BFD5 for ; Fri, 2 Jun 2023 11:11:30 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1685693496; x=1717229496; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=23aVx4jtygbed9N441cDTvilTAWW7N59RkldFUOyfYE=; b=ga+uqW73kp8H0DBMOWb0BH3UWLbVaaaKMkFrYFkk0F8eA2Ilo3StNU/7 A4bqDsGMxylxedF0r9Qe4c7qtGk7SZtC038ndqwg/SdG+02OiCH42BNib Wb6teDj/CZhjKg7z46d7IlURptFZsUFTsY3X+3VfCbbjLtWwiFHbRofy1 wlYvawhO+FneIUR76FJDd3P/3+WMzTvacjIFctel+OhbETDTBbo2Uykj+ rJm/wPLPyFc/hXyoKDurILGIqufZCrXiV/XE3PLd9mKij9/9cIcMn/Ayp dw8u+MZUOBc3JFKV0VTgtwFlCgi+60sRkND/O5PEVaxWCRD3X2XFj8NYt g==; X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="421629823" X-IronPort-AV: E=Sophos;i="6.00,212,1681196400"; d="scan'208";a="421629823" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Jun 2023 01:11:20 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="852060664" X-IronPort-AV: E=Sophos;i="6.00,212,1681196400"; d="scan'208";a="852060664" Received: from desktop-qn7n0nf.sh.intel.com (HELO localhost.localdomain) ([10.239.160.59]) by fmsmga001.fm.intel.com with ESMTP; 02 Jun 2023 01:11:18 -0700 From: Tong Wu To: ffmpeg-devel@ffmpeg.org Date: Fri, 2 Jun 2023 16:06:53 +0800 Message-Id: <20230602080701.1754-1-tong1.wu@intel.com> X-Mailer: git-send-email 2.35.1.windows.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 1/9] libavutil: add hwcontext_d3d12va and AV_PIX_FMT_D3D12 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Tong Wu , Wu Jianhua Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: G9f8lWL2ypTi From: Wu Jianhua Signed-off-by: Wu Jianhua Signed-off-by: Tong Wu --- configure | 5 + doc/APIchanges | 7 + libavutil/Makefile | 3 + libavutil/hwcontext.c | 4 + libavutil/hwcontext.h | 1 + libavutil/hwcontext_d3d12va.c | 699 +++++++++++++++++++++++++ libavutil/hwcontext_d3d12va.h | 167 ++++++ libavutil/hwcontext_d3d12va_internal.h | 63 +++ libavutil/hwcontext_internal.h | 1 + libavutil/pixdesc.c | 4 + libavutil/pixfmt.h | 9 + libavutil/tests/hwdevice.c | 2 + 12 files changed, 965 insertions(+) create mode 100644 libavutil/hwcontext_d3d12va.c create mode 100644 libavutil/hwcontext_d3d12va.h create mode 100644 libavutil/hwcontext_d3d12va_internal.h diff --git a/configure b/configure index 495493aa0e..b86064e36f 100755 --- a/configure +++ b/configure @@ -337,6 +337,7 @@ External library support: --disable-cuda-llvm disable CUDA compilation using clang [autodetect] --disable-cuvid disable Nvidia CUVID support [autodetect] --disable-d3d11va disable Microsoft Direct3D 11 video acceleration code [autodetect] + --disable-d3d12va disable Microsoft Direct3D 12 video acceleration code [autodetect] --disable-dxva2 disable Microsoft DirectX 9 video acceleration code [autodetect] --disable-ffnvcodec disable dynamically linked Nvidia code [autodetect] --enable-libdrm enable DRM code (Linux) [no] @@ -1885,6 +1886,7 @@ HWACCEL_AUTODETECT_LIBRARY_LIST=" cuda_llvm cuvid d3d11va + d3d12va dxva2 ffnvcodec nvdec @@ -2999,6 +3001,7 @@ crystalhd_deps="libcrystalhd_libcrystalhd_if_h" cuda_deps="ffnvcodec" cuvid_deps="ffnvcodec" d3d11va_deps="dxva_h ID3D11VideoDecoder ID3D11VideoContext" +d3d12va_deps="dxva_h ID3D12Device ID3D12VideoDecoder" dxva2_deps="dxva2api_h DXVA2_ConfigPictureDecode ole32 user32" ffnvcodec_deps_any="libdl LoadLibrary" mediacodec_deps="android" @@ -6449,6 +6452,8 @@ check_type "windows.h dxgi1_2.h" "IDXGIOutput1" check_type "windows.h dxgi1_5.h" "IDXGIOutput5" check_type "windows.h d3d11.h" "ID3D11VideoDecoder" check_type "windows.h d3d11.h" "ID3D11VideoContext" +check_type "windows.h d3d12.h" "ID3D12Device" +check_type "windows.h d3d12video.h" "ID3D12VideoDecoder" check_type "windows.h" "DPI_AWARENESS_CONTEXT" -D_WIN32_WINNT=0x0A00 check_type "d3d9.h dxva2api.h" DXVA2_ConfigPictureDecode -D_WIN32_WINNT=0x0602 check_func_headers mfapi.h MFCreateAlignedMemoryBuffer -lmfplat diff --git a/doc/APIchanges b/doc/APIchanges index f040211f7d..b64ed73a64 100644 --- a/doc/APIchanges +++ b/doc/APIchanges @@ -2,6 +2,13 @@ The last version increases of all libraries were on 2023-02-09 API changes, most recent first: +2023-05-xx - xxxxxxxxxx - lavu 58.7.100 - pixfmt.h hwcontext.h hwcontext_d3d12va.h + Add AV_HWDEVICE_TYPE_D3D12VA and AV_PIX_FMT_D3D12. + Add AVD3D12VADeviceContext, AVD3D12VASyncContext, AVD3D12FrameDescriptor, + and AVD3D12VAFramesContext. + Add av_d3d12va_map_sw_to_hw_format, av_d3d12va_create_sync_context, + av_d3d12va_release_sync_context, av_d3d12va_wait_idle, and av_d3d12va_wait_queue_idle. + 2023-05-29 - xxxxxxxxxx - lavc 60.16.100 - avcodec.h codec_id.h Add AV_CODEC_ID_EVC, FF_PROFILE_EVC_BASELINE, and FF_PROFILE_EVC_MAIN. diff --git a/libavutil/Makefile b/libavutil/Makefile index bd9c6f9e32..40d49d76dd 100644 --- a/libavutil/Makefile +++ b/libavutil/Makefile @@ -41,6 +41,7 @@ HEADERS = adler32.h \ hwcontext.h \ hwcontext_cuda.h \ hwcontext_d3d11va.h \ + hwcontext_d3d12va.h \ hwcontext_drm.h \ hwcontext_dxva2.h \ hwcontext_qsv.h \ @@ -186,6 +187,7 @@ OBJS = adler32.o \ OBJS-$(CONFIG_CUDA) += hwcontext_cuda.o OBJS-$(CONFIG_D3D11VA) += hwcontext_d3d11va.o +OBJS-$(CONFIG_D3D12VA) += hwcontext_d3d12va.o OBJS-$(CONFIG_DXVA2) += hwcontext_dxva2.o OBJS-$(CONFIG_LIBDRM) += hwcontext_drm.o OBJS-$(CONFIG_MACOS_KPERF) += macos_kperf.o @@ -209,6 +211,7 @@ SKIPHEADERS-$(HAVE_CUDA_H) += hwcontext_cuda.h SKIPHEADERS-$(CONFIG_CUDA) += hwcontext_cuda_internal.h \ cuda_check.h SKIPHEADERS-$(CONFIG_D3D11VA) += hwcontext_d3d11va.h +SKIPHEADERS-$(CONFIG_D3D12VA) += hwcontext_d3d12va.h SKIPHEADERS-$(CONFIG_DXVA2) += hwcontext_dxva2.h SKIPHEADERS-$(CONFIG_QSV) += hwcontext_qsv.h SKIPHEADERS-$(CONFIG_OPENCL) += hwcontext_opencl.h diff --git a/libavutil/hwcontext.c b/libavutil/hwcontext.c index 3396598269..04070bc3c3 100644 --- a/libavutil/hwcontext.c +++ b/libavutil/hwcontext.c @@ -36,6 +36,9 @@ static const HWContextType * const hw_table[] = { #if CONFIG_D3D11VA &ff_hwcontext_type_d3d11va, #endif +#if CONFIG_D3D12VA + &ff_hwcontext_type_d3d12va, +#endif #if CONFIG_LIBDRM &ff_hwcontext_type_drm, #endif @@ -71,6 +74,7 @@ static const char *const hw_type_names[] = { [AV_HWDEVICE_TYPE_DRM] = "drm", [AV_HWDEVICE_TYPE_DXVA2] = "dxva2", [AV_HWDEVICE_TYPE_D3D11VA] = "d3d11va", + [AV_HWDEVICE_TYPE_D3D12VA] = "d3d12va", [AV_HWDEVICE_TYPE_OPENCL] = "opencl", [AV_HWDEVICE_TYPE_QSV] = "qsv", [AV_HWDEVICE_TYPE_VAAPI] = "vaapi", diff --git a/libavutil/hwcontext.h b/libavutil/hwcontext.h index 7ff08c8608..2b33721a97 100644 --- a/libavutil/hwcontext.h +++ b/libavutil/hwcontext.h @@ -37,6 +37,7 @@ enum AVHWDeviceType { AV_HWDEVICE_TYPE_OPENCL, AV_HWDEVICE_TYPE_MEDIACODEC, AV_HWDEVICE_TYPE_VULKAN, + AV_HWDEVICE_TYPE_D3D12VA, }; typedef struct AVHWDeviceInternal AVHWDeviceInternal; diff --git a/libavutil/hwcontext_d3d12va.c b/libavutil/hwcontext_d3d12va.c new file mode 100644 index 0000000000..d159aef08b --- /dev/null +++ b/libavutil/hwcontext_d3d12va.c @@ -0,0 +1,699 @@ +/* + * Direct3D 12 HW acceleration. + * + * copyright (c) 2022-2023 Wu Jianhua + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "common.h" +#include "hwcontext.h" +#include "hwcontext_internal.h" +#include "hwcontext_d3d12va_internal.h" +#include "hwcontext_d3d12va.h" +#include "imgutils.h" +#include "pixdesc.h" +#include "pixfmt.h" +#include "thread.h" +#include "compat/w32dlfcn.h" +#include + +typedef HRESULT(WINAPI *PFN_CREATE_DXGI_FACTORY2)(UINT Flags, REFIID riid, void **ppFactory); + +static AVOnce functions_loaded = AV_ONCE_INIT; + +static PFN_CREATE_DXGI_FACTORY2 d3d12va_create_dxgi_factory2; +static PFN_D3D12_CREATE_DEVICE d3d12va_create_device; +static PFN_D3D12_GET_DEBUG_INTERFACE d3d12va_get_debug_interface; + +static av_cold void load_functions(void) +{ + HANDLE d3dlib, dxgilib; + + d3dlib = dlopen("d3d12.dll", 0); + dxgilib = dlopen("dxgi.dll", 0); + if (!d3dlib || !dxgilib) + return; + + d3d12va_create_device = (PFN_D3D12_CREATE_DEVICE)GetProcAddress(d3dlib, "D3D12CreateDevice"); + d3d12va_create_dxgi_factory2 = (PFN_CREATE_DXGI_FACTORY2)GetProcAddress(dxgilib, "CreateDXGIFactory2"); + d3d12va_get_debug_interface = (PFN_D3D12_GET_DEBUG_INTERFACE)GetProcAddress(d3dlib, "D3D12GetDebugInterface"); +} + +typedef struct D3D12VAFramesContext { + ID3D12Resource *staging_buffer; + ID3D12CommandQueue *command_queue; + ID3D12CommandAllocator *command_allocator; + ID3D12GraphicsCommandList *command_list; + AVD3D12VASyncContext *sync_ctx; + int nb_surfaces; + int nb_surfaces_used; + DXGI_FORMAT format; + UINT luma_component_size; +} D3D12VAFramesContext; + +static const struct { + DXGI_FORMAT d3d_format; + enum AVPixelFormat pix_fmt; +} supported_formats[] = { + { DXGI_FORMAT_NV12, AV_PIX_FMT_NV12 }, + { DXGI_FORMAT_P010, AV_PIX_FMT_P010 }, +}; + +DXGI_FORMAT av_d3d12va_map_sw_to_hw_format(enum AVPixelFormat pix_fmt) +{ + switch (pix_fmt) { + case AV_PIX_FMT_NV12:return DXGI_FORMAT_NV12; + case AV_PIX_FMT_P010:return DXGI_FORMAT_P010; + default: return DXGI_FORMAT_UNKNOWN; + } +} + +int av_d3d12va_sync_context_alloc(AVD3D12VADeviceContext *ctx, AVD3D12VASyncContext **psync_ctx) +{ + AVD3D12VASyncContext *sync_ctx; + + sync_ctx = av_mallocz(sizeof(AVD3D12VASyncContext)); + if (!sync_ctx) + return AVERROR(ENOMEM); + + DX_CHECK(ID3D12Device_CreateFence(ctx->device, sync_ctx->fence_value, D3D12_FENCE_FLAG_NONE, &IID_ID3D12Fence, &sync_ctx->fence)); + + sync_ctx->event = CreateEvent(NULL, FALSE, FALSE, NULL); + if (!sync_ctx->event) + goto fail; + + *psync_ctx = sync_ctx; + + return 0; + +fail: + D3D12_OBJECT_RELEASE(sync_ctx->fence); + av_freep(&sync_ctx); + return AVERROR(EINVAL); +} + +void av_d3d12va_sync_context_free(AVD3D12VASyncContext **psync_ctx) +{ + AVD3D12VASyncContext *sync_ctx = *psync_ctx; + if (!psync_ctx || !sync_ctx) + return; + + av_d3d12va_wait_idle(sync_ctx); + + D3D12_OBJECT_RELEASE(sync_ctx->fence); + + if (sync_ctx->event) + CloseHandle(sync_ctx->event); + + av_freep(psync_ctx); +} + +static int av_d3d12va_wait_for_fence_value(AVD3D12VASyncContext *sync_ctx, uint64_t fence_value) +{ + uint64_t completion = ID3D12Fence_GetCompletedValue(sync_ctx->fence); + if (completion < fence_value) { + if (FAILED(ID3D12Fence_SetEventOnCompletion(sync_ctx->fence, fence_value, sync_ctx->event))) + return AVERROR(EINVAL); + + WaitForSingleObjectEx(sync_ctx->event, INFINITE, FALSE); + } + + return 0; +} + +int av_d3d12va_wait_idle(AVD3D12VASyncContext *ctx) +{ + return av_d3d12va_wait_for_fence_value(ctx, ctx->fence_value); +} + +int av_d3d12va_wait_queue_idle(AVD3D12VASyncContext *sync_ctx, ID3D12CommandQueue *command_queue) +{ + DX_CHECK(ID3D12CommandQueue_Signal(command_queue, sync_ctx->fence, ++sync_ctx->fence_value)); + return av_d3d12va_wait_idle(sync_ctx); + +fail: + return AVERROR(EINVAL); +} + +static inline int create_resource(ID3D12Device *device, const D3D12_RESOURCE_DESC *desc, D3D12_RESOURCE_STATES states, ID3D12Resource **ppResource, int is_read_back) +{ + D3D12_HEAP_PROPERTIES props = { + .Type = is_read_back ? D3D12_HEAP_TYPE_READBACK : D3D12_HEAP_TYPE_DEFAULT, + .CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN, + .MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN, + .CreationNodeMask = 0, + .VisibleNodeMask = 0, + }; + + if (FAILED(ID3D12Device_CreateCommittedResource(device, &props, D3D12_HEAP_FLAG_NONE, desc, + states, NULL, &IID_ID3D12Resource, ppResource))) + return AVERROR(EINVAL); + + return 0; +} + +static int d3d12va_create_staging_buffer_resource(AVHWFramesContext *ctx) +{ + AVD3D12VADeviceContext *device_hwctx = ctx->device_ctx->hwctx; + D3D12VAFramesContext *s = ctx->internal->priv; + + D3D12_RESOURCE_DESC desc = { + .Dimension = D3D12_RESOURCE_DIMENSION_BUFFER, + .Alignment = 0, + .Width = 0, + .Height = 1, + .DepthOrArraySize = 1, + .MipLevels = 1, + .Format = DXGI_FORMAT_UNKNOWN, + .SampleDesc = { .Count = 1, .Quality = 0 }, + .Layout = D3D12_TEXTURE_LAYOUT_ROW_MAJOR, + .Flags = D3D12_RESOURCE_FLAG_NONE, + }; + + s->luma_component_size = FFALIGN(ctx->width * (s->format == DXGI_FORMAT_P010 ? 2 : 1), D3D12_TEXTURE_DATA_PITCH_ALIGNMENT) * ctx->height; + desc.Width = s->luma_component_size + (s->luma_component_size >> 1); + + return create_resource(device_hwctx->device, &desc, D3D12_RESOURCE_STATE_COPY_DEST, &s->staging_buffer, 1); +} + +static int d3d12va_create_helper_objects(AVHWFramesContext *ctx) +{ + AVD3D12VADeviceContext *device_hwctx = ctx->device_ctx->hwctx; + D3D12VAFramesContext *s = ctx->internal->priv; + + D3D12_COMMAND_QUEUE_DESC queue_desc = { + .Type = D3D12_COMMAND_LIST_TYPE_COPY, + .Priority = 0, + .NodeMask = 0, + }; + + int ret = d3d12va_create_staging_buffer_resource(ctx); + if (ret < 0) + return ret; + + ret = av_d3d12va_sync_context_alloc(device_hwctx, &s->sync_ctx); + if (ret < 0) + return ret; + + DX_CHECK(ID3D12Device_CreateCommandQueue(device_hwctx->device, &queue_desc, + &IID_ID3D12CommandQueue, &s->command_queue)); + + DX_CHECK(ID3D12Device_CreateCommandAllocator(device_hwctx->device, queue_desc.Type, + &IID_ID3D12CommandAllocator, &s->command_allocator)); + + DX_CHECK(ID3D12Device_CreateCommandList(device_hwctx->device, 0, queue_desc.Type, + s->command_allocator, NULL, &IID_ID3D12GraphicsCommandList, &s->command_list)); + + DX_CHECK(ID3D12GraphicsCommandList_Close(s->command_list)); + + ID3D12CommandQueue_ExecuteCommandLists(s->command_queue, 1, (ID3D12CommandList **)&s->command_list); + + return av_d3d12va_wait_queue_idle(s->sync_ctx, s->command_queue); + +fail: + return AVERROR(EINVAL); +} + +static void d3d12va_frames_uninit(AVHWFramesContext *ctx) +{ + AVD3D12VAFramesContext *frames_hwctx = ctx->hwctx; + D3D12VAFramesContext *s = ctx->internal->priv; + + av_d3d12va_sync_context_free(&s->sync_ctx); + + D3D12_OBJECT_RELEASE(s->staging_buffer); + D3D12_OBJECT_RELEASE(s->command_allocator); + D3D12_OBJECT_RELEASE(s->command_list); + D3D12_OBJECT_RELEASE(s->command_queue); + + av_freep(&frames_hwctx->texture_infos); +} + +static int d3d12va_frames_get_constraints(AVHWDeviceContext *ctx, const void *hwconfig, AVHWFramesConstraints *constraints) +{ + HRESULT hr; + int nb_sw_formats = 0; + AVD3D12VADeviceContext *device_hwctx = ctx->hwctx; + + constraints->valid_sw_formats = av_malloc_array(FF_ARRAY_ELEMS(supported_formats) + 1, + sizeof(*constraints->valid_sw_formats)); + if (!constraints->valid_sw_formats) + return AVERROR(ENOMEM); + + for (int i = 0; i < FF_ARRAY_ELEMS(supported_formats); i++) { + D3D12_FEATURE_DATA_FORMAT_SUPPORT format_support = { supported_formats[i].d3d_format }; + hr = ID3D12Device_CheckFeatureSupport(device_hwctx->device, D3D12_FEATURE_FORMAT_SUPPORT, &format_support, sizeof(format_support)); + if (SUCCEEDED(hr) && (format_support.Support1 & D3D12_FORMAT_SUPPORT1_TEXTURE2D)) + constraints->valid_sw_formats[nb_sw_formats++] = supported_formats[i].pix_fmt; + } + constraints->valid_sw_formats[nb_sw_formats] = AV_PIX_FMT_NONE; + + constraints->valid_hw_formats = av_malloc_array(2, sizeof(*constraints->valid_hw_formats)); + if (!constraints->valid_hw_formats) + return AVERROR(ENOMEM); + + constraints->valid_hw_formats[0] = AV_PIX_FMT_D3D12; + constraints->valid_hw_formats[1] = AV_PIX_FMT_NONE; + + return 0; +} + +static void free_texture(void *opaque, uint8_t *data) +{ + AVD3D12FrameDescriptor *desc = (AVD3D12FrameDescriptor *)data; + + if (desc->sync_ctx) + av_d3d12va_sync_context_free(&desc->sync_ctx); + + D3D12_OBJECT_RELEASE(desc->texture); + av_freep(&data); +} + +static AVBufferRef *wrap_texture_buf(AVHWFramesContext *ctx, ID3D12Resource *texture, AVD3D12VASyncContext *sync_ctx) +{ + AVBufferRef *buf; + D3D12VAFramesContext *s = ctx->internal->priv; + AVD3D12VAFramesContext *frames_hwctx = ctx->hwctx; + + AVD3D12FrameDescriptor *desc = av_mallocz(sizeof(*desc)); + if (!desc) + goto fail; + + if (s->nb_surfaces <= s->nb_surfaces_used) { + frames_hwctx->texture_infos = av_realloc_f(frames_hwctx->texture_infos, + s->nb_surfaces_used + 1, + sizeof(*frames_hwctx->texture_infos)); + if (!frames_hwctx->texture_infos) + goto fail; + s->nb_surfaces = s->nb_surfaces_used + 1; + } + + desc->texture = texture; + desc->index = s->nb_surfaces_used; + desc->sync_ctx = sync_ctx; + + frames_hwctx->texture_infos[s->nb_surfaces_used].texture = texture; + frames_hwctx->texture_infos[s->nb_surfaces_used].index = desc->index; + frames_hwctx->texture_infos[s->nb_surfaces_used].sync_ctx = sync_ctx; + s->nb_surfaces_used++; + + buf = av_buffer_create((uint8_t *)desc, sizeof(desc), free_texture, texture, 0); + if (!buf) { + D3D12_OBJECT_RELEASE(texture); + av_freep(&desc); + return NULL; + } + + return buf; + +fail: + D3D12_OBJECT_RELEASE(texture); + av_d3d12va_sync_context_free(&sync_ctx); + return NULL; +} + +static AVBufferRef *d3d12va_pool_alloc(void *opaque, size_t size) +{ + AVHWFramesContext *ctx = (AVHWFramesContext *)opaque; + D3D12VAFramesContext *s = ctx->internal->priv; + AVD3D12VAFramesContext *hwctx = ctx->hwctx; + AVD3D12VADeviceContext *device_hwctx = ctx->device_ctx->hwctx; + + int ret; + ID3D12Resource *texture; + AVD3D12VASyncContext *sync_ctx; + + D3D12_RESOURCE_DESC desc = { + .Dimension = D3D12_RESOURCE_DIMENSION_TEXTURE2D, + .Alignment = 0, + .Width = ctx->width, + .Height = ctx->height, + .DepthOrArraySize = 1, + .MipLevels = 1, + .Format = s->format, + .SampleDesc = {.Count = 1, .Quality = 0 }, + .Layout = D3D12_TEXTURE_LAYOUT_UNKNOWN, + .Flags = D3D12_RESOURCE_FLAG_NONE, + }; + + if (s->nb_surfaces_used >= ctx->initial_pool_size) { + av_log(ctx, AV_LOG_ERROR, "Static surface pool size exceeded.\n"); + return NULL; + } + + ret = create_resource(device_hwctx->device, &desc, D3D12_RESOURCE_STATE_COMMON, &texture, 0); + if (ret < 0) + return NULL; + + ret = av_d3d12va_sync_context_alloc(device_hwctx, &sync_ctx); + if (ret < 0) { + D3D12_OBJECT_RELEASE(texture) + return NULL; + } + + return wrap_texture_buf(ctx, texture, sync_ctx); +} + +static int d3d12va_frames_init(AVHWFramesContext *ctx) +{ + AVD3D12VAFramesContext *hwctx = ctx->hwctx; + AVD3D12VADeviceContext *device_hwctx = ctx->device_ctx->hwctx; + D3D12VAFramesContext *s = ctx->internal->priv; + + int i; + + if (ctx->initial_pool_size > D3D12VA_MAX_SURFACES) { + av_log(ctx, AV_LOG_WARNING, "Too big initial pool size(%d) for surfaces. " + "The size will be limited to %d automatically\n", ctx->initial_pool_size, D3D12VA_MAX_SURFACES); + ctx->initial_pool_size = D3D12VA_MAX_SURFACES; + } + + for (i = 0; i < FF_ARRAY_ELEMS(supported_formats); i++) { + if (ctx->sw_format == supported_formats[i].pix_fmt) { + s->format = supported_formats[i].d3d_format; + break; + } + } + if (i == FF_ARRAY_ELEMS(supported_formats)) { + av_log(ctx, AV_LOG_ERROR, "Unsupported pixel format: %s\n", + av_get_pix_fmt_name(ctx->sw_format)); + return AVERROR(EINVAL); + } + + hwctx->texture_infos = av_realloc_f(NULL, ctx->initial_pool_size, sizeof(*hwctx->texture_infos)); + if (!hwctx->texture_infos) + return AVERROR(ENOMEM); + + memset(hwctx->texture_infos, 0, ctx->initial_pool_size * sizeof(*hwctx->texture_infos)); + s->nb_surfaces = ctx->initial_pool_size; + + ctx->internal->pool_internal = av_buffer_pool_init2(sizeof(AVD3D12FrameDescriptor), + ctx, d3d12va_pool_alloc, NULL); + + if (!ctx->internal->pool_internal) + return AVERROR(ENOMEM); + + return 0; +} + +static int d3d12va_get_buffer(AVHWFramesContext *ctx, AVFrame *frame) +{ + int ret; + AVD3D12FrameDescriptor *desc; + + frame->buf[0] = av_buffer_pool_get(ctx->pool); + if (!frame->buf[0]) + return AVERROR(ENOMEM); + + ret = av_image_fill_arrays(frame->data, frame->linesize, frame->buf[0]->data, + ctx->sw_format, ctx->width, ctx->height, D3D12_TEXTURE_DATA_PITCH_ALIGNMENT); + if (ret < 0) + return ret; + + desc = (AVD3D12FrameDescriptor *)frame->buf[0]->data; + frame->data[0] = (uint8_t *)desc->texture; + frame->data[1] = (uint8_t *)desc->index; + frame->data[2] = (uint8_t *)desc->sync_ctx; + + frame->format = AV_PIX_FMT_D3D12; + frame->width = ctx->width; + frame->height = ctx->height; + + return 0; +} + +static int d3d12va_transfer_get_formats(AVHWFramesContext *ctx, + enum AVHWFrameTransferDirection dir, + enum AVPixelFormat **formats) +{ + D3D12VAFramesContext *s = ctx->internal->priv; + enum AVPixelFormat *fmts; + + fmts = av_malloc_array(2, sizeof(*fmts)); + if (!fmts) + return AVERROR(ENOMEM); + + fmts[0] = ctx->sw_format; + fmts[1] = AV_PIX_FMT_NONE; + + *formats = fmts; + + return 0; +} + +static int d3d12va_transfer_data(AVHWFramesContext *ctx, AVFrame *dst, + const AVFrame *src) +{ + AVD3D12VADeviceContext *hwctx = ctx->device_ctx->hwctx; + AVD3D12VAFramesContext *frames_hwctx = ctx->hwctx; + D3D12VAFramesContext *s = ctx->internal->priv; + + int ret; + int download = src->format == AV_PIX_FMT_D3D12; + const AVFrame *frame = download ? src : dst; + const AVFrame *other = download ? dst : src; + + ID3D12Resource *texture = (ID3D12Resource *) frame->data[0]; + int index = (intptr_t) frame->data[1]; + AVD3D12VASyncContext *sync_ctx = (AVD3D12VASyncContext *)frame->data[2]; + + uint8_t *mapped_data; + uint8_t *data[4]; + int linesizes[4]; + + D3D12_TEXTURE_COPY_LOCATION staging_y_location; + D3D12_TEXTURE_COPY_LOCATION staging_uv_location; + + D3D12_TEXTURE_COPY_LOCATION texture_y_location = { + .pResource = texture, + .Type = D3D12_TEXTURE_COPY_TYPE_SUBRESOURCE_INDEX, + .SubresourceIndex = 0, + }; + + D3D12_TEXTURE_COPY_LOCATION texture_uv_location = { + .pResource = texture, + .Type = D3D12_TEXTURE_COPY_TYPE_SUBRESOURCE_INDEX, + .SubresourceIndex = 1, + }; + + D3D12_RESOURCE_BARRIER barrier = { + .Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION, + .Flags = D3D12_RESOURCE_BARRIER_FLAG_NONE, + .Transition = { + .pResource = texture, + .StateBefore = D3D12_RESOURCE_STATE_COMMON, + .StateAfter = D3D12_RESOURCE_STATE_COPY_SOURCE, + .Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES, + } + }; + + s->format = av_d3d12va_map_sw_to_hw_format(ctx->sw_format); + + if (frame->hw_frames_ctx->data != (uint8_t *)ctx || other->format != ctx->sw_format) + return AVERROR(EINVAL); + + if (!s->command_queue) { + ret = d3d12va_create_helper_objects(ctx); + if (ret < 0) + return ret; + } + + for (int i = 0; i < 4; i++) + linesizes[i] = FFALIGN(frame->width * (s->format == DXGI_FORMAT_P010 ? 2 : 1), D3D12_TEXTURE_DATA_PITCH_ALIGNMENT); + + staging_y_location = (D3D12_TEXTURE_COPY_LOCATION) { + .pResource = s->staging_buffer, + .Type = D3D12_TEXTURE_COPY_TYPE_PLACED_FOOTPRINT, + .PlacedFootprint = { + .Offset = 0, + .Footprint = { + .Format = s->format == DXGI_FORMAT_P010 ? DXGI_FORMAT_R16_UNORM : DXGI_FORMAT_R8_UNORM, + .Width = ctx->width, + .Height = ctx->height, + .Depth = 1, + .RowPitch = linesizes[0], + }, + }, + }; + + staging_uv_location = (D3D12_TEXTURE_COPY_LOCATION) { + .pResource = s->staging_buffer, + .Type = D3D12_TEXTURE_COPY_TYPE_PLACED_FOOTPRINT, + .PlacedFootprint = { + .Offset = s->luma_component_size, + .Footprint = { + .Format = s->format == DXGI_FORMAT_P010 ? DXGI_FORMAT_R16G16_UNORM : DXGI_FORMAT_R8G8_UNORM, + .Width = ctx->width >> 1, + .Height = ctx->height >> 1, + .Depth = 1, + .RowPitch = linesizes[0], + }, + }, + }; + + DX_CHECK(ID3D12CommandAllocator_Reset(s->command_allocator)); + + DX_CHECK(ID3D12GraphicsCommandList_Reset(s->command_list, s->command_allocator, NULL)); + + if (download) { + ID3D12GraphicsCommandList_ResourceBarrier(s->command_list, 1, &barrier); + + ID3D12GraphicsCommandList_CopyTextureRegion(s->command_list, + &staging_y_location, 0, 0, 0, &texture_y_location, NULL); + + ID3D12GraphicsCommandList_CopyTextureRegion(s->command_list, + &staging_uv_location, 0, 0, 0, &texture_uv_location, NULL); + + barrier.Transition.StateBefore = barrier.Transition.StateAfter; + barrier.Transition.StateAfter = D3D12_RESOURCE_STATE_COMMON; + ID3D12GraphicsCommandList_ResourceBarrier(s->command_list, 1, &barrier); + + DX_CHECK(ID3D12GraphicsCommandList_Close(s->command_list)); + + if (!hwctx->sync) + DX_CHECK(ID3D12CommandQueue_Wait(s->command_queue, sync_ctx->fence, sync_ctx->fence_value)); + + ID3D12CommandQueue_ExecuteCommandLists(s->command_queue, 1, (ID3D12CommandList **)&s->command_list); + + ret = av_d3d12va_wait_queue_idle(s->sync_ctx, s->command_queue); + if (ret) + return ret; + + DX_CHECK(ID3D12Resource_Map(s->staging_buffer, 0, NULL, &mapped_data)); + av_image_fill_pointers(data, ctx->sw_format, ctx->height, mapped_data, linesizes); + + av_image_copy(dst->data, dst->linesize, data, linesizes, + ctx->sw_format, ctx->width, ctx->height); + + ID3D12Resource_Unmap(s->staging_buffer, 0, NULL); + } else { + av_log(ctx, AV_LOG_ERROR, "Transfer data to AV_PIX_FMT_D3D12 is not supported yet!\n"); + return AVERROR(EINVAL); + } + + return 0; + +fail: + return AVERROR(EINVAL); +} + +static int d3d12va_device_init(AVHWDeviceContext *hwdev) +{ + AVD3D12VADeviceContext *ctx = hwdev->hwctx; + + if (!ctx->video_device) + DX_CHECK(ID3D12Device_QueryInterface(ctx->device, &IID_ID3D12VideoDevice, (void **)&ctx->video_device)); + + return 0; + +fail: + return AVERROR(EINVAL); +} + +static void d3d12va_device_uninit(AVHWDeviceContext *hwdev) +{ + AVD3D12VADeviceContext *device_hwctx = hwdev->hwctx; + + D3D12_OBJECT_RELEASE(device_hwctx->video_device); + D3D12_OBJECT_RELEASE(device_hwctx->device); +} + +static int d3d12va_device_create(AVHWDeviceContext *ctx, const char *device, + AVDictionary *opts, int flags) +{ + AVD3D12VADeviceContext *device_hwctx = ctx->hwctx; + + HRESULT hr; + UINT create_flags = 0; + IDXGIAdapter *pAdapter = NULL; + + int ret; + int is_debug = !!av_dict_get(opts, "debug", NULL, 0); + device_hwctx->sync = !!av_dict_get(opts, "sync", NULL, 0); + + if ((ret = ff_thread_once(&functions_loaded, load_functions)) != 0) + return AVERROR_UNKNOWN; + + if (is_debug) { + ID3D12Debug *pDebug; + if (d3d12va_get_debug_interface && SUCCEEDED(d3d12va_get_debug_interface(&IID_ID3D12Debug, &pDebug))) { + create_flags |= DXGI_CREATE_FACTORY_DEBUG; + ID3D12Debug_EnableDebugLayer(pDebug); + D3D12_OBJECT_RELEASE(pDebug); + av_log(ctx, AV_LOG_INFO, "D3D12 debug layer is enabled!\n"); + } + } + + if (!device_hwctx->device) { + IDXGIFactory2 *pDXGIFactory = NULL; + + if (!d3d12va_create_device || !d3d12va_create_dxgi_factory2) { + av_log(ctx, AV_LOG_ERROR, "Failed to load D3D12 library or its functions\n"); + return AVERROR_UNKNOWN; + } + + hr = d3d12va_create_dxgi_factory2(create_flags, &IID_IDXGIFactory2, (void **)&pDXGIFactory); + if (SUCCEEDED(hr)) { + int adapter = device ? atoi(device) : 0; + if (FAILED(IDXGIFactory2_EnumAdapters(pDXGIFactory, adapter, &pAdapter))) + pAdapter = NULL; + IDXGIFactory2_Release(pDXGIFactory); + } + + if (pAdapter) { + DXGI_ADAPTER_DESC desc; + hr = IDXGIAdapter2_GetDesc(pAdapter, &desc); + if (!FAILED(hr)) { + av_log(ctx, AV_LOG_INFO, "Using device %04x:%04x (%ls).\n", + desc.VendorId, desc.DeviceId, desc.Description); + } + } + + hr = d3d12va_create_device((IUnknown *)pAdapter, D3D_FEATURE_LEVEL_12_0, &IID_ID3D12Device, &device_hwctx->device); + D3D12_OBJECT_RELEASE(pAdapter); + if (FAILED(hr)) { + av_log(ctx, AV_LOG_ERROR, "Failed to create DirectX3D 12 device (%lx)\n", (long)hr); + return AVERROR_UNKNOWN; + } + } + + return 0; +} + +const HWContextType ff_hwcontext_type_d3d12va = { + .type = AV_HWDEVICE_TYPE_D3D12VA, + .name = "D3D12VA", + + .device_hwctx_size = sizeof(AVD3D12VADeviceContext), + .frames_hwctx_size = sizeof(AVD3D12VAFramesContext), + .frames_priv_size = sizeof(D3D12VAFramesContext), + + .device_create = d3d12va_device_create, + .device_init = d3d12va_device_init, + .device_uninit = d3d12va_device_uninit, + .frames_get_constraints = d3d12va_frames_get_constraints, + .frames_init = d3d12va_frames_init, + .frames_uninit = d3d12va_frames_uninit, + .frames_get_buffer = d3d12va_get_buffer, + .transfer_get_formats = d3d12va_transfer_get_formats, + .transfer_data_to = d3d12va_transfer_data, + .transfer_data_from = d3d12va_transfer_data, + + .pix_fmts = (const enum AVPixelFormat[]){ AV_PIX_FMT_D3D12, AV_PIX_FMT_NONE }, +}; diff --git a/libavutil/hwcontext_d3d12va.h b/libavutil/hwcontext_d3d12va.h new file mode 100644 index 0000000000..060f692d11 --- /dev/null +++ b/libavutil/hwcontext_d3d12va.h @@ -0,0 +1,167 @@ +/* + * Direct3D 12 HW acceleration. + * + * copyright (c) 2022-2023 Wu Jianhua + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVUTIL_HWCONTEXT_D3D12VA_H +#define AVUTIL_HWCONTEXT_D3D12VA_H + +/** + * @file + * An API-specific header for AV_HWDEVICE_TYPE_D3D12VA. + * + */ +#include +#include +#include +#include + +/** + * @brief This struct is allocated as AVHWDeviceContext.hwctx + * + */ +typedef struct AVD3D12VADeviceContext { + /** + * Device used for objects creation and access. This can also be + * used to set the libavcodec decoding device. + * + * Can be set by the user. This is the only mandatory field - the other + * device context fields are set from this and are available for convenience. + * + * Deallocating the AVHWDeviceContext will always release this interface, + * and it does not matter whether it was user-allocated. + */ + ID3D12Device *device; + + /** + * If unset, this will be set from the device field on init. + * + * Deallocating the AVHWDeviceContext will always release this interface, + * and it does not matter whether it was user-allocated. + */ + ID3D12VideoDevice *video_device; + + /** + * Specifed by sync=1 when init d3d12va + * + * Execute commands as sync mode + */ + int sync; +} AVD3D12VADeviceContext; + +/** + * @brief This struct is used to sync d3d12 execution + * + */ +typedef struct AVD3D12VASyncContext { + /** + * D3D12 fence object + */ + ID3D12Fence *fence; + + /** + * A handle to the event object + */ + HANDLE event; + + /** + * The fence value used for sync + */ + uint64_t fence_value; +} AVD3D12VASyncContext; + +/** + * @brief D3D12 frame descriptor for pool allocation. + * + */ +typedef struct AVD3D12FrameDescriptor { + /** + * The texture in which the frame is located. The reference count is + * managed by the AVBufferRef, and destroying the reference will release + * the interface. + * + * Normally stored in AVFrame.data[0]. + */ + ID3D12Resource *texture; + + /** + * The index into the array texture element representing the frame + * + * Normally stored in AVFrame.data[1] (cast from intptr_t). + */ + intptr_t index; + + /** + * The sync context for the texture + * + * Use av_d3d12va_wait_idle(sync_ctx) to ensure the decoding or encoding have been finised + * @see: https://learn.microsoft.com/en-us/windows/win32/medfound/direct3d-12-video-overview#directx-12-fences + * + * Normally stored in AVFrame.data[2]. + */ + AVD3D12VASyncContext *sync_ctx; +} AVD3D12FrameDescriptor; + +/** + * @brief This struct is allocated as AVHWFramesContext.hwctx + * + */ +typedef struct AVD3D12VAFramesContext { + /** + * The same implementation as d3d11va + * This field is not able to be user-allocated at the present. + */ + AVD3D12FrameDescriptor *texture_infos; +} AVD3D12VAFramesContext; + +/** + * @brief Map sw pixel format to d3d12 format + * + * @return d3d12 specified format + */ +DXGI_FORMAT av_d3d12va_map_sw_to_hw_format(enum AVPixelFormat pix_fmt); + +/** + * @brief Allocate an AVD3D12VASyncContext + * + * @return Error code (ret < 0 if failed) + */ +int av_d3d12va_sync_context_alloc(AVD3D12VADeviceContext *ctx, AVD3D12VASyncContext **sync_ctx); + +/** + * @brief Free an AVD3D12VASyncContext + */ +void av_d3d12va_sync_context_free(AVD3D12VASyncContext **sync_ctx); + +/** + * @brief Wait for the sync context to the idle state + * + * @return Error code (ret < 0 if failed) + */ +int av_d3d12va_wait_idle(AVD3D12VASyncContext *sync_ctx); + +/** + * @brief Wait for a specified command queue to the idle state + * + * @return Error code (ret < 0 if failed) + */ +int av_d3d12va_wait_queue_idle(AVD3D12VASyncContext *sync_ctx, ID3D12CommandQueue *command_queue); + +#endif /* AVUTIL_HWCONTEXT_D3D12VA_H */ diff --git a/libavutil/hwcontext_d3d12va_internal.h b/libavutil/hwcontext_d3d12va_internal.h new file mode 100644 index 0000000000..e118a579aa --- /dev/null +++ b/libavutil/hwcontext_d3d12va_internal.h @@ -0,0 +1,63 @@ +/* + * Direct3D 12 HW acceleration. + * + * copyright (c) 2022-2023 Wu Jianhua + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVUTIL_HWCONTEXT_D3D12VA_INTERNAL_H +#define AVUTIL_HWCONTEXT_D3D12VA_INTERNAL_H + +/** + * @def COBJMACROS + * + * @brief Enable C style interface for D3D12 + */ +#ifndef COBJMACROS +#define COBJMACROS +#endif + +/** + * @def DX_CHECK + * + * @brief A check macro used by D3D12 functions highly frequently + */ +#define DX_CHECK(hr) if (FAILED(hr)) { \ + goto fail; \ +} + +/** + * @def D3D12_OBJECT_RELEASE + * + * @brief A release macro used by D3D12 objects highly frequently + */ +#define D3D12_OBJECT_RELEASE(pInterface) if (pInterface) { \ + IUnknown_Release((IUnknown *)pInterface); \ + pInterface = NULL; \ +} + +/** + * @def D3D12VA_MAX_SURFACES + * + * @brief Maximum number surfaces + * The reference processing over decoding will be corrupted on some drivers + * if the max number of reference frames exceeds this. + */ +#define D3D12VA_MAX_SURFACES 32 + +#endif /* AVUTIL_HWCONTEXT_D3D12VA_INTERNAL_H */ \ No newline at end of file diff --git a/libavutil/hwcontext_internal.h b/libavutil/hwcontext_internal.h index e6266494ac..4df516ee6a 100644 --- a/libavutil/hwcontext_internal.h +++ b/libavutil/hwcontext_internal.h @@ -165,6 +165,7 @@ int ff_hwframe_map_replace(AVFrame *dst, const AVFrame *src); extern const HWContextType ff_hwcontext_type_cuda; extern const HWContextType ff_hwcontext_type_d3d11va; +extern const HWContextType ff_hwcontext_type_d3d12va; extern const HWContextType ff_hwcontext_type_drm; extern const HWContextType ff_hwcontext_type_dxva2; extern const HWContextType ff_hwcontext_type_opencl; diff --git a/libavutil/pixdesc.c b/libavutil/pixdesc.c index e1e0dd2a9e..09fca5ef19 100644 --- a/libavutil/pixdesc.c +++ b/libavutil/pixdesc.c @@ -2283,6 +2283,10 @@ static const AVPixFmtDescriptor av_pix_fmt_descriptors[AV_PIX_FMT_NB] = { .name = "d3d11", .flags = AV_PIX_FMT_FLAG_HWACCEL, }, + [AV_PIX_FMT_D3D12] = { + .name = "d3d12", + .flags = AV_PIX_FMT_FLAG_HWACCEL, + }, [AV_PIX_FMT_GBRPF32BE] = { .name = "gbrpf32be", .nb_components = 3, diff --git a/libavutil/pixfmt.h b/libavutil/pixfmt.h index 63e07ba64f..1f0ab2a23c 100644 --- a/libavutil/pixfmt.h +++ b/libavutil/pixfmt.h @@ -426,6 +426,15 @@ enum AVPixelFormat { AV_PIX_FMT_P412BE, ///< interleaved chroma YUV 4:4:4, 36bpp, data in the high bits, big-endian AV_PIX_FMT_P412LE, ///< interleaved chroma YUV 4:4:4, 36bpp, data in the high bits, little-endian + /** + * Hardware surfaces for Direct3D 12. + * + * data[0] contains a ID3D12Resource pointer. + * data[1] contains the resource array index of the frame as intptr_t. + * data[2] contains the sync context for current resource + */ + AV_PIX_FMT_D3D12, + AV_PIX_FMT_NB ///< number of pixel formats, DO NOT USE THIS if you want to link with shared libav* because the number of formats might differ between versions }; diff --git a/libavutil/tests/hwdevice.c b/libavutil/tests/hwdevice.c index c57586613a..9d7964f9ee 100644 --- a/libavutil/tests/hwdevice.c +++ b/libavutil/tests/hwdevice.c @@ -137,6 +137,8 @@ static const struct { { "0", "1", "2" } }, { AV_HWDEVICE_TYPE_D3D11VA, { "0", "1", "2" } }, + { AV_HWDEVICE_TYPE_D3D12VA, + { "0", "1", "2" } }, { AV_HWDEVICE_TYPE_OPENCL, { "0.0", "0.1", "1.0", "1.1" } }, { AV_HWDEVICE_TYPE_VAAPI, From patchwork Fri Jun 2 08:06:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wu, Tong1" X-Patchwork-Id: 41950 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:c51c:b0:10c:5e6f:955f with SMTP id gm28csp1055102pzb; Fri, 2 Jun 2023 01:11:57 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4UB3Uj+jiRp8xQ5hVj3xkaW2/KLEeXCJg31dmMsElQqOP4jwtl0q1Av39YshfrqVXz4PH0 X-Received: by 2002:a17:907:6d27:b0:96f:f50b:9b15 with SMTP id sa39-20020a1709076d2700b0096ff50b9b15mr10127652ejc.35.1685693516909; Fri, 02 Jun 2023 01:11:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685693516; cv=none; d=google.com; s=arc-20160816; b=SMOf5HF9KsNNH0LQpnXHoj0DFEOq807VP8K5mGwdp/y/towgU5r+MGX6FNg+YLq+9h +P2Vs9flwN8s+plM07pkoxzJeYc6NRslTl+E+y6udhvgZI9RZO1MVvlWAr05zVegYpYn G0inE4ME5HptwABMmpOeWPCgZDIJyynoUV+EKPS9I2QuIuoL9QNCqlxuGmTPDmA5CZ5F mEqPFwuyzMyuuZ7sNeUFrsyDQ6gTW4mdN9qxGunhIcUvjTQomS6vPHBTvR/6M9Wm7NaS IUX+kLNFXgBHe+cL/xEValcAWrHfCTcsNC/JNxRVl+iBBTR0hZbL5UixVX+HGeFWODS4 ddlg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=98iScMv/k59AktmLoaJK7edbB5Ri2UtHio2jv0I2kEc=; b=baqt2saPJRxgQfRJ8VDJNGswTQO1g8/6oYU8QU8fiI1Q3s+jiWz75N3Sjq4Sjtlq4r Z4/8hP8jH+u187FJewVhUXx+P6in8djLSIpLvNCSYwG7E5bMJs/lv3RVzDPstUMneUPM 7GPRMYjk1A+5Ez+eR7N79QnyD1R7ZO7X6NV1dNNUygTMnUnIcTc0AF9elslzk+fQ6jXk EBqP+u+dC+lGkVnnpsLmHcTGpmJN7YGePOVqVY153taEG28nWOhcretFe5IiWc2G8L40 PKP7nOuND+N1+pzfQS3lb90zRu5cR31K9PQ0ylYItMkvM6zMSZzJlqOGnXonKdWouLK7 k3RA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=c1X4VOVZ; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id h2-20020aa7c942000000b00514b99afa50si503454edt.27.2023.06.02.01.11.56; Fri, 02 Jun 2023 01:11:56 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=c1X4VOVZ; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 86C2668C30E; Fri, 2 Jun 2023 11:11:43 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id EE9FF68C30E for ; Fri, 2 Jun 2023 11:11:35 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1685693501; x=1717229501; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0t57p6lfS32E0URiAl3VlBNYmXd0+7bWqQFNQgP4YHo=; b=c1X4VOVZANzoCvh92YCzyw8adnX1xG9Xjc4sxbaNTKew5cqabUIt3kOD fFhjJyJj5QoBTVfEZIBdOqF1uNqMaheScZojSdYU9UE2aQdRswWoz36hY bv6/7B5T0w7lxicFREvIgOrnYKnhV86f+j3/kjfzkgFE2t1oWUYi8u0Y2 BWo8y93QmRw+uaoU+ReGDLPBZnyfitZO5qfgALcyF2RWD0aW8HqeMErHb OwU6+PTnTZcJiZYIwB1N++t/KzA1CXNTRWYeSGLO/K9topLyfLd1jDBVE XKHAlwvOafcGwj18k2PU7upuqqUsK6UebYWgTKk7D2oivNRcmjG+BKm7+ g==; X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="421629840" X-IronPort-AV: E=Sophos;i="6.00,212,1681196400"; d="scan'208";a="421629840" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Jun 2023 01:11:21 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="852060668" X-IronPort-AV: E=Sophos;i="6.00,212,1681196400"; d="scan'208";a="852060668" Received: from desktop-qn7n0nf.sh.intel.com (HELO localhost.localdomain) ([10.239.160.59]) by fmsmga001.fm.intel.com with ESMTP; 02 Jun 2023 01:11:20 -0700 From: Tong Wu To: ffmpeg-devel@ffmpeg.org Date: Fri, 2 Jun 2023 16:06:54 +0800 Message-Id: <20230602080701.1754-2-tong1.wu@intel.com> X-Mailer: git-send-email 2.35.1.windows.2 In-Reply-To: <20230602080701.1754-1-tong1.wu@intel.com> References: <20230602080701.1754-1-tong1.wu@intel.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 2/9] avcodec: add D3D12VA hardware accelerated H264 decoding X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Tong Wu , Wu Jianhua Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: QML1jqwsAlPg From: Wu Jianhua The implementation is based on: https://learn.microsoft.com/en-us/windows/win32/medfound/direct3d-12-video-overview With the Direct3D 12 video decoding support, we can render or process the decoded images by the pixel shaders or compute shaders directly without the extra copy overhead, which is beneficial especially if you are trying to render or post-process a 4K or 8K video. The command below is how to enable d3d12va: ffmpeg -hwaccel d3d12va -i input.mp4 output.mp4 Signed-off-by: Wu Jianhua Signed-off-by: Tong Wu --- configure | 2 + libavcodec/Makefile | 3 + libavcodec/d3d11va.h | 3 - libavcodec/d3d12va.c | 552 ++++++++++++++++++++++++++++++++++++ libavcodec/d3d12va.h | 184 ++++++++++++ libavcodec/d3d12va_h264.c | 210 ++++++++++++++ libavcodec/dxva2.c | 24 ++ libavcodec/dxva2.h | 3 - libavcodec/dxva2_h264.c | 12 +- libavcodec/dxva2_internal.h | 67 +++-- libavcodec/h264_slice.c | 4 + libavcodec/h264dec.c | 3 + libavcodec/hwaccels.h | 1 + libavcodec/hwconfig.h | 2 + 14 files changed, 1028 insertions(+), 42 deletions(-) create mode 100644 libavcodec/d3d12va.c create mode 100644 libavcodec/d3d12va.h create mode 100644 libavcodec/d3d12va_h264.c diff --git a/configure b/configure index b86064e36f..f5dad4653f 100755 --- a/configure +++ b/configure @@ -3033,6 +3033,8 @@ h264_d3d11va_hwaccel_deps="d3d11va" h264_d3d11va_hwaccel_select="h264_decoder" h264_d3d11va2_hwaccel_deps="d3d11va" h264_d3d11va2_hwaccel_select="h264_decoder" +h264_d3d12va_hwaccel_deps="d3d12va" +h264_d3d12va_hwaccel_select="h264_decoder" h264_dxva2_hwaccel_deps="dxva2" h264_dxva2_hwaccel_select="h264_decoder" h264_nvdec_hwaccel_deps="nvdec" diff --git a/libavcodec/Makefile b/libavcodec/Makefile index 9aacc1d477..ae143d8821 100644 --- a/libavcodec/Makefile +++ b/libavcodec/Makefile @@ -977,6 +977,7 @@ OBJS-$(CONFIG_ADPCM_ZORK_DECODER) += adpcm.o adpcm_data.o # hardware accelerators OBJS-$(CONFIG_D3D11VA) += dxva2.o +OBJS-$(CONFIG_D3D12VA) += dxva2.o d3d12va.o OBJS-$(CONFIG_DXVA2) += dxva2.o OBJS-$(CONFIG_NVDEC) += nvdec.o OBJS-$(CONFIG_VAAPI) += vaapi_decode.o @@ -994,6 +995,7 @@ OBJS-$(CONFIG_H263_VAAPI_HWACCEL) += vaapi_mpeg4.o OBJS-$(CONFIG_H263_VIDEOTOOLBOX_HWACCEL) += videotoolbox.o OBJS-$(CONFIG_H264_D3D11VA_HWACCEL) += dxva2_h264.o OBJS-$(CONFIG_H264_DXVA2_HWACCEL) += dxva2_h264.o +OBJS-$(CONFIG_H264_D3D12VA_HWACCEL) += dxva2_h264.o d3d12va_h264.o OBJS-$(CONFIG_H264_NVDEC_HWACCEL) += nvdec_h264.o OBJS-$(CONFIG_H264_QSV_HWACCEL) += qsvdec.o OBJS-$(CONFIG_H264_VAAPI_HWACCEL) += vaapi_h264.o @@ -1277,6 +1279,7 @@ SKIPHEADERS += %_tablegen.h \ SKIPHEADERS-$(CONFIG_AMF) += amfenc.h SKIPHEADERS-$(CONFIG_D3D11VA) += d3d11va.h dxva2_internal.h +SKIPHEADERS-$(CONFIG_D3D12VA) += d3d12va.h SKIPHEADERS-$(CONFIG_DXVA2) += dxva2.h dxva2_internal.h SKIPHEADERS-$(CONFIG_JNI) += ffjni.h SKIPHEADERS-$(CONFIG_LCMS2) += fflcms2.h diff --git a/libavcodec/d3d11va.h b/libavcodec/d3d11va.h index 6816b6c1e6..27f40e5519 100644 --- a/libavcodec/d3d11va.h +++ b/libavcodec/d3d11va.h @@ -45,9 +45,6 @@ * @{ */ -#define FF_DXVA2_WORKAROUND_SCALING_LIST_ZIGZAG 1 ///< Work around for Direct3D11 and old UVD/UVD+ ATI video cards -#define FF_DXVA2_WORKAROUND_INTEL_CLEARVIDEO 2 ///< Work around for Direct3D11 and old Intel GPUs with ClearVideo interface - /** * This structure is used to provides the necessary configurations and data * to the Direct3D11 FFmpeg HWAccel implementation. diff --git a/libavcodec/d3d12va.c b/libavcodec/d3d12va.c new file mode 100644 index 0000000000..7f1fab7251 --- /dev/null +++ b/libavcodec/d3d12va.c @@ -0,0 +1,552 @@ +/* + * Direct3D 12 HW acceleration video decoder + * + * copyright (c) 2022-2023 Wu Jianhua + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include +#include +#include + +#include "libavutil/common.h" +#include "libavutil/log.h" +#include "libavutil/time.h" +#include "libavutil/imgutils.h" +#include "libavutil/hwcontext_d3d12va_internal.h" +#include "libavutil/hwcontext_d3d12va.h" +#include "avcodec.h" +#include "decode.h" +#include "d3d12va.h" + +typedef struct CommandAllocator { + ID3D12CommandAllocator *command_allocator; + uint64_t fence_value; +} CommandAllocator; + +int ff_d3d12va_get_suitable_max_bitstream_size(AVCodecContext *avctx) +{ + AVHWFramesContext *frames_ctx = D3D12VA_FRAMES_CONTEXT(avctx); + return av_image_get_buffer_size(frames_ctx->sw_format, avctx->coded_width, avctx->coded_height, 1); +} + +static int d3d12va_get_valid_command_allocator(AVCodecContext *avctx, ID3D12CommandAllocator **ppAllocator) +{ + HRESULT hr; + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + CommandAllocator allocator; + + if (av_fifo_peek(ctx->allocator_queue, &allocator, 1, 0) >= 0) { + uint64_t completion = ID3D12Fence_GetCompletedValue(ctx->sync_ctx->fence); + if (completion >= allocator.fence_value) { + *ppAllocator = allocator.command_allocator; + av_fifo_read(ctx->allocator_queue, &allocator, 1); + return 0; + } + } + + hr = ID3D12Device_CreateCommandAllocator(ctx->device_ctx->device, D3D12_COMMAND_LIST_TYPE_VIDEO_DECODE, + &IID_ID3D12CommandAllocator, ppAllocator); + if (FAILED(hr)) { + av_log(avctx, AV_LOG_ERROR, "Failed to create a new command allocator!\n"); + return AVERROR(EINVAL); + } + + return 0; +} + +static int d3d12va_discard_command_allocator(AVCodecContext *avctx, ID3D12CommandAllocator *pAllocator, uint64_t fence_value) +{ + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + + CommandAllocator allocator = { + .command_allocator = pAllocator, + .fence_value = fence_value + }; + + if (av_fifo_write(ctx->allocator_queue, &allocator, 1) < 0) { + D3D12_OBJECT_RELEASE(pAllocator); + return AVERROR(ENOMEM); + } + + return 0; +} + +static void bufref_free_interface(void *opaque, uint8_t *data) +{ + D3D12_OBJECT_RELEASE(opaque); +} + +static AVBufferRef *bufref_wrap_interface(IUnknown *iface) +{ + return av_buffer_create((uint8_t*)iface, 1, bufref_free_interface, iface, 0); +} + +static int d3d12va_create_buffer(AVCodecContext *avctx, UINT size, ID3D12Resource **ppResouce) +{ + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + + D3D12_HEAP_PROPERTIES heap_props = { .Type = D3D12_HEAP_TYPE_UPLOAD }; + + D3D12_RESOURCE_DESC desc = { + .Dimension = D3D12_RESOURCE_DIMENSION_BUFFER, + .Alignment = D3D12_DEFAULT_RESOURCE_PLACEMENT_ALIGNMENT, + .Width = size, + .Height = 1, + .DepthOrArraySize = 1, + .MipLevels = 1, + .Format = DXGI_FORMAT_UNKNOWN, + .SampleDesc = { .Count = 1, .Quality = 0 }, + .Layout = D3D12_TEXTURE_LAYOUT_ROW_MAJOR, + .Flags = D3D12_RESOURCE_FLAG_NONE, + }; + + HRESULT hr = ID3D12Device_CreateCommittedResource(ctx->device_ctx->device, &heap_props, D3D12_HEAP_FLAG_NONE, + &desc, D3D12_RESOURCE_STATE_GENERIC_READ, NULL, &IID_ID3D12Resource, ppResouce); + + if (FAILED(hr)) { + av_log(avctx, AV_LOG_ERROR, "Failed to create d3d12 buffer.\n"); + return AVERROR(EINVAL); + } + + return 0; +} + +static int d3d12va_wait_for_gpu(AVCodecContext *avctx) +{ + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + AVD3D12VASyncContext *sync_ctx = ctx->sync_ctx; + + return av_d3d12va_wait_queue_idle(sync_ctx, ctx->command_queue); +} + +static int d3d12va_create_decoder_heap(AVCodecContext *avctx) +{ + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + AVHWFramesContext *frames_ctx = D3D12VA_FRAMES_CONTEXT(avctx); + AVD3D12VADeviceContext *hwctx = ctx->device_ctx; + + D3D12_VIDEO_DECODER_HEAP_DESC desc = { + .NodeMask = 0, + .Configuration = ctx->cfg, + .DecodeWidth = frames_ctx->width, + .DecodeHeight = frames_ctx->height, + .Format = av_d3d12va_map_sw_to_hw_format(frames_ctx->sw_format), + .FrameRate = { avctx->framerate.num, avctx->framerate.den }, + .BitRate = avctx->bit_rate, + .MaxDecodePictureBufferCount = frames_ctx->initial_pool_size, + }; + + DX_CHECK(ID3D12VideoDevice_CreateVideoDecoderHeap(hwctx->video_device, &desc, + &IID_ID3D12VideoDecoderHeap, &ctx->decoder_heap)); + + return 0; + +fail: + if (ctx->decoder) { + av_log(avctx, AV_LOG_ERROR, "D3D12 doesn't support decoding frames with an extent " + "[width(%d), height(%d)], on your device!\n", frames_ctx->width, frames_ctx->height); + } + + return AVERROR(EINVAL); +} + +static int d3d12va_create_decoder(AVCodecContext *avctx) +{ + D3D12_VIDEO_DECODER_DESC desc; + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + AVHWFramesContext *frames_ctx = D3D12VA_FRAMES_CONTEXT(avctx); + AVD3D12VADeviceContext *hwctx = ctx->device_ctx; + + D3D12_FEATURE_DATA_VIDEO_DECODE_SUPPORT feature = { + .NodeIndex = 0, + .Configuration = ctx->cfg, + .Width = frames_ctx->width, + .Height = frames_ctx->height, + .DecodeFormat = av_d3d12va_map_sw_to_hw_format(frames_ctx->sw_format), + .FrameRate = { avctx->framerate.num, avctx->framerate.den }, + .BitRate = avctx->bit_rate, + }; + + DX_CHECK(ID3D12VideoDevice_CheckFeatureSupport(hwctx->video_device, D3D12_FEATURE_VIDEO_DECODE_SUPPORT, &feature, sizeof(feature))); + if (!(feature.SupportFlags & D3D12_VIDEO_DECODE_SUPPORT_FLAG_SUPPORTED) || + !(feature.DecodeTier >= D3D12_VIDEO_DECODE_TIER_2)) { + av_log(avctx, AV_LOG_ERROR, "D3D12 decoder doesn't support on this device\n"); + return AVERROR(EINVAL); + } + + desc = (D3D12_VIDEO_DECODER_DESC) { + .NodeMask = 0, + .Configuration = ctx->cfg, + }; + + DX_CHECK(ID3D12VideoDevice_CreateVideoDecoder(hwctx->video_device, &desc, &IID_ID3D12VideoDecoder, &ctx->decoder)); + + ctx->decoder_ref = bufref_wrap_interface((IUnknown *)ctx->decoder); + if (!ctx->decoder_ref) + return AVERROR(ENOMEM); + + return 0; + +fail: + return AVERROR(EINVAL); +} + +static inline int d3d12va_get_num_surfaces(enum AVCodecID codec_id) +{ + int num_surfaces = 1; + switch (codec_id) { + case AV_CODEC_ID_H264: + case AV_CODEC_ID_HEVC: + num_surfaces += 16; + break; + + case AV_CODEC_ID_AV1: + num_surfaces += 12; + break; + + case AV_CODEC_ID_VP9: + num_surfaces += 8; + break; + + default: + num_surfaces += 2; + } + + return num_surfaces; +} + +int ff_d3d12va_common_frame_params(AVCodecContext *avctx, AVBufferRef *hw_frames_ctx) +{ + AVHWFramesContext *frames_ctx = (AVHWFramesContext *)hw_frames_ctx->data; + AVHWDeviceContext *device_ctx = frames_ctx->device_ctx; + AVD3D12VAFramesContext *frames_hwctx = frames_ctx->hwctx; + + frames_ctx->format = AV_PIX_FMT_D3D12; + frames_ctx->sw_format = avctx->sw_pix_fmt == AV_PIX_FMT_YUV420P10 ? AV_PIX_FMT_P010 : AV_PIX_FMT_NV12; + frames_ctx->width = avctx->width; + frames_ctx->height = avctx->height; + + frames_ctx->initial_pool_size = d3d12va_get_num_surfaces(avctx->codec_id); + + return 0; +} + +int ff_d3d12va_decode_init(AVCodecContext *avctx) +{ + int ret; + UINT bitstream_size; + AVHWFramesContext *frames_ctx; + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + + ID3D12CommandAllocator *command_allocator = NULL; + D3D12_COMMAND_QUEUE_DESC queue_desc = { + .Type = D3D12_COMMAND_LIST_TYPE_VIDEO_DECODE, + .Priority = 0, + .Flags = D3D12_COMMAND_QUEUE_FLAG_NONE, + .NodeMask = 0 + }; + + ctx->pix_fmt = avctx->hwaccel->pix_fmt; + + ret = ff_decode_get_hw_frames_ctx(avctx, AV_HWDEVICE_TYPE_D3D12VA); + if (ret < 0) + return ret; + + frames_ctx = D3D12VA_FRAMES_CONTEXT(avctx); + ctx->device_ctx = (AVD3D12VADeviceContext *)frames_ctx->device_ctx->hwctx; + + if (frames_ctx->format != ctx->pix_fmt) { + av_log(avctx, AV_LOG_ERROR, "Invalid pixfmt for hwaccel!\n"); + goto fail; + } + + ret = d3d12va_create_decoder(avctx); + if (ret < 0) + goto fail; + + ret = d3d12va_create_decoder_heap(avctx); + if (ret < 0) + goto fail; + + ctx->max_num_ref = frames_ctx->initial_pool_size; + + bitstream_size = ff_d3d12va_get_suitable_max_bitstream_size(avctx); + ctx->buffers = av_calloc(sizeof(ID3D12Resource *), ctx->max_num_ref); + for (int i = 0; i < ctx->max_num_ref; i++) { + ret = d3d12va_create_buffer(avctx, bitstream_size, &ctx->buffers[i]); + if (ret < 0) + goto fail; + } + + ctx->ref_resources = av_calloc(sizeof(ID3D12Resource *), ctx->max_num_ref); + if (!ctx->ref_resources) + return AVERROR(ENOMEM); + + ctx->ref_subresources = av_calloc(sizeof(UINT), ctx->max_num_ref); + if (!ctx->ref_subresources) + return AVERROR(ENOMEM); + + ctx->allocator_queue = av_fifo_alloc2(ctx->max_num_ref, sizeof(CommandAllocator), AV_FIFO_FLAG_AUTO_GROW); + if (!ctx->allocator_queue) + return AVERROR(ENOMEM); + + ret = av_d3d12va_sync_context_alloc(ctx->device_ctx, &ctx->sync_ctx); + if (ret < 0) + goto fail; + + ret = d3d12va_get_valid_command_allocator(avctx, &command_allocator); + if (ret < 0) + goto fail; + + DX_CHECK(ID3D12Device_CreateCommandQueue(ctx->device_ctx->device, &queue_desc, + &IID_ID3D12CommandQueue, &ctx->command_queue)); + + DX_CHECK(ID3D12Device_CreateCommandList(ctx->device_ctx->device, 0, queue_desc.Type, + command_allocator, NULL, &IID_ID3D12CommandList, &ctx->command_list)); + + DX_CHECK(ID3D12VideoDecodeCommandList_Close(ctx->command_list)); + + ID3D12CommandQueue_ExecuteCommandLists(ctx->command_queue, 1, (ID3D12CommandList **)&ctx->command_list); + + d3d12va_wait_for_gpu(avctx); + + d3d12va_discard_command_allocator(avctx, command_allocator, ctx->sync_ctx->fence_value); + + return 0; + +fail: + D3D12_OBJECT_RELEASE(command_allocator); + ff_d3d12va_decode_uninit(avctx); + + return AVERROR(EINVAL); +} + +int ff_d3d12va_decode_uninit(AVCodecContext *avctx) +{ + int i, num_allocator = 0; + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + CommandAllocator allocator; + + if (ctx->sync_ctx) + d3d12va_wait_for_gpu(avctx); + + av_freep(&ctx->ref_resources); + + av_freep(&ctx->ref_subresources); + + for (i = 0; i < ctx->max_num_ref; i++) + D3D12_OBJECT_RELEASE(ctx->buffers[i]); + + av_freep(&ctx->buffers); + + D3D12_OBJECT_RELEASE(ctx->command_list); + + D3D12_OBJECT_RELEASE(ctx->command_queue); + + if (ctx->allocator_queue) { + while (av_fifo_read(ctx->allocator_queue, &allocator, 1) >= 0) { + num_allocator++; + D3D12_OBJECT_RELEASE(allocator.command_allocator); + } + + av_log(avctx, AV_LOG_VERBOSE, "Total number of command allocators reused: %d\n", num_allocator); + } + + av_fifo_freep2(&ctx->allocator_queue); + + av_d3d12va_sync_context_free(&ctx->sync_ctx); + + D3D12_OBJECT_RELEASE(ctx->decoder_heap); + + av_buffer_unref(&ctx->decoder_ref); + + return 0; +} + +static ID3D12Resource *get_surface(const AVFrame *frame) +{ + return (ID3D12Resource *)frame->data[0]; +} + +intptr_t ff_d3d12va_get_surface_index(AVCodecContext *ctx, const AVFrame* frame) +{ + return (intptr_t)frame->data[1]; +} + +static AVD3D12VASyncContext *d3d12va_get_sync_context(const AVFrame *frame) +{ + return (AVD3D12VASyncContext *)frame->data[2]; +} + +static int d3d12va_begin_update_reference_frames(AVCodecContext *avctx, D3D12_RESOURCE_BARRIER *barriers, int index) +{ + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + AVHWFramesContext *frames_ctx = D3D12VA_FRAMES_CONTEXT(avctx); + AVD3D12VAFramesContext *frames_hwctx = frames_ctx->hwctx; + + int num_barrier = 0; + + for (int i = 0; i < ctx->max_num_ref; i++) { + if (ctx->ref_resources[i] && ctx->ref_resources[i] != frames_hwctx->texture_infos[index].texture) { + barriers[num_barrier].Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION; + barriers[num_barrier].Flags = D3D12_RESOURCE_BARRIER_FLAG_NONE; + barriers[num_barrier].Transition = (D3D12_RESOURCE_TRANSITION_BARRIER){ + .pResource = ctx->ref_resources[i], + .Subresource = 0, + .StateBefore = D3D12_RESOURCE_STATE_COMMON, + .StateAfter = D3D12_RESOURCE_STATE_VIDEO_DECODE_READ, + }; + num_barrier++; + } + } + + return num_barrier; +} + +static void d3d12va_end_update_reference_frames(AVCodecContext *avctx, D3D12_RESOURCE_BARRIER *barriers, int index) +{ + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + AVHWFramesContext *frames_ctx = D3D12VA_FRAMES_CONTEXT(avctx); + AVD3D12VAFramesContext *frames_hwctx = frames_ctx->hwctx; + int num_barrier = 0; + + for (int i = 0; i < ctx->max_num_ref; i++) { + if (ctx->ref_resources[i] && ctx->ref_resources[i] != frames_hwctx->texture_infos[index].texture) { + barriers[num_barrier].Transition.pResource = ctx->ref_resources[i]; + barriers[num_barrier].Flags = D3D12_RESOURCE_BARRIER_FLAG_NONE; + barriers[num_barrier].Transition.StateBefore = D3D12_RESOURCE_STATE_VIDEO_DECODE_READ; + barriers[num_barrier].Transition.StateAfter = D3D12_RESOURCE_STATE_COMMON; + num_barrier++; + } + } +} + +int ff_d3d12va_common_end_frame(AVCodecContext *avctx, AVFrame *frame, + const void *pp, unsigned pp_size, + const void *qm, unsigned qm_size, + int(*update_input_arguments)(AVCodecContext *, D3D12_VIDEO_DECODE_INPUT_STREAM_ARGUMENTS *, ID3D12Resource *)) +{ + int ret; + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + AVHWFramesContext *frames_ctx = D3D12VA_FRAMES_CONTEXT(avctx); + AVD3D12VAFramesContext *frames_hwctx = frames_ctx->hwctx; + ID3D12CommandAllocator *command_allocator = NULL; + + ID3D12Resource *resource = get_surface(frame); + UINT index = ff_d3d12va_get_surface_index(avctx, frame); + AVD3D12VASyncContext *sync_ctx = d3d12va_get_sync_context(frame); + + ID3D12VideoDecodeCommandList *cmd_list = ctx->command_list; + D3D12_RESOURCE_BARRIER barriers[D3D12VA_MAX_SURFACES] = { 0 }; + + D3D12_VIDEO_DECODE_INPUT_STREAM_ARGUMENTS input_args = { + .NumFrameArguments = 2, + .FrameArguments = { + [0] = { + .Type = D3D12_VIDEO_DECODE_ARGUMENT_TYPE_PICTURE_PARAMETERS, + .Size = pp_size, + .pData = (void *)pp, + }, + [1] = { + .Type = D3D12_VIDEO_DECODE_ARGUMENT_TYPE_INVERSE_QUANTIZATION_MATRIX, + .Size = qm_size, + .pData = (void *)qm, + }, + }, + .pHeap = ctx->decoder_heap, + }; + + D3D12_VIDEO_DECODE_OUTPUT_STREAM_ARGUMENTS output_args = { + .ConversionArguments = 0, + .OutputSubresource = 0, + .pOutputTexture2D = resource, + }; + + UINT num_barrier = 1; + barriers[0] = (D3D12_RESOURCE_BARRIER) { + .Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION, + .Flags = D3D12_RESOURCE_BARRIER_FLAG_NONE, + .Transition = { + .pResource = resource, + .Subresource = 0, + .StateBefore = D3D12_RESOURCE_STATE_COMMON, + .StateAfter = D3D12_RESOURCE_STATE_VIDEO_DECODE_WRITE, + }, + }; + + memset(ctx->ref_resources, 0, sizeof(ID3D12Resource *) * ctx->max_num_ref); + memset(ctx->ref_subresources, 0, sizeof(UINT) * ctx->max_num_ref); + input_args.ReferenceFrames.NumTexture2Ds = ctx->max_num_ref; + input_args.ReferenceFrames.ppTexture2Ds = ctx->ref_resources; + input_args.ReferenceFrames.pSubresources = ctx->ref_subresources; + + av_d3d12va_wait_idle(sync_ctx); + + if (!qm) + input_args.NumFrameArguments = 1; + + ret = update_input_arguments(avctx, &input_args, ctx->buffers[index]); + if (ret < 0) + return ret; + + ret = d3d12va_get_valid_command_allocator(avctx, &command_allocator); + if (ret < 0) + goto fail; + + DX_CHECK(ID3D12CommandAllocator_Reset(command_allocator)); + + DX_CHECK(ID3D12VideoDecodeCommandList_Reset(cmd_list, command_allocator)); + + num_barrier += d3d12va_begin_update_reference_frames(avctx, &barriers[1], index); + + ID3D12VideoDecodeCommandList_ResourceBarrier(cmd_list, num_barrier, barriers); + + ID3D12VideoDecodeCommandList_DecodeFrame(cmd_list, ctx->decoder, &output_args, &input_args); + + barriers[0].Transition.StateBefore = barriers[0].Transition.StateAfter; + barriers[0].Transition.StateAfter = D3D12_RESOURCE_STATE_COMMON; + d3d12va_end_update_reference_frames(avctx, &barriers[1], index); + + ID3D12VideoDecodeCommandList_ResourceBarrier(cmd_list, num_barrier, barriers); + + DX_CHECK(ID3D12VideoDecodeCommandList_Close(cmd_list)); + + ID3D12CommandQueue_ExecuteCommandLists(ctx->command_queue, 1, (ID3D12CommandList **)&ctx->command_list); + + DX_CHECK(ID3D12CommandQueue_Signal(ctx->command_queue, sync_ctx->fence, ++sync_ctx->fence_value)); + + DX_CHECK(ID3D12CommandQueue_Signal(ctx->command_queue, ctx->sync_ctx->fence, ++ctx->sync_ctx->fence_value)); + + ret = d3d12va_discard_command_allocator(avctx, command_allocator, ctx->sync_ctx->fence_value); + if (ret < 0) + return ret; + + if (ctx->device_ctx->sync) { + ret = av_d3d12va_wait_idle(ctx->sync_ctx); + if (ret < 0) + return ret; + } + + return 0; + +fail: + if (command_allocator) + d3d12va_discard_command_allocator(avctx, command_allocator, ctx->sync_ctx->fence_value); + return AVERROR(EINVAL); +} diff --git a/libavcodec/d3d12va.h b/libavcodec/d3d12va.h new file mode 100644 index 0000000000..da3e7b7ab9 --- /dev/null +++ b/libavcodec/d3d12va.h @@ -0,0 +1,184 @@ +/* + * Direct3D 12 HW acceleration video decoder + * + * copyright (c) 2022-2023 Wu Jianhua + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVCODEC_D3D12VA_H +#define AVCODEC_D3D12VA_H + +#include "libavutil/fifo.h" +#include "libavutil/hwcontext.h" +#include "libavutil/hwcontext_d3d12va.h" +#include "avcodec.h" +#include "internal.h" + +/** + * @brief This structure is used to provides the necessary configurations and data + * to the FFmpeg Direct3D 12 HWAccel implementation for video decoder. + * + * The application must make it available as AVCodecContext.hwaccel_context. + */ +typedef struct D3D12VADecodeContext { + AVBufferRef *decoder_ref; + + /** + * D3D12 video decoder + */ + ID3D12VideoDecoder *decoder; + + /** + * D3D12 video decoder heap + */ + ID3D12VideoDecoderHeap *decoder_heap; + + /** + * D3D12 configuration used to create the decoder + * + * Specified by decoders + */ + D3D12_VIDEO_DECODE_CONFIGURATION cfg; + + /** + * A cached queue for reusing the D3D12 command allocators + * + * @see https://learn.microsoft.com/en-us/windows/win32/direct3d12/recording-command-lists-and-bundles#id3d12commandallocator + */ + AVFifo *allocator_queue; + + /** + * D3D12 command queue + */ + ID3D12CommandQueue *command_queue; + + /** + * D3D12 video decode command list + */ + ID3D12VideoDecodeCommandList *command_list; + + /** + * The array of buffer resources used to upload compressed bitstream + * + * The buffers.length is the same as D3D12VADecodeContext.max_num_ref + */ + ID3D12Resource **buffers; + + /** + * The array of resources used for reference frames + * + * The ref_resources.length is the same as D3D12VADecodeContext.max_num_ref + */ + ID3D12Resource **ref_resources; + + /** + * The array of subresources used for reference frames + * + * The ref_subresources.length is the same as D3D12VADecodeContext.max_num_ref + */ + UINT *ref_subresources; + + /** + * Maximum number of reference frames + */ + UINT max_num_ref; + + /** + * The sync context used to sync command queue + */ + AVD3D12VASyncContext *sync_ctx; + + /** + * A pointer to AVD3D12VADeviceContext used to create D3D12 objects + */ + AVD3D12VADeviceContext *device_ctx; + + /** + * Pixel format + */ + enum AVPixelFormat pix_fmt; + + /** + * Private to the FFmpeg AVHWAccel implementation + */ + unsigned report_id; +} D3D12VADecodeContext; + +/** + * @} + */ + +#define D3D12VA_DECODE_CONTEXT(avctx) ((D3D12VADecodeContext *)((avctx)->internal->hwaccel_priv_data)) +#define D3D12VA_FRAMES_CONTEXT(avctx) ((AVHWFramesContext *)(avctx)->hw_frames_ctx->data) + +/** + * @brief Get a suitable maximum bitstream size + * + * Creating and destroying a resource on d3d12 needs sync and reallocation, so use this function + * to help allocate a big enough bitstream buffer to avoid recreating resources when decoding. + * + * @return the suitable size + */ +int ff_d3d12va_get_suitable_max_bitstream_size(AVCodecContext *avctx); + +/** + * @brief init D3D12VADecodeContext + * + * @return Error code (ret < 0 if failed) + */ +int ff_d3d12va_decode_init(AVCodecContext *avctx); + +/** + * @brief uninit D3D12VADecodeContext + * + * @return Error code (ret < 0 if failed) + */ +int ff_d3d12va_decode_uninit(AVCodecContext *avctx); + +/** + * @brief d3d12va common frame params + * + * @return Error code (ret < 0 if failed) + */ +int ff_d3d12va_common_frame_params(AVCodecContext *avctx, AVBufferRef *hw_frames_ctx); + +/** + * @brief d3d12va common end frame + * + * @param avctx codec context + * @param frame current output frame + * @param pp picture parameters + * @param pp_size the size of the picture parameters + * @param qm quantization matrix + * @param qm_size the size of the quantization matrix + * @param callback update decoder-specified input stream arguments + * @return Error code (ret < 0 if failed) + */ +int ff_d3d12va_common_end_frame(AVCodecContext *avctx, AVFrame *frame, + const void *pp, unsigned pp_size, + const void *qm, unsigned qm_size, + int(*)(AVCodecContext *, D3D12_VIDEO_DECODE_INPUT_STREAM_ARGUMENTS *, ID3D12Resource *)); + +/** + * @brief get surface index + * + * @return index + */ +intptr_t ff_d3d12va_get_surface_index(AVCodecContext *avctx, const AVFrame *frame); + +#endif /* AVCODEC_D3D12VA_DEC_H */ diff --git a/libavcodec/d3d12va_h264.c b/libavcodec/d3d12va_h264.c new file mode 100644 index 0000000000..0810a034b4 --- /dev/null +++ b/libavcodec/d3d12va_h264.c @@ -0,0 +1,210 @@ +/* + * Direct3D 12 h264 HW acceleration + * + * copyright (c) 2022-2023 Wu Jianhua + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config_components.h" +#include "libavutil/avassert.h" +#include "h264dec.h" +#include "h264data.h" +#include "h264_ps.h" +#include "mpegutils.h" +#include "dxva2_internal.h" +#include "d3d12va.h" +#include "libavutil/hwcontext_d3d12va_internal.h" +#include + +typedef struct H264DecodePictureContext { + DXVA_PicParams_H264 pp; + DXVA_Qmatrix_H264 qm; + unsigned slice_count; + DXVA_Slice_H264_Short slice_short[MAX_SLICES]; + const uint8_t *bitstream; + unsigned bitstream_size; +} H264DecodePictureContext; + +static void fill_slice_short(DXVA_Slice_H264_Short *slice, + unsigned position, unsigned size) +{ + memset(slice, 0, sizeof(*slice)); + slice->BSNALunitDataLocation = position; + slice->SliceBytesInBuffer = size; + slice->wBadSliceChopping = 0; +} + +static int d3d12va_h264_start_frame(AVCodecContext *avctx, + av_unused const uint8_t *buffer, + av_unused uint32_t size) +{ + const H264Context *h = avctx->priv_data; + H264DecodePictureContext *ctx_pic = h->cur_pic_ptr->hwaccel_picture_private; + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + + if (!ctx) + return -1; + + assert(ctx_pic); + + ff_dxva2_h264_fill_picture_parameters(avctx, (AVDXVAContext *)ctx, &ctx_pic->pp); + + ff_dxva2_h264_fill_scaling_lists(avctx, (AVDXVAContext *)ctx, &ctx_pic->qm); + + ctx_pic->slice_count = 0; + ctx_pic->bitstream_size = 0; + ctx_pic->bitstream = NULL; + + return 0; +} + +static int d3d12va_h264_decode_slice(AVCodecContext *avctx, const uint8_t *buffer, uint32_t size) +{ + unsigned position; + const H264Context *h = avctx->priv_data; + const H264SliceContext *sl = &h->slice_ctx[0]; + const H264Picture *current_picture = h->cur_pic_ptr; + H264DecodePictureContext *ctx_pic = current_picture->hwaccel_picture_private; + + if (ctx_pic->slice_count >= MAX_SLICES) + return AVERROR(ERANGE); + + if (!ctx_pic->bitstream) + ctx_pic->bitstream = buffer; + ctx_pic->bitstream_size += size; + + position = buffer - ctx_pic->bitstream; + fill_slice_short(&ctx_pic->slice_short[ctx_pic->slice_count], position, size); + ctx_pic->slice_count++; + + if (sl->slice_type != AV_PICTURE_TYPE_I && sl->slice_type != AV_PICTURE_TYPE_SI) + ctx_pic->pp.wBitFields &= ~(1 << 15); /* Set IntraPicFlag to 0 */ + + return 0; +} + +#define START_CODE 65536 +#define START_CODE_SIZE 3 +static int update_input_arguments(AVCodecContext *avctx, D3D12_VIDEO_DECODE_INPUT_STREAM_ARGUMENTS *input_args, ID3D12Resource *buffer) +{ + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + AVHWFramesContext *frames_ctx = D3D12VA_FRAMES_CONTEXT(avctx); + AVD3D12VAFramesContext *frames_hwctx = frames_ctx->hwctx; + + const H264Context *h = avctx->priv_data; + const H264Picture *current_picture = h->cur_pic_ptr; + H264DecodePictureContext *ctx_pic = current_picture->hwaccel_picture_private; + + int i, index; + uint8_t *mapped_data, *mapped_ptr; + DXVA_Slice_H264_Short *slice; + D3D12_VIDEO_DECODE_FRAME_ARGUMENT *args; + + if (FAILED(ID3D12Resource_Map(buffer, 0, NULL, &mapped_data))) { + av_log(avctx, AV_LOG_ERROR, "Failed to map D3D12 Buffer resource!\n"); + return AVERROR(EINVAL); + } + + mapped_ptr = mapped_data; + for (i = 0; i < ctx_pic->slice_count; i++) { + UINT position, size; + slice = &ctx_pic->slice_short[i]; + + position = slice->BSNALunitDataLocation; + size = slice->SliceBytesInBuffer; + + slice->SliceBytesInBuffer += START_CODE_SIZE; + slice->BSNALunitDataLocation = mapped_ptr - mapped_data; + + *(uint32_t *)mapped_ptr = START_CODE; + mapped_ptr += START_CODE_SIZE; + + memcpy(mapped_ptr, &ctx_pic->bitstream[position], size); + mapped_ptr += size; + } + + ID3D12Resource_Unmap(buffer, 0, NULL); + + input_args->CompressedBitstream = (D3D12_VIDEO_DECODE_COMPRESSED_BITSTREAM){ + .pBuffer = buffer, + .Offset = 0, + .Size = mapped_ptr - mapped_data, + }; + + args = &input_args->FrameArguments[input_args->NumFrameArguments++]; + args->Type = D3D12_VIDEO_DECODE_ARGUMENT_TYPE_SLICE_CONTROL; + args->Size = sizeof(DXVA_Slice_H264_Short) * ctx_pic->slice_count; + args->pData = ctx_pic->slice_short; + + index = ctx_pic->pp.CurrPic.Index7Bits; + ctx->ref_resources[index] = frames_hwctx->texture_infos[index].texture; + for (i = 0; i < FF_ARRAY_ELEMS(ctx_pic->pp.RefFrameList); i++) { + index = ctx_pic->pp.RefFrameList[i].Index7Bits; + if (index != 0x7f) + ctx->ref_resources[index] = frames_hwctx->texture_infos[index].texture; + } + + return 0; +} + +static int d3d12va_h264_end_frame(AVCodecContext *avctx) +{ + H264Context *h = avctx->priv_data; + H264DecodePictureContext *ctx_pic = h->cur_pic_ptr->hwaccel_picture_private; + H264SliceContext *sl = &h->slice_ctx[0]; + + int ret; + + if (ctx_pic->slice_count <= 0 || ctx_pic->bitstream_size <= 0) + return -1; + + ret = ff_d3d12va_common_end_frame(avctx, h->cur_pic_ptr->f, + &ctx_pic->pp, sizeof(ctx_pic->pp), + &ctx_pic->qm, sizeof(ctx_pic->qm), + update_input_arguments); + if (!ret) + ff_h264_draw_horiz_band(h, sl, 0, h->avctx->height); + + return ret; +} + +static int d3d12va_h264_decode_init(AVCodecContext *avctx) +{ + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + + ctx->cfg.DecodeProfile = D3D12_VIDEO_DECODE_PROFILE_H264; + + return ff_d3d12va_decode_init(avctx); +} + +#if CONFIG_H264_D3D12VA_HWACCEL +const AVHWAccel ff_h264_d3d12va_hwaccel = { + .name = "h264_d3d12va", + .type = AVMEDIA_TYPE_VIDEO, + .id = AV_CODEC_ID_H264, + .pix_fmt = AV_PIX_FMT_D3D12, + .init = d3d12va_h264_decode_init, + .uninit = ff_d3d12va_decode_uninit, + .start_frame = d3d12va_h264_start_frame, + .decode_slice = d3d12va_h264_decode_slice, + .end_frame = d3d12va_h264_end_frame, + .frame_params = ff_d3d12va_common_frame_params, + .frame_priv_data_size = sizeof(H264DecodePictureContext), + .priv_data_size = sizeof(D3D12VADecodeContext), +}; +#endif diff --git a/libavcodec/dxva2.c b/libavcodec/dxva2.c index 568d686f39..b22ea3e8f2 100644 --- a/libavcodec/dxva2.c +++ b/libavcodec/dxva2.c @@ -774,6 +774,10 @@ unsigned ff_dxva2_get_surface_index(const AVCodecContext *avctx, void *surface = get_surface(avctx, frame); unsigned i; +#if CONFIG_D3D12VA + if (avctx->pix_fmt == AV_PIX_FMT_D3D12) + return (intptr_t)frame->data[1]; +#endif #if CONFIG_D3D11VA if (avctx->pix_fmt == AV_PIX_FMT_D3D11) return (intptr_t)frame->data[1]; @@ -1056,3 +1060,23 @@ int ff_dxva2_is_d3d11(const AVCodecContext *avctx) else return 0; } + +unsigned *ff_dxva2_get_report_id(const AVCodecContext *avctx, AVDXVAContext *ctx) +{ + unsigned *report_id = NULL; + +#if CONFIG_D3D12VA + if (avctx->pix_fmt == AV_PIX_FMT_D3D12) + report_id = &ctx->d3d12va.report_id; +#endif +#if CONFIG_D3D11VA + if (ff_dxva2_is_d3d11(avctx)) + report_id = &ctx->d3d11va.report_id; +#endif +#if CONFIG_DXVA2 + if (avctx->pix_fmt == AV_PIX_FMT_DXVA2_VLD) + report_id = &ctx->dxva2.report_id; +#endif + + return report_id; +} diff --git a/libavcodec/dxva2.h b/libavcodec/dxva2.h index 22c93992f2..bdec6112e9 100644 --- a/libavcodec/dxva2.h +++ b/libavcodec/dxva2.h @@ -45,9 +45,6 @@ * @{ */ -#define FF_DXVA2_WORKAROUND_SCALING_LIST_ZIGZAG 1 ///< Work around for DXVA2 and old UVD/UVD+ ATI video cards -#define FF_DXVA2_WORKAROUND_INTEL_CLEARVIDEO 2 ///< Work around for DXVA2 and old Intel GPUs with ClearVideo interface - /** * This structure is used to provides the necessary configurations and data * to the DXVA2 FFmpeg HWAccel implementation. diff --git a/libavcodec/dxva2_h264.c b/libavcodec/dxva2_h264.c index 6300b1418d..7a076ea981 100644 --- a/libavcodec/dxva2_h264.c +++ b/libavcodec/dxva2_h264.c @@ -47,9 +47,10 @@ static void fill_picture_entry(DXVA_PicEntry_H264 *pic, pic->bPicEntry = index | (flag << 7); } -static void fill_picture_parameters(const AVCodecContext *avctx, AVDXVAContext *ctx, const H264Context *h, +void ff_dxva2_h264_fill_picture_parameters(const AVCodecContext *avctx, AVDXVAContext *ctx, DXVA_PicParams_H264 *pp) { + const H264Context *h = avctx->priv_data; const H264Picture *current_picture = h->cur_pic_ptr; const SPS *sps = h->ps.sps; const PPS *pps = h->ps.pps; @@ -163,9 +164,10 @@ static void fill_picture_parameters(const AVCodecContext *avctx, AVDXVAContext * //pp->SliceGroupMap[810]; /* XXX not implemented by FFmpeg */ } -static void fill_scaling_lists(const AVCodecContext *avctx, AVDXVAContext *ctx, const H264Context *h, DXVA_Qmatrix_H264 *qm) +void ff_dxva2_h264_fill_scaling_lists(const AVCodecContext *avctx, AVDXVAContext *ctx, DXVA_Qmatrix_H264 *qm) { - const PPS *pps = h->ps.pps; + const H264Context *h = avctx->priv_data; + const PPS *pps = h->ps.pps; unsigned i, j; memset(qm, 0, sizeof(*qm)); if (DXVA_CONTEXT_WORKAROUND(avctx, ctx) & FF_DXVA2_WORKAROUND_SCALING_LIST_ZIGZAG) { @@ -453,10 +455,10 @@ static int dxva2_h264_start_frame(AVCodecContext *avctx, assert(ctx_pic); /* Fill up DXVA_PicParams_H264 */ - fill_picture_parameters(avctx, ctx, h, &ctx_pic->pp); + ff_dxva2_h264_fill_picture_parameters(avctx, ctx, &ctx_pic->pp); /* Fill up DXVA_Qmatrix_H264 */ - fill_scaling_lists(avctx, ctx, h, &ctx_pic->qm); + ff_dxva2_h264_fill_scaling_lists(avctx, ctx, &ctx_pic->qm); ctx_pic->slice_count = 0; ctx_pic->bitstream_size = 0; diff --git a/libavcodec/dxva2_internal.h b/libavcodec/dxva2_internal.h index b822af59cd..a9a1fc090e 100644 --- a/libavcodec/dxva2_internal.h +++ b/libavcodec/dxva2_internal.h @@ -26,18 +26,34 @@ #define COBJMACROS #include "config.h" +#include "config_components.h" /* define the proper COM entries before forcing desktop APIs */ #include +#define FF_DXVA2_WORKAROUND_SCALING_LIST_ZIGZAG 1 ///< Work around for DXVA2/Direct3D11 and old UVD/UVD+ ATI video cards +#define FF_DXVA2_WORKAROUND_INTEL_CLEARVIDEO 2 ///< Work around for DXVA2/Direct3D11 and old Intel GPUs with ClearVideo interface + #if CONFIG_DXVA2 #include "dxva2.h" #include "libavutil/hwcontext_dxva2.h" +#define DXVA2_VAR(ctx, var) ctx->dxva2.var +#else +#define DXVA2_VAR(ctx, var) 0 #endif + #if CONFIG_D3D11VA #include "d3d11va.h" #include "libavutil/hwcontext_d3d11va.h" +#define D3D11VA_VAR(ctx, var) ctx->d3d11va.var +#else +#define D3D11VA_VAR(ctx, var) 0 +#endif + +#if CONFIG_D3D12VA +#include "d3d12va.h" #endif + #if HAVE_DXVA_H /* When targeting WINAPI_FAMILY_PHONE_APP or WINAPI_FAMILY_APP, dxva.h * defines nothing. Force the struct definitions to be visible. */ @@ -62,6 +78,9 @@ typedef union { #if CONFIG_DXVA2 struct dxva_context dxva2; #endif +#if CONFIG_D3D12VA + struct D3D12VADecodeContext d3d12va; +#endif } AVDXVAContext; typedef struct FFDXVASharedContext { @@ -101,39 +120,19 @@ typedef struct FFDXVASharedContext { #define D3D11VA_CONTEXT(ctx) (&ctx->d3d11va) #define DXVA2_CONTEXT(ctx) (&ctx->dxva2) -#if CONFIG_D3D11VA && CONFIG_DXVA2 -#define DXVA_CONTEXT_WORKAROUND(avctx, ctx) (ff_dxva2_is_d3d11(avctx) ? ctx->d3d11va.workaround : ctx->dxva2.workaround) -#define DXVA_CONTEXT_COUNT(avctx, ctx) (ff_dxva2_is_d3d11(avctx) ? ctx->d3d11va.surface_count : ctx->dxva2.surface_count) -#define DXVA_CONTEXT_DECODER(avctx, ctx) (ff_dxva2_is_d3d11(avctx) ? (void *)ctx->d3d11va.decoder : (void *)ctx->dxva2.decoder) -#define DXVA_CONTEXT_REPORT_ID(avctx, ctx) (*(ff_dxva2_is_d3d11(avctx) ? &ctx->d3d11va.report_id : &ctx->dxva2.report_id)) -#define DXVA_CONTEXT_CFG(avctx, ctx) (ff_dxva2_is_d3d11(avctx) ? (void *)ctx->d3d11va.cfg : (void *)ctx->dxva2.cfg) -#define DXVA_CONTEXT_CFG_BITSTREAM(avctx, ctx) (ff_dxva2_is_d3d11(avctx) ? ctx->d3d11va.cfg->ConfigBitstreamRaw : ctx->dxva2.cfg->ConfigBitstreamRaw) -#define DXVA_CONTEXT_CFG_INTRARESID(avctx, ctx) (ff_dxva2_is_d3d11(avctx) ? ctx->d3d11va.cfg->ConfigIntraResidUnsigned : ctx->dxva2.cfg->ConfigIntraResidUnsigned) -#define DXVA_CONTEXT_CFG_RESIDACCEL(avctx, ctx) (ff_dxva2_is_d3d11(avctx) ? ctx->d3d11va.cfg->ConfigResidDiffAccelerator : ctx->dxva2.cfg->ConfigResidDiffAccelerator) +#define DXVA2_CONTEXT_VAR(avctx, ctx, var) (avctx->pix_fmt == AV_PIX_FMT_D3D12 ? 0 : (ff_dxva2_is_d3d11(avctx) ? D3D11VA_VAR(ctx, var) : DXVA2_VAR(ctx, var))) + +#define DXVA_CONTEXT_REPORT_ID(avctx, ctx) (*ff_dxva2_get_report_id(avctx, ctx)) +#define DXVA_CONTEXT_WORKAROUND(avctx, ctx) DXVA2_CONTEXT_VAR(avctx, ctx, workaround) +#define DXVA_CONTEXT_COUNT(avctx, ctx) DXVA2_CONTEXT_VAR(avctx, ctx, surface_count) +#define DXVA_CONTEXT_DECODER(avctx, ctx) (avctx->pix_fmt == AV_PIX_FMT_D3D12 ? 0 : (ff_dxva2_is_d3d11(avctx) ? (void *)D3D11VA_VAR(ctx, decoder) : (void *)DXVA2_VAR(ctx, decoder))) +#define DXVA_CONTEXT_CFG(avctx, ctx) (avctx->pix_fmt == AV_PIX_FMT_D3D12 ? 0 : (ff_dxva2_is_d3d11(avctx) ? (void *)D3D11VA_VAR(ctx, cfg) : (void *)DXVA2_VAR(ctx, cfg))) +#define DXVA_CONTEXT_CFG_BITSTREAM(avctx, ctx) DXVA2_CONTEXT_VAR(avctx, ctx, cfg->ConfigBitstreamRaw) +#define DXVA_CONTEXT_CFG_INTRARESID(avctx, ctx) DXVA2_CONTEXT_VAR(avctx, ctx, cfg->ConfigIntraResidUnsigned) +#define DXVA_CONTEXT_CFG_RESIDACCEL(avctx, ctx) DXVA2_CONTEXT_VAR(avctx, ctx, cfg->ConfigResidDiffAccelerator) #define DXVA_CONTEXT_VALID(avctx, ctx) (DXVA_CONTEXT_DECODER(avctx, ctx) && \ DXVA_CONTEXT_CFG(avctx, ctx) && \ - (ff_dxva2_is_d3d11(avctx) || ctx->dxva2.surface_count)) -#elif CONFIG_DXVA2 -#define DXVA_CONTEXT_WORKAROUND(avctx, ctx) (ctx->dxva2.workaround) -#define DXVA_CONTEXT_COUNT(avctx, ctx) (ctx->dxva2.surface_count) -#define DXVA_CONTEXT_DECODER(avctx, ctx) (ctx->dxva2.decoder) -#define DXVA_CONTEXT_REPORT_ID(avctx, ctx) (*(&ctx->dxva2.report_id)) -#define DXVA_CONTEXT_CFG(avctx, ctx) (ctx->dxva2.cfg) -#define DXVA_CONTEXT_CFG_BITSTREAM(avctx, ctx) (ctx->dxva2.cfg->ConfigBitstreamRaw) -#define DXVA_CONTEXT_CFG_INTRARESID(avctx, ctx) (ctx->dxva2.cfg->ConfigIntraResidUnsigned) -#define DXVA_CONTEXT_CFG_RESIDACCEL(avctx, ctx) (ctx->dxva2.cfg->ConfigResidDiffAccelerator) -#define DXVA_CONTEXT_VALID(avctx, ctx) (ctx->dxva2.decoder && ctx->dxva2.cfg && ctx->dxva2.surface_count) -#elif CONFIG_D3D11VA -#define DXVA_CONTEXT_WORKAROUND(avctx, ctx) (ctx->d3d11va.workaround) -#define DXVA_CONTEXT_COUNT(avctx, ctx) (ctx->d3d11va.surface_count) -#define DXVA_CONTEXT_DECODER(avctx, ctx) (ctx->d3d11va.decoder) -#define DXVA_CONTEXT_REPORT_ID(avctx, ctx) (*(&ctx->d3d11va.report_id)) -#define DXVA_CONTEXT_CFG(avctx, ctx) (ctx->d3d11va.cfg) -#define DXVA_CONTEXT_CFG_BITSTREAM(avctx, ctx) (ctx->d3d11va.cfg->ConfigBitstreamRaw) -#define DXVA_CONTEXT_CFG_INTRARESID(avctx, ctx) (ctx->d3d11va.cfg->ConfigIntraResidUnsigned) -#define DXVA_CONTEXT_CFG_RESIDACCEL(avctx, ctx) (ctx->d3d11va.cfg->ConfigResidDiffAccelerator) -#define DXVA_CONTEXT_VALID(avctx, ctx) (ctx->d3d11va.decoder && ctx->d3d11va.cfg) -#endif + (ff_dxva2_is_d3d11(avctx) || DXVA2_VAR(ctx, surface_count))) unsigned ff_dxva2_get_surface_index(const AVCodecContext *avctx, const AVDXVAContext *, @@ -161,4 +160,10 @@ int ff_dxva2_common_frame_params(AVCodecContext *avctx, int ff_dxva2_is_d3d11(const AVCodecContext *avctx); +unsigned *ff_dxva2_get_report_id(const AVCodecContext *avctx, AVDXVAContext *ctx); + +void ff_dxva2_h264_fill_picture_parameters(const AVCodecContext *avctx, AVDXVAContext *ctx, DXVA_PicParams_H264 *pp); + +void ff_dxva2_h264_fill_scaling_lists(const AVCodecContext *avctx, AVDXVAContext *ctx, DXVA_Qmatrix_H264 *qm); + #endif /* AVCODEC_DXVA2_INTERNAL_H */ diff --git a/libavcodec/h264_slice.c b/libavcodec/h264_slice.c index 41bf30eefc..df70ad8a2f 100644 --- a/libavcodec/h264_slice.c +++ b/libavcodec/h264_slice.c @@ -778,6 +778,7 @@ static enum AVPixelFormat get_pixel_format(H264Context *h, int force_callback) { #define HWACCEL_MAX (CONFIG_H264_DXVA2_HWACCEL + \ (CONFIG_H264_D3D11VA_HWACCEL * 2) + \ + CONFIG_H264_D3D12VA_HWACCEL + \ CONFIG_H264_NVDEC_HWACCEL + \ CONFIG_H264_VAAPI_HWACCEL + \ CONFIG_H264_VIDEOTOOLBOX_HWACCEL + \ @@ -883,6 +884,9 @@ static enum AVPixelFormat get_pixel_format(H264Context *h, int force_callback) *fmt++ = AV_PIX_FMT_D3D11VA_VLD; *fmt++ = AV_PIX_FMT_D3D11; #endif +#if CONFIG_H264_D3D12VA_HWACCEL + *fmt++ = AV_PIX_FMT_D3D12; +#endif #if CONFIG_H264_VAAPI_HWACCEL *fmt++ = AV_PIX_FMT_VAAPI; #endif diff --git a/libavcodec/h264dec.c b/libavcodec/h264dec.c index 19f8dba131..853d3262f7 100644 --- a/libavcodec/h264dec.c +++ b/libavcodec/h264dec.c @@ -1089,6 +1089,9 @@ const FFCodec ff_h264_decoder = { #if CONFIG_H264_D3D11VA2_HWACCEL HWACCEL_D3D11VA2(h264), #endif +#if CONFIG_H264_D3D12VA_HWACCEL + HWACCEL_D3D12VA(h264), +#endif #if CONFIG_H264_NVDEC_HWACCEL HWACCEL_NVDEC(h264), #endif diff --git a/libavcodec/hwaccels.h b/libavcodec/hwaccels.h index 48dfc17f72..be54604b81 100644 --- a/libavcodec/hwaccels.h +++ b/libavcodec/hwaccels.h @@ -32,6 +32,7 @@ extern const AVHWAccel ff_h263_vaapi_hwaccel; extern const AVHWAccel ff_h263_videotoolbox_hwaccel; extern const AVHWAccel ff_h264_d3d11va_hwaccel; extern const AVHWAccel ff_h264_d3d11va2_hwaccel; +extern const AVHWAccel ff_h264_d3d12va_hwaccel; extern const AVHWAccel ff_h264_dxva2_hwaccel; extern const AVHWAccel ff_h264_nvdec_hwaccel; extern const AVHWAccel ff_h264_vaapi_hwaccel; diff --git a/libavcodec/hwconfig.h b/libavcodec/hwconfig.h index e8c6186151..e20118c096 100644 --- a/libavcodec/hwconfig.h +++ b/libavcodec/hwconfig.h @@ -82,6 +82,8 @@ void ff_hwaccel_uninit(AVCodecContext *avctx); HW_CONFIG_HWACCEL(1, 1, 1, VULKAN, VULKAN, ff_ ## codec ## _vulkan_hwaccel) #define HWACCEL_D3D11VA(codec) \ HW_CONFIG_HWACCEL(0, 0, 1, D3D11VA_VLD, NONE, ff_ ## codec ## _d3d11va_hwaccel) +#define HWACCEL_D3D12VA(codec) \ + HW_CONFIG_HWACCEL(1, 1, 0, D3D12, D3D12VA, ff_ ## codec ## _d3d12va_hwaccel) #define HW_CONFIG_ENCODER(device, frames, ad_hoc, format, device_type_) \ &(const AVCodecHWConfigInternal) { \ From patchwork Fri Jun 2 08:06:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wu, Tong1" X-Patchwork-Id: 41951 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:c51c:b0:10c:5e6f:955f with SMTP id gm28csp1055194pzb; Fri, 2 Jun 2023 01:12:08 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ737qTRG65cNez/q08nSdMA+NnLeT5ChBvtABiUxqhyaidLv9uVyr17GSC5vjjIs9xG/jdo X-Received: by 2002:a17:906:4fc3:b0:974:2169:5f81 with SMTP id i3-20020a1709064fc300b0097421695f81mr11445094ejw.22.1685693527752; Fri, 02 Jun 2023 01:12:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685693527; cv=none; d=google.com; s=arc-20160816; b=GqhRaA7M/jyxHr4rMtVIVziWZBVmj8XIyzs8LNNXWiYQCSMkJEIDuzolq7OlCRmnEC +7iMlqiE+NPpLAFzUpbv93cZ/4sjdcBoIAAcMMghA/+vO2nSBk5stY1p/EP08uvlMHR4 ufWzaFlH72xiU5zlmnQouA6acp7tJ/vH2ArmXWDuA09r/ylZ3bcaW4G6sqrWnNov4EHR ldYBhGuMdKGYBitIRpCerCWAwXGRDZ5ZcriOUp+lNSSJGPSKTH43tnk+yfoIOvOFFzrP MsAw/uFCXrKZZv0E4p7eRC1f/BDuctu6N5RaGTSKulp4A3b6dW3nIaCLCW0BChwyss9B j5oQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=ZGwl4QWtZwD8Xj2A6MgI4rH3jUjefIWJOPrp2Tq7t9s=; b=xqAFvMvf4oI5TLEEVa0XsbDTGvM4sQGFWf2EymZV7L225LLSNxMLxzndHNV+BK3IFZ kO73L3PtoA6+W7KrJtei+vole2Zik127fd5gSUoQKg3PKmCxw5SQwouQcSotZDc6Cd03 EyErPwgawT53qFrYWiO0o6aMitSItqbcl+o+2g4Y3zAkNeM3ZiFzxp9GxhSbMYEwLEdW 4TgmmpkhVUHX43EnkTbhBJdtXbZMEyK/GBTepJtTFU+CjOf9fgHoBL0ROi+AFNWvH1qf s0GLMU3NMqH+OaIWEsuBcbAyI84E+jiJMfbKvrapFhi+cOSy9RvKXC7xJk1H7Ihp1keB XsLQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=SWFf1XiY; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id n7-20020a170906700700b0094a86ccb630si446342ejj.893.2023.06.02.01.12.07; Fri, 02 Jun 2023 01:12:07 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=SWFf1XiY; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 93BC968C329; Fri, 2 Jun 2023 11:11:45 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 7300868C31A for ; Fri, 2 Jun 2023 11:11:37 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1685693502; x=1717229502; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7xsawuJEMa1kODF9OZQzU4hPSwfBtZgZdtt1ALHRO+o=; b=SWFf1XiYN/tsso8oKvdW+LP54EwKKapdkFhFYE1U+so9j+yyJNvpKQjB /OPYyzU/6fDGqDATJzM8U3CvC0atVItm5/8VqRA2sPoCGgmkJu9IgzgNs kOPpWtRr63Rj5W2wQGSfXJ/L8JdZUaUkiFOGpLuYCM68KIygmgXtVHwKB u1cCJAfxWaeMQQAeyzEWRJApIWSL1viWHwZAo5nhxJHkQEZWdaD2380NE 6/wYmHX+/OYPBr1TkOvVSpUNiXrRZStMoc6PTVLWhYf5pPUpjFySD76sD +gySF6nbXp4yOEF6MaInd0YhYWV/eJz62OPuM3NWI42aeo+1B4YVBzks8 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="421629856" X-IronPort-AV: E=Sophos;i="6.00,212,1681196400"; d="scan'208";a="421629856" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Jun 2023 01:11:22 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="852060671" X-IronPort-AV: E=Sophos;i="6.00,212,1681196400"; d="scan'208";a="852060671" Received: from desktop-qn7n0nf.sh.intel.com (HELO localhost.localdomain) ([10.239.160.59]) by fmsmga001.fm.intel.com with ESMTP; 02 Jun 2023 01:11:21 -0700 From: Tong Wu To: ffmpeg-devel@ffmpeg.org Date: Fri, 2 Jun 2023 16:06:55 +0800 Message-Id: <20230602080701.1754-3-tong1.wu@intel.com> X-Mailer: git-send-email 2.35.1.windows.2 In-Reply-To: <20230602080701.1754-1-tong1.wu@intel.com> References: <20230602080701.1754-1-tong1.wu@intel.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 3/9] avcodec: add D3D12VA hardware accelerated HEVC decoding X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Tong Wu , Wu Jianhua Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 28X8Xpr5+m+8 From: Wu Jianhua The command below is how to enable d3d12va: ffmpeg -hwaccel d3d12va -i input.mp4 output.mp4 Signed-off-by: Wu Jianhua Signed-off-by: Tong Wu --- configure | 2 + libavcodec/Makefile | 1 + libavcodec/d3d12va_hevc.c | 211 ++++++++++++++++++++++++++++++++++++ libavcodec/dxva2_hevc.c | 10 +- libavcodec/dxva2_internal.h | 4 + libavcodec/hevcdec.c | 10 ++ libavcodec/hwaccels.h | 1 + 7 files changed, 235 insertions(+), 4 deletions(-) create mode 100644 libavcodec/d3d12va_hevc.c diff --git a/configure b/configure index f5dad4653f..3d25e5fdea 100755 --- a/configure +++ b/configure @@ -3051,6 +3051,8 @@ hevc_d3d11va_hwaccel_deps="d3d11va DXVA_PicParams_HEVC" hevc_d3d11va_hwaccel_select="hevc_decoder" hevc_d3d11va2_hwaccel_deps="d3d11va DXVA_PicParams_HEVC" hevc_d3d11va2_hwaccel_select="hevc_decoder" +hevc_d3d12va_hwaccel_deps="d3d12va DXVA_PicParams_HEVC" +hevc_d3d12va_hwaccel_select="hevc_decoder" hevc_dxva2_hwaccel_deps="dxva2 DXVA_PicParams_HEVC" hevc_dxva2_hwaccel_select="hevc_decoder" hevc_nvdec_hwaccel_deps="nvdec" diff --git a/libavcodec/Makefile b/libavcodec/Makefile index ae143d8821..6cc28f2fd0 100644 --- a/libavcodec/Makefile +++ b/libavcodec/Makefile @@ -1004,6 +1004,7 @@ OBJS-$(CONFIG_H264_VIDEOTOOLBOX_HWACCEL) += videotoolbox.o OBJS-$(CONFIG_H264_VULKAN_HWACCEL) += vulkan_decode.o vulkan_h264.o OBJS-$(CONFIG_HEVC_D3D11VA_HWACCEL) += dxva2_hevc.o OBJS-$(CONFIG_HEVC_DXVA2_HWACCEL) += dxva2_hevc.o +OBJS-$(CONFIG_HEVC_D3D12VA_HWACCEL) += dxva2_hevc.o d3d12va_hevc.o OBJS-$(CONFIG_HEVC_NVDEC_HWACCEL) += nvdec_hevc.o OBJS-$(CONFIG_HEVC_QSV_HWACCEL) += qsvdec.o OBJS-$(CONFIG_HEVC_VAAPI_HWACCEL) += vaapi_hevc.o h265_profile_level.o diff --git a/libavcodec/d3d12va_hevc.c b/libavcodec/d3d12va_hevc.c new file mode 100644 index 0000000000..1d94831e01 --- /dev/null +++ b/libavcodec/d3d12va_hevc.c @@ -0,0 +1,211 @@ +/* + * Direct3D 12 HEVC HW acceleration + * + * copyright (c) 2022-2023 Wu Jianhua + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config_components.h" + +#include "libavutil/avassert.h" +#include "libavutil/hwcontext_d3d12va_internal.h" +#include "hevc_data.h" +#include "hevcdec.h" +#include "dxva2_internal.h" +#include "d3d12va.h" +#include + +#define MAX_SLICES 256 + +typedef struct HEVCDecodePictureContext { + DXVA_PicParams_HEVC pp; + DXVA_Qmatrix_HEVC qm; + unsigned slice_count; + DXVA_Slice_HEVC_Short slice_short[MAX_SLICES]; + const uint8_t *bitstream; + unsigned bitstream_size; +} HEVCDecodePictureContext; + +static void fill_slice_short(DXVA_Slice_HEVC_Short *slice, unsigned position, unsigned size) +{ + memset(slice, 0, sizeof(*slice)); + slice->BSNALunitDataLocation = position; + slice->SliceBytesInBuffer = size; + slice->wBadSliceChopping = 0; +} + +static int d3d12va_hevc_start_frame(AVCodecContext *avctx, av_unused const uint8_t *buffer, av_unused uint32_t size) +{ + const HEVCContext *h = avctx->priv_data; + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + HEVCDecodePictureContext *ctx_pic = h->ref->hwaccel_picture_private; + + if (!ctx) + return -1; + + av_assert0(ctx_pic); + + ff_dxva2_hevc_fill_picture_parameters(avctx, (AVDXVAContext *)ctx, &ctx_pic->pp); + + ff_dxva2_hevc_fill_scaling_lists(avctx, (AVDXVAContext *)ctx, &ctx_pic->qm); + + ctx_pic->slice_count = 0; + ctx_pic->bitstream_size = 0; + ctx_pic->bitstream = NULL; + + return 0; +} + +static int d3d12va_hevc_decode_slice(AVCodecContext *avctx, const uint8_t *buffer, uint32_t size) +{ + const HEVCContext *h = avctx->priv_data; + const HEVCFrame *current_picture = h->ref; + HEVCDecodePictureContext *ctx_pic = current_picture->hwaccel_picture_private; + unsigned position; + + if (ctx_pic->slice_count >= MAX_SLICES) + return AVERROR(ERANGE); + + if (!ctx_pic->bitstream) + ctx_pic->bitstream = buffer; + ctx_pic->bitstream_size += size; + + position = buffer - ctx_pic->bitstream; + fill_slice_short(&ctx_pic->slice_short[ctx_pic->slice_count], position, size); + ctx_pic->slice_count++; + + return 0; +} + +#define START_CODE 65536 +#define START_CODE_SIZE 3 +static int update_input_arguments(AVCodecContext *avctx, D3D12_VIDEO_DECODE_INPUT_STREAM_ARGUMENTS *input_args, ID3D12Resource *buffer) +{ + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + AVHWFramesContext *frames_ctx = D3D12VA_FRAMES_CONTEXT(avctx); + AVD3D12VAFramesContext *frames_hwctx = frames_ctx->hwctx; + + const HEVCContext *h = avctx->priv_data; + const HEVCFrame *current_picture = h->ref; + HEVCDecodePictureContext *ctx_pic = current_picture->hwaccel_picture_private; + + int i, index; + uint8_t *mapped_data, *mapped_ptr; + DXVA_Slice_HEVC_Short *slice; + D3D12_VIDEO_DECODE_FRAME_ARGUMENT *args; + + if (FAILED(ID3D12Resource_Map(buffer, 0, NULL, &mapped_data))) { + av_log(avctx, AV_LOG_ERROR, "Failed to map D3D12 Buffer resource!\n"); + return AVERROR(EINVAL); + } + + mapped_ptr = mapped_data; + for (i = 0; i < ctx_pic->slice_count; i++) { + UINT position, size; + slice = &ctx_pic->slice_short[i]; + + position = slice->BSNALunitDataLocation; + size = slice->SliceBytesInBuffer; + + slice->SliceBytesInBuffer += START_CODE_SIZE; + slice->BSNALunitDataLocation = mapped_ptr - mapped_data; + + *(uint32_t *)mapped_ptr = START_CODE; + mapped_ptr += START_CODE_SIZE; + + memcpy(mapped_ptr, &ctx_pic->bitstream[position], size); + mapped_ptr += size; + } + + ID3D12Resource_Unmap(buffer, 0, NULL); + + input_args->CompressedBitstream = (D3D12_VIDEO_DECODE_COMPRESSED_BITSTREAM){ + .pBuffer = buffer, + .Offset = 0, + .Size = mapped_ptr - mapped_data, + }; + + args = &input_args->FrameArguments[input_args->NumFrameArguments++]; + args->Type = D3D12_VIDEO_DECODE_ARGUMENT_TYPE_SLICE_CONTROL; + args->Size = sizeof(DXVA_Slice_HEVC_Short) * ctx_pic->slice_count; + args->pData = ctx_pic->slice_short; + + index = ctx_pic->pp.CurrPic.Index7Bits; + ctx->ref_resources[index] = frames_hwctx->texture_infos[index].texture; + for (i = 0; i < FF_ARRAY_ELEMS(ctx_pic->pp.RefPicList); i++) { + index = ctx_pic->pp.RefPicList[i].Index7Bits; + if (index != 0x7f) + ctx->ref_resources[index] = frames_hwctx->texture_infos[index].texture; + } + + return 0; +} + +static int d3d12va_hevc_end_frame(AVCodecContext *avctx) +{ + HEVCContext *h = avctx->priv_data; + HEVCDecodePictureContext *ctx_pic = h->ref->hwaccel_picture_private; + + int scale = ctx_pic->pp.dwCodingParamToolFlags & 1; + + if (ctx_pic->slice_count <= 0 || ctx_pic->bitstream_size <= 0) + return -1; + + return ff_d3d12va_common_end_frame(avctx, h->ref->frame, &ctx_pic->pp, sizeof(ctx_pic->pp), + scale ? &ctx_pic->qm : NULL, scale ? sizeof(ctx_pic->qm) : 0, update_input_arguments); +} + +static int d3d12va_hevc_decode_init(AVCodecContext *avctx) +{ + HEVCContext *h = avctx->priv_data; + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + + switch (avctx->profile) { + case FF_PROFILE_HEVC_MAIN_10: + ctx->cfg.DecodeProfile = D3D12_VIDEO_DECODE_PROFILE_HEVC_MAIN10; + break; + + case FF_PROFILE_HEVC_MAIN_STILL_PICTURE: + av_log(avctx, AV_LOG_ERROR, "D3D12 doesn't support PROFILE_HEVC_MAIN_STILL_PICTURE!\n"); + return AVERROR(EINVAL); + + case FF_PROFILE_HEVC_MAIN: + default: + ctx->cfg.DecodeProfile = D3D12_VIDEO_DECODE_PROFILE_HEVC_MAIN; + break; + }; + + return ff_d3d12va_decode_init(avctx); +} + +#if CONFIG_HEVC_D3D12VA_HWACCEL +const AVHWAccel ff_hevc_d3d12va_hwaccel = { + .name = "hevc_d3d12va", + .type = AVMEDIA_TYPE_VIDEO, + .id = AV_CODEC_ID_HEVC, + .pix_fmt = AV_PIX_FMT_D3D12, + .init = d3d12va_hevc_decode_init, + .uninit = ff_d3d12va_decode_uninit, + .start_frame = d3d12va_hevc_start_frame, + .decode_slice = d3d12va_hevc_decode_slice, + .end_frame = d3d12va_hevc_end_frame, + .frame_params = ff_d3d12va_common_frame_params, + .frame_priv_data_size = sizeof(HEVCDecodePictureContext), + .priv_data_size = sizeof(D3D12VADecodeContext), +}; +#endif diff --git a/libavcodec/dxva2_hevc.c b/libavcodec/dxva2_hevc.c index 6b239d9917..2b8c0e2a9a 100644 --- a/libavcodec/dxva2_hevc.c +++ b/libavcodec/dxva2_hevc.c @@ -56,9 +56,10 @@ static int get_refpic_index(const DXVA_PicParams_HEVC *pp, int surface_index) return 0xff; } -static void fill_picture_parameters(const AVCodecContext *avctx, AVDXVAContext *ctx, const HEVCContext *h, +void ff_dxva2_hevc_fill_picture_parameters(const AVCodecContext *avctx, AVDXVAContext *ctx, DXVA_PicParams_HEVC *pp) { + const HEVCContext *h = avctx->priv_data; const HEVCFrame *current_picture = h->ref; const HEVCSPS *sps = h->ps.sps; const HEVCPPS *pps = h->ps.pps; @@ -199,8 +200,9 @@ static void fill_picture_parameters(const AVCodecContext *avctx, AVDXVAContext * pp->StatusReportFeedbackNumber = 1 + DXVA_CONTEXT_REPORT_ID(avctx, ctx)++; } -static void fill_scaling_lists(AVDXVAContext *ctx, const HEVCContext *h, DXVA_Qmatrix_HEVC *qm) +void ff_dxva2_hevc_fill_scaling_lists(const AVCodecContext *avctx, AVDXVAContext *ctx, DXVA_Qmatrix_HEVC *qm) { + const HEVCContext *h = avctx->priv_data; unsigned i, j, pos; const ScalingList *sl = h->ps.pps->scaling_list_data_present_flag ? &h->ps.pps->scaling_list : &h->ps.sps->scaling_list; @@ -368,10 +370,10 @@ static int dxva2_hevc_start_frame(AVCodecContext *avctx, av_assert0(ctx_pic); /* Fill up DXVA_PicParams_HEVC */ - fill_picture_parameters(avctx, ctx, h, &ctx_pic->pp); + ff_dxva2_hevc_fill_picture_parameters(avctx, ctx, &ctx_pic->pp); /* Fill up DXVA_Qmatrix_HEVC */ - fill_scaling_lists(ctx, h, &ctx_pic->qm); + ff_dxva2_hevc_fill_scaling_lists(avctx, ctx, &ctx_pic->qm); ctx_pic->slice_count = 0; ctx_pic->bitstream_size = 0; diff --git a/libavcodec/dxva2_internal.h b/libavcodec/dxva2_internal.h index a9a1fc090e..08847aef22 100644 --- a/libavcodec/dxva2_internal.h +++ b/libavcodec/dxva2_internal.h @@ -166,4 +166,8 @@ void ff_dxva2_h264_fill_picture_parameters(const AVCodecContext *avctx, AVDXVACo void ff_dxva2_h264_fill_scaling_lists(const AVCodecContext *avctx, AVDXVAContext *ctx, DXVA_Qmatrix_H264 *qm); +void ff_dxva2_hevc_fill_picture_parameters(const AVCodecContext *avctx, AVDXVAContext *ctx, DXVA_PicParams_HEVC *pp); + +void ff_dxva2_hevc_fill_scaling_lists(const AVCodecContext *avctx, AVDXVAContext *ctx, DXVA_Qmatrix_HEVC *qm); + #endif /* AVCODEC_DXVA2_INTERNAL_H */ diff --git a/libavcodec/hevcdec.c b/libavcodec/hevcdec.c index eee77ec4db..f515725aa7 100644 --- a/libavcodec/hevcdec.c +++ b/libavcodec/hevcdec.c @@ -402,6 +402,7 @@ static enum AVPixelFormat get_format(HEVCContext *s, const HEVCSPS *sps) { #define HWACCEL_MAX (CONFIG_HEVC_DXVA2_HWACCEL + \ CONFIG_HEVC_D3D11VA_HWACCEL * 2 + \ + CONFIG_HEVC_D3D12VA_HWACCEL + \ CONFIG_HEVC_NVDEC_HWACCEL + \ CONFIG_HEVC_VAAPI_HWACCEL + \ CONFIG_HEVC_VIDEOTOOLBOX_HWACCEL + \ @@ -419,6 +420,9 @@ static enum AVPixelFormat get_format(HEVCContext *s, const HEVCSPS *sps) *fmt++ = AV_PIX_FMT_D3D11VA_VLD; *fmt++ = AV_PIX_FMT_D3D11; #endif +#if CONFIG_HEVC_D3D12VA_HWACCEL + *fmt++ = AV_PIX_FMT_D3D12; +#endif #if CONFIG_HEVC_VAAPI_HWACCEL *fmt++ = AV_PIX_FMT_VAAPI; #endif @@ -443,6 +447,9 @@ static enum AVPixelFormat get_format(HEVCContext *s, const HEVCSPS *sps) *fmt++ = AV_PIX_FMT_D3D11VA_VLD; *fmt++ = AV_PIX_FMT_D3D11; #endif +#if CONFIG_HEVC_D3D12VA_HWACCEL + *fmt++ = AV_PIX_FMT_D3D12; +#endif #if CONFIG_HEVC_VAAPI_HWACCEL *fmt++ = AV_PIX_FMT_VAAPI; #endif @@ -3778,6 +3785,9 @@ const FFCodec ff_hevc_decoder = { #if CONFIG_HEVC_D3D11VA2_HWACCEL HWACCEL_D3D11VA2(hevc), #endif +#if CONFIG_HEVC_D3D12VA_HWACCEL + HWACCEL_D3D12VA(hevc), +#endif #if CONFIG_HEVC_NVDEC_HWACCEL HWACCEL_NVDEC(hevc), #endif diff --git a/libavcodec/hwaccels.h b/libavcodec/hwaccels.h index be54604b81..70e115f78a 100644 --- a/libavcodec/hwaccels.h +++ b/libavcodec/hwaccels.h @@ -41,6 +41,7 @@ extern const AVHWAccel ff_h264_videotoolbox_hwaccel; extern const AVHWAccel ff_h264_vulkan_hwaccel; extern const AVHWAccel ff_hevc_d3d11va_hwaccel; extern const AVHWAccel ff_hevc_d3d11va2_hwaccel; +extern const AVHWAccel ff_hevc_d3d12va_hwaccel; extern const AVHWAccel ff_hevc_dxva2_hwaccel; extern const AVHWAccel ff_hevc_nvdec_hwaccel; extern const AVHWAccel ff_hevc_vaapi_hwaccel; From patchwork Fri Jun 2 08:06:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wu, Tong1" X-Patchwork-Id: 41952 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:c51c:b0:10c:5e6f:955f with SMTP id gm28csp1055255pzb; Fri, 2 Jun 2023 01:12:17 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ60XH3d1tAnbb+LtFtSe+e8lME6TctI9s066GOpGxXvsJaF9Fi5vPMJMN+lZ2jRt9HlJD5f X-Received: by 2002:a17:907:940e:b0:974:4a33:83a7 with SMTP id dk14-20020a170907940e00b009744a3383a7mr4026010ejc.12.1685693537331; Fri, 02 Jun 2023 01:12:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685693537; cv=none; d=google.com; s=arc-20160816; b=yIK5dvAHz0aBBiRm4RUBvzfWO+YvAnFjcPbEEakoQc2Q8s440Z3BajMNpHYmNoCdxH NeSqtItuUX4EYdiBWrsHFd2c2VMZ3uwUJZk7eXYwMOLeWB6jE/wfkk6h39U4bkaZn6y+ fPd2z46geOIYxQ5L4ZjlRVpyBC5W0riKLd4iBDUaCi/Mfy6FPMXQ+T+P+4VlEnpMKttv w5HMgp27ta66S2h/bksn0MGwXitLQm1Lrn31F7kCWKz5rfVtOtI5h3c/okSiLQL24Hz9 pf2BCeJ6HGcM4j3hBouyJ2fj7q25hIFCdkhIVpf9Xl/HTGy5zBhZSW70H1lpsDgxJm7p Y5wQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=vnzV4vsqKnhFMGXbIP1sdTpobHCQBZUmv3P8FeWZeUw=; b=lYBY6guCngdDMg8X1bTqthDm6Z3HklJtywxH30Dy8zbhj3OcVGP+KNO7P19Uvhroh5 +VlPSL8ntMTDR9fOwQ6Wwen4B57mX6sLAiFRzGw/P7sZ3QNWn2Fg4N+gfMsuvZ9CX5Mg cC8ln6xbyYqRrGGd6vvLI9a5+L2IEH6d97Zc0XV10FSbUPluhNB9cDgrmujRKLNevexF y0In7Sxo9hMgOrSir8FLQcAnIYO7BsSKy1BzyLfkYowBarlVnyYJ2ngx9ONwy9V6j341 LjxWoswy7rx8U/jL08Re9LNXJCwkfGsoSuSpuN72QNKHM1ABf0cXJmH3MQaJI5QeKWRU /gUg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=JD8L8tbq; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id x8-20020a170906148800b0094edce87118si430275ejc.1042.2023.06.02.01.12.16; Fri, 02 Jun 2023 01:12:17 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=JD8L8tbq; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 93F1668C310; Fri, 2 Jun 2023 11:11:46 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id B3C3068C30E for ; Fri, 2 Jun 2023 11:11:37 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1685693502; x=1717229502; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=B5Zc4VkJiA/22MQ5gqr6Vqoh7Z6WG9TIUOQGH/unK4w=; b=JD8L8tbq70pR8b2oOtd8ryI0yNo7ttRazIErsESyBYT71QB5n6D3XE6L D6q2vUF5hI4DT0yT6dkRDOguWl0UgHllqIWrdUnW6mQp/xAoJaCXjPlcZ QbAqLs7DCqu65a1yH0PkEBIISriGdBkUhMovq1PfZsz0nG6UqqXgNqMIH KTSoZMyNhKnBfftCPtf1pAG0wValp2RlcxMoX2lm3B8ZP17qrEG5i7DNX CJ/Zv70cdQNgCnPj7Aiyki2CnzJy6164ZoVAs4gQB899K/mUmDiOnYyRi gBH+q90OZFl04N/RX61cYBLP+NKOC2/zAcXN+bHDNAjLUSqrh3o2UBm9C A==; X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="421629870" X-IronPort-AV: E=Sophos;i="6.00,212,1681196400"; d="scan'208";a="421629870" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Jun 2023 01:11:23 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="852060674" X-IronPort-AV: E=Sophos;i="6.00,212,1681196400"; d="scan'208";a="852060674" Received: from desktop-qn7n0nf.sh.intel.com (HELO localhost.localdomain) ([10.239.160.59]) by fmsmga001.fm.intel.com with ESMTP; 02 Jun 2023 01:11:22 -0700 From: Tong Wu To: ffmpeg-devel@ffmpeg.org Date: Fri, 2 Jun 2023 16:06:56 +0800 Message-Id: <20230602080701.1754-4-tong1.wu@intel.com> X-Mailer: git-send-email 2.35.1.windows.2 In-Reply-To: <20230602080701.1754-1-tong1.wu@intel.com> References: <20230602080701.1754-1-tong1.wu@intel.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 4/9] avcodec: add D3D12VA hardware accelerated VP9 decoding X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Tong Wu , Wu Jianhua Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: dz+6fMHd8vYI From: Wu Jianhua The command below is how to enable d3d12va: ffmpeg -hwaccel d3d12va -i input.mp4 output.mp4 Signed-off-by: Wu Jianhua Signed-off-by: Tong Wu --- configure | 2 + libavcodec/Makefile | 1 + libavcodec/d3d12va_vp9.c | 176 ++++++++++++++++++++++++++++++++++++ libavcodec/dxva2_internal.h | 2 + libavcodec/dxva2_vp9.c | 7 +- libavcodec/hwaccels.h | 1 + libavcodec/vp9.c | 7 ++ 7 files changed, 193 insertions(+), 3 deletions(-) create mode 100644 libavcodec/d3d12va_vp9.c diff --git a/configure b/configure index 3d25e5fdea..fa2548f3e2 100755 --- a/configure +++ b/configure @@ -3119,6 +3119,8 @@ vp9_d3d11va_hwaccel_deps="d3d11va DXVA_PicParams_VP9" vp9_d3d11va_hwaccel_select="vp9_decoder" vp9_d3d11va2_hwaccel_deps="d3d11va DXVA_PicParams_VP9" vp9_d3d11va2_hwaccel_select="vp9_decoder" +vp9_d3d12va_hwaccel_deps="d3d12va DXVA_PicParams_VP9" +vp9_d3d12va_hwaccel_select="vp9_decoder" vp9_dxva2_hwaccel_deps="dxva2 DXVA_PicParams_VP9" vp9_dxva2_hwaccel_select="vp9_decoder" vp9_nvdec_hwaccel_deps="nvdec" diff --git a/libavcodec/Makefile b/libavcodec/Makefile index 6cc28f2fd0..d5a1bfef7a 100644 --- a/libavcodec/Makefile +++ b/libavcodec/Makefile @@ -1036,6 +1036,7 @@ OBJS-$(CONFIG_VP8_NVDEC_HWACCEL) += nvdec_vp8.o OBJS-$(CONFIG_VP8_VAAPI_HWACCEL) += vaapi_vp8.o OBJS-$(CONFIG_VP9_D3D11VA_HWACCEL) += dxva2_vp9.o OBJS-$(CONFIG_VP9_DXVA2_HWACCEL) += dxva2_vp9.o +OBJS-$(CONFIG_VP9_D3D12VA_HWACCEL) += dxva2_vp9.o d3d12va_vp9.o OBJS-$(CONFIG_VP9_NVDEC_HWACCEL) += nvdec_vp9.o OBJS-$(CONFIG_VP9_VAAPI_HWACCEL) += vaapi_vp9.o OBJS-$(CONFIG_VP9_VDPAU_HWACCEL) += vdpau_vp9.o diff --git a/libavcodec/d3d12va_vp9.c b/libavcodec/d3d12va_vp9.c new file mode 100644 index 0000000000..dc1c461f5c --- /dev/null +++ b/libavcodec/d3d12va_vp9.c @@ -0,0 +1,176 @@ +/* + * Direct3D 12 VP9 HW acceleration + * + * copyright (c) 2022-2023 Wu Jianhua + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config_components.h" + +#include "libavutil/avassert.h" +#include "libavutil/pixdesc.h" +#include "libavutil/hwcontext_d3d12va_internal.h" + +#include "vp9shared.h" +#include "dxva2_internal.h" +#include "d3d12va.h" + +typedef struct VP9DecodePictureContext { + DXVA_PicParams_VP9 pp; + DXVA_Slice_VPx_Short slice; + const uint8_t *bitstream; + unsigned bitstream_size; +} VP9DecodePictureContext; + +static void fill_slice_short(DXVA_Slice_VPx_Short *slice, unsigned position, unsigned size) +{ + memset(slice, 0, sizeof(*slice)); + slice->BSNALunitDataLocation = position; + slice->SliceBytesInBuffer = size; + slice->wBadSliceChopping = 0; +} + +static int d3d12va_vp9_start_frame(AVCodecContext *avctx, av_unused const uint8_t *buffer, av_unused uint32_t size) +{ + const VP9SharedContext *h = avctx->priv_data; + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + VP9DecodePictureContext *ctx_pic = h->frames[CUR_FRAME].hwaccel_picture_private; + + if (!ctx) + return -1; + + av_assert0(ctx_pic); + + if (ff_dxva2_vp9_fill_picture_parameters(avctx, (AVDXVAContext *)ctx, &ctx_pic->pp) < 0) + return -1; + + ctx_pic->bitstream_size = 0; + ctx_pic->bitstream = NULL; + + return 0; +} + +static int d3d12va_vp9_decode_slice(AVCodecContext *avctx, const uint8_t *buffer, uint32_t size) +{ + const VP9SharedContext *h = avctx->priv_data; + VP9DecodePictureContext *ctx_pic = h->frames[CUR_FRAME].hwaccel_picture_private; + unsigned position; + + if (!ctx_pic->bitstream) + ctx_pic->bitstream = buffer; + ctx_pic->bitstream_size += size; + + position = buffer - ctx_pic->bitstream; + fill_slice_short(&ctx_pic->slice, position, size); + + return 0; +} + +static int update_input_arguments(AVCodecContext *avctx, D3D12_VIDEO_DECODE_INPUT_STREAM_ARGUMENTS *input_args, ID3D12Resource *buffer) +{ + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + AVHWFramesContext *frames_ctx = D3D12VA_FRAMES_CONTEXT(avctx); + AVD3D12VAFramesContext *frames_hwctx = frames_ctx->hwctx; + + const VP9SharedContext *h = avctx->priv_data; + VP9DecodePictureContext *ctx_pic = h->frames[CUR_FRAME].hwaccel_picture_private; + + int index; + uint8_t *mapped_data; + D3D12_VIDEO_DECODE_FRAME_ARGUMENT *args; + + if (FAILED(ID3D12Resource_Map(buffer, 0, NULL, &mapped_data))) { + av_log(avctx, AV_LOG_ERROR, "Failed to map D3D12 Buffer resource!\n"); + return AVERROR(EINVAL); + } + + args = &input_args->FrameArguments[input_args->NumFrameArguments++]; + args->Type = D3D12_VIDEO_DECODE_ARGUMENT_TYPE_SLICE_CONTROL; + args->Size = sizeof(ctx_pic->slice); + args->pData = &ctx_pic->slice; + + memcpy(mapped_data, ctx_pic->bitstream, ctx_pic->slice.SliceBytesInBuffer); + + ID3D12Resource_Unmap(buffer, 0, NULL); + + input_args->CompressedBitstream = (D3D12_VIDEO_DECODE_COMPRESSED_BITSTREAM){ + .pBuffer = buffer, + .Offset = 0, + .Size = ctx_pic->slice.SliceBytesInBuffer, + }; + + index = ctx_pic->pp.CurrPic.Index7Bits; + ctx->ref_resources[index] = frames_hwctx->texture_infos[index].texture; + + for (int i = 0; i < FF_ARRAY_ELEMS(ctx_pic->pp.frame_refs); i++) { + index = ctx_pic->pp.frame_refs[i].Index7Bits; + if (index != 0x7f) + ctx->ref_resources[index] = frames_hwctx->texture_infos[index].texture; + } + + return 0; +} + +static int d3d12va_vp9_end_frame(AVCodecContext *avctx) +{ + VP9SharedContext *h = avctx->priv_data; + VP9DecodePictureContext *ctx_pic = h->frames[CUR_FRAME].hwaccel_picture_private; + + if (ctx_pic->bitstream_size <= 0) + return -1; + + return ff_d3d12va_common_end_frame(avctx, h->frames[CUR_FRAME].tf.f, + &ctx_pic->pp, sizeof(ctx_pic->pp), NULL, 0, update_input_arguments); +} + +static int d3d12va_vp9_decode_init(AVCodecContext *avctx) +{ + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + + switch (avctx->profile) { + case FF_PROFILE_VP9_2: + case FF_PROFILE_VP9_3: + ctx->cfg.DecodeProfile = D3D12_VIDEO_DECODE_PROFILE_VP9_10BIT_PROFILE2; + break; + + case FF_PROFILE_VP9_0: + case FF_PROFILE_VP9_1: + default: + ctx->cfg.DecodeProfile = D3D12_VIDEO_DECODE_PROFILE_VP9; + break; + }; + + return ff_d3d12va_decode_init(avctx); +} + +#if CONFIG_VP9_D3D12VA_HWACCEL +const AVHWAccel ff_vp9_d3d12va_hwaccel = { + .name = "vp9_d3d12va", + .type = AVMEDIA_TYPE_VIDEO, + .id = AV_CODEC_ID_VP9, + .pix_fmt = AV_PIX_FMT_D3D12, + .init = d3d12va_vp9_decode_init, + .uninit = ff_d3d12va_decode_uninit, + .start_frame = d3d12va_vp9_start_frame, + .decode_slice = d3d12va_vp9_decode_slice, + .end_frame = d3d12va_vp9_end_frame, + .frame_params = ff_d3d12va_common_frame_params, + .frame_priv_data_size = sizeof(VP9DecodePictureContext), + .priv_data_size = sizeof(D3D12VADecodeContext), +}; +#endif diff --git a/libavcodec/dxva2_internal.h b/libavcodec/dxva2_internal.h index 08847aef22..1de749f83b 100644 --- a/libavcodec/dxva2_internal.h +++ b/libavcodec/dxva2_internal.h @@ -170,4 +170,6 @@ void ff_dxva2_hevc_fill_picture_parameters(const AVCodecContext *avctx, AVDXVACo void ff_dxva2_hevc_fill_scaling_lists(const AVCodecContext *avctx, AVDXVAContext *ctx, DXVA_Qmatrix_HEVC *qm); +int ff_dxva2_vp9_fill_picture_parameters(const AVCodecContext *avctx, AVDXVAContext *ctx, DXVA_PicParams_VP9 *pp); + #endif /* AVCODEC_DXVA2_INTERNAL_H */ diff --git a/libavcodec/dxva2_vp9.c b/libavcodec/dxva2_vp9.c index dbe6c08ad1..480d734acd 100644 --- a/libavcodec/dxva2_vp9.c +++ b/libavcodec/dxva2_vp9.c @@ -42,11 +42,12 @@ static void fill_picture_entry(DXVA_PicEntry_VPx *pic, pic->bPicEntry = index | (flag << 7); } -static int fill_picture_parameters(const AVCodecContext *avctx, AVDXVAContext *ctx, const VP9SharedContext *h, +int ff_dxva2_vp9_fill_picture_parameters(const AVCodecContext *avctx, AVDXVAContext *ctx, DXVA_PicParams_VP9 *pp) { + const VP9SharedContext *h = avctx->priv_data; + const AVPixFmtDescriptor *pixdesc = av_pix_fmt_desc_get(avctx->sw_pix_fmt); int i; - const AVPixFmtDescriptor * pixdesc = av_pix_fmt_desc_get(avctx->sw_pix_fmt); if (!pixdesc) return -1; @@ -264,7 +265,7 @@ static int dxva2_vp9_start_frame(AVCodecContext *avctx, av_assert0(ctx_pic); /* Fill up DXVA_PicParams_VP9 */ - if (fill_picture_parameters(avctx, ctx, h, &ctx_pic->pp) < 0) + if (ff_dxva2_vp9_fill_picture_parameters(avctx, ctx, &ctx_pic->pp) < 0) return -1; ctx_pic->bitstream_size = 0; diff --git a/libavcodec/hwaccels.h b/libavcodec/hwaccels.h index 70e115f78a..1059b6db70 100644 --- a/libavcodec/hwaccels.h +++ b/libavcodec/hwaccels.h @@ -75,6 +75,7 @@ extern const AVHWAccel ff_vp8_nvdec_hwaccel; extern const AVHWAccel ff_vp8_vaapi_hwaccel; extern const AVHWAccel ff_vp9_d3d11va_hwaccel; extern const AVHWAccel ff_vp9_d3d11va2_hwaccel; +extern const AVHWAccel ff_vp9_d3d12va_hwaccel; extern const AVHWAccel ff_vp9_dxva2_hwaccel; extern const AVHWAccel ff_vp9_nvdec_hwaccel; extern const AVHWAccel ff_vp9_vaapi_hwaccel; diff --git a/libavcodec/vp9.c b/libavcodec/vp9.c index 4f704ec0dd..9a34cd331e 100644 --- a/libavcodec/vp9.c +++ b/libavcodec/vp9.c @@ -184,6 +184,7 @@ static int update_size(AVCodecContext *avctx, int w, int h) { #define HWACCEL_MAX (CONFIG_VP9_DXVA2_HWACCEL + \ CONFIG_VP9_D3D11VA_HWACCEL * 2 + \ + CONFIG_VP9_D3D12VA_HWACCEL + \ CONFIG_VP9_NVDEC_HWACCEL + \ CONFIG_VP9_VAAPI_HWACCEL + \ CONFIG_VP9_VDPAU_HWACCEL + \ @@ -210,6 +211,9 @@ static int update_size(AVCodecContext *avctx, int w, int h) *fmtp++ = AV_PIX_FMT_D3D11VA_VLD; *fmtp++ = AV_PIX_FMT_D3D11; #endif +#if CONFIG_VP9_D3D12VA_HWACCEL + *fmtp++ = AV_PIX_FMT_D3D12; +#endif #if CONFIG_VP9_NVDEC_HWACCEL *fmtp++ = AV_PIX_FMT_CUDA; #endif @@ -1910,6 +1914,9 @@ const FFCodec ff_vp9_decoder = { #if CONFIG_VP9_D3D11VA2_HWACCEL HWACCEL_D3D11VA2(vp9), #endif +#if CONFIG_VP9_D3D12VA_HWACCEL + HWACCEL_D3D12VA(vp9), +#endif #if CONFIG_VP9_NVDEC_HWACCEL HWACCEL_NVDEC(vp9), #endif From patchwork Fri Jun 2 08:06:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wu, Tong1" X-Patchwork-Id: 41953 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:c51c:b0:10c:5e6f:955f with SMTP id gm28csp1055318pzb; Fri, 2 Jun 2023 01:12:27 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6LJqx63fhcaxGWyhRIxXZJ+BSaTkYEUxNhzgqy22XBi++C5HyO9Cwc19gIaZxel/KyIsKM X-Received: by 2002:a17:907:7245:b0:969:e993:6ff0 with SMTP id ds5-20020a170907724500b00969e9936ff0mr4197294ejc.25.1685693547124; Fri, 02 Jun 2023 01:12:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685693547; cv=none; d=google.com; s=arc-20160816; b=O9RrhKOEXyzEUt2vZPJQfPH8XmX5FFi/ygKjiQp3mh8UMfO14rJ4GmfZ/EtG7i5LWU nAsMyrncRHhS5BOKicxgFuTDDT23gb1HqS8iPQ7f0z8E9W6uioyMDNcTcXk2oGgrYfhs fVSXYk+4MDL+cQbwP33+N2jG/2AyafWEA0fHb8NcG8yhyVk1RBrQop2fJveAr3S1LdNi xZC3YDNDWaGu4i7hn5Zl7mygH+9p5iwrxFXLMWdeURElp2Gu98lF03Sj/oUtVXeBXfL+ g4qg+eLU5c6r3OYNpGMPCjnJciNFOj9s3OxqevTMGE4SytBAjkDO/+1sKkHvvWgtJswA nMMQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=ajlPovBgUELvJuJyf3r6x36Hoa4qLMi1MkWSemaRNFc=; b=k6GDLoBzsqinECqlA4pavczKXOQgT5KkdjJNH51zTxBJFCKmYhpCfQPobei6QLQFKi 47PpZWi8vJj52VMwSsxEkymJa9JDXf6YJ8W4kqD+3y1fmHdOVQfqPIEHe208kYkVcjtx wZynF3px+nLPZR4qTSWOlrMFGgYES8kGohXkI2tJpE1iX9F7sdJ5dGjI/UBcW6acSUVD FRcsJKil3MdcQ24awCOUmhtaTzeTYdlzgx4/dvZAYe+VmIzPCpNQj8Q0iwElrzp64F9m cqHKgvBPBuq3Jv40cYwAANOFnzdk2wQUmJhd30JEahQaycNQNMz0+EQqKiTU8MkAdxh1 NUJQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=HDlHllBw; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hs31-20020a1709073e9f00b009750bb45a23si154213ejc.588.2023.06.02.01.12.26; Fri, 02 Jun 2023 01:12:27 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=HDlHllBw; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A3E8B68C31D; Fri, 2 Jun 2023 11:11:49 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id D021E68C31D for ; Fri, 2 Jun 2023 11:11:42 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1685693508; x=1717229508; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=itLxLVwecb+/4A70RQNgcy6o712jNb+ybMHlu8Gxxc8=; b=HDlHllBwyIvkIoUSCcp/0KMneL3wIVhhzSeRyfPbGYMEqv1moqoMV6T4 lM7OIxbwNEK9hFTWXRhLWqmGeAsZ8DNJYHRY58ECMXJjInW4xlWUwuaNP pLwVgaTkeMYNqihKklPP2Z7zfCGOH5twu/qmjwhTSV2FPhTFj7peVGJqN wMyOy7Ukee3Dn7NI1J0GxHp+ETQQQDCzrthjdQ6xUvw5OwvDZClRn8ydX eBbSGhZFjXGkQJ7aSw2iK3GoLpsvtnpQmQboyNN4D60U2tbkGQ9yaOfx6 7j7wfcf2q0bogh4rSY2ZhApfeB8WkRxpqg9xAYxzwmDcHM/TdhEkFhyPL Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="421629893" X-IronPort-AV: E=Sophos;i="6.00,212,1681196400"; d="scan'208";a="421629893" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Jun 2023 01:11:25 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="852060678" X-IronPort-AV: E=Sophos;i="6.00,212,1681196400"; d="scan'208";a="852060678" Received: from desktop-qn7n0nf.sh.intel.com (HELO localhost.localdomain) ([10.239.160.59]) by fmsmga001.fm.intel.com with ESMTP; 02 Jun 2023 01:11:23 -0700 From: Tong Wu To: ffmpeg-devel@ffmpeg.org Date: Fri, 2 Jun 2023 16:06:57 +0800 Message-Id: <20230602080701.1754-5-tong1.wu@intel.com> X-Mailer: git-send-email 2.35.1.windows.2 In-Reply-To: <20230602080701.1754-1-tong1.wu@intel.com> References: <20230602080701.1754-1-tong1.wu@intel.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 5/9] avcodec: add D3D12VA hardware accelerated AV1 decoding X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Tong Wu , Wu Jianhua Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: IIRd9tIeC90s From: Wu Jianhua The command below is how to enable d3d12va: ffmpeg -hwaccel d3d12va -i input.mp4 output.mp4 Signed-off-by: Wu Jianhua Signed-off-by: Tong Wu --- configure | 2 + libavcodec/Makefile | 1 + libavcodec/av1dec.c | 10 ++ libavcodec/d3d12va_av1.c | 220 ++++++++++++++++++++++++++++++++++++ libavcodec/dxva2_av1.c | 5 +- libavcodec/dxva2_internal.h | 4 + libavcodec/hwaccels.h | 1 + 7 files changed, 241 insertions(+), 2 deletions(-) create mode 100644 libavcodec/d3d12va_av1.c diff --git a/configure b/configure index fa2548f3e2..1d05c898fb 100755 --- a/configure +++ b/configure @@ -3015,6 +3015,8 @@ av1_d3d11va_hwaccel_deps="d3d11va DXVA_PicParams_AV1" av1_d3d11va_hwaccel_select="av1_decoder" av1_d3d11va2_hwaccel_deps="d3d11va DXVA_PicParams_AV1" av1_d3d11va2_hwaccel_select="av1_decoder" +av1_d3d12va_hwaccel_deps="d3d12va DXVA_PicParams_AV1" +av1_d3d12va_hwaccel_select="av1_decoder" av1_dxva2_hwaccel_deps="dxva2 DXVA_PicParams_AV1" av1_dxva2_hwaccel_select="av1_decoder" av1_nvdec_hwaccel_deps="nvdec CUVIDAV1PICPARAMS" diff --git a/libavcodec/Makefile b/libavcodec/Makefile index d5a1bfef7a..d2940aad4c 100644 --- a/libavcodec/Makefile +++ b/libavcodec/Makefile @@ -987,6 +987,7 @@ OBJS-$(CONFIG_VULKAN) += vulkan.o vulkan_video.o OBJS-$(CONFIG_AV1_D3D11VA_HWACCEL) += dxva2_av1.o OBJS-$(CONFIG_AV1_DXVA2_HWACCEL) += dxva2_av1.o +OBJS-$(CONFIG_AV1_D3D12VA_HWACCEL) += dxva2_av1.o d3d12va_av1.o OBJS-$(CONFIG_AV1_NVDEC_HWACCEL) += nvdec_av1.o OBJS-$(CONFIG_AV1_VAAPI_HWACCEL) += vaapi_av1.o OBJS-$(CONFIG_AV1_VDPAU_HWACCEL) += vdpau_av1.o diff --git a/libavcodec/av1dec.c b/libavcodec/av1dec.c index 5cc5d87c64..bbf13d4717 100644 --- a/libavcodec/av1dec.c +++ b/libavcodec/av1dec.c @@ -448,6 +448,7 @@ static int get_pixel_format(AVCodecContext *avctx) enum AVPixelFormat pix_fmt = AV_PIX_FMT_NONE; #define HWACCEL_MAX (CONFIG_AV1_DXVA2_HWACCEL + \ CONFIG_AV1_D3D11VA_HWACCEL * 2 + \ + CONFIG_AV1_D3D12VA_HWACCEL + \ CONFIG_AV1_NVDEC_HWACCEL + \ CONFIG_AV1_VAAPI_HWACCEL + \ CONFIG_AV1_VDPAU_HWACCEL + \ @@ -523,6 +524,9 @@ static int get_pixel_format(AVCodecContext *avctx) *fmtp++ = AV_PIX_FMT_D3D11VA_VLD; *fmtp++ = AV_PIX_FMT_D3D11; #endif +#if CONFIG_AV1_D3D12VA_HWACCEL + *fmtp++ = AV_PIX_FMT_D3D12; +#endif #if CONFIG_AV1_NVDEC_HWACCEL *fmtp++ = AV_PIX_FMT_CUDA; #endif @@ -544,6 +548,9 @@ static int get_pixel_format(AVCodecContext *avctx) *fmtp++ = AV_PIX_FMT_D3D11VA_VLD; *fmtp++ = AV_PIX_FMT_D3D11; #endif +#if CONFIG_AV1_D3D12VA_HWACCEL + *fmtp++ = AV_PIX_FMT_D3D12; +#endif #if CONFIG_AV1_NVDEC_HWACCEL *fmtp++ = AV_PIX_FMT_CUDA; #endif @@ -1541,6 +1548,9 @@ const FFCodec ff_av1_decoder = { #if CONFIG_AV1_D3D11VA2_HWACCEL HWACCEL_D3D11VA2(av1), #endif +#if CONFIG_AV1_D3D12VA_HWACCEL + HWACCEL_D3D12VA(av1), +#endif #if CONFIG_AV1_NVDEC_HWACCEL HWACCEL_NVDEC(av1), #endif diff --git a/libavcodec/d3d12va_av1.c b/libavcodec/d3d12va_av1.c new file mode 100644 index 0000000000..44ae689341 --- /dev/null +++ b/libavcodec/d3d12va_av1.c @@ -0,0 +1,220 @@ +/* + * Direct3D 12 AV1 HW acceleration + * + * copyright (c) 2022-2023 Wu Jianhua + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config_components.h" +#include "libavutil/avassert.h" +#include "libavutil/hwcontext_d3d12va_internal.h" +#include "av1dec.h" +#include "dxva2_internal.h" +#include "d3d12va.h" + +#define MAX_TILES 256 + +typedef struct D3D12AV1DecodeContext { + D3D12VADecodeContext ctx; + uint8_t *bitstream_buffer; +} D3D12AV1DecodeContext; + +#define D3D12_AV1_DECODE_CONTEXT(avctx) ((D3D12AV1DecodeContext *)D3D12VA_DECODE_CONTEXT(avctx)) + +typedef struct AV1DecodePictureContext { + DXVA_PicParams_AV1 pp; + unsigned tile_count; + DXVA_Tile_AV1 tiles[MAX_TILES]; + uint8_t *bitstream; + unsigned bitstream_size; +} AV1DecodePictureContext; + +static int d3d12va_av1_start_frame(AVCodecContext *avctx, av_unused const uint8_t *buffer, av_unused uint32_t size) +{ + const AV1DecContext *h = avctx->priv_data; + AV1DecodePictureContext *ctx_pic = h->cur_frame.hwaccel_picture_private; + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + if (!ctx) + return -1; + + av_assert0(ctx_pic); + + if (ff_dxva2_av1_fill_picture_parameters(avctx, (AVDXVAContext *)ctx, &ctx_pic->pp) < 0) + return -1; + + ctx_pic->bitstream = NULL; + ctx_pic->bitstream_size = 0; + ctx_pic->tile_count = 0; + + return 0; +} + +static int d3d12va_av1_decode_slice(AVCodecContext *avctx, + const uint8_t *buffer, + uint32_t size) +{ + const AV1DecContext *h = avctx->priv_data; + const AV1RawFrameHeader *frame_header = h->raw_frame_header; + AV1DecodePictureContext *ctx_pic = h->cur_frame.hwaccel_picture_private; + int offset = 0; + uint32_t tg_start, tg_end; + + ctx_pic->tile_count = frame_header->tile_cols * frame_header->tile_rows; + + if (ctx_pic->tile_count > MAX_TILES) + return AVERROR(ENOSYS); + + if (ctx_pic->tile_count == h->tg_end - h->tg_start + 1) { + tg_start = 0; + tg_end = ctx_pic->tile_count - 1; + ctx_pic->bitstream = (uint8_t *)buffer; + ctx_pic->bitstream_size = size; + } else { + ctx_pic->bitstream = D3D12_AV1_DECODE_CONTEXT(avctx)->bitstream_buffer; + memcpy(ctx_pic->bitstream + ctx_pic->bitstream_size, buffer, size); + tg_start = h->tg_start; + tg_end = h->tg_end; + offset = ctx_pic->bitstream_size; + ctx_pic->bitstream_size += size; + } + + for (uint32_t tile_num = tg_start; tile_num <= tg_end; tile_num++) { + ctx_pic->tiles[tile_num].DataOffset = offset + h->tile_group_info[tile_num].tile_offset; + ctx_pic->tiles[tile_num].DataSize = h->tile_group_info[tile_num].tile_size; + ctx_pic->tiles[tile_num].row = h->tile_group_info[tile_num].tile_row; + ctx_pic->tiles[tile_num].column = h->tile_group_info[tile_num].tile_column; + ctx_pic->tiles[tile_num].anchor_frame = 0xFF; + } + + return 0; +} + +static int update_input_arguments(AVCodecContext *avctx, D3D12_VIDEO_DECODE_INPUT_STREAM_ARGUMENTS *input_args, ID3D12Resource *buffer) +{ + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + AVHWFramesContext *frames_ctx = D3D12VA_FRAMES_CONTEXT(avctx); + AVD3D12VAFramesContext *frames_hwctx = frames_ctx->hwctx; + const AV1DecContext *h = avctx->priv_data; + AV1DecodePictureContext *ctx_pic = h->cur_frame.hwaccel_picture_private; + + int index; + uint8_t *mapped_data; + + D3D12_VIDEO_DECODE_FRAME_ARGUMENT *args = &input_args->FrameArguments[input_args->NumFrameArguments++]; + args->Type = D3D12_VIDEO_DECODE_ARGUMENT_TYPE_SLICE_CONTROL; + args->Size = sizeof(DXVA_Tile_AV1) * ctx_pic->tile_count; + args->pData = ctx_pic->tiles; + + input_args->CompressedBitstream = (D3D12_VIDEO_DECODE_COMPRESSED_BITSTREAM){ + .pBuffer = buffer, + .Offset = 0, + .Size = ctx_pic->bitstream_size, + }; + + if (FAILED(ID3D12Resource_Map(buffer, 0, NULL, &mapped_data))) { + av_log(avctx, AV_LOG_ERROR, "Failed to map D3D12 Buffer resource!\n"); + return AVERROR(EINVAL); + } + + memcpy(mapped_data, ctx_pic->bitstream, ctx_pic->bitstream_size); + + ID3D12Resource_Unmap(buffer, 0, NULL); + + index = ctx_pic->pp.CurrPicTextureIndex; + ctx->ref_resources[index] = frames_hwctx->texture_infos[index].texture; + + for (int i = 0; i < FF_ARRAY_ELEMS(ctx_pic->pp.RefFrameMapTextureIndex); i++) { + index = ctx_pic->pp.RefFrameMapTextureIndex[i]; + if (index != 0xFF) + ctx->ref_resources[index] = frames_hwctx->texture_infos[index].texture; + } + + return 0; +} + +static int d3d12va_av1_end_frame(AVCodecContext *avctx) +{ + int ret; + const AV1DecContext *h = avctx->priv_data; + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + AV1DecodePictureContext *ctx_pic = h->cur_frame.hwaccel_picture_private; + + if (ctx_pic->tiles <= 0 || ctx_pic->bitstream_size <= 0) + return -1; + + ret = ff_d3d12va_common_end_frame(avctx, h->cur_frame.f, &ctx_pic->pp, sizeof(ctx_pic->pp), + NULL, 0, update_input_arguments); + + return ret; +} + +static int d3d12va_av1_decode_init(AVCodecContext *avctx) +{ + const AV1DecContext *h = avctx->priv_data; + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + D3D12AV1DecodeContext *av1_ctx = D3D12_AV1_DECODE_CONTEXT(avctx); + AV1DecodePictureContext *ctx_pic = h->cur_frame.hwaccel_picture_private; + + int ret; + + if (avctx->profile != FF_PROFILE_AV1_MAIN) + return AVERROR(EINVAL); + + ctx->cfg.DecodeProfile = D3D12_VIDEO_DECODE_PROFILE_AV1_PROFILE0; + + ret = ff_d3d12va_decode_init(avctx); + if (ret < 0) + return ret; + + if (!av1_ctx->bitstream_buffer) { + av1_ctx->bitstream_buffer = av_malloc(ff_d3d12va_get_suitable_max_bitstream_size(avctx)); + if (!av1_ctx->bitstream_buffer) + return AVERROR(ENOMEM); + } + + return 0; +} + +static int d3d12va_av1_decode_uninit(AVCodecContext *avctx) +{ + const AV1DecContext *h = avctx->priv_data; + D3D12AV1DecodeContext *ctx = D3D12_AV1_DECODE_CONTEXT(avctx); + AV1DecodePictureContext *ctx_pic = h->cur_frame.hwaccel_picture_private; + + if (ctx->bitstream_buffer) + av_freep(&ctx->bitstream_buffer); + + return ff_d3d12va_decode_uninit(avctx); +} + +#if CONFIG_AV1_D3D12VA_HWACCEL +const AVHWAccel ff_av1_d3d12va_hwaccel = { + .name = "av1_d3d12va", + .type = AVMEDIA_TYPE_VIDEO, + .id = AV_CODEC_ID_AV1, + .pix_fmt = AV_PIX_FMT_D3D12, + .init = d3d12va_av1_decode_init, + .uninit = d3d12va_av1_decode_uninit, + .start_frame = d3d12va_av1_start_frame, + .decode_slice = d3d12va_av1_decode_slice, + .end_frame = d3d12va_av1_end_frame, + .frame_params = ff_d3d12va_common_frame_params, + .frame_priv_data_size = sizeof(AV1DecodePictureContext), + .priv_data_size = sizeof(D3D12AV1DecodeContext), +}; +#endif diff --git a/libavcodec/dxva2_av1.c b/libavcodec/dxva2_av1.c index 228f72ba18..7cb1c74fee 100644 --- a/libavcodec/dxva2_av1.c +++ b/libavcodec/dxva2_av1.c @@ -55,10 +55,11 @@ static int get_bit_depth_from_seq(const AV1RawSequenceHeader *seq) return 8; } -static int fill_picture_parameters(const AVCodecContext *avctx, AVDXVAContext *ctx, const AV1DecContext *h, +int ff_dxva2_av1_fill_picture_parameters(const AVCodecContext *avctx, AVDXVAContext *ctx, DXVA_PicParams_AV1 *pp) { int i,j, uses_lr; + const AV1DecContext *h = avctx->priv_data; const AV1RawSequenceHeader *seq = h->raw_seq; const AV1RawFrameHeader *frame_header = h->raw_frame_header; const AV1RawFilmGrainParams *film_grain = &h->cur_frame.film_grain; @@ -280,7 +281,7 @@ static int dxva2_av1_start_frame(AVCodecContext *avctx, av_assert0(ctx_pic); /* Fill up DXVA_PicParams_AV1 */ - if (fill_picture_parameters(avctx, ctx, h, &ctx_pic->pp) < 0) + if (ff_dxva2_av1_fill_picture_parameters(avctx, ctx, &ctx_pic->pp) < 0) return -1; ctx_pic->bitstream_size = 0; diff --git a/libavcodec/dxva2_internal.h b/libavcodec/dxva2_internal.h index 1de749f83b..5f317ad0fe 100644 --- a/libavcodec/dxva2_internal.h +++ b/libavcodec/dxva2_internal.h @@ -172,4 +172,8 @@ void ff_dxva2_hevc_fill_scaling_lists(const AVCodecContext *avctx, AVDXVAContext int ff_dxva2_vp9_fill_picture_parameters(const AVCodecContext *avctx, AVDXVAContext *ctx, DXVA_PicParams_VP9 *pp); +#if CONFIG_AV1_D3D12VA_HWACCEL || CONFIG_AV1_D3D11VA_HWACCEL || CONFIG_AV1_D3D11VA2_HWACCEL || CONFIG_AV1_DXVA2_HWACCEL +int ff_dxva2_av1_fill_picture_parameters(const AVCodecContext *avctx, AVDXVAContext *ctx, DXVA_PicParams_AV1 *pp); +#endif + #endif /* AVCODEC_DXVA2_INTERNAL_H */ diff --git a/libavcodec/hwaccels.h b/libavcodec/hwaccels.h index 1059b6db70..517e70d5c4 100644 --- a/libavcodec/hwaccels.h +++ b/libavcodec/hwaccels.h @@ -23,6 +23,7 @@ extern const AVHWAccel ff_av1_d3d11va_hwaccel; extern const AVHWAccel ff_av1_d3d11va2_hwaccel; +extern const AVHWAccel ff_av1_d3d12va_hwaccel; extern const AVHWAccel ff_av1_dxva2_hwaccel; extern const AVHWAccel ff_av1_nvdec_hwaccel; extern const AVHWAccel ff_av1_vaapi_hwaccel; From patchwork Fri Jun 2 08:06:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wu, Tong1" X-Patchwork-Id: 41954 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:c51c:b0:10c:5e6f:955f with SMTP id gm28csp1055401pzb; Fri, 2 Jun 2023 01:12:37 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6hjDQVReuO80LrSGRUb7oYXXIrQqnyP6tYTKgo2IL4h7m8NIilQHTVlnBzOBSKUHuMxCli X-Received: by 2002:a17:906:fe44:b0:96f:be1e:bf1d with SMTP id wz4-20020a170906fe4400b0096fbe1ebf1dmr8404925ejb.69.1685693556934; Fri, 02 Jun 2023 01:12:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685693556; cv=none; d=google.com; s=arc-20160816; b=uHKLCrjM3KIBWqQRJ51HRSiyIXDncdiVLPp4MtbTSG7g4FDxC8k6WwwX/aPwkqgiFK 2a3o+KoJzGgHf/38262N7uJ6lrM08JtvDHsFi7Q+7IJeb6uibDfnLx4BaroqsgjIzaL9 AreuqidgR2if7pHfXxShVdPTu2kcGIbN8L91UOYDxdjcGNjUdSsAHEZ4msieDmfOh5o+ e26PhKVmlidFYYa0UBR/z4qKrUbXTi6Yi2cCCUE/jZvo9fAQqGZ571lwMPyo1OGH5KSD LcwtZ66xWnX9sBP4cBLBPVP4uFQm4DaJ2HMdID1vhKC6TyQGjFzqD6/s+wzKe40fLeT6 Q+Hg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=XYuN0LouRSNe/Vcp3yVrO+Qu/oS2J77Vp/7IuRNPwyw=; b=gNYRa/5ht7JRinbnxQgPiT4l7Y3NDhtny32SSl5jMvNVsF449qfeyYcF78uwOweyKW 3qeHqLZlZTh0OsleKWlYTHzHd5EZTH0AjzZFG/R5VHG7O2STY05/OaX5rYZVLiRQf8NI cxRUHianQsCI5Zm0Y7dcPp97N9DXqvzz6IF8OdQGcf14f5WqCmOJ/k/suh4I0ByjEjUe qAnTv6egXDYSBQ/0nkIXNOdDGq0OfN4ji2vxissCdzcQ12ROTkXJCdIRviiRFvjabs35 eTkU0SDloxmhEFST6/RuO6EAltV8xVB4tT5NLgUyzYkcAYhX+7Mq8MjwHRLR1HN5YJc2 /85w== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=a8CmCzMM; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hh4-20020a170906a94400b00970025c05f9si514823ejb.507.2023.06.02.01.12.36; Fri, 02 Jun 2023 01:12:36 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=a8CmCzMM; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 94AA968C334; Fri, 2 Jun 2023 11:11:50 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 8408768C31D for ; Fri, 2 Jun 2023 11:11:43 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1685693508; x=1717229508; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=IhxyWWQe3Y15EC3CPUykByhaowWOe5YKLrgIBtQhix0=; b=a8CmCzMMhkOnkHldLqV2zabt1vsoHGWBSqJIyz/s3OtNBliCzHIga+cs AtrWmZCHRjlaWUh7SiSbJtMPHFgv8gfg+P5ZL+byk5P4IcUxEj5m59omx 1ua+lSe/OaTcKvYGXScGFn0CRK0+ARVRZNXVZpARB3BN/2M81PRmGJqI/ uUTwtdKRh71JxpC5AsZy/NTKvpYRSlehGE7pwUMxnfHCIcF+FT5t8XzOL 3Fl6OIYpFt/rHDJdt3GFvLwbR31wsky7tbGzendaepglT1urDfkJ3vHea ENKYBAgR9MoSjOY3IenQMoRHFlDuIUbf29xYfxRVWdo5ID0figeIxottO A==; X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="421629911" X-IronPort-AV: E=Sophos;i="6.00,212,1681196400"; d="scan'208";a="421629911" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Jun 2023 01:11:25 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="852060682" X-IronPort-AV: E=Sophos;i="6.00,212,1681196400"; d="scan'208";a="852060682" Received: from desktop-qn7n0nf.sh.intel.com (HELO localhost.localdomain) ([10.239.160.59]) by fmsmga001.fm.intel.com with ESMTP; 02 Jun 2023 01:11:24 -0700 From: Tong Wu To: ffmpeg-devel@ffmpeg.org Date: Fri, 2 Jun 2023 16:06:58 +0800 Message-Id: <20230602080701.1754-6-tong1.wu@intel.com> X-Mailer: git-send-email 2.35.1.windows.2 In-Reply-To: <20230602080701.1754-1-tong1.wu@intel.com> References: <20230602080701.1754-1-tong1.wu@intel.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 6/9] avcodec: add D3D12VA hardware accelerated MPEG-2 decoding X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Tong Wu , Wu Jianhua Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: iU1J3trxNx3h From: Wu Jianhua The command below is how to enable d3d12va: ffmpeg -hwaccel d3d12va -i input.mp4 output.mp4 Signed-off-by: Wu Jianhua Signed-off-by: Tong Wu --- configure | 2 + libavcodec/Makefile | 1 + libavcodec/d3d12va_mpeg2.c | 191 ++++++++++++++++++++++++++++++++++++ libavcodec/dxva2_internal.h | 6 ++ libavcodec/dxva2_mpeg2.c | 18 ++-- libavcodec/hwaccels.h | 1 + libavcodec/mpeg12dec.c | 6 ++ 7 files changed, 216 insertions(+), 9 deletions(-) create mode 100644 libavcodec/d3d12va_mpeg2.c diff --git a/configure b/configure index 1d05c898fb..9f8c535f5c 100755 --- a/configure +++ b/configure @@ -3081,6 +3081,8 @@ mpeg2_d3d11va_hwaccel_deps="d3d11va" mpeg2_d3d11va_hwaccel_select="mpeg2video_decoder" mpeg2_d3d11va2_hwaccel_deps="d3d11va" mpeg2_d3d11va2_hwaccel_select="mpeg2video_decoder" +mpeg2_d3d12va_hwaccel_deps="d3d12va" +mpeg2_d3d12va_hwaccel_select="mpeg2video_decoder" mpeg2_dxva2_hwaccel_deps="dxva2" mpeg2_dxva2_hwaccel_select="mpeg2video_decoder" mpeg2_nvdec_hwaccel_deps="nvdec" diff --git a/libavcodec/Makefile b/libavcodec/Makefile index d2940aad4c..98d4ff814d 100644 --- a/libavcodec/Makefile +++ b/libavcodec/Makefile @@ -1018,6 +1018,7 @@ OBJS-$(CONFIG_MPEG1_VDPAU_HWACCEL) += vdpau_mpeg12.o OBJS-$(CONFIG_MPEG1_VIDEOTOOLBOX_HWACCEL) += videotoolbox.o OBJS-$(CONFIG_MPEG2_D3D11VA_HWACCEL) += dxva2_mpeg2.o OBJS-$(CONFIG_MPEG2_DXVA2_HWACCEL) += dxva2_mpeg2.o +OBJS-$(CONFIG_MPEG2_D3D12VA_HWACCEL) += dxva2_mpeg2.o d3d12va_mpeg2.o OBJS-$(CONFIG_MPEG2_NVDEC_HWACCEL) += nvdec_mpeg12.o OBJS-$(CONFIG_MPEG2_QSV_HWACCEL) += qsvdec.o OBJS-$(CONFIG_MPEG2_VAAPI_HWACCEL) += vaapi_mpeg2.o diff --git a/libavcodec/d3d12va_mpeg2.c b/libavcodec/d3d12va_mpeg2.c new file mode 100644 index 0000000000..3f93a417c6 --- /dev/null +++ b/libavcodec/d3d12va_mpeg2.c @@ -0,0 +1,191 @@ +/* + * Direct3D12 MPEG-2 HW acceleration + * + * copyright (c) 2022-2023 Wu Jianhua + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config_components.h" +#include "libavutil/avassert.h" +#include "libavutil/hwcontext_d3d12va_internal.h" +#include "mpegutils.h" +#include "mpegvideodec.h" +#include "d3d12va.h" +#include "dxva2_internal.h" + +#define MAX_SLICES 1024 +#define INVALID_REF 0xffff + +#define REF_RESOURCE(index) if (index != INVALID_REF) { \ + ctx->ref_resources[index] = frames_hwctx->texture_infos[index].texture; \ +} + +typedef struct D3D12DecodePictureContext { + DXVA_PictureParameters pp; + DXVA_QmatrixData qm; + unsigned slice_count; + DXVA_SliceInfo slices[MAX_SLICES]; + const uint8_t *bitstream; + unsigned bitstream_size; +} D3D12DecodePictureContext; + +static int d3d12va_mpeg2_start_frame(AVCodecContext *avctx, av_unused const uint8_t *buffer, av_unused uint32_t size) +{ + const MpegEncContext *s = avctx->priv_data; + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + D3D12DecodePictureContext *ctx_pic = s->current_picture_ptr->hwaccel_picture_private; + DXVA_QmatrixData *qm = &ctx_pic->qm; + + if (!ctx) + return -1; + + av_assert0(ctx_pic); + + ff_dxva2_mpeg2_fill_picture_parameters(avctx, (AVDXVAContext *)ctx, &ctx_pic->pp); + ff_dxva2_mpeg2_fill_quantization_matrices(avctx, (AVDXVAContext *)ctx, &ctx_pic->qm); + + // Post processing operations are not supported in D3D12 Video + ctx_pic->pp.wDeblockedPictureIndex = INVALID_REF; + + ctx_pic->bitstream = NULL; + ctx_pic->bitstream_size = 0; + ctx_pic->slice_count = 0; + + return 0; +} + +static int d3d12va_mpeg2_decode_slice(AVCodecContext *avctx, const uint8_t *buffer, uint32_t size) +{ + const MpegEncContext *s = avctx->priv_data; + D3D12DecodePictureContext *ctx_pic = s->current_picture_ptr->hwaccel_picture_private; + + int is_field = s->picture_structure != PICT_FRAME; + + if (ctx_pic->slice_count >= MAX_SLICES) { + return AVERROR(ERANGE); + } + + if (!ctx_pic->bitstream) + ctx_pic->bitstream = buffer; + ctx_pic->bitstream_size += size; + + ff_dxva2_mpeg2_fill_slice(avctx, &ctx_pic->slices[ctx_pic->slice_count++], + buffer - ctx_pic->bitstream, buffer, size); + + return 0; +} + +static int update_input_arguments(AVCodecContext *avctx, D3D12_VIDEO_DECODE_INPUT_STREAM_ARGUMENTS *input_args, ID3D12Resource *buffer) +{ + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + AVHWFramesContext *frames_ctx = D3D12VA_FRAMES_CONTEXT(avctx); + AVD3D12VAFramesContext *frames_hwctx = frames_ctx->hwctx; + const MpegEncContext *s = avctx->priv_data; + D3D12DecodePictureContext *ctx_pic = s->current_picture_ptr->hwaccel_picture_private; + + const int is_field = s->picture_structure != PICT_FRAME; + const unsigned mb_count = s->mb_width * (s->mb_height >> is_field); + + int i; + uint8_t *mapped_data = NULL; + D3D12_VIDEO_DECODE_FRAME_ARGUMENT *args = &input_args->FrameArguments[input_args->NumFrameArguments++]; + + D3D12_RANGE range = { + .Begin = 0, + .End = ctx_pic->bitstream_size, + }; + + if (FAILED(ID3D12Resource_Map(buffer, 0, &range, &mapped_data))) { + av_log(avctx, AV_LOG_ERROR, "Failed to map D3D12 Buffer resource!\n"); + return AVERROR(EINVAL); + } + + for (i = 0; i < ctx_pic->slice_count; i++) { + DXVA_SliceInfo *slice = &ctx_pic->slices[i]; + + if (i < ctx_pic->slice_count - 1) + slice->wNumberMBsInSlice = slice[1].wNumberMBsInSlice - slice[0].wNumberMBsInSlice; + else + slice->wNumberMBsInSlice = mb_count - slice[0].wNumberMBsInSlice; + } + + memcpy(mapped_data, ctx_pic->bitstream, ctx_pic->bitstream_size); + + ID3D12Resource_Unmap(buffer, 0, &range); + + args->Type = D3D12_VIDEO_DECODE_ARGUMENT_TYPE_SLICE_CONTROL; + args->Size = sizeof(DXVA_SliceInfo) * ctx_pic->slice_count; + args->pData = ctx_pic->slices; + + input_args->CompressedBitstream = (D3D12_VIDEO_DECODE_COMPRESSED_BITSTREAM){ + .pBuffer = buffer, + .Offset = 0, + .Size = ctx_pic->bitstream_size, + }; + + REF_RESOURCE(ctx_pic->pp.wDecodedPictureIndex ) + REF_RESOURCE(ctx_pic->pp.wForwardRefPictureIndex ) + REF_RESOURCE(ctx_pic->pp.wBackwardRefPictureIndex) + + return 0; +} + +static int d3d12va_mpeg2_end_frame(AVCodecContext *avctx) +{ + int ret; + MpegEncContext *s = avctx->priv_data; + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + D3D12DecodePictureContext *ctx_pic = s->current_picture_ptr->hwaccel_picture_private; + + if (ctx_pic->slice_count <= 0 || ctx_pic->bitstream_size <= 0) + return -1; + + ret = ff_d3d12va_common_end_frame(avctx, s->current_picture_ptr->f, &ctx_pic->pp, sizeof(ctx_pic->pp), + &ctx_pic->qm, sizeof(ctx_pic->qm), update_input_arguments); + if (!ret) + ff_mpeg_draw_horiz_band(s, 0, avctx->height); + + return ret; +} + +static int d3d12va_mpeg2_decode_init(AVCodecContext *avctx) +{ + const MpegEncContext *s = avctx->priv_data; + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + + ctx->cfg.DecodeProfile = D3D12_VIDEO_DECODE_PROFILE_MPEG2; + + return ff_d3d12va_decode_init(avctx); +} + +#if CONFIG_MPEG2_D3D12VA_HWACCEL +const AVHWAccel ff_mpeg2_d3d12va_hwaccel = { + .name = "mpeg2_d3d12va", + .type = AVMEDIA_TYPE_VIDEO, + .id = AV_CODEC_ID_MPEG2VIDEO, + .pix_fmt = AV_PIX_FMT_D3D12, + .init = d3d12va_mpeg2_decode_init, + .uninit = ff_d3d12va_decode_uninit, + .start_frame = d3d12va_mpeg2_start_frame, + .decode_slice = d3d12va_mpeg2_decode_slice, + .end_frame = d3d12va_mpeg2_end_frame, + .frame_params = ff_d3d12va_common_frame_params, + .frame_priv_data_size = sizeof(D3D12DecodePictureContext), + .priv_data_size = sizeof(D3D12VADecodeContext), +}; +#endif diff --git a/libavcodec/dxva2_internal.h b/libavcodec/dxva2_internal.h index 5f317ad0fe..0c2097001c 100644 --- a/libavcodec/dxva2_internal.h +++ b/libavcodec/dxva2_internal.h @@ -176,4 +176,10 @@ int ff_dxva2_vp9_fill_picture_parameters(const AVCodecContext *avctx, AVDXVACont int ff_dxva2_av1_fill_picture_parameters(const AVCodecContext *avctx, AVDXVAContext *ctx, DXVA_PicParams_AV1 *pp); #endif +void ff_dxva2_mpeg2_fill_picture_parameters(AVCodecContext *avctx, AVDXVAContext *ctx, DXVA_PictureParameters *pp); + +void ff_dxva2_mpeg2_fill_quantization_matrices(AVCodecContext *avctx, AVDXVAContext *ctx, DXVA_QmatrixData *qm); + +void ff_dxva2_mpeg2_fill_slice(AVCodecContext *avctx, DXVA_SliceInfo *slice, unsigned position, const uint8_t *buffer, unsigned size); + #endif /* AVCODEC_DXVA2_INTERNAL_H */ diff --git a/libavcodec/dxva2_mpeg2.c b/libavcodec/dxva2_mpeg2.c index 1989c588dc..2e8e545113 100644 --- a/libavcodec/dxva2_mpeg2.c +++ b/libavcodec/dxva2_mpeg2.c @@ -39,11 +39,11 @@ struct dxva2_picture_context { unsigned bitstream_size; }; -static void fill_picture_parameters(AVCodecContext *avctx, +void ff_dxva2_mpeg2_fill_picture_parameters(AVCodecContext *avctx, AVDXVAContext *ctx, - const struct MpegEncContext *s, DXVA_PictureParameters *pp) { + const struct MpegEncContext *s = avctx->priv_data; const Picture *current_picture = s->current_picture_ptr; int is_field = s->picture_structure != PICT_FRAME; @@ -105,11 +105,11 @@ static void fill_picture_parameters(AVCodecContext *avctx, pp->bBitstreamConcealmentMethod = 0; } -static void fill_quantization_matrices(AVCodecContext *avctx, +void ff_dxva2_mpeg2_fill_quantization_matrices(AVCodecContext *avctx, AVDXVAContext *ctx, - const struct MpegEncContext *s, DXVA_QmatrixData *qm) { + const struct MpegEncContext *s = avctx->priv_data; int i; for (i = 0; i < 4; i++) qm->bNewQmatrix[i] = 1; @@ -122,12 +122,12 @@ static void fill_quantization_matrices(AVCodecContext *avctx, } } -static void fill_slice(AVCodecContext *avctx, - const struct MpegEncContext *s, +void ff_dxva2_mpeg2_fill_slice(AVCodecContext *avctx, DXVA_SliceInfo *slice, unsigned position, const uint8_t *buffer, unsigned size) { + const struct MpegEncContext *s = avctx->priv_data; int is_field = s->picture_structure != PICT_FRAME; GetBitContext gb; @@ -265,8 +265,8 @@ static int dxva2_mpeg2_start_frame(AVCodecContext *avctx, return -1; assert(ctx_pic); - fill_picture_parameters(avctx, ctx, s, &ctx_pic->pp); - fill_quantization_matrices(avctx, ctx, s, &ctx_pic->qm); + ff_dxva2_mpeg2_fill_picture_parameters(avctx, ctx, &ctx_pic->pp); + ff_dxva2_mpeg2_fill_quantization_matrices(avctx, ctx, &ctx_pic->qm); ctx_pic->slice_count = 0; ctx_pic->bitstream_size = 0; @@ -292,7 +292,7 @@ static int dxva2_mpeg2_decode_slice(AVCodecContext *avctx, ctx_pic->bitstream_size += size; position = buffer - ctx_pic->bitstream; - fill_slice(avctx, s, &ctx_pic->slice[ctx_pic->slice_count++], position, + ff_dxva2_mpeg2_fill_slice(avctx, &ctx_pic->slice[ctx_pic->slice_count++], position, buffer, size); return 0; } diff --git a/libavcodec/hwaccels.h b/libavcodec/hwaccels.h index 517e70d5c4..7443eaa5d8 100644 --- a/libavcodec/hwaccels.h +++ b/libavcodec/hwaccels.h @@ -56,6 +56,7 @@ extern const AVHWAccel ff_mpeg1_vdpau_hwaccel; extern const AVHWAccel ff_mpeg1_videotoolbox_hwaccel; extern const AVHWAccel ff_mpeg2_d3d11va_hwaccel; extern const AVHWAccel ff_mpeg2_d3d11va2_hwaccel; +extern const AVHWAccel ff_mpeg2_d3d12va_hwaccel; extern const AVHWAccel ff_mpeg2_nvdec_hwaccel; extern const AVHWAccel ff_mpeg2_dxva2_hwaccel; extern const AVHWAccel ff_mpeg2_vaapi_hwaccel; diff --git a/libavcodec/mpeg12dec.c b/libavcodec/mpeg12dec.c index 27c862ffb2..2ceec3f9f0 100644 --- a/libavcodec/mpeg12dec.c +++ b/libavcodec/mpeg12dec.c @@ -1129,6 +1129,9 @@ static const enum AVPixelFormat mpeg2_hwaccel_pixfmt_list_420[] = { AV_PIX_FMT_D3D11VA_VLD, AV_PIX_FMT_D3D11, #endif +#if CONFIG_MPEG2_D3D12VA_HWACCEL + AV_PIX_FMT_D3D12, +#endif #if CONFIG_MPEG2_VAAPI_HWACCEL AV_PIX_FMT_VAAPI, #endif @@ -2922,6 +2925,9 @@ const FFCodec ff_mpeg2video_decoder = { #if CONFIG_MPEG2_D3D11VA2_HWACCEL HWACCEL_D3D11VA2(mpeg2), #endif +#if CONFIG_MPEG2_D3D12VA_HWACCEL + HWACCEL_D3D12VA(mpeg2), +#endif #if CONFIG_MPEG2_NVDEC_HWACCEL HWACCEL_NVDEC(mpeg2), #endif From patchwork Fri Jun 2 08:06:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wu, Tong1" X-Patchwork-Id: 41957 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:c51c:b0:10c:5e6f:955f with SMTP id gm28csp1055638pzb; Fri, 2 Jun 2023 01:13:03 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5Wxw2fOJY9YNsmkQT6Hmih+cXZSt3QU/2oTUr3tQbX3M7H9JmMSW7ybOuXi5+QaG7oYzGy X-Received: by 2002:a17:907:d19:b0:974:9aa2:8577 with SMTP id gn25-20020a1709070d1900b009749aa28577mr626221ejc.9.1685693583365; Fri, 02 Jun 2023 01:13:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685693583; cv=none; d=google.com; s=arc-20160816; b=MpBpp/4Btkqjk6Ct47STpssv5Gu/E4Jt/tiYiSoehECH0MSU79bnKmrT/cK2UGf3x0 aHuJsFxY/hV4CkI9CTFS5PsFSzu/I/xPY4htTFCkDWlsCLNQN4ucHU1FFHfpIiR7s9El 5BOdsLkJWR56NyZeQ2GyFaTpaQrQNi0uWIZg0UeXmC4G9yhLPni3mHFb8iJGs/Qivtlh vIDnqLb6BhCE/k4dfVd6geiol/0rPoAM3scxIRGd4vCfRFM5TDUSLGw7bfzjeSAOuych oO/N673bky6wkyDrkjLUmeaYZqQwz6ZtYOAXYGdsmMoefXJ3aZPVp+VjBnfAa6qUlPDA 8PJg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=4qGIuClPCcEyV2pp2Oj0MqQGuMVHnwbij/HvIKllm7g=; b=Wyo7SiIrOCxMb4ruAyAmd7I/ACabCPUkhxTOYN/vz6DEVSP+OpuY7/3IpO+CfJ0XDP 8f/47HflSwmk9V49hogahIRRLPiGUyb06AmR/CXXCns4HG6z+s0NR/9BdUYzjBYjeF5h mL26xRWeUTuga1/GkGbV7ds5oK7FOvibr18sCVe8kpM7++46HDpKLVNr9pjv3g8EvGas yMQB5AtbZ0lP+g71/q3endS/Zv5odn7Yt48kXLM/Bpb+H1kNQi44fbVk4u95ZeEngHwm 2RcFl99P3NeLRVG6GkFlVA91iP4OtD1OOsXnXdMvKu5V8qhxSOymrGk5GWWK4R/TDoA1 XuTQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=gvWDH9mF; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id bm10-20020a0564020b0a00b0051640238e83si271620edb.319.2023.06.02.01.13.03; Fri, 02 Jun 2023 01:13:03 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=gvWDH9mF; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 871F768C358; Fri, 2 Jun 2023 11:11:53 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id A8ADB68C337 for ; Fri, 2 Jun 2023 11:11:44 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1685693510; x=1717229510; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Y0o5iqcMOYMOiF7MtIaH7ikn3OnzitLjdC4TmH9Dyow=; b=gvWDH9mFNSJpYBlM9AamrlwGjZ9ORlLhW72ZBnavgFK319aO2gm0pOXg UaF9qxnxYIzGoNdPdXMN43LnFW5oZ75WidEI4MM4XoUBtdN8D+e/qAjMl OI+Li66U1P/3TI1zzka4VUIZKZF/lqJM816McFjjCuXgJiU5gsOiWm7vG LfMkVsq6ccWAFCAG3v2FGiCJBkbKEouPoKOgs7Jk+Qj9jaLgtg36jLgZ8 tNId+hyYy8qcdiKY0qa/ZSm09aLbLUMW3ZJhrOOAk3NLJjnEUbY82mBvc BKNNZh4EnRdgaitrLbENJ18gD6bnlNbw8rtVP0tuhpv4ux9zYbx9u6XeI Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="421629925" X-IronPort-AV: E=Sophos;i="6.00,212,1681196400"; d="scan'208";a="421629925" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Jun 2023 01:11:27 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="852060685" X-IronPort-AV: E=Sophos;i="6.00,212,1681196400"; d="scan'208";a="852060685" Received: from desktop-qn7n0nf.sh.intel.com (HELO localhost.localdomain) ([10.239.160.59]) by fmsmga001.fm.intel.com with ESMTP; 02 Jun 2023 01:11:26 -0700 From: Tong Wu To: ffmpeg-devel@ffmpeg.org Date: Fri, 2 Jun 2023 16:06:59 +0800 Message-Id: <20230602080701.1754-7-tong1.wu@intel.com> X-Mailer: git-send-email 2.35.1.windows.2 In-Reply-To: <20230602080701.1754-1-tong1.wu@intel.com> References: <20230602080701.1754-1-tong1.wu@intel.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 7/9] avcodec: add D3D12VA hardware accelerated VC1 decoding X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Tong Wu , Wu Jianhua Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 8NBi/7jiwB8r From: Wu Jianhua The command below is how to enable d3d12va: ffmpeg -hwaccel d3d12va -i input.mp4 output.mp4 Signed-off-by: Wu Jianhua Signed-off-by: Tong Wu --- configure | 3 + libavcodec/Makefile | 1 + libavcodec/d3d12va_vc1.c | 214 ++++++++++++++++++++++++++++++++++++ libavcodec/dxva2_internal.h | 4 + libavcodec/dxva2_vc1.c | 11 +- libavcodec/hwaccels.h | 2 + libavcodec/vc1dec.c | 9 ++ 7 files changed, 239 insertions(+), 5 deletions(-) create mode 100644 libavcodec/d3d12va_vc1.c diff --git a/configure b/configure index 9f8c535f5c..c4a93a9d6e 100755 --- a/configure +++ b/configure @@ -3107,6 +3107,8 @@ vc1_d3d11va_hwaccel_deps="d3d11va" vc1_d3d11va_hwaccel_select="vc1_decoder" vc1_d3d11va2_hwaccel_deps="d3d11va" vc1_d3d11va2_hwaccel_select="vc1_decoder" +vc1_d3d12va_hwaccel_deps="d3d12va" +vc1_d3d12va_hwaccel_select="vc1_decoder" vc1_dxva2_hwaccel_deps="dxva2" vc1_dxva2_hwaccel_select="vc1_decoder" vc1_nvdec_hwaccel_deps="nvdec" @@ -3137,6 +3139,7 @@ vp9_videotoolbox_hwaccel_deps="videotoolbox" vp9_videotoolbox_hwaccel_select="vp9_decoder" wmv3_d3d11va_hwaccel_select="vc1_d3d11va_hwaccel" wmv3_d3d11va2_hwaccel_select="vc1_d3d11va2_hwaccel" +wmv3_d3d12va_hwaccel_select="vc1_d3d12va_hwaccel" wmv3_dxva2_hwaccel_select="vc1_dxva2_hwaccel" wmv3_nvdec_hwaccel_select="vc1_nvdec_hwaccel" wmv3_vaapi_hwaccel_select="vc1_vaapi_hwaccel" diff --git a/libavcodec/Makefile b/libavcodec/Makefile index 98d4ff814d..9d5350d6e1 100644 --- a/libavcodec/Makefile +++ b/libavcodec/Makefile @@ -1030,6 +1030,7 @@ OBJS-$(CONFIG_MPEG4_VDPAU_HWACCEL) += vdpau_mpeg4.o OBJS-$(CONFIG_MPEG4_VIDEOTOOLBOX_HWACCEL) += videotoolbox.o OBJS-$(CONFIG_VC1_D3D11VA_HWACCEL) += dxva2_vc1.o OBJS-$(CONFIG_VC1_DXVA2_HWACCEL) += dxva2_vc1.o +OBJS-$(CONFIG_VC1_D3D12VA_HWACCEL) += dxva2_vc1.o d3d12va_vc1.o OBJS-$(CONFIG_VC1_NVDEC_HWACCEL) += nvdec_vc1.o OBJS-$(CONFIG_VC1_QSV_HWACCEL) += qsvdec.o OBJS-$(CONFIG_VC1_VAAPI_HWACCEL) += vaapi_vc1.o diff --git a/libavcodec/d3d12va_vc1.c b/libavcodec/d3d12va_vc1.c new file mode 100644 index 0000000000..d577582a3f --- /dev/null +++ b/libavcodec/d3d12va_vc1.c @@ -0,0 +1,214 @@ +/* + * Direct3D12 WMV3/VC-1 HW acceleration + * + * copyright (c) 2022-2023 Wu Jianhua + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config_components.h" +#include "libavutil/avassert.h" +#include "libavutil/hwcontext_d3d12va_internal.h" +#include "mpegutils.h" +#include "mpegvideodec.h" +#include "vc1.h" +#include "vc1data.h" +#include "d3d12va.h" +#include "dxva2_internal.h" + +#define MAX_SLICES 1024 +#define INVALID_REF 0xffff + +#define REF_RESOURCE(index) if (index != INVALID_REF) { \ + ctx->ref_resources[index] = frames_hwctx->texture_infos[index].texture; \ +} + +typedef struct D3D12DecodePictureContext { + DXVA_PictureParameters pp; + unsigned slice_count; + DXVA_SliceInfo slices[MAX_SLICES]; + const uint8_t *bitstream; + unsigned bitstream_size; +} D3D12DecodePictureContext; + +static int d3d12va_vc1_start_frame(AVCodecContext *avctx, av_unused const uint8_t *buffer, av_unused uint32_t size) +{ + const VC1Context *v = avctx->priv_data; + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + D3D12DecodePictureContext *ctx_pic = v->s.current_picture_ptr->hwaccel_picture_private; + + if (!ctx) + return -1; + + av_assert0(ctx_pic); + + ff_dxva2_vc1_fill_picture_parameters(avctx, (AVDXVAContext *)ctx, &ctx_pic->pp); + ctx_pic->pp.wDeblockedPictureIndex = INVALID_REF; + + ctx_pic->bitstream = NULL; + ctx_pic->bitstream_size = 0; + ctx_pic->slice_count = 0; + + return 0; +} + +static int d3d12va_vc1_decode_slice(AVCodecContext *avctx, const uint8_t *buffer, uint32_t size) +{ + const VC1Context *v = avctx->priv_data; + D3D12DecodePictureContext *ctx_pic = v->s.current_picture_ptr->hwaccel_picture_private; + + if (ctx_pic->slice_count >= MAX_SLICES) { + return AVERROR(ERANGE); + } + + if (avctx->codec_id == AV_CODEC_ID_VC1 && + size >= 4 && IS_MARKER(AV_RB32(buffer))) { + buffer += 4; + size -= 4; + } + + if (!ctx_pic->bitstream) + ctx_pic->bitstream = buffer; + ctx_pic->bitstream_size += size; + + ff_dxva2_vc1_fill_slice(avctx, &ctx_pic->slices[ctx_pic->slice_count++], + buffer - ctx_pic->bitstream, size); + + return 0; +} + +static int update_input_arguments(AVCodecContext *avctx, D3D12_VIDEO_DECODE_INPUT_STREAM_ARGUMENTS *input_args, ID3D12Resource *buffer) +{ + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + AVHWFramesContext *frames_ctx = D3D12VA_FRAMES_CONTEXT(avctx); + AVD3D12VAFramesContext *frames_hwctx = frames_ctx->hwctx; + const VC1Context *v = avctx->priv_data; + const MpegEncContext *s = &v->s; + D3D12DecodePictureContext *ctx_pic = s->current_picture_ptr->hwaccel_picture_private; + D3D12_VIDEO_DECODE_FRAME_ARGUMENT *args = &input_args->FrameArguments[input_args->NumFrameArguments++]; + + const unsigned mb_count = s->mb_width * (s->mb_height >> v->field_mode); + uint8_t *mapped_data, *mapped_ptr; + + static const uint8_t start_code[] = { 0, 0, 1, 0x0d }; + + if (FAILED(ID3D12Resource_Map(buffer, 0, NULL, &mapped_data))) { + av_log(avctx, AV_LOG_ERROR, "Failed to map D3D12 Buffer resource!\n"); + return AVERROR(EINVAL); + } + + mapped_ptr = mapped_data; + for (int i = 0; i < ctx_pic->slice_count; i++) { + DXVA_SliceInfo *slice = &ctx_pic->slices[i]; + unsigned position = slice->dwSliceDataLocation; + unsigned size = slice->dwSliceBitsInBuffer / 8; + + slice->dwSliceDataLocation = mapped_ptr - mapped_data; + if (i < ctx_pic->slice_count - 1) + slice->wNumberMBsInSlice = slice[1].wNumberMBsInSlice - slice[0].wNumberMBsInSlice; + else + slice->wNumberMBsInSlice = mb_count - slice[0].wNumberMBsInSlice; + + if (avctx->codec_id == AV_CODEC_ID_VC1) { + memcpy(mapped_ptr, start_code, sizeof(start_code)); + if (i == 0 && v->second_field) + mapped_ptr[3] = 0x0c; + else if (i > 0) + mapped_ptr[3] = 0x0b; + + mapped_ptr += sizeof(start_code); + slice->dwSliceBitsInBuffer += sizeof(start_code) * 8; + } + + memcpy(mapped_ptr, &ctx_pic->bitstream[position], size); + mapped_ptr += size; + } + + ID3D12Resource_Unmap(buffer, 0, NULL); + + args->Type = D3D12_VIDEO_DECODE_ARGUMENT_TYPE_SLICE_CONTROL; + args->Size = sizeof(DXVA_SliceInfo) * ctx_pic->slice_count; + args->pData = ctx_pic->slices; + + input_args->CompressedBitstream = (D3D12_VIDEO_DECODE_COMPRESSED_BITSTREAM){ + .pBuffer = buffer, + .Offset = 0, + .Size = mapped_ptr - mapped_data, + }; + + REF_RESOURCE(ctx_pic->pp.wDecodedPictureIndex ) + REF_RESOURCE(ctx_pic->pp.wForwardRefPictureIndex ) + REF_RESOURCE(ctx_pic->pp.wBackwardRefPictureIndex) + + return 0; +} + +static int d3d12va_vc1_end_frame(AVCodecContext *avctx) +{ + const VC1Context *v = avctx->priv_data; + D3D12DecodePictureContext *ctx_pic = v->s.current_picture_ptr->hwaccel_picture_private; + + if (ctx_pic->slice_count <= 0 || ctx_pic->bitstream_size <= 0) + return -1; + + return ff_d3d12va_common_end_frame(avctx, v->s.current_picture_ptr->f, + &ctx_pic->pp, sizeof(ctx_pic->pp), + NULL, 0, + update_input_arguments); +} + +static int d3d12va_vc1_decode_init(AVCodecContext *avctx) +{ + D3D12VADecodeContext *ctx = D3D12VA_DECODE_CONTEXT(avctx); + ctx->cfg.DecodeProfile = D3D12_VIDEO_DECODE_PROFILE_VC1; + + return ff_d3d12va_decode_init(avctx); +} + +#if CONFIG_WMV3_D3D12VA_HWACCEL +const AVHWAccel ff_wmv3_d3d12va_hwaccel = { + .name = "wmv3_d3d12va", + .type = AVMEDIA_TYPE_VIDEO, + .id = AV_CODEC_ID_WMV3, + .pix_fmt = AV_PIX_FMT_D3D12, + .init = d3d12va_vc1_decode_init, + .uninit = ff_d3d12va_decode_uninit, + .start_frame = d3d12va_vc1_start_frame, + .decode_slice = d3d12va_vc1_decode_slice, + .end_frame = d3d12va_vc1_end_frame, + .frame_params = ff_d3d12va_common_frame_params, + .frame_priv_data_size = sizeof(D3D12DecodePictureContext), + .priv_data_size = sizeof(D3D12VADecodeContext), +}; +#endif + +#if CONFIG_VC1_D3D12VA_HWACCEL +const AVHWAccel ff_vc1_d3d12va_hwaccel = { + .name = "vc1_d3d12va", + .type = AVMEDIA_TYPE_VIDEO, + .id = AV_CODEC_ID_VC1, + .pix_fmt = AV_PIX_FMT_D3D12, + .init = d3d12va_vc1_decode_init, + .uninit = ff_d3d12va_decode_uninit, + .start_frame = d3d12va_vc1_start_frame, + .decode_slice = d3d12va_vc1_decode_slice, + .end_frame = d3d12va_vc1_end_frame, + .frame_params = ff_d3d12va_common_frame_params, + .frame_priv_data_size = sizeof(D3D12DecodePictureContext), + .priv_data_size = sizeof(D3D12VADecodeContext), +}; +#endif diff --git a/libavcodec/dxva2_internal.h b/libavcodec/dxva2_internal.h index 0c2097001c..2532e54877 100644 --- a/libavcodec/dxva2_internal.h +++ b/libavcodec/dxva2_internal.h @@ -182,4 +182,8 @@ void ff_dxva2_mpeg2_fill_quantization_matrices(AVCodecContext *avctx, AVDXVACont void ff_dxva2_mpeg2_fill_slice(AVCodecContext *avctx, DXVA_SliceInfo *slice, unsigned position, const uint8_t *buffer, unsigned size); +void ff_dxva2_vc1_fill_picture_parameters(AVCodecContext *avctx, AVDXVAContext *ctx, DXVA_PictureParameters *pp); + +void ff_dxva2_vc1_fill_slice(AVCodecContext *avctx, DXVA_SliceInfo *slice, unsigned position, unsigned size); + #endif /* AVCODEC_DXVA2_INTERNAL_H */ diff --git a/libavcodec/dxva2_vc1.c b/libavcodec/dxva2_vc1.c index 12e3de59ec..be1baa418e 100644 --- a/libavcodec/dxva2_vc1.c +++ b/libavcodec/dxva2_vc1.c @@ -39,10 +39,11 @@ struct dxva2_picture_context { unsigned bitstream_size; }; -static void fill_picture_parameters(AVCodecContext *avctx, - AVDXVAContext *ctx, const VC1Context *v, +void ff_dxva2_vc1_fill_picture_parameters(AVCodecContext *avctx, + AVDXVAContext *ctx, DXVA_PictureParameters *pp) { + const VC1Context *v = avctx->priv_data; const MpegEncContext *s = &v->s; const Picture *current_picture = s->current_picture_ptr; int intcomp = 0; @@ -162,7 +163,7 @@ static void fill_picture_parameters(AVCodecContext *avctx, pp->bBitstreamConcealmentMethod = 0; } -static void fill_slice(AVCodecContext *avctx, DXVA_SliceInfo *slice, +void ff_dxva2_vc1_fill_slice(AVCodecContext *avctx, DXVA_SliceInfo *slice, unsigned position, unsigned size) { const VC1Context *v = avctx->priv_data; @@ -321,7 +322,7 @@ static int dxva2_vc1_start_frame(AVCodecContext *avctx, return -1; assert(ctx_pic); - fill_picture_parameters(avctx, ctx, v, &ctx_pic->pp); + ff_dxva2_vc1_fill_picture_parameters(avctx, ctx, &ctx_pic->pp); ctx_pic->slice_count = 0; ctx_pic->bitstream_size = 0; @@ -355,7 +356,7 @@ static int dxva2_vc1_decode_slice(AVCodecContext *avctx, ctx_pic->bitstream_size += size; position = buffer - ctx_pic->bitstream; - fill_slice(avctx, &ctx_pic->slice[ctx_pic->slice_count++], position, size); + ff_dxva2_vc1_fill_slice(avctx, &ctx_pic->slice[ctx_pic->slice_count++], position, size); return 0; } diff --git a/libavcodec/hwaccels.h b/libavcodec/hwaccels.h index 7443eaa5d8..79c4db3624 100644 --- a/libavcodec/hwaccels.h +++ b/libavcodec/hwaccels.h @@ -69,6 +69,7 @@ extern const AVHWAccel ff_mpeg4_videotoolbox_hwaccel; extern const AVHWAccel ff_prores_videotoolbox_hwaccel; extern const AVHWAccel ff_vc1_d3d11va_hwaccel; extern const AVHWAccel ff_vc1_d3d11va2_hwaccel; +extern const AVHWAccel ff_vc1_d3d12va_hwaccel; extern const AVHWAccel ff_vc1_dxva2_hwaccel; extern const AVHWAccel ff_vc1_nvdec_hwaccel; extern const AVHWAccel ff_vc1_vaapi_hwaccel; @@ -85,6 +86,7 @@ extern const AVHWAccel ff_vp9_vdpau_hwaccel; extern const AVHWAccel ff_vp9_videotoolbox_hwaccel; extern const AVHWAccel ff_wmv3_d3d11va_hwaccel; extern const AVHWAccel ff_wmv3_d3d11va2_hwaccel; +extern const AVHWAccel ff_wmv3_d3d12va_hwaccel; extern const AVHWAccel ff_wmv3_dxva2_hwaccel; extern const AVHWAccel ff_wmv3_nvdec_hwaccel; extern const AVHWAccel ff_wmv3_vaapi_hwaccel; diff --git a/libavcodec/vc1dec.c b/libavcodec/vc1dec.c index 9e343d003f..db1e667dfc 100644 --- a/libavcodec/vc1dec.c +++ b/libavcodec/vc1dec.c @@ -1385,6 +1385,9 @@ static const enum AVPixelFormat vc1_hwaccel_pixfmt_list_420[] = { AV_PIX_FMT_D3D11VA_VLD, AV_PIX_FMT_D3D11, #endif +#if CONFIG_VC1_D3D12VA_HWACCEL + AV_PIX_FMT_D3D12, +#endif #if CONFIG_VC1_NVDEC_HWACCEL AV_PIX_FMT_CUDA, #endif @@ -1420,6 +1423,9 @@ const FFCodec ff_vc1_decoder = { #if CONFIG_VC1_D3D11VA2_HWACCEL HWACCEL_D3D11VA2(vc1), #endif +#if CONFIG_VC1_D3D12VA_HWACCEL + HWACCEL_D3D12VA(vc1), +#endif #if CONFIG_VC1_NVDEC_HWACCEL HWACCEL_NVDEC(vc1), #endif @@ -1457,6 +1463,9 @@ const FFCodec ff_wmv3_decoder = { #if CONFIG_WMV3_D3D11VA2_HWACCEL HWACCEL_D3D11VA2(wmv3), #endif +#if CONFIG_WMV3_D3D12VA_HWACCEL + HWACCEL_D3D12VA(wmv3), +#endif #if CONFIG_WMV3_NVDEC_HWACCEL HWACCEL_NVDEC(wmv3), #endif From patchwork Fri Jun 2 08:07:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wu, Tong1" X-Patchwork-Id: 41955 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:c51c:b0:10c:5e6f:955f with SMTP id gm28csp1055492pzb; Fri, 2 Jun 2023 01:12:46 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6z+6I/KqFd0paEYdi4CCC2rYMmdvBoNTQGJeatwltD4ItinrxyXh5OllKDdbAekVlLfvMI X-Received: by 2002:a17:907:748:b0:94e:116:8581 with SMTP id xc8-20020a170907074800b0094e01168581mr10920244ejb.5.1685693566412; Fri, 02 Jun 2023 01:12:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685693566; cv=none; d=google.com; s=arc-20160816; b=ZHtM4fX9I5QDg9b1Lkf7Sfex/RlWhnr72bghthJ7lUGID/hxcBN+fSSDkZH5chDB7f WNGSENfBp0zXhs7DqToGg+f5ogMQIpLZjKjKcGdOlv5+WdEbIvV43nPEwtXVEggBm+P8 VHE1zdlhylT5ZW+ortgzClUbMwjfrp3qdbdDEK3vjkc6TaykYxnI/RwAweHrTBMwd5q5 LUawZQ3FMxZKQZvcZqTVbgwysRl828MSgyib5TVvOD1Pq6Tv30rH8wbCY9VeV2L1fCHf 4ziH80FRKg00pOgxVYuZtlaAxN14YwA2aF4ONaYnkFv6tjjTr4g5Rb+6pSZjzYwx8dJ2 gzUQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=njyck86I2PoQkhzrR4G+POWmCtxI9VZQAGuFth8q0yY=; b=QNiq4iKSMwntNx+UzH09iiE3APhsMRtCn4FqQ5B5gS3ZVtUQ3dvbzpJ+fuDlUZZHLJ SyOdgkUqKim9CQjMR3xPTBn8nW8jYjbFsUMVVO52n8hZQJpV9aWy15s731HuiQGxRngH cKTcojP0nkl5m4JJUirB9gzozXQ6kfkEm9Nj3UQOXL+sKECAY6PEjSxrUo4wvppiP+TN ceFakRJXitALQPPPDLHnS8mf99FwhZASPNDwDCkqYh83sP0xsaYy5N6g38G9hnIIm7co vsDKr/XB+v00tgdZJmsdPKIgxp09DooOMQuze2Mnc5cDBUY9TLsWz5ZS5Zqlo6iXK4WL O/cg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b="P9W/aHDd"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id w13-20020aa7cb4d000000b005149e886287si486912edt.94.2023.06.02.01.12.46; Fri, 02 Jun 2023 01:12:46 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b="P9W/aHDd"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 770C068C33D; Fri, 2 Jun 2023 11:11:51 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id ECDA468C334 for ; Fri, 2 Jun 2023 11:11:48 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1685693509; x=1717229509; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WelcEbbfM8s4noKGfpKbdOMbZScHyMb+HehSa9ftff0=; b=P9W/aHDdO5nuacKyr24CAuYR+WVd3f1jFkTxdJMIbYQW+lnb0C1FIEVN Hg2I8kORmepuVRof4RACeSd1O6VGm2AUF9gsb+guWUwJp3zqZzrti4fTb vrUnv8QNeaT3JqI5yJ7IWnkuA3QujLRkwjohlJdfysG/cg3YfkCnQhg/a FYPPGmZvUiu8wdWTK8PbN6FxSYGUply+T7Rn4uAjxXRnyMEIW2xl5yHor qgtdFZPvCQqHSaUDhUDr1/UETI8bQXtpoXeFLFLVJYPXyvR87EQZhGoE0 Mc14ukbXT7WnRiZlKEIHvw4sH8cVKIU1MJ/4mpQsoUAJpFPnYRnHm1b3Z g==; X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="421629936" X-IronPort-AV: E=Sophos;i="6.00,212,1681196400"; d="scan'208";a="421629936" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Jun 2023 01:11:28 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="852060688" X-IronPort-AV: E=Sophos;i="6.00,212,1681196400"; d="scan'208";a="852060688" Received: from desktop-qn7n0nf.sh.intel.com (HELO localhost.localdomain) ([10.239.160.59]) by fmsmga001.fm.intel.com with ESMTP; 02 Jun 2023 01:11:27 -0700 From: Tong Wu To: ffmpeg-devel@ffmpeg.org Date: Fri, 2 Jun 2023 16:07:00 +0800 Message-Id: <20230602080701.1754-8-tong1.wu@intel.com> X-Mailer: git-send-email 2.35.1.windows.2 In-Reply-To: <20230602080701.1754-1-tong1.wu@intel.com> References: <20230602080701.1754-1-tong1.wu@intel.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 8/9] Changelog: D3D12VA hardware accelerated H264, HEVC, VP9, AV1, MPEG-2 and VC1 decoding X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Tong Wu , Wu Jianhua Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 5ceEkVO62sYO From: Wu Jianhua Signed-off-by: Wu Jianhua Signed-off-by: Tong Wu --- Changelog | 1 + 1 file changed, 1 insertion(+) diff --git a/Changelog b/Changelog index 3d197f2b94..d03feb485d 100644 --- a/Changelog +++ b/Changelog @@ -14,6 +14,7 @@ version : - color_vulkan filter - bwdif_vulkan filter - nlmeans_vulkan filter +- D3D12VA hardware accelerated H264, HEVC, VP9, AV1, MPEG-2 and VC1 decoding version 6.0: - Radiance HDR image support From patchwork Fri Jun 2 08:07:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wu, Tong1" X-Patchwork-Id: 41956 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:c51c:b0:10c:5e6f:955f with SMTP id gm28csp1055556pzb; Fri, 2 Jun 2023 01:12:54 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5vcBRjX+RZhaRhTWldy2tYebx1zscURDbM83zFLUbMGlwLXn1dGtUPYuFfc9xJKTFdz0Ip X-Received: by 2002:a17:907:72d6:b0:96f:45cd:6c21 with SMTP id du22-20020a17090772d600b0096f45cd6c21mr5167077ejc.30.1685693574712; Fri, 02 Jun 2023 01:12:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685693574; cv=none; d=google.com; s=arc-20160816; b=ylD4jFdts65Sq/EKli3qkLYC3tgDn2tK+u6Pkpm3ndNK0DZ4iDq8IcsE4CdDTxLB6I wjU4ZxEHc/eBAOMcWdaoQT9octFNiguz5dR25f9TOcU9G98hfg3bS7rIoFxCam0Vjd37 kMzC3SYNYxvkYxZY8IjPPV2QYJ5xJWFOiff4B8rjGN+X7V9QMQOoF+Y9K95gepEk6A7S CXVB6klCA5QCOqtEjMhToBffttGT5o3na2DmyH9P3QxRARCPgrNCeGX3wQXjvuKL2Il0 xzNqfD5m/xsbJ8jdKK0RooiY8QcM/n1v/TwwyXw4HM5LFJRwL/F//qEfTV33fWDOZcnH dKBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=bdnVksLy9EYKlKNe0uFKv4FBQMhzP4bAgFkQkSVkw/A=; b=PuGFX4ADxg7Wa8NbvaxiagXFsMAb2FZjQrlpFKIECCtnCxsufyuL/QG5nd/8KJEVQN DPMRSM2hJmtKXf5utySbuIDcflM4W8p1Z+zwHh6O5m+clb2hSnSmY/a1X3ofnMq+Yf+7 cfzAsPjX3PW19KtfXyPlaROHAnPRR7bjkdBsbHaWN7nPjGMWZExguQbK7game5zLjJ9s XS4dav2+5wOF3KqaCR9+d1juijbgbiIlQ8UfbmF4oYJM3le7iLeXKxkkzriJ/p2iRwQJ KgS5Y6rNXUrRCltwbPEtBDpK3pBeM0jy73Q6fHYUcluiWQByrI5/18+yiV5VevR/PfAh fncQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=j3TJscXC; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id fp29-20020a1709069e1d00b0096f5610177bsi497507ejc.103.2023.06.02.01.12.54; Fri, 02 Jun 2023 01:12:54 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=j3TJscXC; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 885EA68C33B; Fri, 2 Jun 2023 11:11:52 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 9410868C33C for ; Fri, 2 Jun 2023 11:11:49 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1685693510; x=1717229510; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TqUPpMMiio9aWicp1yORPJOnEJVLOh5Rr2n07YwwLdM=; b=j3TJscXCis0ulLA1qJWtegJDbrEcHBLRayRP9khrXzgiIeujhOjO19bM iOdpf2ymaLuN12I1JkFvi8t3OGQ6XMtZNa3mOwG3giNX6ySBnQBNDkzNY dFMy35ejdSqvz9cVifgB8fy5gz3U2ZwM7Q4O0cN2bRTpgAQ/yNA9SVqV3 PQbREk4BKBTtSGjKVWpiBAJfnb89YtUStRGOBhJsNFPLeI4fqQx2E8w3w tlSLLkMYP9ed85WsIPTvuwR6shlVoMD8N6YB2UIt3XBC1rIugQazKbvdJ STWSof8QTgyg4/c4YzrsufBWmuLRq3WdhE91d3vEcW+TF0hLGCxn6Xxz9 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="421629950" X-IronPort-AV: E=Sophos;i="6.00,212,1681196400"; d="scan'208";a="421629950" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Jun 2023 01:11:29 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="852060691" X-IronPort-AV: E=Sophos;i="6.00,212,1681196400"; d="scan'208";a="852060691" Received: from desktop-qn7n0nf.sh.intel.com (HELO localhost.localdomain) ([10.239.160.59]) by fmsmga001.fm.intel.com with ESMTP; 02 Jun 2023 01:11:28 -0700 From: Tong Wu To: ffmpeg-devel@ffmpeg.org Date: Fri, 2 Jun 2023 16:07:01 +0800 Message-Id: <20230602080701.1754-9-tong1.wu@intel.com> X-Mailer: git-send-email 2.35.1.windows.2 In-Reply-To: <20230602080701.1754-1-tong1.wu@intel.com> References: <20230602080701.1754-1-tong1.wu@intel.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 9/9] avcodec/d3d12va_hevc: enable allow_profile_mismatch flag for d3d12va msp profile X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Tong Wu Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: z4xqpphh9wsV Same as d3d11va, this flag enables main still picture profile for d3d12va. User should add this flag when decoding main still picture profile. Signed-off-by: Tong Wu --- libavcodec/d3d12va_hevc.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/libavcodec/d3d12va_hevc.c b/libavcodec/d3d12va_hevc.c index 1d94831e01..59ca500eb2 100644 --- a/libavcodec/d3d12va_hevc.c +++ b/libavcodec/d3d12va_hevc.c @@ -181,8 +181,13 @@ static int d3d12va_hevc_decode_init(AVCodecContext *avctx) break; case FF_PROFILE_HEVC_MAIN_STILL_PICTURE: - av_log(avctx, AV_LOG_ERROR, "D3D12 doesn't support PROFILE_HEVC_MAIN_STILL_PICTURE!\n"); - return AVERROR(EINVAL); + if (avctx->hwaccel_flags & AV_HWACCEL_FLAG_ALLOW_PROFILE_MISMATCH) { + ctx->cfg.DecodeProfile = D3D12_VIDEO_DECODE_PROFILE_HEVC_MAIN; + break; + } else { + av_log(avctx, AV_LOG_ERROR, "D3D12 doesn't support PROFILE_HEVC_MAIN_STILL_PICTURE!\n"); + return AVERROR(EINVAL); + } case FF_PROFILE_HEVC_MAIN: default: