From patchwork Mon Jan 22 23:56:59 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Thompson X-Patchwork-Id: 7390 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.2.156.27 with SMTP id q27csp3584316jak; Mon, 22 Jan 2018 15:57:12 -0800 (PST) X-Google-Smtp-Source: AH8x2268otf5yk9Ut6C/IeQJN/ipNvYY6ktI7QkQ85AjQBuFpl5Mx3nZf6VUuAn9i2Wuie7pvHZz X-Received: by 10.28.215.76 with SMTP id o73mr336713wmg.51.1516665432563; Mon, 22 Jan 2018 15:57:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516665432; cv=none; d=google.com; s=arc-20160816; b=DPSA4bqCO//C7y1LNqghzO8nKH8I13UpPtXnGnGkiB5SnFaQZpff8SAbeU7r57C9Ay k8mkB8Iyv/oxXRs63r0BZahNIet5szldsF0XTn15Gq0jNYN32k35hvJR+zw+VWtcRFQK sTsFy4TFPWburKQ3yPw/5i+1CDzxPSKq4jn9TroBe1ct5VaRcbH9gk7WntkWKKSEi5cB E0QEIJzfOWRkTATrEHAvMx+TffUjRTaAr70mt7kYw3irVltAet8IoQNSLsGS7EbzBq4H pPuJmcvitauofHSdXN7yiTXAYyj95Z5BHaHQna887GRSCs6EFZOEou+L2y0usPFxArbn vX2g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:content-language:mime-version:user-agent:date :message-id:to:from:dkim-signature:delivered-to :arc-authentication-results; bh=J5xCqNMBbrp85t6mTqb74bjR8f2HdC3yaaCXNU8AWaA=; b=fvYBU1/qO6yeo/hOb5Dx701yuWgx7uQKHD0UafX3Uz07MKhzNWM5WPTkqwskuYQjRn xU2x/i5Bjg7ETG1kdXAh597qTEh8Uz7Poa54KMgeKw6HmdV/knEBt/0SACtZ4LiXpA/a S0sHFp2b14WAd21NQrRPSyAOCMP/yn1FTy2irlM2nU8/zpr5rGMly/strp9rFctSfIW+ nhnREEG6EiJ+PWNw16xWy0v3v1bb9ZA0glzTodKuvVTiZjlsDIukFP7eEsGMiRiTEgzF H+4MZLZ5EM/pT14NpJHIHnw7rmJKj/8L6xh4tq8N7lhgn9ihkOR8tSg3CE5dRG9w8ZZY 9Qhw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@jkqxz-net.20150623.gappssmtp.com header.s=20150623 header.b=kO/GSi+P; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id s7si5531651wms.141.2018.01.22.15.57.12; Mon, 22 Jan 2018 15:57:12 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@jkqxz-net.20150623.gappssmtp.com header.s=20150623 header.b=kO/GSi+P; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B1260689959; Tue, 23 Jan 2018 01:57:08 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wm0-f50.google.com (mail-wm0-f50.google.com [74.125.82.50]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 994A5689729 for ; Tue, 23 Jan 2018 01:57:02 +0200 (EET) Received: by mail-wm0-f50.google.com with SMTP id 143so19452553wma.5 for ; Mon, 22 Jan 2018 15:57:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=jkqxz-net.20150623.gappssmtp.com; s=20150623; h=from:subject:to:message-id:date:user-agent:mime-version :content-language:content-transfer-encoding; bh=6R6/7H4v3fLET6mdFmvOdWGxqNPK+ZwBqAJCmMgjz2c=; b=kO/GSi+PEZWRCJw8BPt6+z580K5vH7+L4XxiRIHs1sPxcuMOhL04bvTKz22ydX1GmS lSnkVvDMWAHExPhMX6q2wuBHIBjnBX+06J6K9a86kivqXlulaqwLtYYe1JevsGNA4BlV LzcrV/1f2TOB7fyFqMDJc3iayIN+ttvnoO8OY0LvSMAoyHjx6rlsRXRW0NIpbeGQte3e f3T7gctfhiboLv5Rt0yc24jSNhQ7rJ6ZiHiGyUWPtaWwUsNq0SsxxgWLwD+IrmZWJNHw RIdMTLGsthsCu7OFx+o14ohbun+oI8FgoUeNtpWmtbUtraeFfH8B4aEiSlczA/O2Z4VF rwvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:subject:to:message-id:date:user-agent :mime-version:content-language:content-transfer-encoding; bh=6R6/7H4v3fLET6mdFmvOdWGxqNPK+ZwBqAJCmMgjz2c=; b=GUjy/7fopWsCkYFj3j208vyQAB22MzOdjpz1Xl1usVj+xM4Kmvfb2RZC0Tyk0roqci n9gR431Iu83t2cfN84f2GHwSfMqySm2pvtkOqyyb4vrzpIseHU873TDTY+m2HwKZQwTZ wEGvUQwjDDdfZhYa3IPFE9UdD6C5t/tP2zxe1P5q3qwCInfegTRBD1LdYnvjhdKpXbJq LS6qmrU+uxeHCGd790BJtsNWJFmcRlC+c6SNFMI5qEco+wajtJOiDodndR9clI/4Rpfi ABaulpYc8mqpMVjjNbMAe3VSRQxaT3z5C67wunCzoPMtAeYtIK5Hca+RpXPELHimfy6Q jD6g== X-Gm-Message-State: AKwxytfrg8e1xicK5rNM5a+tCxxhYWRJPhAOBk2e6gA2OXk2j1V9sa3M A+BDaKMOfAHvyKQ2qL+eH2JYE/mD X-Received: by 10.28.203.142 with SMTP id b136mr342811wmg.127.1516665423767; Mon, 22 Jan 2018 15:57:03 -0800 (PST) Received: from [192.168.0.3] (cpc91242-cmbg18-2-0-cust650.5-4.cable.virginm.net. [82.8.130.139]) by smtp.gmail.com with ESMTPSA id p10sm28635473wrh.61.2018.01.22.15.57.02 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 22 Jan 2018 15:57:03 -0800 (PST) From: Mark Thompson To: FFmpeg development discussions and patches Message-ID: <02129811-5f4d-9437-36d4-bfa3f890c4bd@jkqxz.net> Date: Mon, 22 Jan 2018 23:56:59 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 Content-Language: en-US Subject: [FFmpeg-devel] [RFC] amfenc: Add support for OpenCL input X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" --- This allows passing OpenCL frames to AMF without a download/upload step to get around AMD's lack of support for D3D11 mapping. For example: ./ffmpeg -hwaccel dxva2 -hwaccel_output_format dxva2_vld -i input.mp4 -an -vf 'hwmap=derive_device=opencl,program_opencl=source=examples.cl:kernel=rotate_image' -c:v h264_amf output.mp4 * I can't find any documentation or examples for these functions, so I'm guessing a bit exactly how they are meant to work. In particular, there are some locking functions which I have ignored because I have no idea under what circumstances something might want to be locked. * I tried to write common parts with D3D11, but I might well have broken D3D11 support in the process - it doesn't work at all for me so I can't test it. * Not sure how to get non-NV12 to work. I may be missing something, or it may just not be there - the trace messages suggest it doesn't like the width of RGB0 or the second plane of GRAY8. - Mark libavcodec/amfenc.c | 178 +++++++++++++++++++++++++++++++++++----------------- libavcodec/amfenc.h | 1 + 2 files changed, 123 insertions(+), 56 deletions(-) diff --git a/libavcodec/amfenc.c b/libavcodec/amfenc.c index 89a10ff253..220cdd278f 100644 --- a/libavcodec/amfenc.c +++ b/libavcodec/amfenc.c @@ -24,6 +24,9 @@ #if CONFIG_D3D11VA #include "libavutil/hwcontext_d3d11va.h" #endif +#if CONFIG_OPENCL +#include "libavutil/hwcontext_opencl.h" +#endif #include "libavutil/mem.h" #include "libavutil/pixdesc.h" #include "libavutil/time.h" @@ -51,6 +54,9 @@ const enum AVPixelFormat ff_amf_pix_fmts[] = { #if CONFIG_D3D11VA AV_PIX_FMT_D3D11, #endif +#if CONFIG_OPENCL + AV_PIX_FMT_OPENCL, +#endif AV_PIX_FMT_NONE }; @@ -69,6 +75,7 @@ static const FormatMap format_map[] = { AV_PIX_FMT_YUV420P, AMF_SURFACE_YUV420P }, { AV_PIX_FMT_YUYV422, AMF_SURFACE_YUY2 }, { AV_PIX_FMT_D3D11, AMF_SURFACE_NV12 }, + { AV_PIX_FMT_OPENCL, AMF_SURFACE_NV12 }, }; @@ -154,8 +161,9 @@ static int amf_load_library(AVCodecContext *avctx) static int amf_init_context(AVCodecContext *avctx) { - AmfContext *ctx = avctx->priv_data; - AMF_RESULT res = AMF_OK; + AmfContext *ctx = avctx->priv_data; + AMF_RESULT res; + AVHWDeviceContext *hwdev = NULL; // configure AMF logger // the return of these functions indicates old state and do not affect behaviour @@ -173,59 +181,91 @@ static int amf_init_context(AVCodecContext *avctx) res = ctx->factory->pVtbl->CreateContext(ctx->factory, &ctx->context); AMF_RETURN_IF_FALSE(ctx, res == AMF_OK, AVERROR_UNKNOWN, "CreateContext() failed with error %d\n", res); - // try to reuse existing DX device -#if CONFIG_D3D11VA + + // Attempt to initialise from an existing D3D11 or OpenCL device. if (avctx->hw_frames_ctx) { - AVHWFramesContext *device_ctx = (AVHWFramesContext*)avctx->hw_frames_ctx->data; - if (device_ctx->device_ctx->type == AV_HWDEVICE_TYPE_D3D11VA) { - if (amf_av_to_amf_format(device_ctx->sw_format) != AMF_SURFACE_UNKNOWN) { - if (device_ctx->device_ctx->hwctx) { - AVD3D11VADeviceContext *device_d3d11 = (AVD3D11VADeviceContext *)device_ctx->device_ctx->hwctx; - res = ctx->context->pVtbl->InitDX11(ctx->context, device_d3d11->device, AMF_DX11_1); - if (res == AMF_OK) { - ctx->hw_frames_ctx = av_buffer_ref(avctx->hw_frames_ctx); - if (!ctx->hw_frames_ctx) { - return AVERROR(ENOMEM); - } - } else { - if(res == AMF_NOT_SUPPORTED) - av_log(avctx, AV_LOG_INFO, "avctx->hw_frames_ctx has D3D11 device which doesn't have D3D11VA interface, switching to default\n"); - else - av_log(avctx, AV_LOG_INFO, "avctx->hw_frames_ctx has non-AMD device, switching to default\n"); - } - } - } else { - av_log(avctx, AV_LOG_INFO, "avctx->hw_frames_ctx has format not uspported by AMF, switching to default\n"); - } + AVHWFramesContext *hwfc = (AVHWFramesContext*)avctx->hw_frames_ctx->data; + + if (amf_av_to_amf_format(hwfc->sw_format) == AMF_SURFACE_UNKNOWN) { + av_log(avctx, AV_LOG_VERBOSE, "Input hardware frame format (%s) is not supported.\n", + av_get_pix_fmt_name(hwfc->sw_format)); + } else { + hwdev = hwfc->device_ctx; + + ctx->hw_frames_ctx = av_buffer_ref(avctx->hw_frames_ctx); + if (!ctx->hw_frames_ctx) + return AVERROR(ENOMEM); } - } else if (avctx->hw_device_ctx) { - AVHWDeviceContext *device_ctx = (AVHWDeviceContext*)(avctx->hw_device_ctx->data); - if (device_ctx->type == AV_HWDEVICE_TYPE_D3D11VA) { - if (device_ctx->hwctx) { - AVD3D11VADeviceContext *device_d3d11 = (AVD3D11VADeviceContext *)device_ctx->hwctx; - res = ctx->context->pVtbl->InitDX11(ctx->context, device_d3d11->device, AMF_DX11_1); + } + if (!hwdev && avctx->hw_device_ctx) { + hwdev = (AVHWDeviceContext*)avctx->hw_device_ctx->data; + + ctx->hw_device_ctx = av_buffer_ref(avctx->hw_device_ctx); + if (!ctx->hw_device_ctx) + return AVERROR(ENOMEM); + } + if (hwdev) { +#if CONFIG_D3D11VA + if (hwdev->type == AV_HWDEVICE_TYPE_D3D11VA) { + AVD3D11VADeviceContext *d3d11dev = hwdev->hwctx; + + res = ctx->context->pVtbl->InitDX11(ctx->context, + d3d11dev->device, AMF_DX11_1); + if (res == AMF_OK) { + av_log(avctx, AV_LOG_VERBOSE, "Initialised from " + "external D3D11 device.\n"); + return 0; + } + + av_log(avctx, AV_LOG_INFO, "Failed to initialise from " + "external D3D11 device: %d.\n", res); + } else +#endif +#if CONFIG_OPENCL + if (hwdev->type == AV_HWDEVICE_TYPE_OPENCL) { + AVOpenCLDeviceContext *cldev = hwdev->hwctx; + cl_int cle; + + ctx->cl_command_queue = + clCreateCommandQueue(cldev->context, cldev->device_id, 0, &cle); + if (!ctx->cl_command_queue) { + av_log(avctx, AV_LOG_INFO, "Failed to create OpenCL " + "command queue: %d.\n", cle); + } else { + res = ctx->context->pVtbl->InitOpenCL(ctx->context, + ctx->cl_command_queue); if (res == AMF_OK) { - ctx->hw_device_ctx = av_buffer_ref(avctx->hw_device_ctx); - if (!ctx->hw_device_ctx) { - return AVERROR(ENOMEM); - } - } else { - if (res == AMF_NOT_SUPPORTED) - av_log(avctx, AV_LOG_INFO, "avctx->hw_device_ctx has D3D11 device which doesn't have D3D11VA interface, switching to default\n"); - else - av_log(avctx, AV_LOG_INFO, "avctx->hw_device_ctx has non-AMD device, switching to default\n"); + av_log(avctx, AV_LOG_VERBOSE, "Initialised from " + "external OpenCL device.\n"); + return 0; } + av_log(avctx, AV_LOG_INFO, "Failed to initialise from " + "external OpenCL device: %d.\n", res); } + } else +#endif + { + av_log(avctx, AV_LOG_INFO, "Input device type %s is not supported.\n", + av_hwdevice_get_type_name(hwdev->type)); } } -#endif - if (!ctx->hw_frames_ctx && !ctx->hw_device_ctx) { - res = ctx->context->pVtbl->InitDX11(ctx->context, NULL, AMF_DX11_1); - if (res != AMF_OK) { - res = ctx->context->pVtbl->InitDX9(ctx->context, NULL); - AMF_RETURN_IF_FALSE(ctx, res == AMF_OK, AVERROR_UNKNOWN, "InitDX9() failed with error %d\n", res); + + // Initialise from a new D3D11 device, or D3D9 if D3D11 is not available. + res = ctx->context->pVtbl->InitDX11(ctx->context, NULL, AMF_DX11_1); + if (res == AMF_OK) { + av_log(avctx, AV_LOG_VERBOSE, "Initialised from internal D3D11 device.\n"); + } else { + av_log(avctx, AV_LOG_VERBOSE, "Failed to initialise from internal D3D11 device: %d.\n", res); + res = ctx->context->pVtbl->InitDX9(ctx->context, NULL); + if (res == AMF_OK) { + av_log(avctx, AV_LOG_VERBOSE, "Initialised from internal D3D9 device.\n"); + } else { + av_log(avctx, AV_LOG_VERBOSE, "Failed to initialise from internal D3D9 device: %d.\n", res); + av_log(avctx, AV_LOG_ERROR, "Unable to initialise AMF.\n"); + return AVERROR_UNKNOWN; } } + return 0; } @@ -279,6 +319,11 @@ int av_cold ff_amf_encode_close(AVCodecContext *avctx) av_buffer_unref(&ctx->hw_device_ctx); av_buffer_unref(&ctx->hw_frames_ctx); +#if CONFIG_OPENCL + if (ctx->cl_command_queue) + clReleaseCommandQueue(ctx->cl_command_queue); +#endif + if (ctx->trace) { ctx->trace->pVtbl->UnregisterWriter(ctx->trace, FFMPEG_AMF_WRITER_ID); } @@ -485,17 +530,38 @@ int ff_amf_send_frame(AVCodecContext *avctx, const AVFrame *frame) (AVHWDeviceContext*)ctx->hw_device_ctx->data) )) { #if CONFIG_D3D11VA - static const GUID AMFTextureArrayIndexGUID = { 0x28115527, 0xe7c3, 0x4b66, { 0x99, 0xd3, 0x4f, 0x2a, 0xe6, 0xb4, 0x7f, 0xaf } }; - ID3D11Texture2D *texture = (ID3D11Texture2D*)frame->data[0]; // actual texture - int index = (int)(size_t)frame->data[1]; // index is a slice in texture array is - set to tell AMF which slice to use - texture->lpVtbl->SetPrivateData(texture, &AMFTextureArrayIndexGUID, sizeof(index), &index); - - res = ctx->context->pVtbl->CreateSurfaceFromDX11Native(ctx->context, texture, &surface, NULL); // wrap to AMF surface - AMF_RETURN_IF_FALSE(ctx, res == AMF_OK, AVERROR(ENOMEM), "CreateSurfaceFromDX11Native() failed with error %d\n", res); - - // input HW surfaces can be vertically aligned by 16; tell AMF the real size - surface->pVtbl->SetCrop(surface, 0, 0, frame->width, frame->height); + if (frame->format == AV_PIX_FMT_D3D11) { + static const GUID AMFTextureArrayIndexGUID = { 0x28115527, 0xe7c3, 0x4b66, { 0x99, 0xd3, 0x4f, 0x2a, 0xe6, 0xb4, 0x7f, 0xaf } }; + ID3D11Texture2D *texture = (ID3D11Texture2D*)frame->data[0]; // actual texture + int index = (int)(size_t)frame->data[1]; // index is a slice in texture array is - set to tell AMF which slice to use + texture->lpVtbl->SetPrivateData(texture, &AMFTextureArrayIndexGUID, sizeof(index), &index); + + res = ctx->context->pVtbl->CreateSurfaceFromDX11Native(ctx->context, texture, &surface, NULL); // wrap to AMF surface + AMF_RETURN_IF_FALSE(ctx, res == AMF_OK, AVERROR(ENOMEM), "CreateSurfaceFromDX11Native() failed with error %d\n", res); + + // input HW surfaces can be vertically aligned by 16; tell AMF the real size + surface->pVtbl->SetCrop(surface, 0, 0, frame->width, frame->height); + } else +#endif +#if CONFIG_OPENCL + if (frame->format == AV_PIX_FMT_OPENCL) { + void *planes[AV_NUM_DATA_POINTERS]; + AMF_SURFACE_FORMAT format; + int i; + + for (i = 0; i < AV_NUM_DATA_POINTERS; i++) + planes[i] = frame->data[i]; + + format = amf_av_to_amf_format(frame->format); + + res = ctx->context->pVtbl->CreateSurfaceFromOpenCLNative(ctx->context, format, + frame->width, frame->height, + planes, &surface, NULL); + AMF_RETURN_IF_FALSE(ctx, res == AMF_OK, AVERROR_UNKNOWN, + "CreateSurfaceFromOpenCLNative() failed with error %d\n", res); + } else #endif + av_assert0(0 && "Invalid hardware input format."); } else { res = ctx->context->pVtbl->AllocSurface(ctx->context, AMF_MEMORY_HOST, ctx->format, avctx->width, avctx->height, &surface); AMF_RETURN_IF_FALSE(ctx, res == AMF_OK, AVERROR(ENOMEM), "AllocSurface() failed with error %d\n", res); diff --git a/libavcodec/amfenc.h b/libavcodec/amfenc.h index 84f0aad2fa..bb8fd1807a 100644 --- a/libavcodec/amfenc.h +++ b/libavcodec/amfenc.h @@ -61,6 +61,7 @@ typedef struct AmfContext { AVBufferRef *hw_device_ctx; ///< pointer to HW accelerator (decoder) AVBufferRef *hw_frames_ctx; ///< pointer to HW accelerator (frame allocator) + void *cl_command_queue; ///< Command queue for use with OpenCL input // helpers to handle async calls int delayed_drain;