Message ID | 0cbfcf22-2474-70a5-bd33-4c45ae396f7a@jkqxz.net |
---|---|
State | New |
Headers | show |
It seem that the failure is not in the new extension but before, in the interop from D3D11 to OCL. It can happen in two cases: OCL device/context are created without D3D11 device or format of the texture is not supported. NV12 is supported. I went through the latest ffmpeg snapshot and found that function opencl_enumerate_d3d11_devices() looks correct, pointer to the function is set to OpenCLDeviceSelector::enumerate_devices member but I cannot find a call to selector->enumerate_devices(). Instead opencl_enumerate_devices() is called directly. So my guess is that created OCL device is not created from D3D11. Just in case OCL device creation sample: https://github.com/GPUOpen-LibrariesAndSDKs/AMF/blob/master/amf/public/samples/CPPSamples/common/DeviceOpenCL.cpp Regarding the new split extension: here is a working snippet: cl_mem clImage2D = 0; cl_mem clImages[AMF_SURFACE_MAX_PLANES]; // index can be not 0 if texture is allocated as an array. clImage2D = clCreateFromD3D11Texture2DKHR(m_clContext, memflags, pTexture, index, &clStatus); for(int i = 0; i < planesNumber; i++) { clImages[i] = clGetPlaneFromImageAMD(m_clContext, clImage2D, (cl_uint)i, &clStatus); } // don’t forget to release clImages[i] and clImage2D Regards, Mikhail > -----Original Message----- > From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of > Mark Thompson > Sent: November 25, 2018 2:35 PM > To: ffmpeg-devel@ffmpeg.org > Subject: Re: [FFmpeg-devel] [INFO]AMD D3D11 to OpenCL interop extension > for NV12 and P010 textures - split planes > > On 24/05/2018 15:26, Mironov, Mikhail wrote: > > AMD has published OpenCL extension which allows split D3D11 texture > interoped as a single 2D image into two 2D images representing Y and UV > planes. > > > https://www.khronos.org/registry/OpenCL/extensions/amd/cl_amd_planar > _y > > uv.txt > > I had a go at implementing this now that it is actually visible in released > drivers, but I can't get it to work. > > Patch trying to implement it is below. It finds the extension and the new > function correctly, but I'm stuck on the creation of the whole-texture image > with clCreateFromD3D11Texture2DKHR(). The error returned is > CL_INVALID_D3D11_RESOURCE_KHR (-1007), but as far as I can tell none of > the documented failure cases which would return that error code apply. > > $ ./ffmpeg_g.exe -report -v debug -y -hwaccel d3d11va -hwaccel_device 0 - > hwaccel_output_format d3d11 -i input.mp4 -an -vf > "hwmap=derive_device=opencl:mode=read,unsharp_opencl,hwdownload,fo > rmat=nv12" -f null - ... > [AVHWDeviceContext @ 0000000001c0de80] Using device 1002:665f (AMD > Radeon (TM) R7 360 Series). > ... > [h264 @ 000000000284adc0] Format d3d11 chosen by get_format(). > ... > [Parsed_hwmap_0 @ 0000000002a2be00] Configure hwmap d3d11 -> > opencl. > [AVHWDeviceContext @ 000000000d328500] 2 OpenCL platforms found. > [AVHWDeviceContext @ 000000000d328500] 1 OpenCL devices found on > platform "Intel(R) OpenCL". > [AVHWDeviceContext @ 000000000d328500] Device Intel(R) Core(TM) i3- > 6300 CPU @ 3.80GHz skipped (not GPU). > [AVHWDeviceContext @ 000000000d328500] 1 OpenCL devices found on > platform "AMD Accelerated Parallel Processing". > [AVHWDeviceContext @ 000000000d328500] 1.0: AMD Accelerated Parallel > Processing / Bonaire [AVHWDeviceContext @ 000000000d328500] DXVA2 to > OpenCL mapping function found (clCreateFromDX9MediaSurfaceKHR). > [AVHWDeviceContext @ 000000000d328500] DXVA2 in OpenCL acquire > function found (clEnqueueAcquireDX9MediaSurfacesKHR). > [AVHWDeviceContext @ 000000000d328500] DXVA2 in OpenCL release > function found (clEnqueueReleaseDX9MediaSurfacesKHR). > [AVHWDeviceContext @ 000000000d328500] cl_khr_d3d11_sharing found as > platform extension. > [AVHWDeviceContext @ 000000000d328500] cl_amd_planar_yuv found as > device extension. > [AVHWDeviceContext @ 000000000d328500] D3D11 to OpenCL mapping > function found (clCreateFromD3D11Texture2DKHR). > [AVHWDeviceContext @ 000000000d328500] D3D11 in OpenCL acquire > function found (clEnqueueAcquireD3D11ObjectsKHR). > [AVHWDeviceContext @ 000000000d328500] D3D11 in OpenCL release > function found (clEnqueueReleaseD3D11ObjectsKHR). > [AVHWDeviceContext @ 000000000d328500] D3D11 to OpenCL mapping on > AMD function found (clGetPlaneFromImageAMD). > [AVHWFramesContext @ 0000000002c13180] Failed to create CL image from > D3D texture index 0: -1007. > [Parsed_hwmap_0 @ 0000000002a2be00] Failed to create derived frames > context: -5. > [Parsed_hwmap_0 @ 0000000002a2be00] Failed to configure output pad on > Parsed_hwmap_0 > > > Are there any examples of using this extension that I could compare with? > Alternatively, is the source code for the CL driver available somewhere so > that I can work out what that error actually means? > > Thanks, > > - Mark > > > From 25fb98f021b1347394d56ecf4781466096616542 Mon Sep 17 00:00:00 > 2001 > From: Mark Thompson <sw@jkqxz.net> > Date: Sun, 25 Nov 2018 16:59:24 +0000 > Subject: [PATCH] hwcontext_opencl: Add support for D3D11 to OpenCL > mapping on AMD > > Uses cl_amd_planar_yuv. > --- > libavutil/hwcontext_opencl.c | 106 ++++++++++++++++++++++++++++------- > 1 file changed, 86 insertions(+), 20 deletions(-) > > diff --git a/libavutil/hwcontext_opencl.c b/libavutil/hwcontext_opencl.c > index 728877553f..c745b91775 100644 > --- a/libavutil/hwcontext_opencl.c > +++ b/libavutil/hwcontext_opencl.c > @@ -64,6 +64,12 @@ > #if HAVE_OPENCL_D3D11 > #include <CL/cl_d3d11.h> > #include "hwcontext_d3d11va.h" > + > +// From cl_amd_planar_yuv; unfortunately no header is provided. > +typedef cl_mem (*clGetPlaneFromImageAMD_fn)(cl_context context, > + cl_mem mem, > + cl_uint plane, > + cl_int *errcode_ret); > #endif > > #if HAVE_OPENCL_DRM_ARM > @@ -113,12 +119,17 @@ typedef struct OpenCLDeviceContext { > > #if HAVE_OPENCL_D3D11 > int d3d11_mapping_usable; > + int d3d11_map_amd; > + int d3d11_map_intel; > + > clCreateFromD3D11Texture2DKHR_fn > clCreateFromD3D11Texture2DKHR; > clEnqueueAcquireD3D11ObjectsKHR_fn > clEnqueueAcquireD3D11ObjectsKHR; > clEnqueueReleaseD3D11ObjectsKHR_fn > clEnqueueReleaseD3D11ObjectsKHR; > + clGetPlaneFromImageAMD_fn > + clGetPlaneFromImageAMD; > #endif > > #if HAVE_OPENCL_DRM_ARM > @@ -817,17 +828,25 @@ static int opencl_device_init(AVHWDeviceContext > *hwdev) #if HAVE_OPENCL_D3D11 > { > const char *d3d11_ext = "cl_khr_d3d11_sharing"; > - const char *nv12_ext = "cl_intel_d3d11_nv12_media_sharing"; > + const char *amd_ext = "cl_amd_planar_yuv"; > + const char *intel_ext = "cl_intel_d3d11_nv12_media_sharing"; > int fail = 0; > > if (!opencl_check_extension(hwdev, d3d11_ext)) { > av_log(hwdev, AV_LOG_VERBOSE, "The %s extension is " > "required for D3D11 to OpenCL mapping.\n", d3d11_ext); > fail = 1; > - } else if (!opencl_check_extension(hwdev, nv12_ext)) { > - av_log(hwdev, AV_LOG_VERBOSE, "The %s extension may be " > - "required for D3D11 to OpenCL mapping.\n", nv12_ext); > - // Not fatal. > + } else { > + if (opencl_check_extension(hwdev, amd_ext)) { > + priv->d3d11_map_amd = 1; > + } else if (opencl_check_extension(hwdev, intel_ext)) { > + priv->d3d11_map_intel = 1; > + } else { > + av_log(hwdev, AV_LOG_VERBOSE, "One of the %s or %s " > + "extensions are required for D3D11 to OpenCL " > + "mapping.\n", amd_ext, intel_ext); > + fail = 1; > + } > } > > CL_FUNC(clCreateFromD3D11Texture2DKHR, > @@ -837,6 +856,11 @@ static int opencl_device_init(AVHWDeviceContext > *hwdev) > CL_FUNC(clEnqueueReleaseD3D11ObjectsKHR, > "D3D11 in OpenCL release"); > > + if (priv->d3d11_map_amd) { > + CL_FUNC(clGetPlaneFromImageAMD, > + "D3D11 to OpenCL mapping on AMD"); > + } > + > if (fail) { > av_log(hwdev, AV_LOG_WARNING, "D3D11 to OpenCL mapping " > "not usable.\n"); > @@ -2573,10 +2597,22 @@ static int > opencl_frames_derive_from_d3d11(AVHWFramesContext *dst_fc, > cl_int cle; > int err, i, p, nb_planes; > > - if (src_fc->sw_format != AV_PIX_FMT_NV12) { > - av_log(dst_fc, AV_LOG_ERROR, "Only NV12 textures are supported " > - "for D3D11 to OpenCL mapping.\n"); > - return AVERROR(EINVAL); > + // AMD supports NV12 and P010, Intel only supports NV12. > + if (device_priv->d3d11_map_amd) { > + if (src_fc->sw_format != AV_PIX_FMT_NV12 && > + src_fc->sw_format != AV_PIX_FMT_P010) { > + av_log(dst_fc, AV_LOG_ERROR, "Only NV12 and P010 textures are " > + "supported with AMD for D3D11 to OpenCL mapping.\n"); > + return AVERROR(EINVAL); > + } > + } else if (device_priv->d3d11_map_intel) { > + if (src_fc->sw_format != AV_PIX_FMT_NV12) { > + av_log(dst_fc, AV_LOG_ERROR, "Only NV12 and P010 textures are " > + "supported with Intel for D3D11 to OpenCL mapping.\n"); > + return AVERROR(EINVAL); > + } > + } else { > + av_assert0(0); > } > nb_planes = 2; > > @@ -2601,21 +2637,51 @@ static int > opencl_frames_derive_from_d3d11(AVHWFramesContext *dst_fc, > for (i = 0; i < frames_priv->nb_mapped_frames; i++) { > AVOpenCLFrameDescriptor *desc = &frames_priv->mapped_frames[i]; > desc->nb_planes = nb_planes; > - for (p = 0; p < nb_planes; p++) { > - UINT subresource = 2 * i + p; > > - desc->planes[p] = > - device_priv->clCreateFromD3D11Texture2DKHR( > - dst_dev->context, cl_flags, src_hwctx->texture, > - subresource, &cle); > - if (!desc->planes[p]) { > - av_log(dst_fc, AV_LOG_ERROR, "Failed to create CL " > - "image from plane %d of D3D texture " > - "index %d (subresource %u): %d.\n", > - p, i, (unsigned int)subresource, cle); > + if (device_priv->d3d11_map_amd) { > + cl_mem image; > + > + image = device_priv->clCreateFromD3D11Texture2DKHR( > + dst_dev->context, cl_flags, src_hwctx->texture, i, &cle); > + if (!image) { > + av_log(dst_fc, AV_LOG_ERROR, "Failed to create CL image " > + "from D3D texture index %d: %d.\n", i, cle); > err = AVERROR(EIO); > goto fail; > } > + > + for (p = 0; p < nb_planes; p++) { > + desc->planes[p] = device_priv->clGetPlaneFromImageAMD( > + dst_dev->context, image, p, &cle); > + if (!desc->planes[p]) { > + av_log(dst_fc, AV_LOG_ERROR, "Failed to create CL image " > + "from plane %d of image created from D3D11 " > + "texture index %d: %d.\n", p, cle, i); > + clReleaseMemObject(image); > + err = AVERROR(EIO); > + goto fail; > + } > + } > + > + clReleaseMemObject(image); > + > + } else { > + for (p = 0; p < nb_planes; p++) { > + UINT subresource = 2 * i + p; > + > + desc->planes[p] = > + device_priv->clCreateFromD3D11Texture2DKHR( > + dst_dev->context, cl_flags, src_hwctx->texture, > + subresource, &cle); > + if (!desc->planes[p]) { > + av_log(dst_fc, AV_LOG_ERROR, "Failed to create CL " > + "image from plane %d of D3D texture " > + "index %d (subresource %u): %d.\n", > + p, i, (unsigned int)subresource, cle); > + err = AVERROR(EIO); > + goto fail; > + } > + } > } > } > > -- > 2.19.1 > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
On 25/11/2018 21:28, Mironov, Mikhail wrote: > It seem that the failure is not in the new extension but before, in the interop from D3D11 to OCL. It can happen in two cases: OCL device/context are created without D3D11 device or format of the texture is not supported. NV12 is supported. I went through the latest ffmpeg snapshot and found that function opencl_enumerate_d3d11_devices() looks correct, pointer to the function is set to OpenCLDeviceSelector::enumerate_devices member but I cannot find a call to selector->enumerate_devices(). Instead opencl_enumerate_devices() is called directly. So my guess is that created OCL device is not created from D3D11. Hmm, right - patch just sent to fix the selection call. It doesn't actually make any difference to this case, though, since the filter made it choose the right device anyway and CL_CONTEXT_D3D11_DEVICE_KHR was always set when deriving from D3D11. (It could only have made a difference if there were other conflicting D3D11 devices it could have picked incorrectly.) > Just in case OCL device creation sample: https://github.com/GPUOpen-LibrariesAndSDKs/AMF/blob/master/amf/public/samples/CPPSamples/common/DeviceOpenCL.cpp > > Regarding the new split extension: here is a working snippet: > cl_mem clImage2D = 0; > cl_mem clImages[AMF_SURFACE_MAX_PLANES]; > // index can be not 0 if texture is allocated as an array. > clImage2D = clCreateFromD3D11Texture2DKHR(m_clContext, memflags, pTexture, index, &clStatus); Where is the comment about index being nonzero coming from there? Other callers to this definitely start from a zero index. (I tried adding one to my index values but it didn't change the result.) > > for(int i = 0; i < planesNumber; i++) > { > clImages[i] = clGetPlaneFromImageAMD(m_clContext, clImage2D, (cl_uint)i, &clStatus); > > } > // don’t forget to release clImages[i] and clImage2D Otherwise, that agrees with how I read the extension document. Thanks, - Mark
On 25/11/2018 22:22, Mark Thompson wrote: > On 25/11/2018 21:28, Mironov, Mikhail wrote: >> It seem that the failure is not in the new extension but before, in the interop from D3D11 to OCL. It can happen in two cases: OCL device/context are created without D3D11 device or format of the texture is not supported. NV12 is supported. I went through the latest ffmpeg snapshot and found that function opencl_enumerate_d3d11_devices() looks correct, pointer to the function is set to OpenCLDeviceSelector::enumerate_devices member but I cannot find a call to selector->enumerate_devices(). Instead opencl_enumerate_devices() is called directly. So my guess is that created OCL device is not created from D3D11. > > Hmm, right - patch just sent to fix the selection call. > > It doesn't actually make any difference to this case, though, since the filter made it choose the right device anyway and CL_CONTEXT_D3D11_DEVICE_KHR was always set when deriving from D3D11. (It could only have made a difference if there were other conflicting D3D11 devices it could have picked incorrectly.) > >> Just in case OCL device creation sample: https://github.com/GPUOpen-LibrariesAndSDKs/AMF/blob/master/amf/public/samples/CPPSamples/common/DeviceOpenCL.cpp >> >> Regarding the new split extension: here is a working snippet: >> cl_mem clImage2D = 0; >> cl_mem clImages[AMF_SURFACE_MAX_PLANES]; >> // index can be not 0 if texture is allocated as an array. >> clImage2D = clCreateFromD3D11Texture2DKHR(m_clContext, memflags, pTexture, index, &clStatus); > > Where is the comment about index being nonzero coming from there? Other callers to this definitely start from a zero index. (I tried adding one to my index values but it didn't change the result.) Urgh, sorry - ignore that question, I misread "can be not 0" as "cannot be 0". >> >> for(int i = 0; i < planesNumber; i++) >> { >> clImages[i] = clGetPlaneFromImageAMD(m_clContext, clImage2D, (cl_uint)i, &clStatus); >> >> } >> // don’t forget to release clImages[i] and clImage2D > > Otherwise, that agrees with how I read the extension document. Thanks, - Mark
You assume that device ID returned from regular enumeration is the same as device ID returned from clGetDeviceIDsFromD3D11KHR. It is not guaranteed and I didn't try this. Also I would add tracing to ensure that CL_CONTEXT_D3D11_DEVICE_KHR is actually set and clGetDeviceIDsFromD3D11KHR is called. In AMF code I always set CL_CONTEXT_INTEROP_USER_SYNC to true. Also I would trace other parameters to clCreateFromD3D11Texture2DKHR: memory flags and texture descriptor. BTW: does the interop work for NV or Intel? Thanks, Mikhail > -----Original Message----- > From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of > Mark Thompson > Sent: November 25, 2018 5:22 PM > To: ffmpeg-devel@ffmpeg.org > Subject: Re: [FFmpeg-devel] [INFO]AMD D3D11 to OpenCL interop extension > for NV12 and P010 textures - split planes > > On 25/11/2018 21:28, Mironov, Mikhail wrote: > > It seem that the failure is not in the new extension but before, in the > interop from D3D11 to OCL. It can happen in two cases: OCL device/context > are created without D3D11 device or format of the texture is not supported. > NV12 is supported. I went through the latest ffmpeg snapshot and found that > function opencl_enumerate_d3d11_devices() looks correct, pointer to the > function is set to OpenCLDeviceSelector::enumerate_devices member but I > cannot find a call to selector->enumerate_devices(). Instead > opencl_enumerate_devices() is called directly. So my guess is that created > OCL device is not created from D3D11. > > Hmm, right - patch just sent to fix the selection call. > > It doesn't actually make any difference to this case, though, since the filter > made it choose the right device anyway and > CL_CONTEXT_D3D11_DEVICE_KHR was always set when deriving from > D3D11. (It could only have made a difference if there were other conflicting > D3D11 devices it could have picked incorrectly.) > > > Just in case OCL device creation sample: > > https://github.com/GPUOpen- > LibrariesAndSDKs/AMF/blob/master/amf/public > > /samples/CPPSamples/common/DeviceOpenCL.cpp > > > > Regarding the new split extension: here is a working snippet: > > cl_mem clImage2D = 0; > > cl_mem clImages[AMF_SURFACE_MAX_PLANES]; // index can be not 0 if > > texture is allocated as an array. > > clImage2D = clCreateFromD3D11Texture2DKHR(m_clContext, memflags, > > pTexture, index, &clStatus); > > Where is the comment about index being nonzero coming from there? > Other callers to this definitely start from a zero index. (I tried adding one to > my index values but it didn't change the result.) > > > > > for(int i = 0; i < planesNumber; i++) > > { > > clImages[i] = clGetPlaneFromImageAMD(m_clContext, clImage2D, > > (cl_uint)i, &clStatus); > > > > } > > // don’t forget to release clImages[i] and clImage2D > > Otherwise, that agrees with how I read the extension document. > > Thanks, > > - Mark > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
On 26/11/2018 15:32, Mironov, Mikhail wrote: > You assume that device ID returned from regular enumeration is the same as device ID returned from clGetDeviceIDsFromD3D11KHR. It is not guaranteed and I didn't try this. Ok, that's fair I suppose. Fixing it hasn't changed anything, though. > Also I would add tracing to ensure that CL_CONTEXT_D3D11_DEVICE_KHR is actually set and clGetDeviceIDsFromD3D11KHR is called. The first was always true (Intel requires it too), the second is now. > In AMF code I always set CL_CONTEXT_INTEROP_USER_SYNC to true. I'm not completely sure of the precise semantics of D3D11, but I don't think we want that here - the clEnqueueAcquireD3D11ObjectsKHR() call should be the first synchronisation point following the previous component (generally a decoder). I tried setting it anyway, but the behaviour doesn't change - I still get CL_INVALID_D3D11_RESOURCE_KHR. > Also I would trace other parameters to clCreateFromD3D11Texture2DKHR: memory flags and texture descriptor. For the flags, I tried all of CL_MEM_READ_WRITE / CL_MEM_WRITE_ONLY / CL_MEM_READ_ONLY, which are the only allowed values. The texture descriptor is just the one created for the decoder. > BTW: does the interop work for NV or Intel? The D3D11 interop works on Intel, though not directly in the ffmpeg utility without a little change because it requires the textures to be created with D3D11_RESOURCE_MISC_FLAG (as described in the extension document <https://www.khronos.org/registry/OpenCL/extensions/intel/cl_intel_d3d11_nv12_media_sharing.txt>). Intel doesn't care whether clGetDeviceIDsFromD3D11KHR() is used or not (though I've fixed that anyway), but it does require the CL_CONTEXT_D3D11_DEVICE_KHR option to clCreateContext() (fails with CL_INVALID_CONTEXT if it isn't). It can't work on Nvidia because they don't offer any way to share NV12 textures. The DXVA2/D3D9 interop works correctly on both AMD and Intel with only the common standard extension. The one Nvidia device I can find easily doesn't have cl_khr_dx9_media_sharing at all, so that doesn't work. Thanks, - Mark >> -----Original Message----- >> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of >> Mark Thompson >> Sent: November 25, 2018 5:22 PM >> To: ffmpeg-devel@ffmpeg.org >> Subject: Re: [FFmpeg-devel] [INFO]AMD D3D11 to OpenCL interop extension >> for NV12 and P010 textures - split planes >> >> On 25/11/2018 21:28, Mironov, Mikhail wrote: >>> It seem that the failure is not in the new extension but before, in the >> interop from D3D11 to OCL. It can happen in two cases: OCL device/context >> are created without D3D11 device or format of the texture is not supported. >> NV12 is supported. I went through the latest ffmpeg snapshot and found that >> function opencl_enumerate_d3d11_devices() looks correct, pointer to the >> function is set to OpenCLDeviceSelector::enumerate_devices member but I >> cannot find a call to selector->enumerate_devices(). Instead >> opencl_enumerate_devices() is called directly. So my guess is that created >> OCL device is not created from D3D11. >> >> Hmm, right - patch just sent to fix the selection call. >> >> It doesn't actually make any difference to this case, though, since the filter >> made it choose the right device anyway and >> CL_CONTEXT_D3D11_DEVICE_KHR was always set when deriving from >> D3D11. (It could only have made a difference if there were other conflicting >> D3D11 devices it could have picked incorrectly.) >> >>> Just in case OCL device creation sample: >>> https://github.com/GPUOpen- >> LibrariesAndSDKs/AMF/blob/master/amf/public >>> /samples/CPPSamples/common/DeviceOpenCL.cpp >>> >>> Regarding the new split extension: here is a working snippet: >>> cl_mem clImage2D = 0; >>> cl_mem clImages[AMF_SURFACE_MAX_PLANES]; // index can be not 0 if >>> texture is allocated as an array. >>> clImage2D = clCreateFromD3D11Texture2DKHR(m_clContext, memflags, >>> pTexture, index, &clStatus); >> >> Where is the comment about index being nonzero coming from there? >> Other callers to this definitely start from a zero index. (I tried adding one to >> my index values but it didn't change the result.) >> >>> >>> for(int i = 0; i < planesNumber; i++) >>> { >>> clImages[i] = clGetPlaneFromImageAMD(m_clContext, clImage2D, >>> (cl_uint)i, &clStatus); >>> >>> } >>> // don’t forget to release clImages[i] and clImage2D >> >> Otherwise, that agrees with how I read the extension document.
HI, I've wrote a small sample you can use: https://www.dropbox.com/s/c8m8evoao731tbm/OCLDX11Interop.zip?dl=0 If it doesn’t work - you have conflict of drivers with Intel - saw this before. Thanks, Mikhail > -----Original Message----- > From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of > Mark Thompson > Sent: November 27, 2018 7:05 PM > To: ffmpeg-devel@ffmpeg.org > Subject: Re: [FFmpeg-devel] [INFO]AMD D3D11 to OpenCL interop extension > for NV12 and P010 textures - split planes > > On 26/11/2018 15:32, Mironov, Mikhail wrote: > > You assume that device ID returned from regular enumeration is the same > as device ID returned from clGetDeviceIDsFromD3D11KHR. It is not > guaranteed and I didn't try this. > > Ok, that's fair I suppose. Fixing it hasn't changed anything, though. > > > Also I would add tracing to ensure that CL_CONTEXT_D3D11_DEVICE_KHR > is actually set and clGetDeviceIDsFromD3D11KHR is called. > > The first was always true (Intel requires it too), the second is now. > > > In AMF code I always set CL_CONTEXT_INTEROP_USER_SYNC to true. > > I'm not completely sure of the precise semantics of D3D11, but I don't think > we want that here - the clEnqueueAcquireD3D11ObjectsKHR() call should be > the first synchronisation point following the previous component (generally a > decoder). > > I tried setting it anyway, but the behaviour doesn't change - I still get > CL_INVALID_D3D11_RESOURCE_KHR. > > > Also I would trace other parameters to clCreateFromD3D11Texture2DKHR: > memory flags and texture descriptor. > > For the flags, I tried all of CL_MEM_READ_WRITE / CL_MEM_WRITE_ONLY / > CL_MEM_READ_ONLY, which are the only allowed values. The texture > descriptor is just the one created for the decoder. > > > BTW: does the interop work for NV or Intel? > > The D3D11 interop works on Intel, though not directly in the ffmpeg utility > without a little change because it requires the textures to be created with > D3D11_RESOURCE_MISC_FLAG (as described in the extension document > <https://www.khronos.org/registry/OpenCL/extensions/intel/cl_intel_d3d11 > _nv12_media_sharing.txt>). Intel doesn't care whether > clGetDeviceIDsFromD3D11KHR() is used or not (though I've fixed that > anyway), but it does require the CL_CONTEXT_D3D11_DEVICE_KHR option to > clCreateContext() (fails with CL_INVALID_CONTEXT if it isn't). > > It can't work on Nvidia because they don't offer any way to share NV12 > textures. > > The DXVA2/D3D9 interop works correctly on both AMD and Intel with only > the common standard extension. The one Nvidia device I can find easily > doesn't have cl_khr_dx9_media_sharing at all, so that doesn't work. > > Thanks, > > - Mark > > > >> -----Original Message----- > >> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of > >> Mark Thompson > >> Sent: November 25, 2018 5:22 PM > >> To: ffmpeg-devel@ffmpeg.org > >> Subject: Re: [FFmpeg-devel] [INFO]AMD D3D11 to OpenCL interop > >> extension for NV12 and P010 textures - split planes > >> > >> On 25/11/2018 21:28, Mironov, Mikhail wrote: > >>> It seem that the failure is not in the new extension but before, in > >>> the > >> interop from D3D11 to OCL. It can happen in two cases: OCL > >> device/context are created without D3D11 device or format of the texture > is not supported. > >> NV12 is supported. I went through the latest ffmpeg snapshot and > >> found that function opencl_enumerate_d3d11_devices() looks correct, > >> pointer to the function is set to > >> OpenCLDeviceSelector::enumerate_devices member but I cannot find a > >> call to selector->enumerate_devices(). Instead > >> opencl_enumerate_devices() is called directly. So my guess is that > >> created OCL device is not created from D3D11. > >> > >> Hmm, right - patch just sent to fix the selection call. > >> > >> It doesn't actually make any difference to this case, though, since > >> the filter made it choose the right device anyway and > >> CL_CONTEXT_D3D11_DEVICE_KHR was always set when deriving from > D3D11. > >> (It could only have made a difference if there were other conflicting > >> D3D11 devices it could have picked incorrectly.) > >> > >>> Just in case OCL device creation sample: > >>> https://github.com/GPUOpen- > >> LibrariesAndSDKs/AMF/blob/master/amf/public > >>> /samples/CPPSamples/common/DeviceOpenCL.cpp > >>> > >>> Regarding the new split extension: here is a working snippet: > >>> cl_mem clImage2D = 0; > >>> cl_mem clImages[AMF_SURFACE_MAX_PLANES]; // index can be not 0 if > >>> texture is allocated as an array. > >>> clImage2D = clCreateFromD3D11Texture2DKHR(m_clContext, memflags, > >>> pTexture, index, &clStatus); > >> > >> Where is the comment about index being nonzero coming from there? > >> Other callers to this definitely start from a zero index. (I tried > >> adding one to my index values but it didn't change the result.) > >> > >>> > >>> for(int i = 0; i < planesNumber; i++) > >>> { > >>> clImages[i] = clGetPlaneFromImageAMD(m_clContext, clImage2D, > >>> (cl_uint)i, &clStatus); > >>> > >>> } > >>> // don’t forget to release clImages[i] and clImage2D > >> > >> Otherwise, that agrees with how I read the extension document. > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Hi Mark, Did you have chance to try this sample? BTW: Alex Karvchenko waits for your review and commit for AMF context for some time. With this commit he can add more interesting things. Could you please take a look? Thanks, Mikhail > -----Original Message----- > From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of > Mironov, Mikhail > Sent: November 29, 2018 10:52 AM > To: FFmpeg development discussions and patches <ffmpeg- > devel@ffmpeg.org> > Subject: Re: [FFmpeg-devel] [INFO]AMD D3D11 to OpenCL interop extension > for NV12 and P010 textures - split planes > > HI, > I've wrote a small sample you can use: > https://www.dropbox.com/s/c8m8evoao731tbm/OCLDX11Interop.zip?dl=0 > If it doesn’t work - you have conflict of drivers with Intel - saw this before. > Thanks, > Mikhail > > > -----Original Message----- > > From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of > Mark > > Thompson > > Sent: November 27, 2018 7:05 PM > > To: ffmpeg-devel@ffmpeg.org > > Subject: Re: [FFmpeg-devel] [INFO]AMD D3D11 to OpenCL interop > > extension for NV12 and P010 textures - split planes > > > > On 26/11/2018 15:32, Mironov, Mikhail wrote: > > > You assume that device ID returned from regular enumeration is the > > > same > > as device ID returned from clGetDeviceIDsFromD3D11KHR. It is not > > guaranteed and I didn't try this. > > > > Ok, that's fair I suppose. Fixing it hasn't changed anything, though. > > > > > Also I would add tracing to ensure that CL_CONTEXT_D3D11_DEVICE_KHR > > is actually set and clGetDeviceIDsFromD3D11KHR is called. > > > > The first was always true (Intel requires it too), the second is now. > > > > > In AMF code I always set CL_CONTEXT_INTEROP_USER_SYNC to true. > > > > I'm not completely sure of the precise semantics of D3D11, but I don't > > think we want that here - the clEnqueueAcquireD3D11ObjectsKHR() call > > should be the first synchronisation point following the previous > > component (generally a decoder). > > > > I tried setting it anyway, but the behaviour doesn't change - I still > > get CL_INVALID_D3D11_RESOURCE_KHR. > > > > > Also I would trace other parameters to > clCreateFromD3D11Texture2DKHR: > > memory flags and texture descriptor. > > > > For the flags, I tried all of CL_MEM_READ_WRITE / CL_MEM_WRITE_ONLY > / > > CL_MEM_READ_ONLY, which are the only allowed values. The texture > > descriptor is just the one created for the decoder. > > > > > BTW: does the interop work for NV or Intel? > > > > The D3D11 interop works on Intel, though not directly in the ffmpeg > > utility without a little change because it requires the textures to be > > created with D3D11_RESOURCE_MISC_FLAG (as described in the extension > > document > > <https://www.khronos.org/registry/OpenCL/extensions/intel/cl_intel_d3d > > 11 _nv12_media_sharing.txt>). Intel doesn't care whether > > clGetDeviceIDsFromD3D11KHR() is used or not (though I've fixed that > > anyway), but it does require the CL_CONTEXT_D3D11_DEVICE_KHR option > to > > clCreateContext() (fails with CL_INVALID_CONTEXT if it isn't). > > > > It can't work on Nvidia because they don't offer any way to share NV12 > > textures. > > > > The DXVA2/D3D9 interop works correctly on both AMD and Intel with only > > the common standard extension. The one Nvidia device I can find > > easily doesn't have cl_khr_dx9_media_sharing at all, so that doesn't work. > > > > Thanks, > > > > - Mark > > > > > > >> -----Original Message----- > > >> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of > > >> Mark Thompson > > >> Sent: November 25, 2018 5:22 PM > > >> To: ffmpeg-devel@ffmpeg.org > > >> Subject: Re: [FFmpeg-devel] [INFO]AMD D3D11 to OpenCL interop > > >> extension for NV12 and P010 textures - split planes > > >> > > >> On 25/11/2018 21:28, Mironov, Mikhail wrote: > > >>> It seem that the failure is not in the new extension but before, > > >>> in the > > >> interop from D3D11 to OCL. It can happen in two cases: OCL > > >> device/context are created without D3D11 device or format of the > > >> texture > > is not supported. > > >> NV12 is supported. I went through the latest ffmpeg snapshot and > > >> found that function opencl_enumerate_d3d11_devices() looks correct, > > >> pointer to the function is set to > > >> OpenCLDeviceSelector::enumerate_devices member but I cannot find a > > >> call to selector->enumerate_devices(). Instead > > >> opencl_enumerate_devices() is called directly. So my guess is that > > >> created OCL device is not created from D3D11. > > >> > > >> Hmm, right - patch just sent to fix the selection call. > > >> > > >> It doesn't actually make any difference to this case, though, since > > >> the filter made it choose the right device anyway and > > >> CL_CONTEXT_D3D11_DEVICE_KHR was always set when deriving from > > D3D11. > > >> (It could only have made a difference if there were other > > >> conflicting > > >> D3D11 devices it could have picked incorrectly.) > > >> > > >>> Just in case OCL device creation sample: > > >>> https://github.com/GPUOpen- > > >> LibrariesAndSDKs/AMF/blob/master/amf/public > > >>> /samples/CPPSamples/common/DeviceOpenCL.cpp > > >>> > > >>> Regarding the new split extension: here is a working snippet: > > >>> cl_mem clImage2D = 0; > > >>> cl_mem clImages[AMF_SURFACE_MAX_PLANES]; // index can be not 0 > if > > >>> texture is allocated as an array. > > >>> clImage2D = clCreateFromD3D11Texture2DKHR(m_clContext, > memflags, > > >>> pTexture, index, &clStatus); > > >> > > >> Where is the comment about index being nonzero coming from there? > > >> Other callers to this definitely start from a zero index. (I tried > > >> adding one to my index values but it didn't change the result.) > > >> > > >>> > > >>> for(int i = 0; i < planesNumber; i++) > > >>> { > > >>> clImages[i] = clGetPlaneFromImageAMD(m_clContext, clImage2D, > > >>> (cl_uint)i, &clStatus); > > >>> > > >>> } > > >>> // don’t forget to release clImages[i] and clImage2D > > >> > > >> Otherwise, that agrees with how I read the extension document. > > _______________________________________________ > > ffmpeg-devel mailing list > > ffmpeg-devel@ffmpeg.org > > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
diff --git a/libavutil/hwcontext_opencl.c b/libavutil/hwcontext_opencl.c index 728877553f..c745b91775 100644 --- a/libavutil/hwcontext_opencl.c +++ b/libavutil/hwcontext_opencl.c @@ -64,6 +64,12 @@ #if HAVE_OPENCL_D3D11 #include <CL/cl_d3d11.h> #include "hwcontext_d3d11va.h" + +// From cl_amd_planar_yuv; unfortunately no header is provided. +typedef cl_mem (*clGetPlaneFromImageAMD_fn)(cl_context context, + cl_mem mem, + cl_uint plane, + cl_int *errcode_ret); #endif #if HAVE_OPENCL_DRM_ARM @@ -113,12 +119,17 @@ typedef struct OpenCLDeviceContext { #if HAVE_OPENCL_D3D11 int d3d11_mapping_usable; + int d3d11_map_amd; + int d3d11_map_intel; + clCreateFromD3D11Texture2DKHR_fn clCreateFromD3D11Texture2DKHR; clEnqueueAcquireD3D11ObjectsKHR_fn clEnqueueAcquireD3D11ObjectsKHR; clEnqueueReleaseD3D11ObjectsKHR_fn clEnqueueReleaseD3D11ObjectsKHR; + clGetPlaneFromImageAMD_fn + clGetPlaneFromImageAMD; #endif #if HAVE_OPENCL_DRM_ARM @@ -817,17 +828,25 @@ static int opencl_device_init(AVHWDeviceContext *hwdev) #if HAVE_OPENCL_D3D11 { const char *d3d11_ext = "cl_khr_d3d11_sharing"; - const char *nv12_ext = "cl_intel_d3d11_nv12_media_sharing"; + const char *amd_ext = "cl_amd_planar_yuv"; + const char *intel_ext = "cl_intel_d3d11_nv12_media_sharing"; int fail = 0; if (!opencl_check_extension(hwdev, d3d11_ext)) { av_log(hwdev, AV_LOG_VERBOSE, "The %s extension is " "required for D3D11 to OpenCL mapping.\n", d3d11_ext); fail = 1; - } else if (!opencl_check_extension(hwdev, nv12_ext)) { - av_log(hwdev, AV_LOG_VERBOSE, "The %s extension may be " - "required for D3D11 to OpenCL mapping.\n", nv12_ext); - // Not fatal. + } else { + if (opencl_check_extension(hwdev, amd_ext)) { + priv->d3d11_map_amd = 1; + } else if (opencl_check_extension(hwdev, intel_ext)) { + priv->d3d11_map_intel = 1; + } else { + av_log(hwdev, AV_LOG_VERBOSE, "One of the %s or %s " + "extensions are required for D3D11 to OpenCL " + "mapping.\n", amd_ext, intel_ext); + fail = 1; + } } CL_FUNC(clCreateFromD3D11Texture2DKHR, @@ -837,6 +856,11 @@ static int opencl_device_init(AVHWDeviceContext *hwdev) CL_FUNC(clEnqueueReleaseD3D11ObjectsKHR, "D3D11 in OpenCL release"); + if (priv->d3d11_map_amd) { + CL_FUNC(clGetPlaneFromImageAMD, + "D3D11 to OpenCL mapping on AMD"); + } + if (fail) { av_log(hwdev, AV_LOG_WARNING, "D3D11 to OpenCL mapping " "not usable.\n"); @@ -2573,10 +2597,22 @@ static int opencl_frames_derive_from_d3d11(AVHWFramesContext *dst_fc, cl_int cle; int err, i, p, nb_planes; - if (src_fc->sw_format != AV_PIX_FMT_NV12) { - av_log(dst_fc, AV_LOG_ERROR, "Only NV12 textures are supported " - "for D3D11 to OpenCL mapping.\n"); - return AVERROR(EINVAL); + // AMD supports NV12 and P010, Intel only supports NV12. + if (device_priv->d3d11_map_amd) { + if (src_fc->sw_format != AV_PIX_FMT_NV12 && + src_fc->sw_format != AV_PIX_FMT_P010) { + av_log(dst_fc, AV_LOG_ERROR, "Only NV12 and P010 textures are " + "supported with AMD for D3D11 to OpenCL mapping.\n"); + return AVERROR(EINVAL); + } + } else if (device_priv->d3d11_map_intel) { + if (src_fc->sw_format != AV_PIX_FMT_NV12) { + av_log(dst_fc, AV_LOG_ERROR, "Only NV12 and P010 textures are " + "supported with Intel for D3D11 to OpenCL mapping.\n"); + return AVERROR(EINVAL); + } + } else { + av_assert0(0); } nb_planes = 2; @@ -2601,21 +2637,51 @@ static int opencl_frames_derive_from_d3d11(AVHWFramesContext *dst_fc, for (i = 0; i < frames_priv->nb_mapped_frames; i++) { AVOpenCLFrameDescriptor *desc = &frames_priv->mapped_frames[i]; desc->nb_planes = nb_planes; - for (p = 0; p < nb_planes; p++) { - UINT subresource = 2 * i + p; - desc->planes[p] = - device_priv->clCreateFromD3D11Texture2DKHR( - dst_dev->context, cl_flags, src_hwctx->texture, - subresource, &cle); - if (!desc->planes[p]) { - av_log(dst_fc, AV_LOG_ERROR, "Failed to create CL " - "image from plane %d of D3D texture " - "index %d (subresource %u): %d.\n", - p, i, (unsigned int)subresource, cle); + if (device_priv->d3d11_map_amd) { + cl_mem image; + + image = device_priv->clCreateFromD3D11Texture2DKHR( + dst_dev->context, cl_flags, src_hwctx->texture, i, &cle); + if (!image) { + av_log(dst_fc, AV_LOG_ERROR, "Failed to create CL image " + "from D3D texture index %d: %d.\n", i, cle); err = AVERROR(EIO); goto fail; } + + for (p = 0; p < nb_planes; p++) { + desc->planes[p] = device_priv->clGetPlaneFromImageAMD( + dst_dev->context, image, p, &cle); + if (!desc->planes[p]) { + av_log(dst_fc, AV_LOG_ERROR, "Failed to create CL image " + "from plane %d of image created from D3D11 " + "texture index %d: %d.\n", p, cle, i); + clReleaseMemObject(image); + err = AVERROR(EIO); + goto fail; + } + } + + clReleaseMemObject(image); + + } else { + for (p = 0; p < nb_planes; p++) { + UINT subresource = 2 * i + p; + + desc->planes[p] = + device_priv->clCreateFromD3D11Texture2DKHR( + dst_dev->context, cl_flags, src_hwctx->texture, + subresource, &cle); + if (!desc->planes[p]) { + av_log(dst_fc, AV_LOG_ERROR, "Failed to create CL " + "image from plane %d of D3D texture " + "index %d (subresource %u): %d.\n", + p, i, (unsigned int)subresource, cle); + err = AVERROR(EIO); + goto fail; + } + } } }