From patchwork Sun Apr 22 16:47:47 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rostislav Pehlivanov X-Patchwork-Id: 8585 Delivered-To: ffmpegpatchwork@gmail.com Received: by 2002:a02:155:0:0:0:0:0 with SMTP id c82-v6csp2914998jad; Sun, 22 Apr 2018 09:55:42 -0700 (PDT) X-Google-Smtp-Source: AIpwx4+tJzGrWNRpc4aK2I0y4JQ8g7Bikv3RSZEwtJDyujQHKnZO2BZceHEWFILBO/5Uwr91kG0n X-Received: by 10.28.7.76 with SMTP id 73mr8127103wmh.128.1524416142265; Sun, 22 Apr 2018 09:55:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524416142; cv=none; d=google.com; s=arc-20160816; b=zcSzoONa2ILLdUMcRtko4s67FleaGyhdE1z+1KDzLgKZZ0uiEx/wqkoaQduRC5SOL/ R0HhOPxiMoPdcRD+3wCge/Ia82LaayLFeqjfEDl4GeVdZzRXgvk1hOKoseXIjMvtlstz zDg0wFu8m4TJv8hwV0Ri0h0csxQyrlDDXHZvk/QIQrMTYFXoy3r1fxc2YLd3DOSv5fnf +VQ+OhY4XlSD0QZV0napbo1ewXD/6Z2lfXMlgIbLvxWyvzvgdowgirlcLfJGOiBzDAne I4+1mCtDGEN6Pe2m/xO+CHTWdrbH+unCMGeEqn+SKzxgd4nVNv69ZPSeHLRXnMdi3ne1 8IXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:references:in-reply-to:message-id:date :to:from:dkim-signature:delivered-to:arc-authentication-results; bh=2IUo8Y8h4TQIsfrz5He1nanYBTSaJ+4Ld5QcrzPGId4=; b=ZSMP0eavLbv45v45GLY6iB4819SgvpEeXpbxlXwXTB/E5xAywZp2IKhXCcTRhbHb8V qtbliO5AbU9BJXTceBxZccesFlzK35bKD0nSFUCWGM/bwqlEZfbTz+3bsObGn9IPoVxg 4Nik1lufjicIyoMbnoLbHvBaJPIGEGdDKOsvxD+nIG5aXfQBdscwPQ7nwwIdAG8Miazd eI4lsZhpKSUlri9lMg5fP1sujdVuSH7kygawsDXEQ+JS/CnOxi5XA01fmsjlAHAm/zfx mTVG26b6oKGKEvOKumYDtg1tDKfMrmmo8XOM7JWDD7lSv8hWNVZA7f/6XHHIHERQCUOJ LxhA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20161025 header.b=bAXsHk6s; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id x16si3597882wmc.0.2018.04.22.09.55.41; Sun, 22 Apr 2018 09:55:42 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20161025 header.b=bAXsHk6s; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 9AD6768A362; Sun, 22 Apr 2018 19:55:05 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wr0-f174.google.com (mail-wr0-f174.google.com [209.85.128.174]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id F366868A34F for ; Sun, 22 Apr 2018 19:54:59 +0300 (EEST) Received: by mail-wr0-f174.google.com with SMTP id f14-v6so34916326wre.4 for ; Sun, 22 Apr 2018 09:55:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=21u9CSCMBRTzionBGIYwhW3eorPOX3nWkguVfElS2/o=; b=bAXsHk6sjnDgydXJQrNFPMKbbSxSvSizkPY6/qbhDmNEdSTrwKzCg5FA/CuY8z05KD RGe2wrZGmSMP7lTPvJixZ4UFQHfXaynk0/O9BocZ6dXZLxeZBoRHYvTmj96iRXGTpekZ 96Z8QbDkalBRKG1DVunB/nZreKATwPBFPGEkBfZuz9dtEhj959YIDlER2ATAxbbDGD4N baPk1/Br/AGS9fxl4zlS6O2KyI5Meat0q94/zOMkv+ONLsoGrv6GDBy3OkpriM9S4wI7 ufbM5Hphn/JE589IeG4ivjUFzC/mJXuhtdr9Jtsrs8KnJ4xaH56fu3jj2eUEm9urJkAu tNtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=21u9CSCMBRTzionBGIYwhW3eorPOX3nWkguVfElS2/o=; b=OVuvirK6y3+bhEsi8JOs89vNL+guyGl4uVJ6EybMWkyi6xiU3a1uKJndmkpDYqLl3e 2ox8bdFDcPju/rNf1CYeSDIyo1yUHpGZC5D5gSUIv7rNLcwicth5qHZP8YCT9H1Jwb45 TBdE4PX0MgKsu3h6uBRvohSU60z4+Ioh8m+zlOFJR7ShcV1AxQQY6d1QJ+i1Bcktzk62 o1Wqwvh53nkAvcNMJpGgQY7n4FgD84CYXNFzJfaYQbVpPVeukuKSiQBqOSfNuaLIuxSg G0mspjw6ViOknD5pjVEz5VqNnpENzi3pF20Vkf8E0gt6+eghP7Ma3oEN6zuhYAna7Ni3 h8ZA== X-Gm-Message-State: ALQs6tAAP/q8ws7MVzd26VVCgqLJcjX7/5vB2rOaZuWrA1e/5j6bbGZm 61WGPRw5nqdt2VRnvlmgSvySTz/G X-Received: by 10.28.171.70 with SMTP id u67mr7818576wme.96.1524415716328; Sun, 22 Apr 2018 09:48:36 -0700 (PDT) Received: from moonbase.pars.ee ([2a00:23c4:7c88:af00:5419:5e6c:3fec:91ff]) by smtp.gmail.com with ESMTPSA id l41-v6sm18923022wrl.2.2018.04.22.09.48.35 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 22 Apr 2018 09:48:35 -0700 (PDT) From: Rostislav Pehlivanov To: ffmpeg-devel@ffmpeg.org Date: Sun, 22 Apr 2018 17:47:47 +0100 Message-Id: <20180422164751.22628-5-atomnuker@gmail.com> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180422164751.22628-1-atomnuker@gmail.com> References: <20180422164751.22628-1-atomnuker@gmail.com> Subject: [FFmpeg-devel] [PATCH v2 4/8] lavfi: add common Vulkan filtering code X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Rostislav Pehlivanov MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" This commit adds a common code for use in Vulkan filters. It attempts to ease the burden of writing Vulkan image filtering to a minimum, which is pretty much a requirement considering how verbose the API is. It supports both compute and graphic pipelines and manages to abstract the API to such a level there's no need to call any Vulkan functions inside the init path of the code. Handling shader descriptors is probably the bulk of the code, and despite the abstraction, it loses none of the features for describing shader IO. In order to produce linkable shaders, it depends on the libshaderc library (and depends on the latest stable version of it). This allows for greater performance and flexibility than static built-in shaders and also eliminates the cumbersome process of interfacing with glslang to compile GLSL to SPIR-V. It's based off of the common opencl and provides similar interfaces for filter pad init and config, with the addition that it also supports in-place filtering. Signed-off-by: Rostislav Pehlivanov --- configure | 10 +- libavfilter/vulkan.c | 1450 ++++++++++++++++++++++++++++++++++++++++++ libavfilter/vulkan.h | 234 +++++++ 3 files changed, 1693 insertions(+), 1 deletion(-) create mode 100644 libavfilter/vulkan.c create mode 100644 libavfilter/vulkan.h diff --git a/configure b/configure index 59dd3f85fc..b9bda095d3 100755 --- a/configure +++ b/configure @@ -252,6 +252,7 @@ External library support: --enable-librsvg enable SVG rasterization via librsvg [no] --enable-librubberband enable rubberband needed for rubberband filter [no] --enable-librtmp enable RTMP[E] support via librtmp [no] + --enable-libshaderc enable GLSL->SPIRV compilation via libshaderc [no] --enable-libshine enable fixed-point MP3 encoding via libshine [no] --enable-libsmbclient enable Samba protocol via libsmbclient [no] --enable-libsnappy enable Snappy compression, needed for hap encoding [no] @@ -1702,6 +1703,7 @@ EXTERNAL_LIBRARY_LIST=" libpulse librsvg librtmp + libshaderc libshine libsmbclient libsnappy @@ -2219,6 +2221,7 @@ HAVE_LIST=" opencl_dxva2 opencl_vaapi_beignet opencl_vaapi_intel_media + shaderc_opt_perf vulkan_drm_mod perl pod2man @@ -3449,7 +3452,7 @@ avformat_deps="avcodec avutil" avformat_suggest="libm network zlib" avresample_deps="avutil" avresample_suggest="libm" -avutil_suggest="clock_gettime ffnvcodec libm libdrm libmfx opencl user32 vaapi videotoolbox corefoundation corevideo coremedia bcrypt" +avutil_suggest="clock_gettime ffnvcodec libm libdrm libmfx opencl vulkan libshaderc user32 vaapi videotoolbox corefoundation corevideo coremedia bcrypt" postproc_deps="avutil gpl" postproc_suggest="libm" swresample_deps="avutil" @@ -6029,6 +6032,7 @@ enabled libpulse && require_pkg_config libpulse libpulse pulse/pulseaud enabled librsvg && require_pkg_config librsvg librsvg-2.0 librsvg-2.0/librsvg/rsvg.h rsvg_handle_render_cairo enabled librtmp && require_pkg_config librtmp librtmp librtmp/rtmp.h RTMP_Socket enabled librubberband && require_pkg_config librubberband "rubberband >= 1.8.1" rubberband/rubberband-c.h rubberband_new -lstdc++ && append librubberband_extralibs "-lstdc++" +enabled libshaderc && require libshaderc shaderc/shaderc.h shaderc_compiler_initialize -lshaderc_shared enabled libshine && require_pkg_config libshine shine shine/layer3.h shine_encode_buffer enabled libsmbclient && { check_pkg_config libsmbclient smbclient libsmbclient.h smbc_init || require libsmbclient libsmbclient.h smbc_init -lsmbclient; } @@ -6328,6 +6332,10 @@ enabled crystalhd && check_lib crystalhd "stdint.h libcrystalhd/libcrystalhd_if. enabled vulkan && check_pkg_config vulkan "vulkan >= 1.1.73" "vulkan/vulkan.h" vkCreateInstance +if enabled_all vulkan libshaderc ; then + check_cc shaderc_opt_perf shaderc/shaderc.h "int t = shaderc_optimization_level_performance" +fi + if enabled_all vulkan libdrm ; then check_cpp_condition vulkan_drm_mod vulkan/vulkan.h "defined VK_EXT_IMAGE_DRM_FORMAT_MODIFIER_EXTENSION_NAME" fi diff --git a/libavfilter/vulkan.c b/libavfilter/vulkan.c new file mode 100644 index 0000000000..4266791618 --- /dev/null +++ b/libavfilter/vulkan.c @@ -0,0 +1,1450 @@ +/* + * Vulkan utilities + * Copyright (c) 2018 Rostislav Pehlivanov + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "formats.h" +#include "vulkan.h" + +#define VK_LOAD_PFN(inst, name) PFN_##name pfn_##name = (PFN_##name) \ + vkGetInstanceProcAddr(inst, #name) + +/* Converts return values to strings */ +const char *ff_vk_ret2str(VkResult res) +{ +#define CASE(VAL) case VAL: return #VAL + switch (res) { + CASE(VK_SUCCESS); + CASE(VK_NOT_READY); + CASE(VK_TIMEOUT); + CASE(VK_EVENT_SET); + CASE(VK_EVENT_RESET); + CASE(VK_INCOMPLETE); + CASE(VK_ERROR_OUT_OF_HOST_MEMORY); + CASE(VK_ERROR_OUT_OF_DEVICE_MEMORY); + CASE(VK_ERROR_INITIALIZATION_FAILED); + CASE(VK_ERROR_DEVICE_LOST); + CASE(VK_ERROR_MEMORY_MAP_FAILED); + CASE(VK_ERROR_LAYER_NOT_PRESENT); + CASE(VK_ERROR_EXTENSION_NOT_PRESENT); + CASE(VK_ERROR_FEATURE_NOT_PRESENT); + CASE(VK_ERROR_INCOMPATIBLE_DRIVER); + CASE(VK_ERROR_TOO_MANY_OBJECTS); + CASE(VK_ERROR_FORMAT_NOT_SUPPORTED); + CASE(VK_ERROR_FRAGMENTED_POOL); + CASE(VK_ERROR_SURFACE_LOST_KHR); + CASE(VK_ERROR_NATIVE_WINDOW_IN_USE_KHR); + CASE(VK_SUBOPTIMAL_KHR); + CASE(VK_ERROR_OUT_OF_DATE_KHR); + CASE(VK_ERROR_INCOMPATIBLE_DISPLAY_KHR); + CASE(VK_ERROR_VALIDATION_FAILED_EXT); + CASE(VK_ERROR_INVALID_SHADER_NV); + CASE(VK_ERROR_OUT_OF_POOL_MEMORY); + CASE(VK_ERROR_INVALID_EXTERNAL_HANDLE); + CASE(VK_ERROR_NOT_PERMITTED_EXT); + default: return "Unknown error"; + } +#undef CASE +} + +int ff_vk_alloc_mem(AVFilterContext *avctx, VkMemoryRequirements *req, + VkMemoryPropertyFlagBits req_flags, void *alloc_extension, + VkMemoryPropertyFlagBits *mem_flags, VkDeviceMemory *mem) +{ + VkResult ret; + int index = -1; + VkPhysicalDeviceProperties props; + VkPhysicalDeviceMemoryProperties mprops; + VulkanFilterContext *s = avctx->priv; + + VkMemoryAllocateInfo alloc_info = { + .sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO, + .pNext = alloc_extension, + }; + + vkGetPhysicalDeviceProperties(s->hwctx->phys_dev, &props); + vkGetPhysicalDeviceMemoryProperties(s->hwctx->phys_dev, &mprops); + + /* Align if we need to */ + if (req_flags & VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT) + req->size = FFALIGN(req->size, props.limits.minMemoryMapAlignment); + + alloc_info.allocationSize = req->size; + + /* The vulkan spec requires memory types to be sorted in the "optimal" + * order, so the first matching type we find will be the best/fastest one */ + for (int i = 0; i < mprops.memoryTypeCount; i++) { + /* The memory type must be supported by the requirements (bitfield) */ + if (!(req->memoryTypeBits & (1 << i))) + continue; + + /* The memory type flags must include our properties */ + if ((mprops.memoryTypes[i].propertyFlags & req_flags) != req_flags) + continue; + + /* Found a suitable memory type */ + index = i; + break; + } + + if (index < 0) { + av_log(avctx, AV_LOG_ERROR, "No memory type found for flags 0x%x\n", + req_flags); + return AVERROR(EINVAL); + } + + alloc_info.memoryTypeIndex = index; + + ret = vkAllocateMemory(s->hwctx->act_dev, &alloc_info, + s->hwctx->alloc, mem); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Failed to allocate memory: %s\n", + ff_vk_ret2str(ret)); + return AVERROR(ENOMEM); + } + + *mem_flags |= mprops.memoryTypes[index].propertyFlags; + + return 0; +} + +int ff_vk_create_buf(AVFilterContext *avctx, FFVkBuffer *buf, size_t size, + VkBufferUsageFlags usage, VkMemoryPropertyFlagBits flags) +{ + int err; + VkResult ret; + VkMemoryRequirements req; + VulkanFilterContext *s = avctx->priv; + + VkBufferCreateInfo buf_spawn = { + .sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO, + .pNext = NULL, + .usage = usage, + .sharingMode = VK_SHARING_MODE_EXCLUSIVE, + .size = size, /* Gets FFALIGNED during alloc if host visible + but should be ok */ + }; + + ret = vkCreateBuffer(s->hwctx->act_dev, &buf_spawn, NULL, &buf->buf); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Failed to create buffer: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + vkGetBufferMemoryRequirements(s->hwctx->act_dev, buf->buf, &req); + + err = ff_vk_alloc_mem(avctx, &req, flags, NULL, &buf->flags, &buf->mem); + if (err) + return err; + + ret = vkBindBufferMemory(s->hwctx->act_dev, buf->buf, buf->mem, 0); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Failed to bind memory to buffer: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + return 0; +} + +int ff_vk_map_buffers(AVFilterContext *avctx, FFVkBuffer *buf, uint8_t *mem[], + int nb_buffers, int invalidate) +{ + int i; + VkResult ret; + VulkanFilterContext *s = avctx->priv; + VkMappedMemoryRange *inval_list = NULL; + int inval_count = 0; + + for (i = 0; i < nb_buffers; i++) { + ret = vkMapMemory(s->hwctx->act_dev, buf[i].mem, 0, + VK_WHOLE_SIZE, 0, (void **)&mem[i]); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Failed to map buffer memory: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } + + if (!invalidate) + return 0; + + for (i = 0; i < nb_buffers; i++) { + const VkMappedMemoryRange ival_buf = { + .sType = VK_STRUCTURE_TYPE_MAPPED_MEMORY_RANGE, + .memory = buf[i].mem, + .size = VK_WHOLE_SIZE, + }; + if (buf[i].flags & VK_MEMORY_PROPERTY_HOST_COHERENT_BIT) + continue; + inval_list = av_fast_realloc(s->scratch, &s->scratch_size, + (++inval_count)*sizeof(*inval_list)); + if (!inval_list) + return AVERROR(ENOMEM); + inval_list[inval_count - 1] = ival_buf; + } + + if (inval_count) { + ret = vkInvalidateMappedMemoryRanges(s->hwctx->act_dev, inval_count, + inval_list); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Failed to invalidate memory: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } + + return 0; +} + +int ff_vk_unmap_buffers(AVFilterContext *avctx, FFVkBuffer *buf, int nb_buffers, + int flush) +{ + VkResult ret; + int i, err = 0; + VulkanFilterContext *s = avctx->priv; + VkMappedMemoryRange *flush_list = NULL; + int flush_count = 0; + + if (flush) { + for (i = 0; i < nb_buffers; i++) { + const VkMappedMemoryRange flush_buf = { + .sType = VK_STRUCTURE_TYPE_MAPPED_MEMORY_RANGE, + .memory = buf[i].mem, + .size = VK_WHOLE_SIZE, + }; + if (buf[i].flags & VK_MEMORY_PROPERTY_HOST_COHERENT_BIT) + continue; + flush_list = av_fast_realloc(s->scratch, &s->scratch_size, + (++flush_count)*sizeof(*flush_list)); + if (!flush_list) + return AVERROR(ENOMEM); + flush_list[flush_count - 1] = flush_buf; + } + } + + if (flush_count) { + ret = vkFlushMappedMemoryRanges(s->hwctx->act_dev, flush_count, + flush_list); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Failed to flush memory: %s\n", + ff_vk_ret2str(ret)); + err = AVERROR_EXTERNAL; /* We still want to try to unmap them */ + } + } + + for (i = 0; i < nb_buffers; i++) + vkUnmapMemory(s->hwctx->act_dev, buf[i].mem); + + return err; +} + +void ff_vk_free_buf(AVFilterContext *avctx, FFVkBuffer *buf) +{ + VulkanFilterContext *s = avctx->priv; + if (!buf) + return; + + if (buf->buf != VK_NULL_HANDLE) + vkDestroyBuffer(s->hwctx->act_dev, buf->buf, s->hwctx->alloc); + if (buf->mem != VK_NULL_HANDLE) + vkFreeMemory(s->hwctx->act_dev, buf->mem, s->hwctx->alloc); +} + +int ff_vk_create_exec_ctx(AVFilterContext *avctx, FFVkExecContext *e, int queue) +{ + VkResult ret; + VulkanFilterContext *s = avctx->priv; + + VkCommandPoolCreateInfo cqueue_create = { + .sType = VK_STRUCTURE_TYPE_COMMAND_POOL_CREATE_INFO, + .flags = VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT, + .queueFamilyIndex = queue, + }; + VkCommandBufferAllocateInfo cbuf_create = { + .sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_ALLOCATE_INFO, + .level = VK_COMMAND_BUFFER_LEVEL_PRIMARY, + .commandBufferCount = 1, + }; + VkFenceCreateInfo fence_spawn = { VK_STRUCTURE_TYPE_FENCE_CREATE_INFO }; + + ret = vkCreateCommandPool(s->hwctx->act_dev, &cqueue_create, + s->hwctx->alloc, &e->pool); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Command pool creation failure: %s\n", + ff_vk_ret2str(ret)); + return 1; + } + + cbuf_create.commandPool = e->pool; + + ret = vkAllocateCommandBuffers(s->hwctx->act_dev, &cbuf_create, &e->buf); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Command buffer alloc failure: %s\n", + ff_vk_ret2str(ret)); + return 1; + } + + ret = vkCreateFence(s->hwctx->act_dev, &fence_spawn, + s->hwctx->alloc, &e->fence); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Failed to create frame fence: %s\n", + ff_vk_ret2str(ret)); + return 1; + } + + vkGetDeviceQueue(s->hwctx->act_dev, queue, 0, &e->queue); + + return 0; +} + +void ff_vk_free_exec_ctx(AVFilterContext *avctx, FFVkExecContext *e) +{ + VulkanFilterContext *s = avctx->priv; + + if (!e) + return; + + if (e->fence != VK_NULL_HANDLE) + vkDestroyFence(s->hwctx->act_dev, e->fence, s->hwctx->alloc); + if (e->buf != VK_NULL_HANDLE) + vkFreeCommandBuffers(s->hwctx->act_dev, e->pool, 1, &e->buf); + if (e->pool != VK_NULL_HANDLE) + vkDestroyCommandPool(s->hwctx->act_dev, e->pool, s->hwctx->alloc); +} + +int ff_vk_filter_query_formats(AVFilterContext *avctx) +{ + static const enum AVPixelFormat pixel_formats[] = { + AV_PIX_FMT_VULKAN, AV_PIX_FMT_NONE, + }; + AVFilterFormats *pix_fmts = ff_make_format_list(pixel_formats); + if (!pix_fmts) + return AVERROR(ENOMEM); + + return ff_set_common_formats(avctx, pix_fmts); +} + +static int vulkan_filter_set_device(AVFilterContext *avctx, + AVBufferRef *device) +{ + VulkanFilterContext *s = avctx->priv; + + av_buffer_unref(&s->device_ref); + + s->device_ref = av_buffer_ref(device); + if (!s->device_ref) + return AVERROR(ENOMEM); + + s->device = (AVHWDeviceContext*)s->device_ref->data; + s->hwctx = s->device->hwctx; + + return 0; +} + +static int vulkan_filter_set_frames(AVFilterContext *avctx, + AVBufferRef *frames) +{ + VulkanFilterContext *s = avctx->priv; + + av_buffer_unref(&s->frames_ref); + + s->frames_ref = av_buffer_ref(frames); + if (!s->frames_ref) + return AVERROR(ENOMEM); + + return 0; +} + +int ff_vk_filter_config_input(AVFilterLink *inlink) +{ + int err; + AVFilterContext *avctx = inlink->dst; + VulkanFilterContext *s = avctx->priv; + AVHWFramesContext *input_frames; + + if (!inlink->hw_frames_ctx) { + av_log(avctx, AV_LOG_ERROR, "Vulkan filtering requires a " + "hardware frames context on the input.\n"); + return AVERROR(EINVAL); + } + + /* Extract the device and default output format from the first input. */ + if (avctx->inputs[0] != inlink) + return 0; + + input_frames = (AVHWFramesContext*)inlink->hw_frames_ctx->data; + if (input_frames->format != AV_PIX_FMT_VULKAN) + return AVERROR(EINVAL); + + err = vulkan_filter_set_device(avctx, input_frames->device_ref); + if (err < 0) + return err; + err = vulkan_filter_set_frames(avctx, inlink->hw_frames_ctx); + if (err < 0) + return err; + + /* Default output parameters match input parameters. */ + s->input_format = input_frames->sw_format; + if (s->output_format == AV_PIX_FMT_NONE) + s->output_format = input_frames->sw_format; + if (!s->output_width) + s->output_width = inlink->w; + if (!s->output_height) + s->output_height = inlink->h; + + return 0; +} + +int ff_vk_filter_config_output_inplace(AVFilterLink *outlink) +{ + int err; + AVFilterContext *avctx = outlink->src; + VulkanFilterContext *s = avctx->priv; + + av_buffer_unref(&outlink->hw_frames_ctx); + + if (!s->device_ref) { + if (!avctx->hw_device_ctx) { + av_log(avctx, AV_LOG_ERROR, "Vulkan filtering requires a " + "Vulkan device.\n"); + return AVERROR(EINVAL); + } + + err = vulkan_filter_set_device(avctx, avctx->hw_device_ctx); + if (err < 0) + return err; + } + + outlink->hw_frames_ctx = av_buffer_ref(s->frames_ref); + outlink->w = s->output_width; + outlink->h = s->output_height; + + return 0; +} + +int ff_vk_filter_config_output(AVFilterLink *outlink) +{ + int err; + AVFilterContext *avctx = outlink->src; + VulkanFilterContext *s = avctx->priv; + AVBufferRef *output_frames_ref; + AVHWFramesContext *output_frames; + + av_buffer_unref(&outlink->hw_frames_ctx); + + if (!s->device_ref) { + if (!avctx->hw_device_ctx) { + av_log(avctx, AV_LOG_ERROR, "Vulkan filtering requires a " + "Vulkan device.\n"); + return AVERROR(EINVAL); + } + + err = vulkan_filter_set_device(avctx, avctx->hw_device_ctx); + if (err < 0) + return err; + } + + output_frames_ref = av_hwframe_ctx_alloc(s->device_ref); + if (!output_frames_ref) { + err = AVERROR(ENOMEM); + goto fail; + } + output_frames = (AVHWFramesContext*)output_frames_ref->data; + + output_frames->format = AV_PIX_FMT_VULKAN; + output_frames->sw_format = s->output_format; + output_frames->width = s->output_width; + output_frames->height = s->output_height; + + err = av_hwframe_ctx_init(output_frames_ref); + if (err < 0) { + av_log(avctx, AV_LOG_ERROR, "Failed to initialise output " + "frames: %d.\n", err); + goto fail; + } + + outlink->hw_frames_ctx = output_frames_ref; + outlink->w = s->output_width; + outlink->h = s->output_height; + + return 0; +fail: + av_buffer_unref(&output_frames_ref); + return err; +} + +int ff_vk_filter_init(AVFilterContext *avctx) +{ + VulkanFilterContext *s = avctx->priv; + const shaderc_env_version opt_ver = shaderc_env_version_vulkan_1_1; + const shaderc_optimization_level opt_lvl = shaderc_optimization_level_performance; + + s->output_format = AV_PIX_FMT_NONE; + + s->sc_compiler = shaderc_compiler_initialize(); + if (!s->sc_compiler) + return AVERROR_EXTERNAL; + + s->sc_opts = shaderc_compile_options_initialize(); + if (!s->sc_compiler) + return AVERROR_EXTERNAL; + + shaderc_compile_options_set_target_env(s->sc_opts, + shaderc_target_env_vulkan, + opt_ver); + shaderc_compile_options_set_optimization_level(s->sc_opts, opt_lvl); + + return 0; +} + +void ff_vk_filter_uninit(AVFilterContext *avctx) +{ + int i; + VulkanFilterContext *s = avctx->priv; + + shaderc_compile_options_release(s->sc_opts); + shaderc_compiler_release(s->sc_compiler); + + for (i = 0; i < s->shaders_num; i++) { + SPIRVShader *shd = &s->shaders[i]; + vkDestroyShaderModule(s->hwctx->act_dev, shd->shader.module, + s->hwctx->alloc); + } + + if (s->pipeline != VK_NULL_HANDLE) + vkDestroyPipeline(s->hwctx->act_dev, s->pipeline, s->hwctx->alloc); + if (s->pipeline_layout != VK_NULL_HANDLE) + vkDestroyPipelineLayout(s->hwctx->act_dev, s->pipeline_layout, + s->hwctx->alloc); + + for (i = 0; i < s->samplers_num; i++) { + VulkanSampler *sampler = &s->samplers[i]; + VK_LOAD_PFN(s->hwctx->inst, vkDestroySamplerYcbcrConversionKHR); + vkDestroySampler(s->hwctx->act_dev, sampler->sampler, s->hwctx->alloc); + pfn_vkDestroySamplerYcbcrConversionKHR(s->hwctx->act_dev, + sampler->yuv_conv.conversion, + s->hwctx->alloc); + } + + ff_vk_free_buf(avctx, &s->vbuffer); + + for (i = 0; i < s->descriptor_sets_num; i++) { + VK_LOAD_PFN(s->hwctx->inst, vkDestroyDescriptorUpdateTemplateKHR); + pfn_vkDestroyDescriptorUpdateTemplateKHR(s->hwctx->act_dev, + s->desc_template[i], + s->hwctx->alloc); + vkDestroyDescriptorSetLayout(s->hwctx->act_dev, s->desc_layout[i], + s->hwctx->alloc); + } + + if (s->desc_pool != VK_NULL_HANDLE) + vkDestroyDescriptorPool(s->hwctx->act_dev, s->desc_pool, + s->hwctx->alloc); + if (s->renderpass != VK_NULL_HANDLE) + vkDestroyRenderPass(s->hwctx->act_dev, s->renderpass, + s->hwctx->alloc); + + av_freep(&s->desc_layout); + av_freep(&s->pool_size_desc); + av_freep(&s->shaders); + av_freep(&s->samplers); + av_buffer_unref(&s->device_ref); + av_buffer_unref(&s->frames_ref); + + /* Only freed in case of failure */ + av_freep(&s->push_consts); + av_freep(&s->pool_size_desc); + if (s->desc_template_info) { + for (i = 0; i < s->descriptor_sets_num; i++) + av_free((void *)s->desc_template_info[i].pDescriptorUpdateEntries); + av_freep(&s->desc_template_info); + } +} + +SPIRVShader *ff_vk_init_shader(AVFilterContext *avctx, const char *name, + VkShaderStageFlags stage) +{ + SPIRVShader *shd; + VulkanFilterContext *s = avctx->priv; + + s->shaders = av_realloc_array(s->shaders, sizeof(*s->shaders), + s->shaders_num + 1); + if (!s->shaders) + return NULL; + + shd = &s->shaders[s->shaders_num++]; + memset(shd, 0, sizeof(*shd)); + av_bprint_init(&shd->src, 0, AV_BPRINT_SIZE_UNLIMITED); + + shd->shader.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO; + shd->shader.stage = stage; + + shd->name = name; + + GLSLF(0, #version %i ,460); + GLSLC(0, #define AREA(v) ((v).x*(v).y) ); + GLSLC(0, #define IS_WITHIN(v1, v2) ((v1.x < v2.x) && (v1.y < v2.y)) ); + GLSLC(0, ); + + return shd; +} + +void ff_vk_set_compute_shader_sizes(AVFilterContext *avctx, SPIRVShader *shd, + int local_size[3]) +{ + shd->local_size[0] = local_size[0]; + shd->local_size[1] = local_size[1]; + shd->local_size[2] = local_size[2]; + + av_bprintf(&shd->src, "layout (local_size_x = %i, " + "local_size_y = %i, local_size_z = %i) in;\n", + shd->local_size[0], shd->local_size[1], shd->local_size[2]); +} + +static void print_shader(AVFilterContext *avctx, SPIRVShader *shd) +{ + int i; + int line = 0; + const char *p = shd->src.str; + const char *start = p; + + AVBPrint buf; + av_bprint_init(&buf, 0, AV_BPRINT_SIZE_UNLIMITED); + + for (i = 0; i < strlen(p); i++) { + if (p[i] == '\n') { + av_bprintf(&buf, "%i\t", ++line); + av_bprint_append_data(&buf, start, &p[i] - start + 1); + start = &p[i + 1]; + } + } + + av_log(avctx, AV_LOG_VERBOSE, "Compiling shader %s: \n%s\n", + shd->name, buf.str); + av_bprint_finalize(&buf, NULL); +} + +int ff_vk_compile_shader(AVFilterContext *avctx, SPIRVShader *shd, + const char *entry) +{ + VkResult ret; + VulkanFilterContext *s = avctx->priv; + VkShaderModuleCreateInfo shader_create; + + shaderc_compilation_result_t res; + static const shaderc_shader_kind type_map[] = { + [VK_SHADER_STAGE_VERTEX_BIT] = shaderc_vertex_shader, + [VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT] = shaderc_tess_control_shader, + [VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT] = shaderc_tess_evaluation_shader, + [VK_SHADER_STAGE_GEOMETRY_BIT] = shaderc_geometry_shader, + [VK_SHADER_STAGE_FRAGMENT_BIT] = shaderc_fragment_shader, + [VK_SHADER_STAGE_COMPUTE_BIT] = shaderc_compute_shader, + }; + + shd->shader.pName = entry; + + print_shader(avctx, shd); + + res = shaderc_compile_into_spv(s->sc_compiler, shd->src.str, shd->src.len, + type_map[shd->shader.stage], shd->name, + entry, s->sc_opts); + av_bprint_finalize(&shd->src, NULL); + + if (shaderc_result_get_compilation_status(res) != + shaderc_compilation_status_success) { + av_log(avctx, AV_LOG_ERROR, "%s", shaderc_result_get_error_message(res)); + return AVERROR_EXTERNAL; + } + + shader_create.sType = VK_STRUCTURE_TYPE_SHADER_MODULE_CREATE_INFO; + shader_create.pNext = NULL; + shader_create.codeSize = shaderc_result_get_length(res); + shader_create.flags = 0; + shader_create.pCode = (const uint32_t *)shaderc_result_get_bytes(res); + + ret = vkCreateShaderModule(s->hwctx->act_dev, &shader_create, NULL, + &shd->shader.module); + shaderc_result_release(res); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to create shader module: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + av_log(avctx, AV_LOG_VERBOSE, "Shader linked! Size: %zu bytes\n", + shader_create.codeSize); + + return 0; +} + +int ff_vk_init_renderpass(AVFilterContext *avctx) +{ + VulkanFilterContext *s = avctx->priv; + + VkAttachmentDescription rpass_att[] = { + { + .format = av_vkfmt_from_pixfmt(s->output_format), + .samples = VK_SAMPLE_COUNT_1_BIT, + .loadOp = VK_ATTACHMENT_LOAD_OP_DONT_CARE, + .storeOp = VK_ATTACHMENT_STORE_OP_STORE, + .stencilLoadOp = VK_ATTACHMENT_LOAD_OP_DONT_CARE, + .stencilStoreOp = VK_ATTACHMENT_STORE_OP_DONT_CARE, + .initialLayout = VK_IMAGE_LAYOUT_UNDEFINED, + .finalLayout = VK_IMAGE_LAYOUT_PRESENT_SRC_KHR, + }, + }; + + VkSubpassDescription rpass_sub_desc[] = { + { + .pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS, + .colorAttachmentCount = 1, + .pColorAttachments = (VkAttachmentReference[]) { + { 0, VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL }, + }, + .pDepthStencilAttachment = NULL, + .preserveAttachmentCount = 0, + } + }; + + VkRenderPassCreateInfo renderpass_spawn = { + .sType = VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO, + .pAttachments = rpass_att, + .attachmentCount = FF_ARRAY_ELEMS(rpass_att), + .pSubpasses = rpass_sub_desc, + .subpassCount = FF_ARRAY_ELEMS(rpass_sub_desc), + }; + + VkResult ret = vkCreateRenderPass(s->hwctx->act_dev, &renderpass_spawn, + s->hwctx->alloc, &s->renderpass); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Renderpass init failure: %s\n", ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + return 0; +} + +static VkSamplerYcbcrModelConversion conv_primaries(enum AVColorPrimaries color_primaries) +{ + switch(color_primaries) { + case AVCOL_PRI_BT470BG: + return VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_601; + case AVCOL_PRI_BT709: + return VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_709; + case AVCOL_PRI_BT2020: + return VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_2020; + } + /* Just assume its 709 */ + return VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_709; +} + +const VulkanSampler *ff_vk_init_sampler(AVFilterContext *avctx, AVFrame *input, + int unnorm_coords, VkFilter filt) +{ + VkResult ret; + VulkanFilterContext *s = avctx->priv; + VulkanSampler *sampler; + + VkSamplerCreateInfo sampler_info = { + .sType = VK_STRUCTURE_TYPE_SAMPLER_CREATE_INFO, + .magFilter = filt, + .minFilter = sampler_info.magFilter, + .mipmapMode = unnorm_coords ? VK_SAMPLER_MIPMAP_MODE_NEAREST : + VK_SAMPLER_MIPMAP_MODE_LINEAR, + .addressModeU = VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_BORDER, + .addressModeV = sampler_info.addressModeU, + .addressModeW = sampler_info.addressModeU, + .anisotropyEnable = VK_FALSE, + .compareOp = VK_COMPARE_OP_NEVER, + .borderColor = VK_BORDER_COLOR_FLOAT_TRANSPARENT_BLACK, + .unnormalizedCoordinates = unnorm_coords, + }; + + s->samplers = av_realloc_array(s->samplers, sizeof(*s->samplers), + s->samplers_num + 1); + if (!s->samplers) + return NULL; + + sampler = &s->samplers[s->samplers_num++]; + memset(sampler, 0, sizeof(*sampler)); + + sampler->converting = !!input; + sampler->yuv_conv.sType = VK_STRUCTURE_TYPE_SAMPLER_YCBCR_CONVERSION_INFO; + + if (input) { + VkSamplerYcbcrConversion *conv = &sampler->yuv_conv.conversion; + VkComponentMapping comp_map = { + .r = VK_COMPONENT_SWIZZLE_IDENTITY, + .g = VK_COMPONENT_SWIZZLE_IDENTITY, + .b = VK_COMPONENT_SWIZZLE_IDENTITY, + .a = VK_COMPONENT_SWIZZLE_IDENTITY, + }; + + VkSamplerYcbcrConversionCreateInfo c_info = { + .sType = VK_STRUCTURE_TYPE_SAMPLER_YCBCR_CONVERSION_CREATE_INFO, + .format = av_vkfmt_from_pixfmt(s->input_format), + .chromaFilter = VK_FILTER_LINEAR, + .ycbcrModel = conv_primaries(input->color_primaries), + .ycbcrRange = input->color_range == AVCOL_RANGE_JPEG ? + VK_SAMPLER_YCBCR_RANGE_ITU_FULL : + VK_SAMPLER_YCBCR_RANGE_ITU_NARROW, + .xChromaOffset = input->chroma_location == AVCHROMA_LOC_CENTER ? + VK_CHROMA_LOCATION_MIDPOINT : + VK_CHROMA_LOCATION_COSITED_EVEN, + .components = comp_map, + }; + + VK_LOAD_PFN(s->hwctx->inst, vkCreateSamplerYcbcrConversionKHR); + + sampler_info.pNext = &sampler->yuv_conv; + + if (unnorm_coords) { + av_log(avctx, AV_LOG_ERROR, "Cannot create a converting sampler " + "with unnormalized addressing, forbidden by spec!\n"); + return NULL; + } + + ret = pfn_vkCreateSamplerYcbcrConversionKHR(s->hwctx->act_dev, &c_info, + s->hwctx->alloc, conv); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to init conversion: %s\n", + ff_vk_ret2str(ret)); + return NULL; + } + } + + ret = vkCreateSampler(s->hwctx->act_dev, &sampler_info, + s->hwctx->alloc, &sampler->sampler); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to init sampler: %s\n", + ff_vk_ret2str(ret)); + return NULL; + } + + return sampler; +} + +/* A 3x2 matrix, with the translation part separate. */ +struct transform { + /* row-major, e.g. in mathematical notation: + * | m[0][0] m[0][1] | + * | m[1][0] m[1][1] | */ + float m[2][2]; + float t[2]; +}; + +/* Standard parallel 2D projection, except y1 < y0 means that the coordinate + * system is flipped, not the projection. */ +static inline void transform_ortho(struct transform *t, float x0, float x1, + float y0, float y1) +{ + if (y1 < y0) { + float tmp = y0; + y0 = tmp - y1; + y1 = tmp; + } + + t->m[0][0] = 2.0f / (x1 - x0); + t->m[0][1] = 0.0f; + t->m[1][0] = 0.0f; + t->m[1][1] = 2.0f / (y1 - y0); + t->t[0] = -(x1 + x0) / (x1 - x0); + t->t[1] = -(y1 + y0) / (y1 - y0); +} + +/* This treats m as an affine transformation, in other words m[2][n] gets + * added to the output. */ +static inline void transform_vec(struct transform t, float *x, float *y) +{ + float vx = *x, vy = *y; + *x = vx * t.m[0][0] + vy * t.m[0][1] + t.t[0]; + *y = vx * t.m[1][0] + vy * t.m[1][1] + t.t[1]; +} + +/* Vertex buffer structure */ +struct vertex { + struct { + float x, y; + } position; + struct { + float x, y; + } texcoord[4]; +}; + +int ff_vk_init_simple_vbuffer(AVFilterContext *avctx) +{ + struct vertex *va; + struct transform t; + VulkanFilterContext *s = avctx->priv; + + int i, n, err, vp_w = s->output_width, vp_h = s->output_height; + float x[2] = { 0, vp_w }; + float y[2] = { 0, vp_h }; + + s->num_verts = 4; + + err = ff_vk_create_buf(avctx, &s->vbuffer, + sizeof(struct vertex)*s->num_verts, + VK_BUFFER_USAGE_VERTEX_BUFFER_BIT, + VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT); + if (err) + return err; + + err = ff_vk_map_buffers(avctx, &s->vbuffer, (uint8_t **)&va, 1, 0); + if (err) + return err; + + transform_ortho(&t, 0, vp_w, 0, vp_h); + transform_vec(t, &x[0], &y[0]); + transform_vec(t, &x[1], &y[1]); + + for (n = 0; n < s->num_verts; n++) { + struct vertex *v = &va[n]; + v->position.x = x[n / 2]; + v->position.y = y[n % 2]; + for (i = 0; i < 4; i++) { + struct transform tr = { { { 0 } } }; + float tx = (n / 2) * vp_w; + float ty = (n % 2) * vp_h; + tr.m[0][0] = 1.0f; + tr.m[1][1] = 1.0f; + transform_vec(tr, &tx, &ty); + v->texcoord[i].x = tx / vp_w; + v->texcoord[i].y = ty / vp_h; + } + } + + err = ff_vk_unmap_buffers(avctx, &s->vbuffer, 1, 1); + if (err) + return err; + + return 0; +} + +int ff_vk_add_push_constant(AVFilterContext *avctx, int offset, int size, + VkShaderStageFlagBits stage) +{ + VkPushConstantRange *pc; + VulkanFilterContext *s = avctx->priv; + + s->push_consts = av_realloc_array(s->push_consts, sizeof(*s->push_consts), + s->push_consts_num + 1); + if (!s->push_consts) + return AVERROR(ENOMEM); + + pc = &s->push_consts[s->push_consts_num++]; + memset(pc, 0, sizeof(*pc)); + + pc->stageFlags = stage; + pc->offset = offset; + pc->size = size; + + return s->push_consts_num - 1; +} + +static const struct descriptor_props { + size_t struct_size; /* Size of the opaque which updates the descriptor */ + const char *type; + int is_uniform; + int mem_quali; /* Can use a memory qualifier */ + int dim_needed; /* Must indicate dimension */ + int buf_content; /* Must indicate buffer contents */ +} descriptor_props[] = { + [VK_DESCRIPTOR_TYPE_SAMPLER] = { sizeof(VkDescriptorImageInfo), "sampler", 1, 0, 0, 0, }, + [VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE] = { sizeof(VkDescriptorImageInfo), "texture", 1, 0, 1, 0, }, + [VK_DESCRIPTOR_TYPE_STORAGE_IMAGE] = { sizeof(VkDescriptorImageInfo), "image", 1, 1, 1, 0, }, + [VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT] = { sizeof(VkDescriptorImageInfo), "subpassInput", 1, 0, 0, 0, }, + [VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER] = { sizeof(VkDescriptorImageInfo), "sampler", 1, 0, 1, 0, }, + [VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER] = { sizeof(VkDescriptorBufferInfo), NULL, 1, 0, 0, 1, }, + [VK_DESCRIPTOR_TYPE_STORAGE_BUFFER] = { sizeof(VkDescriptorBufferInfo), "buffer", 0, 1, 0, 1, }, + [VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC] = { sizeof(VkDescriptorBufferInfo), NULL, 1, 0, 0, 1, }, + [VK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC] = { sizeof(VkDescriptorBufferInfo), "buffer", 0, 1, 0, 1, }, + [VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER] = { sizeof(VkBufferView), "samplerBuffer", 1, 0, 0, 0, }, + [VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER] = { sizeof(VkBufferView), "imageBuffer", 1, 0, 0, 0, }, +}; + +int ff_vk_add_descriptor_set(AVFilterContext *avctx, SPIRVShader *shd, + VulkanDescriptorSetBinding *desc, int num, + int only_print_to_shader) +{ + int i, j; + VkResult ret; + VkDescriptorSetLayout *layout; + VulkanFilterContext *s = avctx->priv; + + if (only_print_to_shader) + goto print; + + s->desc_layout = av_realloc_array(s->desc_layout, sizeof(*s->desc_layout), + s->descriptor_sets_num + 1); + if (!s->desc_layout) + return AVERROR(ENOMEM); + + layout = &s->desc_layout[s->descriptor_sets_num]; + memset(layout, 0, sizeof(*layout)); + + { /* Create descriptor set layout descriptions */ + VkDescriptorSetLayoutCreateInfo desc_create_layout = { 0 }; + VkDescriptorSetLayoutBinding *desc_binding; + + desc_binding = av_mallocz(sizeof(*desc_binding)*num); + if (!desc_binding) + return AVERROR(ENOMEM); + + for (i = 0; i < num; i++) { + desc_binding[i].binding = i; + desc_binding[i].descriptorType = desc[i].type; + desc_binding[i].descriptorCount = FFMAX(desc[i].elems, 1); + desc_binding[i].stageFlags = desc[i].stages; + desc_binding[i].pImmutableSamplers = desc[i].samplers; + } + + desc_create_layout.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO; + desc_create_layout.pBindings = desc_binding; + desc_create_layout.bindingCount = num; + + ret = vkCreateDescriptorSetLayout(s->hwctx->act_dev, &desc_create_layout, + s->hwctx->alloc, layout); + av_free(desc_binding); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to init descriptor set " + "layout: %s\n", ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } + + { /* Pool each descriptor by type and update pool counts */ + for (i = 0; i < num; i++) { + for (j = 0; j < s->pool_size_desc_num; j++) + if (s->pool_size_desc[j].type == desc[i].type) + break; + if (j >= s->pool_size_desc_num) { + s->pool_size_desc = av_realloc_array(s->pool_size_desc, + sizeof(*s->pool_size_desc), + ++s->pool_size_desc_num); + if (!s->pool_size_desc) + return AVERROR(ENOMEM); + memset(&s->pool_size_desc[j], 0, sizeof(VkDescriptorPoolSize)); + } + s->pool_size_desc[j].type = desc[i].type; + s->pool_size_desc[j].descriptorCount += FFMAX(desc[i].elems, 1); + } + } + + { /* Create template creation struct */ + VkDescriptorUpdateTemplateCreateInfo *dt; + VkDescriptorUpdateTemplateEntry *des_entries; + + /* Freed after descriptor set initialization */ + des_entries = av_mallocz(num*sizeof(VkDescriptorUpdateTemplateEntry)); + if (!des_entries) + return AVERROR(ENOMEM); + + for (i = 0; i < num; i++) { + des_entries[i].dstBinding = i; + des_entries[i].descriptorType = desc[i].type; + des_entries[i].descriptorCount = FFMAX(desc[i].elems, 1); + des_entries[i].dstArrayElement = 0; + des_entries[i].offset = ((uint8_t *)desc[i].updater) - (uint8_t *)s; + des_entries[i].stride = descriptor_props[desc[i].type].struct_size; + } + + s->desc_template_info = av_realloc_array(s->desc_template_info, + sizeof(*s->desc_template_info), + s->descriptor_sets_num + 1); + if (!s->desc_layout) + return AVERROR(ENOMEM); + + dt = &s->desc_template_info[s->descriptor_sets_num]; + memset(dt, 0, sizeof(*dt)); + + dt->sType = VK_STRUCTURE_TYPE_DESCRIPTOR_UPDATE_TEMPLATE_CREATE_INFO; + dt->templateType = VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_DESCRIPTOR_SET; + dt->descriptorSetLayout = *layout; + dt->pDescriptorUpdateEntries = des_entries; + dt->descriptorUpdateEntryCount = num; + } + + s->descriptor_sets_num++; + +print: + /* Write shader info */ + for (i = 0; i < num; i++) { + const struct descriptor_props *prop = &descriptor_props[desc[i].type]; + GLSLA("layout (set = %i, binding = %i", s->descriptor_sets_num - 1, i); + + if (desc[i].mem_layout) + GLSLA(", %s", desc[i].mem_layout); + GLSLA(")"); + + if (prop->is_uniform) + GLSLA(" uniform"); + + if (prop->mem_quali && desc[i].mem_quali) + GLSLA(" %s", desc[i].mem_quali); + + if (prop->type) + GLSLA(" %s", prop->type); + + if (prop->dim_needed) + GLSLA("%iD", desc[i].dimensions); + + GLSLA(" %s", desc[i].name); + + if (prop->buf_content) + GLSLA(" {\n %s\n}", desc[i].buf_content); + else if (desc[i].elems > 0) + GLSLA("[%i]", desc[i].elems); + + GLSLA(";\n"); + } + + return 0; +} + +void ff_vk_update_descriptor_set(AVFilterContext *avctx, int set_id) +{ + VulkanFilterContext *s = avctx->priv; + + VK_LOAD_PFN(s->hwctx->inst, vkUpdateDescriptorSetWithTemplateKHR); + pfn_vkUpdateDescriptorSetWithTemplateKHR(s->hwctx->act_dev, + s->desc_set[set_id], + s->desc_template[set_id], s); +} + +const enum VkImageAspectFlagBits ff_vk_aspect_flags(enum AVPixelFormat pixfmt, + int plane) +{ + const int tot_planes = av_pix_fmt_count_planes(pixfmt); + static const enum VkImageAspectFlagBits m[] = { VK_IMAGE_ASPECT_PLANE_0_BIT, + VK_IMAGE_ASPECT_PLANE_1_BIT, + VK_IMAGE_ASPECT_PLANE_2_BIT, }; + if (!tot_planes || (plane > tot_planes)) + return 0; + if (tot_planes == 1) + return VK_IMAGE_ASPECT_COLOR_BIT; + if (plane < 0) + return m[0] | m[1] | (tot_planes > 2 ? m[2] : 0); + return m[plane]; +} + +const VkFormat ff_vk_plane_rep_fmt(enum AVPixelFormat pixfmt, int plane) +{ + const int tot_planes = av_pix_fmt_count_planes(pixfmt); + const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(pixfmt); + const int high = desc->comp[plane].depth > 8; + if (tot_planes == 1) { /* RGB, etc.'s singleplane rep is itself */ + return av_vkfmt_from_pixfmt(pixfmt); + } else if (tot_planes == 2) { /* Must be NV12 or P010 */ + if (!high) + return !plane ? VK_FORMAT_R8_UNORM : VK_FORMAT_R8G8_UNORM; + else + return !plane ? VK_FORMAT_R16_UNORM : VK_FORMAT_R16G16_UNORM; + } else { /* Regular planar YUV */ + return !high ? VK_FORMAT_R8_UNORM : VK_FORMAT_R16_UNORM; + } +} + +int ff_vk_create_imageview(AVFilterContext *avctx, VkImageView *v, AVVkFrame *f, + VkFormat fmt, enum VkImageAspectFlagBits aspect, + VkComponentMapping map, const void *pnext) +{ + VulkanFilterContext *s = avctx->priv; + VkImageViewCreateInfo imgview_spawn = { + .sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO, + .pNext = pnext, + .image = f->img, + .viewType = VK_IMAGE_VIEW_TYPE_2D, + .format = fmt, + .components = map, + .subresourceRange = { + .aspectMask = aspect, + .baseMipLevel = 0, + .levelCount = 1, + .baseArrayLayer = 0, + .layerCount = 1, + }, + }; + + VkResult ret = vkCreateImageView(s->hwctx->act_dev, &imgview_spawn, + s->hwctx->alloc, v); + if (ret != VK_SUCCESS) { + av_log(s, AV_LOG_ERROR, "Failed to create imageview: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + return 0; +} + +void ff_vk_destroy_imageview(AVFilterContext *avctx, VkImageView v) +{ + VulkanFilterContext *s = avctx->priv; + vkDestroyImageView(s->hwctx->act_dev, v, s->hwctx->alloc); +} + +int ff_vk_init_pipeline_layout(AVFilterContext *avctx) +{ + int i; + VkResult ret; + VulkanFilterContext *s = avctx->priv; + + { /* Init descriptor set pool */ + VkDescriptorPoolCreateInfo pool_create_info = { + .sType = VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO, + .poolSizeCount = s->pool_size_desc_num, + .pPoolSizes = s->pool_size_desc, + .maxSets = s->descriptor_sets_num, + }; + + ret = vkCreateDescriptorPool(s->hwctx->act_dev, &pool_create_info, + s->hwctx->alloc, &s->desc_pool); + av_freep(&s->pool_size_desc); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to init descriptor set " + "pool: %s\n", ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } + + { /* Allocate descriptor sets */ + VkDescriptorSetAllocateInfo alloc_info = { + .sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO, + .descriptorPool = s->desc_pool, + .descriptorSetCount = s->descriptor_sets_num, + .pSetLayouts = s->desc_layout, + }; + + s->desc_set = av_malloc(s->descriptor_sets_num*sizeof(*s->desc_set)); + if (!s->desc_set) + return AVERROR(ENOMEM); + + ret = vkAllocateDescriptorSets(s->hwctx->act_dev, &alloc_info, + s->desc_set); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to allocate descriptor set: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } + + { /* Finally create the pipeline layout */ + VkPipelineLayoutCreateInfo spawn_pipeline_layout = { + .sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO, + .setLayoutCount = s->descriptor_sets_num, + .pSetLayouts = s->desc_layout, + .pushConstantRangeCount = s->push_consts_num, + .pPushConstantRanges = s->push_consts, + }; + + ret = vkCreatePipelineLayout(s->hwctx->act_dev, &spawn_pipeline_layout, + s->hwctx->alloc, &s->pipeline_layout); + av_freep(&s->push_consts); + s->push_consts_num = 0; + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to init pipeline layout: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } + + { /* Descriptor template (for tightly packed descriptors) */ + VK_LOAD_PFN(s->hwctx->inst, vkCreateDescriptorUpdateTemplateKHR); + VkDescriptorUpdateTemplateCreateInfo *desc_template_info; + + s->desc_template = av_malloc(s->descriptor_sets_num*sizeof(*s->desc_template)); + if (!s->desc_template) + return AVERROR(ENOMEM); + + /* Create update templates for the descriptor sets */ + for (i = 0; i < s->descriptor_sets_num; i++) { + desc_template_info = &s->desc_template_info[i]; + desc_template_info->pipelineLayout = s->pipeline_layout; + ret = pfn_vkCreateDescriptorUpdateTemplateKHR(s->hwctx->act_dev, + desc_template_info, + s->hwctx->alloc, + &s->desc_template[i]); + av_free((void *)desc_template_info->pDescriptorUpdateEntries); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to init descriptor " + "template: %s\n", ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } + + av_freep(&s->desc_template_info); + } + + return 0; +} + +int ff_vk_init_compute_pipeline(AVFilterContext *avctx) +{ + int i; + VkResult ret; + VulkanFilterContext *s = avctx->priv; + + VkComputePipelineCreateInfo pipe = { + .sType = VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO, + .layout = s->pipeline_layout, + }; + + for (i = 0; i < s->shaders_num; i++) { + if (s->shaders[i].shader.stage & VK_SHADER_STAGE_COMPUTE_BIT) { + pipe.stage = s->shaders[i].shader; + break; + } + } + if (i == s->shaders_num) { + av_log(avctx, AV_LOG_ERROR, "Can't init compute pipeline, no shader\n"); + return AVERROR(EINVAL); + } + + ret = vkCreateComputePipelines(s->hwctx->act_dev, VK_NULL_HANDLE, 1, &pipe, + s->hwctx->alloc, &s->pipeline); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to init compute pipeline: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + return 0; +} + +int ff_vk_init_graphics_pipeline(AVFilterContext *avctx) +{ + VkResult ret; + VulkanFilterContext *s = avctx->priv; + + VkVertexInputBindingDescription vbind_desc = { + .binding = 0, + .stride = sizeof(struct vertex), + .inputRate = VK_VERTEX_INPUT_RATE_VERTEX, + }; + + VkVertexInputAttributeDescription vatt_desc[4] = { { 0 } }; + for (int i = 0; i < 4; i++) { + VkVertexInputAttributeDescription *att = &vatt_desc[i]; + att->location = i; + att->binding = 0; + att->format = VK_FORMAT_R32G32_SFLOAT, + att->offset = i*2*sizeof(float); + } + + VkPipelineVertexInputStateCreateInfo vpipe_info = { + .sType = VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_STATE_CREATE_INFO, + .vertexAttributeDescriptionCount = FF_ARRAY_ELEMS(vatt_desc), + .pVertexAttributeDescriptions = vatt_desc, + .vertexBindingDescriptionCount = 1, + .pVertexBindingDescriptions = &vbind_desc, + }; + + VkPipelineDynamicStateCreateInfo dynamic_states = { + .sType = VK_STRUCTURE_TYPE_PIPELINE_DYNAMIC_STATE_CREATE_INFO, + .dynamicStateCount = 2, + .pDynamicStates = (VkDynamicState []) { + VK_DYNAMIC_STATE_VIEWPORT, VK_DYNAMIC_STATE_SCISSOR, + }, + }; + + VkPipelineInputAssemblyStateCreateInfo spawn_input_asm = { + .sType = VK_STRUCTURE_TYPE_PIPELINE_INPUT_ASSEMBLY_STATE_CREATE_INFO, + .topology = VK_PRIMITIVE_TOPOLOGY_TRIANGLE_STRIP, + .primitiveRestartEnable = VK_FALSE, + }; + + VkRect2D scissor = { .extent = { .width = s->output_width, .height = s->output_height } }; + VkViewport viewport = { .width = s->output_width, .height = s->output_height, .maxDepth = 1.0f }; + + VkPipelineViewportStateCreateInfo spawn_viewport = { + .sType = VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_STATE_CREATE_INFO, + .viewportCount = 1, + .pViewports = &viewport, + .scissorCount = 1, + .pScissors = &scissor, + }; + + VkPipelineRasterizationStateCreateInfo rasterizer = { + .sType = VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_CREATE_INFO, + .depthClampEnable = VK_FALSE, + .rasterizerDiscardEnable = VK_FALSE, + .polygonMode = VK_POLYGON_MODE_FILL, + .lineWidth = 1.0f, + .cullMode = VK_CULL_MODE_NONE, + .frontFace = VK_FRONT_FACE_COUNTER_CLOCKWISE, + .depthBiasEnable = VK_FALSE, + }; + + VkPipelineMultisampleStateCreateInfo multisampling = { + .sType = VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO, + .sampleShadingEnable = VK_FALSE, + .rasterizationSamples = VK_SAMPLE_COUNT_1_BIT, + .minSampleShading = 1.0f, + .alphaToCoverageEnable = VK_FALSE, + .alphaToOneEnable = VK_FALSE, + }; + + VkPipelineColorBlendAttachmentState col_blend_att = { + .colorWriteMask = VK_COLOR_COMPONENT_R_BIT | VK_COLOR_COMPONENT_G_BIT | + VK_COLOR_COMPONENT_B_BIT | VK_COLOR_COMPONENT_A_BIT, + .blendEnable = VK_FALSE, + .srcColorBlendFactor = VK_BLEND_FACTOR_ONE, + .dstColorBlendFactor = VK_BLEND_FACTOR_ZERO, + .colorBlendOp = VK_BLEND_OP_ADD, + .srcAlphaBlendFactor = VK_BLEND_FACTOR_ONE, + .dstAlphaBlendFactor = VK_BLEND_FACTOR_ZERO, + .alphaBlendOp = VK_BLEND_OP_ADD, + }; + + VkPipelineColorBlendStateCreateInfo col_blend = { + .sType = VK_STRUCTURE_TYPE_PIPELINE_COLOR_BLEND_STATE_CREATE_INFO, + .logicOpEnable = VK_FALSE, + .logicOp = VK_LOGIC_OP_COPY, + .attachmentCount = 1, + .pAttachments = &col_blend_att, + }; + + VkGraphicsPipelineCreateInfo spawn_pipeline = { + .sType = VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO, + .pVertexInputState = &vpipe_info, + .stageCount = s->shaders_num, + //.pStages = s->shaders, TODO + .renderPass = s->renderpass, + .subpass = 0, + .layout = s->pipeline_layout, + .pDynamicState = &dynamic_states, + .pInputAssemblyState = &spawn_input_asm, + .pViewportState = &spawn_viewport, + .pRasterizationState = &rasterizer, + .pMultisampleState = &multisampling, + .pColorBlendState = &col_blend, + }; + + ret = vkCreateGraphicsPipelines(s->hwctx->act_dev, VK_NULL_HANDLE, 1, + &spawn_pipeline, + s->hwctx->alloc, &s->pipeline); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to init pipeline: %s\n", ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + return 0; +} diff --git a/libavfilter/vulkan.h b/libavfilter/vulkan.h new file mode 100644 index 0000000000..45a13d4932 --- /dev/null +++ b/libavfilter/vulkan.h @@ -0,0 +1,234 @@ +/* + * Vulkan utilities + * Copyright (c) 2018 Rostislav Pehlivanov + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVFILTER_VULKAN_COMMON_H +#define AVFILTER_VULKAN_COMMON_H + +#include "avfilter.h" +#include "libavutil/pixdesc.h" +#include "libavutil/bprint.h" +#include "libavutil/hwcontext.h" +#include "libavutil/hwcontext_vulkan.h" + +#include + +/* GLSL management macros */ +#define INDENT(N) INDENT_##N +#define INDENT_0 +#define INDENT_1 INDENT_0 " " +#define INDENT_2 INDENT_1 INDENT_1 +#define INDENT_3 INDENT_2 INDENT_1 +#define INDENT_4 INDENT_3 INDENT_1 +#define INDENT_5 INDENT_4 INDENT_1 +#define INDENT_6 INDENT_5 INDENT_1 +#define C(N, S) INDENT(N) #S "\n" +#define GLSLC(N, S) av_bprintf(&shd->src, C(N, S)) +#define GLSLA(...) av_bprintf(&shd->src, __VA_ARGS__) +#define GLSLF(N, S, ...) av_bprintf(&shd->src, C(N, S), __VA_ARGS__) +#define GLSLD(D) GLSLC(0, ); \ + av_bprint_append_data(&shd->src, D, strlen(D)); \ + GLSLC(0, ) + +/* Helper, pretty much every Vulkan return value needs to be checked */ +#define RET(x) \ + do { \ + if ((err = (x)) < 0) \ + goto fail; \ + } while (0) + +/* Useful for attaching immutable samplers to arrays */ +#define DUP_SAMPLER_ARRAY4(x) (const VkSampler []){ x, x, x, x, } + +typedef struct SPIRVShader { + const char *name; /* Name for id/debugging purposes */ + AVBPrint src; + int local_size[3]; /* Compute shader workgroup sizes */ + VkPipelineShaderStageCreateInfo shader; +} SPIRVShader; + +typedef struct VulkanDescriptorSetBinding { + const char *name; + VkDescriptorType type; + const char *mem_layout; /* Storage images (rgba8, etc.) and buffers (std430, etc.) */ + const char *mem_quali; /* readonly, writeonly, etc. */ + const char *buf_content; /* For buffers */ + uint32_t dimensions; /* Needed for e.g. sampler%iD */ + uint32_t elems; /* 0 - scalar, 1 or more - vector */ + VkShaderStageFlags stages; + const VkSampler *samplers; /* Immutable samplers, length - #elems */ + void *updater; +} VulkanDescriptorSetBinding; + +typedef struct VulkanSampler { + VkSampler sampler; + VkSamplerYcbcrConversionInfo yuv_conv; /* For imageview creation */ + int converting; /* Indicates whether sampler is a converting one */ +} VulkanSampler; + +typedef struct FFVkExecContext { + VkCommandPool pool; + VkCommandBuffer buf; + VkQueue queue; + VkFence fence; +} FFVkExecContext; + +typedef struct FFVkBuffer { + VkBuffer buf; + VkDeviceMemory mem; + VkMemoryPropertyFlagBits flags; +} FFVkBuffer; + +typedef struct VulkanFilterContext { + const AVClass *class; + + AVBufferRef *device_ref; + AVBufferRef *frames_ref; /* For in-place filtering */ + AVHWDeviceContext *device; + AVVulkanDeviceContext *hwctx; + + /* Properties */ + int output_width; + int output_height; + enum AVPixelFormat output_format; + enum AVPixelFormat input_format; + + /* Samplers */ + VulkanSampler *samplers; + int samplers_num; + + /* Shaders */ + SPIRVShader *shaders; + int shaders_num; + shaderc_compiler_t sc_compiler; + shaderc_compile_options_t sc_opts; + + /* Contexts */ + VkRenderPass renderpass; + VkPipelineLayout pipeline_layout; + VkPipeline pipeline; + + /* Descriptors */ + VkDescriptorSetLayout *desc_layout; + VkDescriptorPool desc_pool; + VkDescriptorSet *desc_set; + VkDescriptorUpdateTemplate *desc_template; + int push_consts_num; + int descriptor_sets_num; + int pool_size_desc_num; + + /* Vertex buffer */ + FFVkBuffer vbuffer; + int num_verts; + + /* Temporary, used to store data in between initialization stages */ + VkDescriptorUpdateTemplateCreateInfo *desc_template_info; + VkDescriptorPoolSize *pool_size_desc; + VkPushConstantRange *push_consts; + void *scratch; /* Scratch memory used only in functions */ + unsigned int scratch_size; +} VulkanFilterContext; + +/* Generic memory allocation. + * Will align size to the minimum map alignment requirement in case req_flags + * has VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT set */ +int ff_vk_alloc_mem(AVFilterContext *avctx, VkMemoryRequirements *req, + VkMemoryPropertyFlagBits req_flags, void *alloc_extension, + VkMemoryPropertyFlagBits *mem_flags, VkDeviceMemory *mem); + +/* Buffer I/O */ +int ff_vk_create_buf(AVFilterContext *avctx, FFVkBuffer *buf, size_t size, + VkBufferUsageFlags usage, VkMemoryPropertyFlagBits flags); +int ff_vk_map_buffers(AVFilterContext *avctx, FFVkBuffer *buf, uint8_t *mem[], + int nb_buffers, int invalidate); +int ff_vk_unmap_buffers(AVFilterContext *avctx, FFVkBuffer *buf, int nb_buffers, + int flush); +void ff_vk_free_buf(AVFilterContext *avctx, FFVkBuffer *buf); + +/* Command context init/uninit */ +int ff_vk_create_exec_ctx(AVFilterContext *avctx, FFVkExecContext *e, int queue); +void ff_vk_free_exec_ctx(AVFilterContext *avctx, FFVkExecContext *e); + +/* Converts Vulkan return values to strings */ +const char *ff_vk_ret2str(VkResult res); + +/* Create a Vulkan sampler, if input isn't NULL the sampler will convert to RGB */ +const VulkanSampler *ff_vk_init_sampler(AVFilterContext *avctx, AVFrame *input, + int unnorm_coords, VkFilter filt); + +/* Gets the single-plane representation format */ +const VkFormat ff_vk_plane_rep_fmt(enum AVPixelFormat pixfmt, int plane); +/* Gets the image aspect flags of a plane */ +const enum VkImageAspectFlagBits ff_vk_aspect_flags(enum AVPixelFormat pixfmt, + int plane); +/* Creates an imageview */ +int ff_vk_create_imageview(AVFilterContext *avctx, VkImageView *v, AVVkFrame *f, + VkFormat fmt, enum VkImageAspectFlagBits aspect, + VkComponentMapping map, const void *pnext); +/* Destroys an imageview */ +void ff_vk_destroy_imageview(AVFilterContext *avctx, VkImageView v); +/* Creates a shader */ +SPIRVShader *ff_vk_init_shader(AVFilterContext *avctx, const char *name, + VkShaderStageFlags stage); +/* For compute shaders, defines the workgroup size */ +void ff_vk_set_compute_shader_sizes(AVFilterContext *avctx, SPIRVShader *shd, + int local_size[3]); +/* Compiles a completed shader into a module */ +int ff_vk_compile_shader(AVFilterContext *avctx, SPIRVShader *shd, + const char *entry); + + + + + +/* Needs to be abstracted so it adds them to a certain pipeline layout */ +int ff_vk_add_descriptor_set(AVFilterContext *avctx, SPIRVShader *shd, + VulkanDescriptorSetBinding *desc, int num, + int only_print_to_shader); +int ff_vk_add_push_constant(AVFilterContext *avctx, int offset, int size, + VkShaderStageFlagBits stage); + + + + +/* Creates a Vulkan pipeline layout */ +int ff_vk_init_pipeline_layout(AVFilterContext *avctx); + +/* Creates a compute pipeline */ +int ff_vk_init_compute_pipeline(AVFilterContext *avctx); + +/* Creates a Vulkan renderpass */ +int ff_vk_init_renderpass(AVFilterContext *avctx); +/* Creates a graphics pipeline */ +int ff_vk_init_graphics_pipeline(AVFilterContext *avctx); +/* Init a simple vertex buffer (4 vertices, a rectangle matching the video) */ +int ff_vk_init_simple_vbuffer(AVFilterContext *avctx); +/* Updates a given descriptor set after pipeline initialization */ +void ff_vk_update_descriptor_set(AVFilterContext *avctx, int set_id); + +/* General lavfi IO functions */ +int ff_vk_filter_query_formats (AVFilterContext *avctx); +int ff_vk_filter_init (AVFilterContext *avctx); +int ff_vk_filter_config_input (AVFilterLink *inlink); +int ff_vk_filter_config_output (AVFilterLink *outlink); +int ff_vk_filter_config_output_inplace(AVFilterLink *outlink); +void ff_vk_filter_uninit (AVFilterContext *avctx); + +#endif /* AVFILTER_VULKAN_COMMON_H */