From patchwork Tue May 22 04:28:33 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rostislav Pehlivanov X-Patchwork-Id: 9055 Delivered-To: ffmpegpatchwork@gmail.com Received: by 2002:a02:155:0:0:0:0:0 with SMTP id c82-v6csp1060650jad; Mon, 21 May 2018 21:38:30 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpzT+QPNblgZMIRzVq1a370ewqPkhyNINk6c1mzpsXgfrEfhwV2xv97R91uyLH2AsS/LaQN X-Received: by 2002:adf:8e27:: with SMTP id n36-v6mr16562415wrb.27.1526963909983; Mon, 21 May 2018 21:38:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526963909; cv=none; d=google.com; s=arc-20160816; b=KAo+Uj90jyrEvIQSi7cbuNbfZh6k6rXp34ta3/xNha2QEJVmFfAesuo93atYhGIGoS yQxouh9jqj8DPM/Xn0LyMWFmOrb5hjyVg7d7WaTkaFPhhFDWyuw4wl01n8XEL2joq1Jy aHO0YrtlztWc6rBC2qYlF2C08J6Z28wVonEaCiAl5nyr/DWmVbWufU+qvPfrxktkXXwo DyVWrWODsPgHxvT4MqbWquUSS1+VmOJbFeNQu58+LMt/At/6KDRPH/y1GeAo08MVslpq oR5G1qsb8U7ELkUB3mvAV21pPCHcaPCmCSRYBcyix8PgURUtbjjQUqZROtyhvBUP8qt2 2nqg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:references:in-reply-to:message-id:date :to:from:dkim-signature:delivered-to:arc-authentication-results; bh=fAP8xMQHjaV4J8eWMI6bpHbNKD/ZZuoUIpfspGbvIpo=; b=1AWfIjnIHttfyaIEnYKHkMbL96Ykd89nz6lOA0MowquGj10SXnD8dnk50/ZFQrdkyv 19NKWtPZXJ23OgJc7NMWV2E47iAiWCaRQ4Y77QrA3zUrXngN0Efxi6X2LVi3owqJciZ9 pDImZiijImd+Bz05hPt4oUyMRzk8oTl1jgEmEWnvCkEJFxY83rg49m6X5OoxKwe33m9z g+1vb180nnmpjKs4/DtNvJ0I48jS5U/41ZjYjzXjMJVERtWcf6bXZHiq16KMIrhCOu84 vZx7GqHzm2q7lkHCNXHTpsROw/BRUhWZuwcYPUOcEhWkEg6RY/HZTWV4T9UF02u3hdvC 3nOA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20161025 header.b=TWu0MlHS; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id v78-v6si14038271wrb.261.2018.05.21.21.38.28; Mon, 21 May 2018 21:38:29 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20161025 header.b=TWu0MlHS; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id F0EB2689F10; Tue, 22 May 2018 07:37:45 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wr0-f194.google.com (mail-wr0-f194.google.com [209.85.128.194]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 7DC28689EB5 for ; Tue, 22 May 2018 07:37:39 +0300 (EEST) Received: by mail-wr0-f194.google.com with SMTP id j1-v6so7042370wrm.1 for ; Mon, 21 May 2018 21:38:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=7DSTDMixIcIlJyb051e8RF+Kzvv6e9SyT4YhOKRk6xA=; b=TWu0MlHSieSsrzg2iMTu0oz75zJOTG3+g5GRdB+tz6rZ4WNwHylpKT5FmLWUt61v2h X5AlGylS8L5C8nKrgPBMZe7mruXNhuT7Vg1XSbxHoEwuMX6Zr3JW8qTWBA/7s/u6UJi5 kZcaFTZfdyCD25g4mD8excZx3UD6Ar8s+j5+p/QEmU3LGRYmZ8iRY+5mOVD2MB3BfdS+ W0xI1v8xDFGBEiiUn7/HGq7wGGU0a+/vMoa63MXPxUD7hp72OjaAc9vIeu4FfFviCVBi sl0thz7fsenyZTZn5keXQR8E3+D4x2wZLHv4TlwUBDbmlypQDkS685DE2ey2B3XKUHC7 GceQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=7DSTDMixIcIlJyb051e8RF+Kzvv6e9SyT4YhOKRk6xA=; b=RB1zM0tUSZquUr3H7If5K+eb5Mi8NKDMjm5zzFtUBSFYXr4LNLJgeafTY+TBfEu2aH XrUVZdmoaYMhP+vKgMGjnw55xHbDnXRIOGxsTBn21G0S62/5MsDrWBDAlTUEJBzZZzf+ PMe5UcsWce8aQFlmswR6EatOzgcW5rp2IlQFolcSxMkvnegifCRjaBr2OsA90Z/jZnED HrUt08HdNzl+iOiJEjz+D6npSeMnw+4K1SOUBgZuP1A3KvFcXUHkJgeMznuadNe7SFv+ zQeezhiBkckXuFqOuP3tyRrylO8q22RBVdrv0ltVEBpso8BIuV3oqroslbSEEAI7GNXR uecg== X-Gm-Message-State: ALKqPwdEzu+YBjbW3AJT7m2H1gQRF2zrFaTpHGOMMcQ0eTxFyJJ7fCw8 MFWKdCv2KYc1VK956vNAeLongB1I X-Received: by 2002:adf:bf08:: with SMTP id p8-v6mr6020697wrh.166.1526963441722; Mon, 21 May 2018 21:30:41 -0700 (PDT) Received: from moonbase.pars.ee ([2a00:23c4:7c8c:8e00:41c8:943:a8a:1468]) by smtp.gmail.com with ESMTPSA id b72-v6sm23314531wmf.11.2018.05.21.21.30.40 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 21 May 2018 21:30:40 -0700 (PDT) From: Rostislav Pehlivanov To: ffmpeg-devel@ffmpeg.org Date: Tue, 22 May 2018 05:28:33 +0100 Message-Id: <20180522042833.11209-1-atomnuker@gmail.com> X-Mailer: git-send-email 2.17.0 In-Reply-To: <4f962cc1-9f77-c034-6f24-4d4e542079a6@gmail.com> References: <4f962cc1-9f77-c034-6f24-4d4e542079a6@gmail.com> Subject: [FFmpeg-devel] [PATCH v4 4/8] lavfi: add common Vulkan filtering code X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Rostislav Pehlivanov MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" This commit adds a common code for use in Vulkan filters. It attempts to ease the burden of writing Vulkan image filtering to a minimum, which is pretty much a requirement considering how verbose the API is. It supports both compute and graphic pipelines and manages to abstract the API to such a level there's no need to call any Vulkan functions inside the init path of the code. Handling shader descriptors is probably the bulk of the code, and despite the abstraction, it loses none of the features for describing shader IO. In order to produce linkable shaders, it depends on the libshaderc library (and depends on the latest stable version of it). This allows for greater performance and flexibility than static built-in shaders and also eliminates the cumbersome process of interfacing with glslang to compile GLSL to SPIR-V. It's based off of the common opencl and provides similar interfaces for filter pad init and config, with the addition that it also supports in-place filtering. Signed-off-by: Rostislav Pehlivanov --- configure | 12 +- libavfilter/vulkan.c | 1190 ++++++++++++++++++++++++++++++++++++++++++ libavfilter/vulkan.h | 223 ++++++++ 3 files changed, 1423 insertions(+), 2 deletions(-) create mode 100644 libavfilter/vulkan.c create mode 100644 libavfilter/vulkan.h diff --git a/configure b/configure index 5f4407b753..abcfe32625 100755 --- a/configure +++ b/configure @@ -252,6 +252,7 @@ External library support: --enable-librsvg enable SVG rasterization via librsvg [no] --enable-librubberband enable rubberband needed for rubberband filter [no] --enable-librtmp enable RTMP[E] support via librtmp [no] + --enable-libshaderc enable GLSL->SPIRV compilation via libshaderc [no] --enable-libshine enable fixed-point MP3 encoding via libshine [no] --enable-libsmbclient enable Samba protocol via libsmbclient [no] --enable-libsnappy enable Snappy compression, needed for hap encoding [no] @@ -1707,6 +1708,7 @@ EXTERNAL_LIBRARY_LIST=" libpulse librsvg librtmp + libshaderc libshine libsmbclient libsnappy @@ -2225,6 +2227,7 @@ HAVE_LIST=" opencl_dxva2 opencl_vaapi_beignet opencl_vaapi_intel_media + shaderc_opt_perf vulkan_drm_mod perl pod2man @@ -3456,12 +3459,12 @@ avcodec_select="null_bsf" avdevice_deps="avformat avcodec avutil" avdevice_suggest="libm" avfilter_deps="avutil" -avfilter_suggest="libm" +avfilter_suggest="libm libshaderc" avformat_deps="avcodec avutil" avformat_suggest="libm network zlib" avresample_deps="avutil" avresample_suggest="libm" -avutil_suggest="clock_gettime ffnvcodec libm libdrm libmfx opencl user32 vaapi videotoolbox corefoundation corevideo coremedia bcrypt" +avutil_suggest="clock_gettime ffnvcodec libm libdrm libmfx opencl vulkan user32 vaapi videotoolbox corefoundation corevideo coremedia bcrypt" postproc_deps="avutil gpl" postproc_suggest="libm" swresample_deps="avutil" @@ -6050,6 +6053,7 @@ enabled libpulse && require_pkg_config libpulse libpulse pulse/pulseaud enabled librsvg && require_pkg_config librsvg librsvg-2.0 librsvg-2.0/librsvg/rsvg.h rsvg_handle_render_cairo enabled librtmp && require_pkg_config librtmp librtmp librtmp/rtmp.h RTMP_Socket enabled librubberband && require_pkg_config librubberband "rubberband >= 1.8.1" rubberband/rubberband-c.h rubberband_new -lstdc++ && append librubberband_extralibs "-lstdc++" +enabled libshaderc && require libshaderc shaderc/shaderc.h shaderc_compiler_initialize -lshaderc_shared enabled libshine && require_pkg_config libshine shine shine/layer3.h shine_encode_buffer enabled libsmbclient && { check_pkg_config libsmbclient smbclient libsmbclient.h smbc_init || require libsmbclient libsmbclient.h smbc_init -lsmbclient; } @@ -6355,6 +6359,10 @@ enabled crystalhd && check_lib crystalhd "stdint.h libcrystalhd/libcrystalhd_if. enabled vulkan && require_pkg_config vulkan "vulkan >= 1.1.73" "vulkan/vulkan.h" vkCreateInstance +if enabled_all vulkan libshaderc ; then + check_cc shaderc_opt_perf shaderc/shaderc.h "int t = shaderc_optimization_level_performance" +fi + if enabled_all vulkan libdrm ; then check_cpp_condition vulkan_drm_mod vulkan/vulkan.h "defined VK_EXT_IMAGE_DRM_FORMAT_MODIFIER_EXTENSION_NAME" fi diff --git a/libavfilter/vulkan.c b/libavfilter/vulkan.c new file mode 100644 index 0000000000..7954c6f665 --- /dev/null +++ b/libavfilter/vulkan.c @@ -0,0 +1,1190 @@ +/* + * Vulkan utilities + * Copyright (c) 2018 Rostislav Pehlivanov + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "formats.h" +#include "vulkan.h" + +#define VK_LOAD_PFN(inst, name) PFN_##name pfn_##name = (PFN_##name) \ + vkGetInstanceProcAddr(inst, #name) + +/* Converts return values to strings */ +const char *ff_vk_ret2str(VkResult res) +{ +#define CASE(VAL) case VAL: return #VAL + switch (res) { + CASE(VK_SUCCESS); + CASE(VK_NOT_READY); + CASE(VK_TIMEOUT); + CASE(VK_EVENT_SET); + CASE(VK_EVENT_RESET); + CASE(VK_INCOMPLETE); + CASE(VK_ERROR_OUT_OF_HOST_MEMORY); + CASE(VK_ERROR_OUT_OF_DEVICE_MEMORY); + CASE(VK_ERROR_INITIALIZATION_FAILED); + CASE(VK_ERROR_DEVICE_LOST); + CASE(VK_ERROR_MEMORY_MAP_FAILED); + CASE(VK_ERROR_LAYER_NOT_PRESENT); + CASE(VK_ERROR_EXTENSION_NOT_PRESENT); + CASE(VK_ERROR_FEATURE_NOT_PRESENT); + CASE(VK_ERROR_INCOMPATIBLE_DRIVER); + CASE(VK_ERROR_TOO_MANY_OBJECTS); + CASE(VK_ERROR_FORMAT_NOT_SUPPORTED); + CASE(VK_ERROR_FRAGMENTED_POOL); + CASE(VK_ERROR_SURFACE_LOST_KHR); + CASE(VK_ERROR_NATIVE_WINDOW_IN_USE_KHR); + CASE(VK_SUBOPTIMAL_KHR); + CASE(VK_ERROR_OUT_OF_DATE_KHR); + CASE(VK_ERROR_INCOMPATIBLE_DISPLAY_KHR); + CASE(VK_ERROR_VALIDATION_FAILED_EXT); + CASE(VK_ERROR_INVALID_SHADER_NV); + CASE(VK_ERROR_OUT_OF_POOL_MEMORY); + CASE(VK_ERROR_INVALID_EXTERNAL_HANDLE); + CASE(VK_ERROR_NOT_PERMITTED_EXT); + default: return "Unknown error"; + } +#undef CASE +} + +int ff_vk_alloc_mem(AVFilterContext *avctx, VkMemoryRequirements *req, + VkMemoryPropertyFlagBits req_flags, void *alloc_extension, + VkMemoryPropertyFlagBits *mem_flags, VkDeviceMemory *mem) +{ + VkResult ret; + int index = -1; + VkPhysicalDeviceProperties props; + VkPhysicalDeviceMemoryProperties mprops; + VulkanFilterContext *s = avctx->priv; + + VkMemoryAllocateInfo alloc_info = { + .sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO, + .pNext = alloc_extension, + }; + + vkGetPhysicalDeviceProperties(s->hwctx->phys_dev, &props); + vkGetPhysicalDeviceMemoryProperties(s->hwctx->phys_dev, &mprops); + + /* Align if we need to */ + if (req_flags & VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT) + req->size = FFALIGN(req->size, props.limits.minMemoryMapAlignment); + + alloc_info.allocationSize = req->size; + + /* The vulkan spec requires memory types to be sorted in the "optimal" + * order, so the first matching type we find will be the best/fastest one */ + for (int i = 0; i < mprops.memoryTypeCount; i++) { + /* The memory type must be supported by the requirements (bitfield) */ + if (!(req->memoryTypeBits & (1 << i))) + continue; + + /* The memory type flags must include our properties */ + if ((mprops.memoryTypes[i].propertyFlags & req_flags) != req_flags) + continue; + + /* Found a suitable memory type */ + index = i; + break; + } + + if (index < 0) { + av_log(avctx, AV_LOG_ERROR, "No memory type found for flags 0x%x\n", + req_flags); + return AVERROR(EINVAL); + } + + alloc_info.memoryTypeIndex = index; + + ret = vkAllocateMemory(s->hwctx->act_dev, &alloc_info, + s->hwctx->alloc, mem); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Failed to allocate memory: %s\n", + ff_vk_ret2str(ret)); + return AVERROR(ENOMEM); + } + + *mem_flags |= mprops.memoryTypes[index].propertyFlags; + + return 0; +} + +int ff_vk_create_buf(AVFilterContext *avctx, FFVkBuffer *buf, size_t size, + VkBufferUsageFlags usage, VkMemoryPropertyFlagBits flags) +{ + int err; + VkResult ret; + VkMemoryRequirements req; + VulkanFilterContext *s = avctx->priv; + + VkBufferCreateInfo buf_spawn = { + .sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO, + .pNext = NULL, + .usage = usage, + .sharingMode = VK_SHARING_MODE_EXCLUSIVE, + .size = size, /* Gets FFALIGNED during alloc if host visible + but should be ok */ + }; + + ret = vkCreateBuffer(s->hwctx->act_dev, &buf_spawn, NULL, &buf->buf); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Failed to create buffer: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + vkGetBufferMemoryRequirements(s->hwctx->act_dev, buf->buf, &req); + + err = ff_vk_alloc_mem(avctx, &req, flags, NULL, &buf->flags, &buf->mem); + if (err) + return err; + + ret = vkBindBufferMemory(s->hwctx->act_dev, buf->buf, buf->mem, 0); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Failed to bind memory to buffer: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + return 0; +} + +int ff_vk_map_buffers(AVFilterContext *avctx, FFVkBuffer *buf, uint8_t *mem[], + int nb_buffers, int invalidate) +{ + VkResult ret; + VulkanFilterContext *s = avctx->priv; + VkMappedMemoryRange *inval_list = NULL; + int inval_count = 0; + + for (int i = 0; i < nb_buffers; i++) { + ret = vkMapMemory(s->hwctx->act_dev, buf[i].mem, 0, + VK_WHOLE_SIZE, 0, (void **)&mem[i]); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Failed to map buffer memory: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } + + if (!invalidate) + return 0; + + for (int i = 0; i < nb_buffers; i++) { + const VkMappedMemoryRange ival_buf = { + .sType = VK_STRUCTURE_TYPE_MAPPED_MEMORY_RANGE, + .memory = buf[i].mem, + .size = VK_WHOLE_SIZE, + }; + if (buf[i].flags & VK_MEMORY_PROPERTY_HOST_COHERENT_BIT) + continue; + inval_list = av_fast_realloc(s->scratch, &s->scratch_size, + (++inval_count)*sizeof(*inval_list)); + if (!inval_list) + return AVERROR(ENOMEM); + inval_list[inval_count - 1] = ival_buf; + } + + if (inval_count) { + ret = vkInvalidateMappedMemoryRanges(s->hwctx->act_dev, inval_count, + inval_list); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Failed to invalidate memory: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } + + return 0; +} + +int ff_vk_unmap_buffers(AVFilterContext *avctx, FFVkBuffer *buf, int nb_buffers, + int flush) +{ + int err = 0; + VkResult ret; + VulkanFilterContext *s = avctx->priv; + VkMappedMemoryRange *flush_list = NULL; + int flush_count = 0; + + if (flush) { + for (int i = 0; i < nb_buffers; i++) { + const VkMappedMemoryRange flush_buf = { + .sType = VK_STRUCTURE_TYPE_MAPPED_MEMORY_RANGE, + .memory = buf[i].mem, + .size = VK_WHOLE_SIZE, + }; + if (buf[i].flags & VK_MEMORY_PROPERTY_HOST_COHERENT_BIT) + continue; + flush_list = av_fast_realloc(s->scratch, &s->scratch_size, + (++flush_count)*sizeof(*flush_list)); + if (!flush_list) + return AVERROR(ENOMEM); + flush_list[flush_count - 1] = flush_buf; + } + } + + if (flush_count) { + ret = vkFlushMappedMemoryRanges(s->hwctx->act_dev, flush_count, + flush_list); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Failed to flush memory: %s\n", + ff_vk_ret2str(ret)); + err = AVERROR_EXTERNAL; /* We still want to try to unmap them */ + } + } + + for (int i = 0; i < nb_buffers; i++) + vkUnmapMemory(s->hwctx->act_dev, buf[i].mem); + + return err; +} + +void ff_vk_free_buf(AVFilterContext *avctx, FFVkBuffer *buf) +{ + VulkanFilterContext *s = avctx->priv; + if (!buf) + return; + + if (buf->buf != VK_NULL_HANDLE) + vkDestroyBuffer(s->hwctx->act_dev, buf->buf, s->hwctx->alloc); + if (buf->mem != VK_NULL_HANDLE) + vkFreeMemory(s->hwctx->act_dev, buf->mem, s->hwctx->alloc); +} + +int ff_vk_create_exec_ctx(AVFilterContext *avctx, FFVkExecContext *e, int queue) +{ + VkResult ret; + VulkanFilterContext *s = avctx->priv; + + VkCommandPoolCreateInfo cqueue_create = { + .sType = VK_STRUCTURE_TYPE_COMMAND_POOL_CREATE_INFO, + .flags = VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT, + .queueFamilyIndex = queue, + }; + VkCommandBufferAllocateInfo cbuf_create = { + .sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_ALLOCATE_INFO, + .level = VK_COMMAND_BUFFER_LEVEL_PRIMARY, + .commandBufferCount = 1, + }; + VkFenceCreateInfo fence_spawn = { VK_STRUCTURE_TYPE_FENCE_CREATE_INFO }; + + ret = vkCreateCommandPool(s->hwctx->act_dev, &cqueue_create, + s->hwctx->alloc, &e->pool); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Command pool creation failure: %s\n", + ff_vk_ret2str(ret)); + return 1; + } + + cbuf_create.commandPool = e->pool; + + ret = vkAllocateCommandBuffers(s->hwctx->act_dev, &cbuf_create, &e->buf); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Command buffer alloc failure: %s\n", + ff_vk_ret2str(ret)); + return 1; + } + + ret = vkCreateFence(s->hwctx->act_dev, &fence_spawn, + s->hwctx->alloc, &e->fence); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Failed to create frame fence: %s\n", + ff_vk_ret2str(ret)); + return 1; + } + + vkGetDeviceQueue(s->hwctx->act_dev, queue, 0, &e->queue); + + return 0; +} + +void ff_vk_free_exec_ctx(AVFilterContext *avctx, FFVkExecContext *e) +{ + VulkanFilterContext *s = avctx->priv; + + if (!e) + return; + + if (e->fence != VK_NULL_HANDLE) + vkDestroyFence(s->hwctx->act_dev, e->fence, s->hwctx->alloc); + if (e->buf != VK_NULL_HANDLE) + vkFreeCommandBuffers(s->hwctx->act_dev, e->pool, 1, &e->buf); + if (e->pool != VK_NULL_HANDLE) + vkDestroyCommandPool(s->hwctx->act_dev, e->pool, s->hwctx->alloc); +} + +int ff_vk_filter_query_formats(AVFilterContext *avctx) +{ + static const enum AVPixelFormat pixel_formats[] = { + AV_PIX_FMT_VULKAN, AV_PIX_FMT_NONE, + }; + AVFilterFormats *pix_fmts = ff_make_format_list(pixel_formats); + if (!pix_fmts) + return AVERROR(ENOMEM); + + return ff_set_common_formats(avctx, pix_fmts); +} + +static int vulkan_filter_set_device(AVFilterContext *avctx, + AVBufferRef *device) +{ + VulkanFilterContext *s = avctx->priv; + + av_buffer_unref(&s->device_ref); + + s->device_ref = av_buffer_ref(device); + if (!s->device_ref) + return AVERROR(ENOMEM); + + s->device = (AVHWDeviceContext*)s->device_ref->data; + s->hwctx = s->device->hwctx; + + return 0; +} + +static int vulkan_filter_set_frames(AVFilterContext *avctx, + AVBufferRef *frames) +{ + VulkanFilterContext *s = avctx->priv; + + av_buffer_unref(&s->frames_ref); + + s->frames_ref = av_buffer_ref(frames); + if (!s->frames_ref) + return AVERROR(ENOMEM); + + return 0; +} + +int ff_vk_filter_config_input(AVFilterLink *inlink) +{ + int err; + AVFilterContext *avctx = inlink->dst; + VulkanFilterContext *s = avctx->priv; + AVHWFramesContext *input_frames; + + if (!inlink->hw_frames_ctx) { + av_log(avctx, AV_LOG_ERROR, "Vulkan filtering requires a " + "hardware frames context on the input.\n"); + return AVERROR(EINVAL); + } + + /* Extract the device and default output format from the first input. */ + if (avctx->inputs[0] != inlink) + return 0; + + input_frames = (AVHWFramesContext*)inlink->hw_frames_ctx->data; + if (input_frames->format != AV_PIX_FMT_VULKAN) + return AVERROR(EINVAL); + + err = vulkan_filter_set_device(avctx, input_frames->device_ref); + if (err < 0) + return err; + err = vulkan_filter_set_frames(avctx, inlink->hw_frames_ctx); + if (err < 0) + return err; + + /* Default output parameters match input parameters. */ + s->input_format = input_frames->sw_format; + if (s->output_format == AV_PIX_FMT_NONE) + s->output_format = input_frames->sw_format; + if (!s->output_width) + s->output_width = inlink->w; + if (!s->output_height) + s->output_height = inlink->h; + + return 0; +} + +int ff_vk_filter_config_output_inplace(AVFilterLink *outlink) +{ + int err; + AVFilterContext *avctx = outlink->src; + VulkanFilterContext *s = avctx->priv; + + av_buffer_unref(&outlink->hw_frames_ctx); + + if (!s->device_ref) { + if (!avctx->hw_device_ctx) { + av_log(avctx, AV_LOG_ERROR, "Vulkan filtering requires a " + "Vulkan device.\n"); + return AVERROR(EINVAL); + } + + err = vulkan_filter_set_device(avctx, avctx->hw_device_ctx); + if (err < 0) + return err; + } + + outlink->hw_frames_ctx = av_buffer_ref(s->frames_ref); + outlink->w = s->output_width; + outlink->h = s->output_height; + + return 0; +} + +int ff_vk_filter_config_output(AVFilterLink *outlink) +{ + int err; + AVFilterContext *avctx = outlink->src; + VulkanFilterContext *s = avctx->priv; + AVBufferRef *output_frames_ref; + AVHWFramesContext *output_frames; + + av_buffer_unref(&outlink->hw_frames_ctx); + + if (!s->device_ref) { + if (!avctx->hw_device_ctx) { + av_log(avctx, AV_LOG_ERROR, "Vulkan filtering requires a " + "Vulkan device.\n"); + return AVERROR(EINVAL); + } + + err = vulkan_filter_set_device(avctx, avctx->hw_device_ctx); + if (err < 0) + return err; + } + + output_frames_ref = av_hwframe_ctx_alloc(s->device_ref); + if (!output_frames_ref) { + err = AVERROR(ENOMEM); + goto fail; + } + output_frames = (AVHWFramesContext*)output_frames_ref->data; + + output_frames->format = AV_PIX_FMT_VULKAN; + output_frames->sw_format = s->output_format; + output_frames->width = s->output_width; + output_frames->height = s->output_height; + + err = av_hwframe_ctx_init(output_frames_ref); + if (err < 0) { + av_log(avctx, AV_LOG_ERROR, "Failed to initialise output " + "frames: %d.\n", err); + goto fail; + } + + outlink->hw_frames_ctx = output_frames_ref; + outlink->w = s->output_width; + outlink->h = s->output_height; + + return 0; +fail: + av_buffer_unref(&output_frames_ref); + return err; +} + +int ff_vk_filter_init(AVFilterContext *avctx) +{ + VulkanFilterContext *s = avctx->priv; + const shaderc_env_version opt_ver = shaderc_env_version_vulkan_1_1; +#if HAVE_SHADERC_OPT_PERF + const shaderc_optimization_level opt_lvl = shaderc_optimization_level_performance; +#else + const shaderc_optimization_level opt_lvl = shaderc_optimization_level_size; +#endif + + s->output_format = AV_PIX_FMT_NONE; + + s->sc_compiler = shaderc_compiler_initialize(); + if (!s->sc_compiler) + return AVERROR_EXTERNAL; + + s->sc_opts = shaderc_compile_options_initialize(); + if (!s->sc_compiler) + return AVERROR_EXTERNAL; + + shaderc_compile_options_set_target_env(s->sc_opts, + shaderc_target_env_vulkan, + opt_ver); + shaderc_compile_options_set_optimization_level(s->sc_opts, opt_lvl); + + return 0; +} + +void ff_vk_filter_uninit(AVFilterContext *avctx) +{ + VulkanFilterContext *s = avctx->priv; + + shaderc_compile_options_release(s->sc_opts); + shaderc_compiler_release(s->sc_compiler); + + for (int i = 0; i < s->shaders_num; i++) { + SPIRVShader *shd = &s->shaders[i]; + vkDestroyShaderModule(s->hwctx->act_dev, shd->shader.module, + s->hwctx->alloc); + } + + if (s->pipeline != VK_NULL_HANDLE) + vkDestroyPipeline(s->hwctx->act_dev, s->pipeline, s->hwctx->alloc); + if (s->pipeline_layout != VK_NULL_HANDLE) + vkDestroyPipelineLayout(s->hwctx->act_dev, s->pipeline_layout, + s->hwctx->alloc); + + for (int i = 0; i < s->samplers_num; i++) { + VulkanSampler *sampler = &s->samplers[i]; + VK_LOAD_PFN(s->hwctx->inst, vkDestroySamplerYcbcrConversionKHR); + vkDestroySampler(s->hwctx->act_dev, sampler->sampler, s->hwctx->alloc); + pfn_vkDestroySamplerYcbcrConversionKHR(s->hwctx->act_dev, + sampler->yuv_conv.conversion, + s->hwctx->alloc); + } + + ff_vk_free_buf(avctx, &s->vbuffer); + + for (int i = 0; i < s->descriptor_sets_num; i++) { + VK_LOAD_PFN(s->hwctx->inst, vkDestroyDescriptorUpdateTemplateKHR); + pfn_vkDestroyDescriptorUpdateTemplateKHR(s->hwctx->act_dev, + s->desc_template[i], + s->hwctx->alloc); + vkDestroyDescriptorSetLayout(s->hwctx->act_dev, s->desc_layout[i], + s->hwctx->alloc); + } + + if (s->desc_pool != VK_NULL_HANDLE) + vkDestroyDescriptorPool(s->hwctx->act_dev, s->desc_pool, + s->hwctx->alloc); + + av_freep(&s->desc_layout); + av_freep(&s->pool_size_desc); + av_freep(&s->shaders); + av_freep(&s->samplers); + av_buffer_unref(&s->device_ref); + av_buffer_unref(&s->frames_ref); + + /* Only freed in case of failure */ + av_freep(&s->push_consts); + av_freep(&s->pool_size_desc); + if (s->desc_template_info) { + for (int i = 0; i < s->descriptor_sets_num; i++) + av_free((void *)s->desc_template_info[i].pDescriptorUpdateEntries); + av_freep(&s->desc_template_info); + } +} + +SPIRVShader *ff_vk_init_shader(AVFilterContext *avctx, const char *name, + VkShaderStageFlags stage) +{ + SPIRVShader *shd; + VulkanFilterContext *s = avctx->priv; + + s->shaders = av_realloc_array(s->shaders, sizeof(*s->shaders), + s->shaders_num + 1); + if (!s->shaders) + return NULL; + + shd = &s->shaders[s->shaders_num++]; + memset(shd, 0, sizeof(*shd)); + av_bprint_init(&shd->src, 0, AV_BPRINT_SIZE_UNLIMITED); + + shd->shader.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO; + shd->shader.stage = stage; + + shd->name = name; + + GLSLF(0, #version %i ,460); + GLSLC(0, #define AREA(v) ((v).x*(v).y) ); + GLSLC(0, #define IS_WITHIN(v1, v2) ((v1.x < v2.x) && (v1.y < v2.y)) ); + GLSLC(0, ); + + return shd; +} + +void ff_vk_set_compute_shader_sizes(AVFilterContext *avctx, SPIRVShader *shd, + int local_size[3]) +{ + shd->local_size[0] = local_size[0]; + shd->local_size[1] = local_size[1]; + shd->local_size[2] = local_size[2]; + + av_bprintf(&shd->src, "layout (local_size_x = %i, " + "local_size_y = %i, local_size_z = %i) in;\n", + shd->local_size[0], shd->local_size[1], shd->local_size[2]); +} + +static void print_shader(AVFilterContext *avctx, SPIRVShader *shd) +{ + int line = 0; + const char *p = shd->src.str; + const char *start = p; + + AVBPrint buf; + av_bprint_init(&buf, 0, AV_BPRINT_SIZE_UNLIMITED); + + for (int i = 0; i < strlen(p); i++) { + if (p[i] == '\n') { + av_bprintf(&buf, "%i\t", ++line); + av_bprint_append_data(&buf, start, &p[i] - start + 1); + start = &p[i + 1]; + } + } + + av_log(avctx, AV_LOG_VERBOSE, "Compiling shader %s: \n%s\n", + shd->name, buf.str); + av_bprint_finalize(&buf, NULL); +} + +int ff_vk_compile_shader(AVFilterContext *avctx, SPIRVShader *shd, + const char *entry) +{ + VkResult ret; + VulkanFilterContext *s = avctx->priv; + VkShaderModuleCreateInfo shader_create; + + shaderc_compilation_result_t res; + static const shaderc_shader_kind type_map[] = { + [VK_SHADER_STAGE_VERTEX_BIT] = shaderc_vertex_shader, + [VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT] = shaderc_tess_control_shader, + [VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT] = shaderc_tess_evaluation_shader, + [VK_SHADER_STAGE_GEOMETRY_BIT] = shaderc_geometry_shader, + [VK_SHADER_STAGE_FRAGMENT_BIT] = shaderc_fragment_shader, + [VK_SHADER_STAGE_COMPUTE_BIT] = shaderc_compute_shader, + }; + + shd->shader.pName = entry; + + print_shader(avctx, shd); + + res = shaderc_compile_into_spv(s->sc_compiler, shd->src.str, shd->src.len, + type_map[shd->shader.stage], shd->name, + entry, s->sc_opts); + av_bprint_finalize(&shd->src, NULL); + + if (shaderc_result_get_compilation_status(res) != + shaderc_compilation_status_success) { + av_log(avctx, AV_LOG_ERROR, "%s", shaderc_result_get_error_message(res)); + return AVERROR_EXTERNAL; + } + + shader_create.sType = VK_STRUCTURE_TYPE_SHADER_MODULE_CREATE_INFO; + shader_create.pNext = NULL; + shader_create.codeSize = shaderc_result_get_length(res); + shader_create.flags = 0; + shader_create.pCode = (const uint32_t *)shaderc_result_get_bytes(res); + + ret = vkCreateShaderModule(s->hwctx->act_dev, &shader_create, NULL, + &shd->shader.module); + shaderc_result_release(res); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to create shader module: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + av_log(avctx, AV_LOG_VERBOSE, "Shader linked! Size: %zu bytes\n", + shader_create.codeSize); + + return 0; +} + +static VkSamplerYcbcrModelConversion conv_primaries(enum AVColorPrimaries color_primaries) +{ + switch(color_primaries) { + case AVCOL_PRI_BT470BG: + return VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_601; + case AVCOL_PRI_BT709: + return VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_709; + case AVCOL_PRI_BT2020: + return VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_2020; + } + /* Just assume its 709 */ + return VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_709; +} + +const VulkanSampler *ff_vk_init_sampler(AVFilterContext *avctx, AVFrame *input, + int unnorm_coords, VkFilter filt) +{ + VkResult ret; + VulkanFilterContext *s = avctx->priv; + VulkanSampler *sampler; + + VkSamplerCreateInfo sampler_info = { + .sType = VK_STRUCTURE_TYPE_SAMPLER_CREATE_INFO, + .magFilter = filt, + .minFilter = sampler_info.magFilter, + .mipmapMode = unnorm_coords ? VK_SAMPLER_MIPMAP_MODE_NEAREST : + VK_SAMPLER_MIPMAP_MODE_LINEAR, + .addressModeU = VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_BORDER, + .addressModeV = sampler_info.addressModeU, + .addressModeW = sampler_info.addressModeU, + .anisotropyEnable = VK_FALSE, + .compareOp = VK_COMPARE_OP_NEVER, + .borderColor = VK_BORDER_COLOR_FLOAT_TRANSPARENT_BLACK, + .unnormalizedCoordinates = unnorm_coords, + }; + + s->samplers = av_realloc_array(s->samplers, sizeof(*s->samplers), + s->samplers_num + 1); + if (!s->samplers) + return NULL; + + sampler = &s->samplers[s->samplers_num++]; + memset(sampler, 0, sizeof(*sampler)); + + sampler->converting = !!input; + sampler->yuv_conv.sType = VK_STRUCTURE_TYPE_SAMPLER_YCBCR_CONVERSION_INFO; + + if (input) { + VkSamplerYcbcrConversion *conv = &sampler->yuv_conv.conversion; + VkComponentMapping comp_map = { + .r = VK_COMPONENT_SWIZZLE_IDENTITY, + .g = VK_COMPONENT_SWIZZLE_IDENTITY, + .b = VK_COMPONENT_SWIZZLE_IDENTITY, + .a = VK_COMPONENT_SWIZZLE_IDENTITY, + }; + + VkSamplerYcbcrConversionCreateInfo c_info = { + .sType = VK_STRUCTURE_TYPE_SAMPLER_YCBCR_CONVERSION_CREATE_INFO, + .format = av_vkfmt_from_pixfmt(s->input_format), + .chromaFilter = VK_FILTER_LINEAR, + .ycbcrModel = conv_primaries(input->color_primaries), + .ycbcrRange = input->color_range == AVCOL_RANGE_JPEG ? + VK_SAMPLER_YCBCR_RANGE_ITU_FULL : + VK_SAMPLER_YCBCR_RANGE_ITU_NARROW, + .xChromaOffset = input->chroma_location == AVCHROMA_LOC_CENTER ? + VK_CHROMA_LOCATION_MIDPOINT : + VK_CHROMA_LOCATION_COSITED_EVEN, + .components = comp_map, + }; + + VK_LOAD_PFN(s->hwctx->inst, vkCreateSamplerYcbcrConversionKHR); + + sampler_info.pNext = &sampler->yuv_conv; + + if (unnorm_coords) { + av_log(avctx, AV_LOG_ERROR, "Cannot create a converting sampler " + "with unnormalized addressing, forbidden by spec!\n"); + return NULL; + } + + ret = pfn_vkCreateSamplerYcbcrConversionKHR(s->hwctx->act_dev, &c_info, + s->hwctx->alloc, conv); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to init conversion: %s\n", + ff_vk_ret2str(ret)); + return NULL; + } + } + + ret = vkCreateSampler(s->hwctx->act_dev, &sampler_info, + s->hwctx->alloc, &sampler->sampler); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to init sampler: %s\n", + ff_vk_ret2str(ret)); + return NULL; + } + + return sampler; +} + +int ff_vk_add_push_constant(AVFilterContext *avctx, int offset, int size, + VkShaderStageFlagBits stage) +{ + VkPushConstantRange *pc; + VulkanFilterContext *s = avctx->priv; + + s->push_consts = av_realloc_array(s->push_consts, sizeof(*s->push_consts), + s->push_consts_num + 1); + if (!s->push_consts) + return AVERROR(ENOMEM); + + pc = &s->push_consts[s->push_consts_num++]; + memset(pc, 0, sizeof(*pc)); + + pc->stageFlags = stage; + pc->offset = offset; + pc->size = size; + + return s->push_consts_num - 1; +} + +static const struct descriptor_props { + size_t struct_size; /* Size of the opaque which updates the descriptor */ + const char *type; + int is_uniform; + int mem_quali; /* Can use a memory qualifier */ + int dim_needed; /* Must indicate dimension */ + int buf_content; /* Must indicate buffer contents */ +} descriptor_props[] = { + [VK_DESCRIPTOR_TYPE_SAMPLER] = { sizeof(VkDescriptorImageInfo), "sampler", 1, 0, 0, 0, }, + [VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE] = { sizeof(VkDescriptorImageInfo), "texture", 1, 0, 1, 0, }, + [VK_DESCRIPTOR_TYPE_STORAGE_IMAGE] = { sizeof(VkDescriptorImageInfo), "image", 1, 1, 1, 0, }, + [VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT] = { sizeof(VkDescriptorImageInfo), "subpassInput", 1, 0, 0, 0, }, + [VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER] = { sizeof(VkDescriptorImageInfo), "sampler", 1, 0, 1, 0, }, + [VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER] = { sizeof(VkDescriptorBufferInfo), NULL, 1, 0, 0, 1, }, + [VK_DESCRIPTOR_TYPE_STORAGE_BUFFER] = { sizeof(VkDescriptorBufferInfo), "buffer", 0, 1, 0, 1, }, + [VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC] = { sizeof(VkDescriptorBufferInfo), NULL, 1, 0, 0, 1, }, + [VK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC] = { sizeof(VkDescriptorBufferInfo), "buffer", 0, 1, 0, 1, }, + [VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER] = { sizeof(VkBufferView), "samplerBuffer", 1, 0, 0, 0, }, + [VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER] = { sizeof(VkBufferView), "imageBuffer", 1, 0, 0, 0, }, +}; + +int ff_vk_add_descriptor_set(AVFilterContext *avctx, SPIRVShader *shd, + VulkanDescriptorSetBinding *desc, int num, + int only_print_to_shader) +{ + VkResult ret; + VkDescriptorSetLayout *layout; + VulkanFilterContext *s = avctx->priv; + + if (only_print_to_shader) + goto print; + + s->desc_layout = av_realloc_array(s->desc_layout, sizeof(*s->desc_layout), + s->descriptor_sets_num + 1); + if (!s->desc_layout) + return AVERROR(ENOMEM); + + layout = &s->desc_layout[s->descriptor_sets_num]; + memset(layout, 0, sizeof(*layout)); + + { /* Create descriptor set layout descriptions */ + VkDescriptorSetLayoutCreateInfo desc_create_layout = { 0 }; + VkDescriptorSetLayoutBinding *desc_binding; + + desc_binding = av_mallocz(sizeof(*desc_binding)*num); + if (!desc_binding) + return AVERROR(ENOMEM); + + for (int i = 0; i < num; i++) { + desc_binding[i].binding = i; + desc_binding[i].descriptorType = desc[i].type; + desc_binding[i].descriptorCount = FFMAX(desc[i].elems, 1); + desc_binding[i].stageFlags = desc[i].stages; + desc_binding[i].pImmutableSamplers = desc[i].samplers; + } + + desc_create_layout.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO; + desc_create_layout.pBindings = desc_binding; + desc_create_layout.bindingCount = num; + + ret = vkCreateDescriptorSetLayout(s->hwctx->act_dev, &desc_create_layout, + s->hwctx->alloc, layout); + av_free(desc_binding); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to init descriptor set " + "layout: %s\n", ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } + + { /* Pool each descriptor by type and update pool counts */ + for (int i = 0; i < num; i++) { + int j; + for (j = 0; j < s->pool_size_desc_num; j++) + if (s->pool_size_desc[j].type == desc[i].type) + break; + if (j >= s->pool_size_desc_num) { + s->pool_size_desc = av_realloc_array(s->pool_size_desc, + sizeof(*s->pool_size_desc), + ++s->pool_size_desc_num); + if (!s->pool_size_desc) + return AVERROR(ENOMEM); + memset(&s->pool_size_desc[j], 0, sizeof(VkDescriptorPoolSize)); + } + s->pool_size_desc[j].type = desc[i].type; + s->pool_size_desc[j].descriptorCount += FFMAX(desc[i].elems, 1); + } + } + + { /* Create template creation struct */ + VkDescriptorUpdateTemplateCreateInfo *dt; + VkDescriptorUpdateTemplateEntry *des_entries; + + /* Freed after descriptor set initialization */ + des_entries = av_mallocz(num*sizeof(VkDescriptorUpdateTemplateEntry)); + if (!des_entries) + return AVERROR(ENOMEM); + + for (int i = 0; i < num; i++) { + des_entries[i].dstBinding = i; + des_entries[i].descriptorType = desc[i].type; + des_entries[i].descriptorCount = FFMAX(desc[i].elems, 1); + des_entries[i].dstArrayElement = 0; + des_entries[i].offset = ((uint8_t *)desc[i].updater) - (uint8_t *)s; + des_entries[i].stride = descriptor_props[desc[i].type].struct_size; + } + + s->desc_template_info = av_realloc_array(s->desc_template_info, + sizeof(*s->desc_template_info), + s->descriptor_sets_num + 1); + if (!s->desc_layout) + return AVERROR(ENOMEM); + + dt = &s->desc_template_info[s->descriptor_sets_num]; + memset(dt, 0, sizeof(*dt)); + + dt->sType = VK_STRUCTURE_TYPE_DESCRIPTOR_UPDATE_TEMPLATE_CREATE_INFO; + dt->templateType = VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_DESCRIPTOR_SET; + dt->descriptorSetLayout = *layout; + dt->pDescriptorUpdateEntries = des_entries; + dt->descriptorUpdateEntryCount = num; + } + + s->descriptor_sets_num++; + +print: + /* Write shader info */ + for (int i = 0; i < num; i++) { + const struct descriptor_props *prop = &descriptor_props[desc[i].type]; + GLSLA("layout (set = %i, binding = %i", s->descriptor_sets_num - 1, i); + + if (desc[i].mem_layout) + GLSLA(", %s", desc[i].mem_layout); + GLSLA(")"); + + if (prop->is_uniform) + GLSLA(" uniform"); + + if (prop->mem_quali && desc[i].mem_quali) + GLSLA(" %s", desc[i].mem_quali); + + if (prop->type) + GLSLA(" %s", prop->type); + + if (prop->dim_needed) + GLSLA("%iD", desc[i].dimensions); + + GLSLA(" %s", desc[i].name); + + if (prop->buf_content) + GLSLA(" {\n %s\n}", desc[i].buf_content); + else if (desc[i].elems > 0) + GLSLA("[%i]", desc[i].elems); + + GLSLA(";\n"); + } + + return 0; +} + +void ff_vk_update_descriptor_set(AVFilterContext *avctx, int set_id) +{ + VulkanFilterContext *s = avctx->priv; + + VK_LOAD_PFN(s->hwctx->inst, vkUpdateDescriptorSetWithTemplateKHR); + pfn_vkUpdateDescriptorSetWithTemplateKHR(s->hwctx->act_dev, + s->desc_set[set_id], + s->desc_template[set_id], s); +} + +const enum VkImageAspectFlagBits ff_vk_aspect_flags(enum AVPixelFormat pixfmt, + int plane) +{ + const int tot_planes = av_pix_fmt_count_planes(pixfmt); + static const enum VkImageAspectFlagBits m[] = { VK_IMAGE_ASPECT_PLANE_0_BIT, + VK_IMAGE_ASPECT_PLANE_1_BIT, + VK_IMAGE_ASPECT_PLANE_2_BIT, }; + if (!tot_planes || (plane > tot_planes)) + return 0; + if (tot_planes == 1) + return VK_IMAGE_ASPECT_COLOR_BIT; + if (plane < 0) + return m[0] | m[1] | (tot_planes > 2 ? m[2] : 0); + return m[plane]; +} + +const VkFormat ff_vk_plane_rep_fmt(enum AVPixelFormat pixfmt, int plane) +{ + const int tot_planes = av_pix_fmt_count_planes(pixfmt); + const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(pixfmt); + const int high = desc->comp[plane].depth > 8; + if (tot_planes == 1) { /* RGB, etc.'s singleplane rep is itself */ + return av_vkfmt_from_pixfmt(pixfmt); + } else if (tot_planes == 2) { /* Must be NV12 or P010 */ + if (!high) + return !plane ? VK_FORMAT_R8_UNORM : VK_FORMAT_R8G8_UNORM; + else + return !plane ? VK_FORMAT_R16_UNORM : VK_FORMAT_R16G16_UNORM; + } else { /* Regular planar YUV */ + return !high ? VK_FORMAT_R8_UNORM : VK_FORMAT_R16_UNORM; + } +} + +const char *ff_vk_shader_rep_fmt(enum AVPixelFormat pixfmt) +{ + const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(pixfmt); + const int high = desc->comp[0].depth > 8; + return high ? "rgba16f" : "rgba8"; +} + +int ff_vk_create_imageview(AVFilterContext *avctx, VkImageView *v, AVVkFrame *f, + VkFormat fmt, enum VkImageAspectFlagBits aspect, + VkComponentMapping map, const void *pnext) +{ + VulkanFilterContext *s = avctx->priv; + VkImageViewCreateInfo imgview_spawn = { + .sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO, + .pNext = pnext, + .image = f->img, + .viewType = VK_IMAGE_VIEW_TYPE_2D, + .format = fmt, + .components = map, + .subresourceRange = { + .aspectMask = aspect, + .baseMipLevel = 0, + .levelCount = 1, + .baseArrayLayer = 0, + .layerCount = 1, + }, + }; + + VkResult ret = vkCreateImageView(s->hwctx->act_dev, &imgview_spawn, + s->hwctx->alloc, v); + if (ret != VK_SUCCESS) { + av_log(s, AV_LOG_ERROR, "Failed to create imageview: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + return 0; +} + +void ff_vk_destroy_imageview(AVFilterContext *avctx, VkImageView v) +{ + VulkanFilterContext *s = avctx->priv; + vkDestroyImageView(s->hwctx->act_dev, v, s->hwctx->alloc); +} + +int ff_vk_init_pipeline_layout(AVFilterContext *avctx) +{ + VkResult ret; + VulkanFilterContext *s = avctx->priv; + + { /* Init descriptor set pool */ + VkDescriptorPoolCreateInfo pool_create_info = { + .sType = VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO, + .poolSizeCount = s->pool_size_desc_num, + .pPoolSizes = s->pool_size_desc, + .maxSets = s->descriptor_sets_num, + }; + + ret = vkCreateDescriptorPool(s->hwctx->act_dev, &pool_create_info, + s->hwctx->alloc, &s->desc_pool); + av_freep(&s->pool_size_desc); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to init descriptor set " + "pool: %s\n", ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } + + { /* Allocate descriptor sets */ + VkDescriptorSetAllocateInfo alloc_info = { + .sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO, + .descriptorPool = s->desc_pool, + .descriptorSetCount = s->descriptor_sets_num, + .pSetLayouts = s->desc_layout, + }; + + s->desc_set = av_malloc(s->descriptor_sets_num*sizeof(*s->desc_set)); + if (!s->desc_set) + return AVERROR(ENOMEM); + + ret = vkAllocateDescriptorSets(s->hwctx->act_dev, &alloc_info, + s->desc_set); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to allocate descriptor set: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } + + { /* Finally create the pipeline layout */ + VkPipelineLayoutCreateInfo spawn_pipeline_layout = { + .sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO, + .setLayoutCount = s->descriptor_sets_num, + .pSetLayouts = s->desc_layout, + .pushConstantRangeCount = s->push_consts_num, + .pPushConstantRanges = s->push_consts, + }; + + ret = vkCreatePipelineLayout(s->hwctx->act_dev, &spawn_pipeline_layout, + s->hwctx->alloc, &s->pipeline_layout); + av_freep(&s->push_consts); + s->push_consts_num = 0; + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to init pipeline layout: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } + + { /* Descriptor template (for tightly packed descriptors) */ + VK_LOAD_PFN(s->hwctx->inst, vkCreateDescriptorUpdateTemplateKHR); + VkDescriptorUpdateTemplateCreateInfo *desc_template_info; + + s->desc_template = av_malloc(s->descriptor_sets_num*sizeof(*s->desc_template)); + if (!s->desc_template) + return AVERROR(ENOMEM); + + /* Create update templates for the descriptor sets */ + for (int i = 0; i < s->descriptor_sets_num; i++) { + desc_template_info = &s->desc_template_info[i]; + desc_template_info->pipelineLayout = s->pipeline_layout; + ret = pfn_vkCreateDescriptorUpdateTemplateKHR(s->hwctx->act_dev, + desc_template_info, + s->hwctx->alloc, + &s->desc_template[i]); + av_free((void *)desc_template_info->pDescriptorUpdateEntries); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to init descriptor " + "template: %s\n", ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } + + av_freep(&s->desc_template_info); + } + + return 0; +} + +int ff_vk_init_compute_pipeline(AVFilterContext *avctx) +{ + int i; + VkResult ret; + VulkanFilterContext *s = avctx->priv; + + VkComputePipelineCreateInfo pipe = { + .sType = VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO, + .layout = s->pipeline_layout, + }; + + for (i = 0; i < s->shaders_num; i++) { + if (s->shaders[i].shader.stage & VK_SHADER_STAGE_COMPUTE_BIT) { + pipe.stage = s->shaders[i].shader; + break; + } + } + if (i == s->shaders_num) { + av_log(avctx, AV_LOG_ERROR, "Can't init compute pipeline, no shader\n"); + return AVERROR(EINVAL); + } + + ret = vkCreateComputePipelines(s->hwctx->act_dev, VK_NULL_HANDLE, 1, &pipe, + s->hwctx->alloc, &s->pipeline); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to init compute pipeline: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + return 0; +} diff --git a/libavfilter/vulkan.h b/libavfilter/vulkan.h new file mode 100644 index 0000000000..cac06f6920 --- /dev/null +++ b/libavfilter/vulkan.h @@ -0,0 +1,223 @@ +/* + * Vulkan utilities + * Copyright (c) 2018 Rostislav Pehlivanov + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVFILTER_VULKAN_COMMON_H +#define AVFILTER_VULKAN_COMMON_H + +#include "avfilter.h" +#include "libavutil/pixdesc.h" +#include "libavutil/bprint.h" +#include "libavutil/hwcontext.h" +#include "libavutil/hwcontext_vulkan.h" + +#include + +/* GLSL management macros */ +#define INDENT(N) INDENT_##N +#define INDENT_0 +#define INDENT_1 INDENT_0 " " +#define INDENT_2 INDENT_1 INDENT_1 +#define INDENT_3 INDENT_2 INDENT_1 +#define INDENT_4 INDENT_3 INDENT_1 +#define INDENT_5 INDENT_4 INDENT_1 +#define INDENT_6 INDENT_5 INDENT_1 +#define C(N, S) INDENT(N) #S "\n" +#define GLSLC(N, S) av_bprintf(&shd->src, C(N, S)) +#define GLSLA(...) av_bprintf(&shd->src, __VA_ARGS__) +#define GLSLF(N, S, ...) av_bprintf(&shd->src, C(N, S), __VA_ARGS__) +#define GLSLD(D) GLSLC(0, ); \ + av_bprint_append_data(&shd->src, D, strlen(D)); \ + GLSLC(0, ) + +/* Helper, pretty much every Vulkan return value needs to be checked */ +#define RET(x) \ + do { \ + if ((err = (x)) < 0) \ + goto fail; \ + } while (0) + +/* Useful for attaching immutable samplers to arrays */ +#define DUP_SAMPLER_ARRAY4(x) (const VkSampler []){ x, x, x, x, } + +typedef struct SPIRVShader { + const char *name; /* Name for id/debugging purposes */ + AVBPrint src; + int local_size[3]; /* Compute shader workgroup sizes */ + VkPipelineShaderStageCreateInfo shader; +} SPIRVShader; + +typedef struct VulkanDescriptorSetBinding { + const char *name; + VkDescriptorType type; + const char *mem_layout; /* Storage images (rgba8, etc.) and buffers (std430, etc.) */ + const char *mem_quali; /* readonly, writeonly, etc. */ + const char *buf_content; /* For buffers */ + uint32_t dimensions; /* Needed for e.g. sampler%iD */ + uint32_t elems; /* 0 - scalar, 1 or more - vector */ + VkShaderStageFlags stages; + const VkSampler *samplers; /* Immutable samplers, length - #elems */ + void *updater; +} VulkanDescriptorSetBinding; + +typedef struct VulkanSampler { + VkSampler sampler; + VkSamplerYcbcrConversionInfo yuv_conv; /* For imageview creation */ + int converting; /* Indicates whether sampler is a converting one */ +} VulkanSampler; + +typedef struct FFVkExecContext { + VkCommandPool pool; + VkCommandBuffer buf; + VkQueue queue; + VkFence fence; +} FFVkExecContext; + +typedef struct FFVkBuffer { + VkBuffer buf; + VkDeviceMemory mem; + VkMemoryPropertyFlagBits flags; +} FFVkBuffer; + +typedef struct VulkanFilterContext { + const AVClass *class; + + AVBufferRef *device_ref; + AVBufferRef *frames_ref; /* For in-place filtering */ + AVHWDeviceContext *device; + AVVulkanDeviceContext *hwctx; + + /* Properties */ + int output_width; + int output_height; + enum AVPixelFormat output_format; + enum AVPixelFormat input_format; + + /* Samplers */ + VulkanSampler *samplers; + int samplers_num; + + /* Shaders */ + SPIRVShader *shaders; + int shaders_num; + shaderc_compiler_t sc_compiler; + shaderc_compile_options_t sc_opts; + + /* Contexts */ + VkRenderPass renderpass; + VkPipelineLayout pipeline_layout; + VkPipeline pipeline; + + /* Descriptors */ + VkDescriptorSetLayout *desc_layout; + VkDescriptorPool desc_pool; + VkDescriptorSet *desc_set; + VkDescriptorUpdateTemplate *desc_template; + int push_consts_num; + int descriptor_sets_num; + int pool_size_desc_num; + + /* Vertex buffer */ + FFVkBuffer vbuffer; + int num_verts; + + /* Temporary, used to store data in between initialization stages */ + VkDescriptorUpdateTemplateCreateInfo *desc_template_info; + VkDescriptorPoolSize *pool_size_desc; + VkPushConstantRange *push_consts; + void *scratch; /* Scratch memory used only in functions */ + unsigned int scratch_size; +} VulkanFilterContext; + +/* Generic memory allocation. + * Will align size to the minimum map alignment requirement in case req_flags + * has VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT set */ +int ff_vk_alloc_mem(AVFilterContext *avctx, VkMemoryRequirements *req, + VkMemoryPropertyFlagBits req_flags, void *alloc_extension, + VkMemoryPropertyFlagBits *mem_flags, VkDeviceMemory *mem); + +/* Buffer I/O */ +int ff_vk_create_buf(AVFilterContext *avctx, FFVkBuffer *buf, size_t size, + VkBufferUsageFlags usage, VkMemoryPropertyFlagBits flags); +int ff_vk_map_buffers(AVFilterContext *avctx, FFVkBuffer *buf, uint8_t *mem[], + int nb_buffers, int invalidate); +int ff_vk_unmap_buffers(AVFilterContext *avctx, FFVkBuffer *buf, int nb_buffers, + int flush); +void ff_vk_free_buf(AVFilterContext *avctx, FFVkBuffer *buf); + +/* Command context init/uninit */ +int ff_vk_create_exec_ctx(AVFilterContext *avctx, FFVkExecContext *e, int queue); +void ff_vk_free_exec_ctx(AVFilterContext *avctx, FFVkExecContext *e); + +/* Converts Vulkan return values to strings */ +const char *ff_vk_ret2str(VkResult res); + +/* Create a Vulkan sampler, if input isn't NULL the sampler will convert to RGB */ +const VulkanSampler *ff_vk_init_sampler(AVFilterContext *avctx, AVFrame *input, + int unnorm_coords, VkFilter filt); + +/* Gets the single-plane representation format */ +const VkFormat ff_vk_plane_rep_fmt(enum AVPixelFormat pixfmt, int plane); +/* Gets the glsl format for an image */ +const char *ff_vk_shader_rep_fmt(enum AVPixelFormat pixfmt); +/* Gets the image aspect flags of a plane */ +const enum VkImageAspectFlagBits ff_vk_aspect_flags(enum AVPixelFormat pixfmt, + int plane); +/* Creates an imageview */ +int ff_vk_create_imageview(AVFilterContext *avctx, VkImageView *v, AVVkFrame *f, + VkFormat fmt, enum VkImageAspectFlagBits aspect, + VkComponentMapping map, const void *pnext); +/* Destroys an imageview */ +void ff_vk_destroy_imageview(AVFilterContext *avctx, VkImageView v); +/* Creates a shader */ +SPIRVShader *ff_vk_init_shader(AVFilterContext *avctx, const char *name, + VkShaderStageFlags stage); +/* For compute shaders, defines the workgroup size */ +void ff_vk_set_compute_shader_sizes(AVFilterContext *avctx, SPIRVShader *shd, + int local_size[3]); +/* Compiles a completed shader into a module */ +int ff_vk_compile_shader(AVFilterContext *avctx, SPIRVShader *shd, + const char *entry); + +/* Needs to be abstracted so it adds them to a certain pipeline layout */ +int ff_vk_add_descriptor_set(AVFilterContext *avctx, SPIRVShader *shd, + VulkanDescriptorSetBinding *desc, int num, + int only_print_to_shader); +int ff_vk_add_push_constant(AVFilterContext *avctx, int offset, int size, + VkShaderStageFlagBits stage); + +/* Creates a Vulkan pipeline layout */ +int ff_vk_init_pipeline_layout(AVFilterContext *avctx); + +/* Creates a compute pipeline */ +int ff_vk_init_compute_pipeline(AVFilterContext *avctx); + +/* Updates a given descriptor set after pipeline initialization */ +void ff_vk_update_descriptor_set(AVFilterContext *avctx, int set_id); + +/* General lavfi IO functions */ +int ff_vk_filter_query_formats (AVFilterContext *avctx); +int ff_vk_filter_init (AVFilterContext *avctx); +int ff_vk_filter_config_input (AVFilterLink *inlink); +int ff_vk_filter_config_output (AVFilterLink *outlink); +int ff_vk_filter_config_output_inplace(AVFilterLink *outlink); +void ff_vk_filter_uninit (AVFilterContext *avctx); + +#endif /* AVFILTER_VULKAN_COMMON_H */