From patchwork Fri Mar 30 03:14:33 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rostislav Pehlivanov X-Patchwork-Id: 8232 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.2.1.70 with SMTP id c67csp2408948jad; Thu, 29 Mar 2018 20:15:08 -0700 (PDT) X-Google-Smtp-Source: AIpwx4+Z43+BCUoobv/EadZ8WvVUHvHO+kk2Z+VoOH/46kN7TXpuFknxghMizqNOSnf17lekBWON X-Received: by 10.223.142.23 with SMTP id n23mr8614437wrb.28.1522379708194; Thu, 29 Mar 2018 20:15:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522379708; cv=none; d=google.com; s=arc-20160816; b=OSyA+0keBHuBxeKZIg/9YZuM8GkLFbj59gzBpWy8gVGhoI/p9Wtpe9IbiD9a3U9o7C n63PZ9eKY/FhuhxT3ODZyDaxc2tn/HHzNRVNEtVvMW/WQIxKZNZsspG0nqzN0Zqj3Eoz rtMFJ5NiVTTnqe+SzpFJahBFULPtm8vap9g5YlL1vq2SvVChwS+5Om0EeS0y25GYiZXD bm8b3w2ypfZul9254CG3P9mMYuKB9qo6079jQdq8Bt+3kebUnokkYHokYtVFkzHtmeCY KUgst/zilWhw40CELjp+qYMemgx7eoAhXi+PKqcTL4l2dAE/vUS4H/uQVbAeRa7755fM u8cw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:references:in-reply-to:message-id:date :to:from:dkim-signature:delivered-to:arc-authentication-results; bh=EUsX/k9VklfIQS+sHy+KLw8DgjhGRzNd6D38lmlg9mE=; b=l+2mxVq5dk2Xaufhziql7Ce/lJ52YSvWOcGRWF7jFYgzuooS7x5J6trspIX8q9VjdN qdrvN8YrQieiMOqOZBGg04T76dgjWivjrTdrKekHi1M8H8Qz1kqp9ZKMf/E3IzMvexxW +TdUV6DL9AVuk+9cyFbnlTruI1iZz6ACGRxD2dQzkFADl4h5drefnwt0QACbp57KoIhW A0cYMijcI2/d6KoUTbLsGOEIPsnFl7Fz9R5k/xDBsP+dQW+gY1NtAaa5fygZCWhJ6Pt6 hDlAhEmnFUWIhxumQwsMqFrXxX1xOGvLaAU8Q8gvjXDYshmdBAfYQ3i4T/pdQXdl3uf5 wYuw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20161025 header.b=AAFjSQu0; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id o2si5950032wre.295.2018.03.29.20.15.07; Thu, 29 Mar 2018 20:15:08 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20161025 header.b=AAFjSQu0; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 663DB689F24; Fri, 30 Mar 2018 06:14:31 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wm0-f51.google.com (mail-wm0-f51.google.com [74.125.82.51]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 682CE689B7F for ; Fri, 30 Mar 2018 06:14:23 +0300 (EEST) Received: by mail-wm0-f51.google.com with SMTP id r82so14801176wme.0 for ; Thu, 29 Mar 2018 20:14:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=cP+122IKSs6tA+6PG7XZ1W4xCi+0f5ZFtOugnHSVslM=; b=AAFjSQu0uQLN/zUikPXW/2+RQ4W6cheyWygxKvDq9nMmQElbFYoOoWCiNGGqhN+gqi DUMUW3tY1wdTX/OvhNW1910+FBRb9CdOIcRsfA28Wt8bmKSMMslkmM+qsMCY7PpcV4O9 zxWENvVm0ZcnH3qsXKh2J88UxUR0K3lZsbNQDwAldJl9/99SWDsg06cRaH+uuskDUcwt /37VCkQ7EUO6SXET07MG6pRucTHZ2fort+c/PfUsH7CI/jKS+aQm/lPSDFFOxMf+1N+P v76+CPs7wYAW9FOtDBXgLnDmiwJz8yoc3GxP6GJu/lUyNexc9VNYlG14tQEUYvwJJAej K6GQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=cP+122IKSs6tA+6PG7XZ1W4xCi+0f5ZFtOugnHSVslM=; b=E85pdkdvCNfmR+UkFRH1fVuE2j8pqm4X5vKRMS51cBCin3RCbYkWQChyyej6rOmaOa 00cajNLPUPxUSzQBm3U0YjK95PpHcue7F/skQ7xDA1NQiIrHFPXObEqifnkgVzaLbqRt 0SktnbAM1HZnSoN2pWv5iwNAzp9RCzKKgEkP2P40KLcjHcKINspjkZxGeAm8tNl6DMsM v/lbdajdxRRJhFbawLk7BnzFJqO3MUrOYmxxnlfh+mV0XsI+5feTYUIWyTB2K6FRZWib qe2w13qoGywreKEN3TKzgNpDyzFjJO8ZSVEK+n0Tcl1U08zEqena4a1fycsD5q9QE5z9 aKew== X-Gm-Message-State: AElRT7EHNkmnoXUN+YeD/9vnc9HhIf7qJkjolyuXgsldDtCXTsoQVyG3 FlUalfmyeXud0wJwJDIcQ5V2RBE7 X-Received: by 10.28.10.83 with SMTP id 80mr1050788wmk.70.1522379680692; Thu, 29 Mar 2018 20:14:40 -0700 (PDT) Received: from moonbase.pars.ee ([2a00:23c4:7c88:af00:c5c7:81e6:8fcc:20eb]) by smtp.gmail.com with ESMTPSA id n12sm6430940wrg.16.2018.03.29.20.14.39 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 29 Mar 2018 20:14:39 -0700 (PDT) From: Rostislav Pehlivanov To: ffmpeg-devel@ffmpeg.org Date: Fri, 30 Mar 2018 04:14:33 +0100 Message-Id: <20180330031434.22245-3-atomnuker@gmail.com> X-Mailer: git-send-email 2.16.3 In-Reply-To: <20180330031434.22245-1-atomnuker@gmail.com> References: <20180330031434.22245-1-atomnuker@gmail.com> Subject: [FFmpeg-devel] [PATCH 2/3] lavfi: add common Vulkan filtering code X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Rostislav Pehlivanov MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" This commit adds a common code for use in Vulkan filters. It attempts to ease the burden of writing Vulkan image filtering to a minimum, which is pretty much a requirement considering how verbose the API is. It supports both compute and graphic pipelines and manages to abstract the API to such a level there's no need to call any Vulkan functions inside the init path of the code. Handling shader descriptors is probably the bulk of the code, and despite the abstraction, it loses none of the features for describing shader IO. In order to produce linkable shaders, it depends on the libshaderc library (and depends on the latest stable version of it). This allows for greater performance and flexibility than static built-in shaders and also eliminates the cumbersome process of interfacing with glslang to compile GLSL to SPIR-V. It's based off of the common opencl and provides similar interfaces for filter pad init and config, with the addition that it also supports in-place filtering. Signed-off-by: Rostislav Pehlivanov --- configure | 3 + libavfilter/vulkan.c | 1101 ++++++++++++++++++++++++++++++++++++++++++++++++++ libavfilter/vulkan.h | 179 ++++++++ 3 files changed, 1283 insertions(+) create mode 100644 libavfilter/vulkan.c create mode 100644 libavfilter/vulkan.h diff --git a/configure b/configure index 2213f0452d..3621b5cdeb 100755 --- a/configure +++ b/configure @@ -252,6 +252,7 @@ External library support: --enable-librsvg enable SVG rasterization via librsvg [no] --enable-librubberband enable rubberband needed for rubberband filter [no] --enable-librtmp enable RTMP[E] support via librtmp [no] + --enable-libshaderc enable GLSL->SPIRV compilation via libshaderc [no] --enable-libshine enable fixed-point MP3 encoding via libshine [no] --enable-libsmbclient enable Samba protocol via libsmbclient [no] --enable-libsnappy enable Snappy compression, needed for hap encoding [no] @@ -1702,6 +1703,7 @@ EXTERNAL_LIBRARY_LIST=" libpulse librsvg librtmp + libshaderc libshine libsmbclient libsnappy @@ -6020,6 +6022,7 @@ enabled libpulse && require_pkg_config libpulse libpulse pulse/pulseaud enabled librsvg && require_pkg_config librsvg librsvg-2.0 librsvg-2.0/librsvg/rsvg.h rsvg_handle_render_cairo enabled librtmp && require_pkg_config librtmp librtmp librtmp/rtmp.h RTMP_Socket enabled librubberband && require_pkg_config librubberband "rubberband >= 1.8.1" rubberband/rubberband-c.h rubberband_new -lstdc++ && append librubberband_extralibs "-lstdc++" +enabled libshaderc && require libshaderc shaderc/shaderc.h shaderc_compiler_initialize -lshaderc_shared enabled libshine && require_pkg_config libshine shine shine/layer3.h shine_encode_buffer enabled libsmbclient && { check_pkg_config libsmbclient smbclient libsmbclient.h smbc_init || require libsmbclient libsmbclient.h smbc_init -lsmbclient; } diff --git a/libavfilter/vulkan.c b/libavfilter/vulkan.c new file mode 100644 index 0000000000..c2e02f5d0a --- /dev/null +++ b/libavfilter/vulkan.c @@ -0,0 +1,1101 @@ +/* + * Vulkan utilities + * Copyright (c) 2018 Rostislav Pehlivanov + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "formats.h" +#include "vulkan.h" + +int ff_vk_filter_query_formats(AVFilterContext *avctx) +{ + static const enum AVPixelFormat pixel_formats[] = { + AV_PIX_FMT_VULKAN, AV_PIX_FMT_NONE, + }; + AVFilterFormats *pix_fmts = ff_make_format_list(pixel_formats); + if (!pix_fmts) + return AVERROR(ENOMEM); + + return ff_set_common_formats(avctx, pix_fmts); +} + +static int vulkan_filter_set_device(AVFilterContext *avctx, + AVBufferRef *device) +{ + VulkanFilterContext *s = avctx->priv; + + av_buffer_unref(&s->device_ref); + + s->device_ref = av_buffer_ref(device); + if (!s->device_ref) + return AVERROR(ENOMEM); + + s->device = (AVHWDeviceContext*)s->device_ref->data; + s->hwctx = s->device->hwctx; + + return 0; +} + +static int vulkan_filter_set_frames(AVFilterContext *avctx, + AVBufferRef *frames) +{ + VulkanFilterContext *s = avctx->priv; + + av_buffer_unref(&s->frames_ref); + + s->frames_ref = av_buffer_ref(frames); + if (!s->frames_ref) + return AVERROR(ENOMEM); + + return 0; +} + +int ff_vk_filter_config_input(AVFilterLink *inlink) +{ + int err; + AVFilterContext *avctx = inlink->dst; + VulkanFilterContext *s = avctx->priv; + AVHWFramesContext *input_frames; + + if (!inlink->hw_frames_ctx) { + av_log(avctx, AV_LOG_ERROR, "Vulkan filtering requires a " + "hardware frames context on the input.\n"); + return AVERROR(EINVAL); + } + + /* Extract the device and default output format from the first input. */ + if (avctx->inputs[0] != inlink) + return 0; + + input_frames = (AVHWFramesContext*)inlink->hw_frames_ctx->data; + if (input_frames->format != AV_PIX_FMT_VULKAN) + return AVERROR(EINVAL); + + err = vulkan_filter_set_device(avctx, input_frames->device_ref); + if (err < 0) + return err; + err = vulkan_filter_set_frames(avctx, inlink->hw_frames_ctx); + if (err < 0) + return err; + + /* Default output parameters match input parameters. */ + s->input_format = input_frames->sw_format; + if (s->output_format == AV_PIX_FMT_NONE) + s->output_format = input_frames->sw_format; + if (!s->output_width) + s->output_width = inlink->w; + if (!s->output_height) + s->output_height = inlink->h; + + return 0; +} + +int ff_vk_filter_config_output_inplace(AVFilterLink *outlink) +{ + int err; + AVFilterContext *avctx = outlink->src; + VulkanFilterContext *s = avctx->priv; + + av_buffer_unref(&outlink->hw_frames_ctx); + + if (!s->device_ref) { + if (!avctx->hw_device_ctx) { + av_log(avctx, AV_LOG_ERROR, "Vulkan filtering requires a " + "Vulkan device.\n"); + return AVERROR(EINVAL); + } + + err = vulkan_filter_set_device(avctx, avctx->hw_device_ctx); + if (err < 0) + return err; + } + + outlink->hw_frames_ctx = av_buffer_ref(s->frames_ref); + outlink->w = s->output_width; + outlink->h = s->output_height; + + return 0; +} + +int ff_vk_filter_config_output(AVFilterLink *outlink) +{ + int err; + AVFilterContext *avctx = outlink->src; + VulkanFilterContext *s = avctx->priv; + AVBufferRef *output_frames_ref; + AVHWFramesContext *output_frames; + + av_buffer_unref(&outlink->hw_frames_ctx); + + if (!s->device_ref) { + if (!avctx->hw_device_ctx) { + av_log(avctx, AV_LOG_ERROR, "Vulkan filtering requires a " + "Vulkan device.\n"); + return AVERROR(EINVAL); + } + + err = vulkan_filter_set_device(avctx, avctx->hw_device_ctx); + if (err < 0) + return err; + } + + output_frames_ref = av_hwframe_ctx_alloc(s->device_ref); + if (!output_frames_ref) { + err = AVERROR(ENOMEM); + goto fail; + } + output_frames = (AVHWFramesContext*)output_frames_ref->data; + + output_frames->format = AV_PIX_FMT_VULKAN; + output_frames->sw_format = s->output_format; + output_frames->width = s->output_width; + output_frames->height = s->output_height; + + err = av_hwframe_ctx_init(output_frames_ref); + if (err < 0) { + av_log(avctx, AV_LOG_ERROR, "Failed to initialise output " + "frames: %d.\n", err); + goto fail; + } + + outlink->hw_frames_ctx = output_frames_ref; + outlink->w = s->output_width; + outlink->h = s->output_height; + + return 0; +fail: + av_buffer_unref(&output_frames_ref); + return err; +} + +int ff_vk_filter_init(AVFilterContext *avctx) +{ + VulkanFilterContext *s = avctx->priv; + const shaderc_env_version opt_ver = shaderc_env_version_vulkan_1_1; + const shaderc_optimization_level opt_lvl = shaderc_optimization_level_size; + + s->output_format = AV_PIX_FMT_NONE; + + s->sc_compiler = shaderc_compiler_initialize(); + if (!s->sc_compiler) + return AVERROR_EXTERNAL; + + s->sc_opts = shaderc_compile_options_initialize(); + if (!s->sc_compiler) + return AVERROR_EXTERNAL; + + shaderc_compile_options_set_target_env(s->sc_opts, + shaderc_target_env_vulkan, + opt_ver); + shaderc_compile_options_set_optimization_level(s->sc_opts, opt_lvl); + + return 0; +} + +void ff_vk_filter_uninit(AVFilterContext *avctx) +{ + int i; + VulkanFilterContext *s = avctx->priv; + VK_LOAD_PFN(s->hwctx->inst, vkDestroyDescriptorUpdateTemplateKHR); + VK_LOAD_PFN(s->hwctx->inst, vkDestroySamplerYcbcrConversionKHR); + + shaderc_compile_options_release(s->sc_opts); + shaderc_compiler_release(s->sc_compiler); + + for (i = 0; i < s->shaders_num; i++) { + SPIRVShader *shd = &s->shaders[i]; + vkDestroyShaderModule(s->hwctx->act_dev, shd->shader.module, NULL); + } + + vkDestroyPipeline(s->hwctx->act_dev, s->pipeline, NULL); + vkDestroyPipelineLayout(s->hwctx->act_dev, s->pipeline_layout, NULL); + + vkDestroySampler(s->hwctx->act_dev, s->sampler, NULL); + pfn_vkDestroySamplerYcbcrConversionKHR(s->hwctx->act_dev, s->yuv_sampler.conversion, NULL); + + av_vk_free_buf(s->device, &s->vbuffer); + + for (i = 0; i < s->descriptor_sets_num; i++) { + pfn_vkDestroyDescriptorUpdateTemplateKHR(s->hwctx->act_dev, s->desc_template[i], NULL); + vkDestroyDescriptorSetLayout(s->hwctx->act_dev, s->desc_layout[i], NULL); + } + + vkDestroyDescriptorPool(s->hwctx->act_dev, s->desc_pool, NULL); + vkDestroyRenderPass(s->hwctx->act_dev, s->renderpass, NULL); + + av_freep(&s->desc_layout); + av_freep(&s->pool_size_desc); + av_freep(&s->shaders); + av_buffer_unref(&s->device_ref); + av_buffer_unref(&s->frames_ref); + + /* Only freed in case of failure */ + av_freep(&s->push_consts); + av_freep(&s->pool_size_desc); + if (s->desc_template_info) { + for (i = 0; i < s->descriptor_sets_num; i++) + av_free((void *)s->desc_template_info[i].pDescriptorUpdateEntries); + av_freep(&s->desc_template_info); + } +} + +SPIRVShader *ff_vk_init_shader(AVFilterContext *avctx, const char *name, + VkShaderStageFlags stage) +{ + SPIRVShader *shd; + VulkanFilterContext *s = avctx->priv; + + s->shaders = av_realloc_array(s->shaders, sizeof(*s->shaders), + s->shaders_num + 1); + if (!s->shaders) + return NULL; + + shd = &s->shaders[s->shaders_num++]; + memset(shd, 0, sizeof(*shd)); + av_bprint_init(&shd->src, 0, AV_BPRINT_SIZE_UNLIMITED); + + shd->shader.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO; + shd->shader.stage = stage; + + shd->name = name; + + GLSLF(0, #version %i, 460); + GLSLC(0, #define AREA(v) ((v).x*(v).y)); + GLSLC(0, #define COPY_IMG(d, s, p) imageStore(d, (p), imageLoad(s, (p)))); + GLSLC(0, ); + + return shd; +} + +void ff_vk_set_compute_shader_sizes(AVFilterContext *avctx, SPIRVShader *shd, + int local_size[3]) +{ + shd->local_size[0] = local_size[0]; + shd->local_size[1] = local_size[1]; + shd->local_size[2] = local_size[2]; + + av_bprintf(&shd->src, "layout (local_size_x = %i, " + "local_size_y = %i, local_size_z = %i) in;\n", + shd->local_size[0], shd->local_size[1], shd->local_size[2]); +} + +static void print_shader(AVFilterContext *avctx, SPIRVShader *shd) +{ + int i; + int line = 0; + const char *p = shd->src.str; + const char *start = p; + + AVBPrint buf; + av_bprint_init(&buf, 0, AV_BPRINT_SIZE_UNLIMITED); + + for (i = 0; i < strlen(p); i++) { + if (p[i] == '\n') { + av_bprintf(&buf, "%i\t", ++line); + av_bprint_append_data(&buf, start, &p[i] - start + 1); + start = &p[i + 1]; + } + } + + av_log(avctx, AV_LOG_WARNING, "Compiling shader %s: \n%s\n", + shd->name, buf.str); + av_bprint_finalize(&buf, NULL); +} + +int ff_vk_compile_shader(AVFilterContext *avctx, SPIRVShader *shd, + const char *entry) +{ + VkResult ret; + VulkanFilterContext *s = avctx->priv; + VkShaderModuleCreateInfo shader_create; + + shaderc_compilation_result_t res; + static const shaderc_shader_kind type_map[] = { + [VK_SHADER_STAGE_VERTEX_BIT] = shaderc_vertex_shader, + [VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT] = shaderc_tess_control_shader, + [VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT] = shaderc_tess_evaluation_shader, + [VK_SHADER_STAGE_GEOMETRY_BIT] = shaderc_geometry_shader, + [VK_SHADER_STAGE_FRAGMENT_BIT] = shaderc_fragment_shader, + [VK_SHADER_STAGE_COMPUTE_BIT] = shaderc_compute_shader, + }; + + shd->shader.pName = entry; + + print_shader(avctx, shd); + + res = shaderc_compile_into_spv(s->sc_compiler, shd->src.str, shd->src.len, + type_map[shd->shader.stage], shd->name, + entry, s->sc_opts); + av_bprint_finalize(&shd->src, NULL); + + if (shaderc_result_get_compilation_status(res) != + shaderc_compilation_status_success) { + av_log(avctx, AV_LOG_ERROR, "%s", shaderc_result_get_error_message(res)); + return AVERROR_EXTERNAL; + } + + shader_create.sType = VK_STRUCTURE_TYPE_SHADER_MODULE_CREATE_INFO; + shader_create.pNext = NULL; + shader_create.codeSize = shaderc_result_get_length(res); + shader_create.flags = 0; + shader_create.pCode = (const uint32_t *)shaderc_result_get_bytes(res); + + ret = vkCreateShaderModule(s->hwctx->act_dev, &shader_create, NULL, + &shd->shader.module); + shaderc_result_release(res); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to create shader module: %s\n", + av_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + av_log(avctx, AV_LOG_WARNING, "Shader linked! Size: %zu bytes\n", + shader_create.codeSize); + + return 0; +} + +int ff_vk_init_renderpass(AVFilterContext *avctx) +{ + VulkanFilterContext *s = avctx->priv; + + VkAttachmentDescription rpass_att[] = { + { + .format = av_vkfmt_from_pixfmt(s->output_format), + .samples = VK_SAMPLE_COUNT_1_BIT, + .loadOp = VK_ATTACHMENT_LOAD_OP_DONT_CARE, + .storeOp = VK_ATTACHMENT_STORE_OP_STORE, + .stencilLoadOp = VK_ATTACHMENT_LOAD_OP_DONT_CARE, + .stencilStoreOp = VK_ATTACHMENT_STORE_OP_DONT_CARE, + .initialLayout = VK_IMAGE_LAYOUT_UNDEFINED, + .finalLayout = VK_IMAGE_LAYOUT_PRESENT_SRC_KHR, + }, + }; + + VkSubpassDescription rpass_sub_desc[] = { + { + .pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS, + .colorAttachmentCount = 1, + .pColorAttachments = (VkAttachmentReference[]) { + { 0, VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL }, + }, + .pDepthStencilAttachment = NULL, + .preserveAttachmentCount = 0, + } + }; + + VkRenderPassCreateInfo renderpass_spawn = { + .sType = VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO, + .pAttachments = rpass_att, + .attachmentCount = FF_ARRAY_ELEMS(rpass_att), + .pSubpasses = rpass_sub_desc, + .subpassCount = FF_ARRAY_ELEMS(rpass_sub_desc), + }; + + VkResult ret = vkCreateRenderPass(s->hwctx->act_dev, &renderpass_spawn, + NULL, &s->renderpass); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Renderpass init failure: %s\n", av_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + return 0; +} + +static VkSamplerYcbcrModelConversion conv_primaries(enum AVColorPrimaries color_primaries) +{ + switch(color_primaries) { + case AVCOL_PRI_BT470BG: + return VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_601; + case AVCOL_PRI_BT709: + return VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_709; + case AVCOL_PRI_BT2020: + return VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_2020; + } + return VK_SAMPLER_YCBCR_MODEL_CONVERSION_RGB_IDENTITY; +} + +int ff_vk_init_sampler(AVFilterContext *avctx, AVFrame *input) +{ + VkResult ret; + VulkanFilterContext *s = avctx->priv; + VkSamplerYcbcrConversionCreateInfo c_info; + + if (input) { + c_info.sType = VK_STRUCTURE_TYPE_SAMPLER_YCBCR_CONVERSION_CREATE_INFO; + c_info.format = av_vkfmt_from_pixfmt(s->input_format); + c_info.chromaFilter = VK_FILTER_LINEAR; + c_info.ycbcrModel = conv_primaries(input->color_primaries); + c_info.ycbcrRange = input->color_range == AVCOL_RANGE_JPEG ? + VK_SAMPLER_YCBCR_RANGE_ITU_FULL : + VK_SAMPLER_YCBCR_RANGE_ITU_NARROW; + c_info.xChromaOffset = input->chroma_location == AVCHROMA_LOC_CENTER ? + VK_CHROMA_LOCATION_MIDPOINT : + VK_CHROMA_LOCATION_COSITED_EVEN; + c_info.forceExplicitReconstruction = 0; + + VkComponentMapping comp_map = { + .r = VK_COMPONENT_SWIZZLE_IDENTITY, + .g = VK_COMPONENT_SWIZZLE_IDENTITY, + .b = VK_COMPONENT_SWIZZLE_IDENTITY, + .a = VK_COMPONENT_SWIZZLE_ONE, + }; + + c_info.components = comp_map; + + VK_LOAD_PFN(s->hwctx->inst, vkCreateSamplerYcbcrConversionKHR); + + s->yuv_sampler.sType = VK_STRUCTURE_TYPE_SAMPLER_YCBCR_CONVERSION_INFO; + + ret = pfn_vkCreateSamplerYcbcrConversionKHR(s->hwctx->act_dev, &c_info, + NULL, &s->yuv_sampler.conversion); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to init conversion: %s\n", av_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } + + VkSamplerCreateInfo sampler = { + .sType = VK_STRUCTURE_TYPE_SAMPLER_CREATE_INFO, + .pNext = input ? &s->yuv_sampler : NULL, + .magFilter = VK_FILTER_LINEAR, + .minFilter = VK_FILTER_LINEAR, + .mipmapMode = VK_SAMPLER_MIPMAP_MODE_NEAREST, + .addressModeU = VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE, + .addressModeV = sampler.addressModeU, + .addressModeW = sampler.addressModeU, + .anisotropyEnable = VK_FALSE, + .compareOp = VK_COMPARE_OP_NEVER, + .borderColor = VK_BORDER_COLOR_FLOAT_OPAQUE_BLACK, + .unnormalizedCoordinates = 1, + }; + + ret = vkCreateSampler(s->hwctx->act_dev, &sampler, NULL, &s->sampler); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to init sampler: %s\n", av_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + return 0; +} + +/* A 3x2 matrix, with the translation part separate. */ +struct transform { + /* row-major, e.g. in mathematical notation: + * | m[0][0] m[0][1] | + * | m[1][0] m[1][1] | */ + float m[2][2]; + float t[2]; +}; + +/* Standard parallel 2D projection, except y1 < y0 means that the coordinate + * system is flipped, not the projection. */ +static inline void transform_ortho(struct transform *t, float x0, float x1, + float y0, float y1) +{ + if (y1 < y0) { + float tmp = y0; + y0 = tmp - y1; + y1 = tmp; + } + + t->m[0][0] = 2.0f / (x1 - x0); + t->m[0][1] = 0.0f; + t->m[1][0] = 0.0f; + t->m[1][1] = 2.0f / (y1 - y0); + t->t[0] = -(x1 + x0) / (x1 - x0); + t->t[1] = -(y1 + y0) / (y1 - y0); +} + +/* This treats m as an affine transformation, in other words m[2][n] gets + * added to the output. */ +static inline void transform_vec(struct transform t, float *x, float *y) +{ + float vx = *x, vy = *y; + *x = vx * t.m[0][0] + vy * t.m[0][1] + t.t[0]; + *y = vx * t.m[1][0] + vy * t.m[1][1] + t.t[1]; +} + +/* Vertex buffer structure */ +struct vertex { + struct { + float x, y; + } position; + struct { + float x, y; + } texcoord[4]; +}; + +int ff_vk_init_simple_vbuffer(AVFilterContext *avctx) +{ + struct vertex *va; + struct transform t; + VulkanFilterContext *s = avctx->priv; + + int i, n, err, vp_w = s->output_width, vp_h = s->output_height; + float x[2] = { 0, vp_w }; + float y[2] = { 0, vp_h }; + + s->num_verts = 4; + + err = av_vk_create_buf(s->device, &s->vbuffer, + sizeof(struct vertex)*s->num_verts, + VK_BUFFER_USAGE_VERTEX_BUFFER_BIT, + VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, + NULL, NULL); + if (err) + return err; + + err = av_vk_map_buffers(s->device, &s->vbuffer, (uint8_t **)&va, 1, 0); + if (err) + return err; + + transform_ortho(&t, 0, vp_w, 0, vp_h); + transform_vec(t, &x[0], &y[0]); + transform_vec(t, &x[1], &y[1]); + + for (n = 0; n < s->num_verts; n++) { + struct vertex *v = &va[n]; + v->position.x = x[n / 2]; + v->position.y = y[n % 2]; + for (i = 0; i < 4; i++) { + struct transform tr = { { { 0 } } }; + float tx = (n / 2) * vp_w; + float ty = (n % 2) * vp_h; + tr.m[0][0] = 1.0f; + tr.m[1][1] = 1.0f; + transform_vec(tr, &tx, &ty); + v->texcoord[i].x = tx / vp_w; + v->texcoord[i].y = ty / vp_h; + } + } + + err = av_vk_unmap_buffers(s->device, &s->vbuffer, 1, 1); + if (err) + return err; + + return 0; +} + +int ff_vk_add_push_constant(AVFilterContext *avctx, int offset, int size, + VkShaderStageFlagBits stage) +{ + VkPushConstantRange *pc; + VulkanFilterContext *s = avctx->priv; + + s->push_consts = av_realloc_array(s->push_consts, sizeof(*s->push_consts), + s->push_consts_num + 1); + if (!s->push_consts) + return AVERROR(ENOMEM); + + pc = &s->push_consts[s->push_consts_num++]; + memset(pc, 0, sizeof(*pc)); + + pc->stageFlags = stage; + pc->offset = offset; + pc->size = size; + + return s->push_consts_num - 1; +} + +static const struct descriptor_props { + size_t struct_size; /* Size of the opaque which updates the descriptor */ + const char *type; + int is_uniform; + int mem_quali; /* Can use a memory qualifier */ + int dim_needed; /* Must indicate dimension */ + int buf_content; /* Must indicate buffer contents */ +} descriptor_props[] = { + [VK_DESCRIPTOR_TYPE_SAMPLER] = { sizeof(VkDescriptorImageInfo), "sampler", 1, 0, 0, 0, }, + [VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE] = { sizeof(VkDescriptorImageInfo), "texture", 1, 0, 1, 0, }, + [VK_DESCRIPTOR_TYPE_STORAGE_IMAGE] = { sizeof(VkDescriptorImageInfo), "image", 1, 1, 1, 0, }, + [VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT] = { sizeof(VkDescriptorImageInfo), "subpassInput", 1, 0, 0, 0, }, + [VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER] = { sizeof(VkDescriptorImageInfo), "sampler", 1, 0, 1, 0, }, + [VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER] = { sizeof(VkDescriptorBufferInfo), NULL, 1, 0, 0, 1, }, + [VK_DESCRIPTOR_TYPE_STORAGE_BUFFER] = { sizeof(VkDescriptorBufferInfo), "buffer", 0, 1, 0, 1, }, + [VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC] = { sizeof(VkDescriptorBufferInfo), NULL, 1, 0, 0, 1, }, + [VK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC] = { sizeof(VkDescriptorBufferInfo), "buffer", 0, 1, 0, 1, }, + [VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER] = { sizeof(VkBufferView), "samplerBuffer", 1, 0, 0, 0, }, + [VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER] = { sizeof(VkBufferView), "imageBuffer", 1, 0, 0, 0, }, +}; + +int ff_vk_add_descriptor_set(AVFilterContext *avctx, SPIRVShader *shd, + VulkanDescriptorSetBinding *desc, int num, + int only_print_to_shader) +{ + int i, j; + VkResult ret; + VkDescriptorSetLayout *layout; + VulkanFilterContext *s = avctx->priv; + + if (only_print_to_shader) + goto print; + + s->desc_layout = av_realloc_array(s->desc_layout, sizeof(*s->desc_layout), + s->descriptor_sets_num + 1); + if (!s->desc_layout) + return AVERROR(ENOMEM); + + layout = &s->desc_layout[s->descriptor_sets_num]; + memset(layout, 0, sizeof(*layout)); + + { /* Create descriptor set layout descriptions */ + VkDescriptorSetLayoutCreateInfo desc_create_layout = { 0 }; + VkDescriptorSetLayoutBinding *desc_binding; + + desc_binding = av_mallocz(sizeof(*desc_binding)*num); + if (!desc_binding) + return AVERROR(ENOMEM); + + for (i = 0; i < num; i++) { + desc_binding[i].binding = i; + desc_binding[i].descriptorType = desc[i].type; + desc_binding[i].descriptorCount = FFMAX(desc[i].elems, 1); + desc_binding[i].stageFlags = desc[i].stages; + desc_binding[i].pImmutableSamplers = desc[i].samplers; + } + + desc_create_layout.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO; + desc_create_layout.pBindings = desc_binding; + desc_create_layout.bindingCount = num; + + ret = vkCreateDescriptorSetLayout(s->hwctx->act_dev, &desc_create_layout, + NULL, layout); + av_free(desc_binding); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to init descriptor set " + "layout: %s\n", av_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } + + { /* Pool each descriptor by type and update pool counts */ + for (i = 0; i < num; i++) { + for (j = 0; j < s->pool_size_desc_num; j++) + if (s->pool_size_desc[j].type == desc[i].type) + break; + if (j >= s->pool_size_desc_num) { + s->pool_size_desc = av_realloc_array(s->pool_size_desc, + sizeof(*s->pool_size_desc), + ++s->pool_size_desc_num); + if (!s->pool_size_desc) + return AVERROR(ENOMEM); + memset(&s->pool_size_desc[j], 0, sizeof(VkDescriptorPoolSize)); + } + s->pool_size_desc[j].type = desc[i].type; + s->pool_size_desc[j].descriptorCount += FFMAX(desc[i].elems, 1); + } + } + + { /* Create template creation struct */ + VkDescriptorUpdateTemplateCreateInfo *dt; + VkDescriptorUpdateTemplateEntry *des_entries; + + /* Freed after descriptor set initialization */ + des_entries = av_mallocz(num*sizeof(VkDescriptorUpdateTemplateEntry)); + if (!des_entries) + return AVERROR(ENOMEM); + + for (i = 0; i < num; i++) { + des_entries[i].dstBinding = i; + des_entries[i].descriptorType = desc[i].type; + des_entries[i].descriptorCount = FFMAX(desc[i].elems, 1); + des_entries[i].dstArrayElement = 0; + des_entries[i].offset = ((uint8_t *)desc[i].updater) - (uint8_t *)s; + des_entries[i].stride = descriptor_props[desc[i].type].struct_size; + } + + s->desc_template_info = av_realloc_array(s->desc_template_info, + sizeof(*s->desc_template_info), + s->descriptor_sets_num + 1); + if (!s->desc_layout) + return AVERROR(ENOMEM); + + dt = &s->desc_template_info[s->descriptor_sets_num]; + memset(dt, 0, sizeof(*dt)); + + dt->sType = VK_STRUCTURE_TYPE_DESCRIPTOR_UPDATE_TEMPLATE_CREATE_INFO; + dt->templateType = VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_DESCRIPTOR_SET; + dt->descriptorSetLayout = *layout; + dt->pDescriptorUpdateEntries = des_entries; + dt->descriptorUpdateEntryCount = num; + } + + s->descriptor_sets_num++; + +print: + /* Write shader info */ + for (i = 0; i < num; i++) { + const struct descriptor_props *prop = &descriptor_props[desc[i].type]; + GLSLA("layout (set = %i, binding = %i", s->descriptor_sets_num - 1, i); + + if (desc[i].mem_layout) + GLSLA(", %s", desc[i].mem_layout); + GLSLA(")"); + + if (prop->is_uniform) + GLSLA(" uniform"); + + if (prop->mem_quali && desc[i].mem_quali) + GLSLA(" %s", desc[i].mem_quali); + + if (prop->type) + GLSLA(" %s", prop->type); + + if (prop->dim_needed) + GLSLA("%iD", desc[i].dimensions); + + GLSLA(" %s", desc[i].name); + + if (prop->buf_content) + GLSLA(" {\n %s\n}", desc[i].buf_content); + else if (desc[i].elems > 0) + GLSLA("[%i]", desc[i].elems); + + GLSLA(";\n"); + } + + return 0; +} + +void ff_vk_update_descriptor_set(AVFilterContext *avctx, int set_id) +{ + VulkanFilterContext *s = avctx->priv; + + VK_LOAD_PFN(s->hwctx->inst, vkUpdateDescriptorSetWithTemplateKHR); + pfn_vkUpdateDescriptorSetWithTemplateKHR(s->hwctx->act_dev, + s->desc_set[set_id], + s->desc_template[set_id], s); +} + +const enum VkImageAspectFlagBits ff_vk_aspect_flags(enum AVPixelFormat pixfmt, + int plane) +{ + const int tot_planes = av_pix_fmt_count_planes(pixfmt); + static const enum VkImageAspectFlagBits m[] = { VK_IMAGE_ASPECT_PLANE_0_BIT, + VK_IMAGE_ASPECT_PLANE_1_BIT, + VK_IMAGE_ASPECT_PLANE_2_BIT, }; + if (!tot_planes || (plane > tot_planes)) + return 0; + if (tot_planes == 1) + return VK_IMAGE_ASPECT_COLOR_BIT; + if (plane < 0) + return m[0] | m[1] | (tot_planes > 2 ? m[2] : 0); + return m[plane]; +} + +const VkFormat ff_vk_plane_rep_fmt(enum AVPixelFormat pixfmt, int plane) +{ + const int tot_planes = av_pix_fmt_count_planes(pixfmt); + const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(pixfmt); + const int high = desc->comp[plane].depth > 8; + if (tot_planes == 1) { /* RGB, etc.'s singleplane rep is itself */ + return av_vkfmt_from_pixfmt(pixfmt); + } else if (tot_planes == 2) { /* Must be NV12 or P010 */ + if (!high) + return !plane ? VK_FORMAT_R8_UNORM : VK_FORMAT_R8G8_UNORM; + else + return !plane ? VK_FORMAT_R16_UNORM : VK_FORMAT_R16G16_UNORM; + } else { /* Regular planar YUV */ + return !high ? VK_FORMAT_R8_UNORM : VK_FORMAT_R16_UNORM; + } +} + +int ff_vk_create_imageview(AVFilterContext *avctx, VkImageView *v, AVVkFrame *f, + VkFormat fmt, enum VkImageAspectFlagBits aspect, + VkComponentMapping m, void *pnext) +{ + VulkanFilterContext *s = avctx->priv; + VkImageViewCreateInfo imgview_spawn = { + .sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO, + .pNext = pnext, + .image = f->img, + .viewType = VK_IMAGE_VIEW_TYPE_2D, + .format = fmt, + .components = m, + .subresourceRange = { + .aspectMask = aspect, + .baseMipLevel = 0, + .levelCount = 1, + .baseArrayLayer = 0, + .layerCount = 1, + }, + }; + + VkResult ret = vkCreateImageView(s->hwctx->act_dev, &imgview_spawn, NULL, v); + if (ret != VK_SUCCESS) { + av_log(s, AV_LOG_ERROR, "Failed to create imageview: %s\n", + av_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + return 0; +} + +void ff_vk_destroy_imageview(AVFilterContext *avctx, VkImageView v) +{ + VulkanFilterContext *s = avctx->priv; + vkDestroyImageView(s->hwctx->act_dev, v, NULL); +} + +int ff_vk_init_pipeline_layout(AVFilterContext *avctx) +{ + int i; + VkResult ret; + VulkanFilterContext *s = avctx->priv; + + { /* Init descriptor set pool */ + VkDescriptorPoolCreateInfo pool_create_info = { + .sType = VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO, + .poolSizeCount = s->pool_size_desc_num, + .pPoolSizes = s->pool_size_desc, + .maxSets = s->descriptor_sets_num, + }; + + ret = vkCreateDescriptorPool(s->hwctx->act_dev, &pool_create_info, + NULL, &s->desc_pool); + av_freep(&s->pool_size_desc); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to init descriptor set " + "pool: %s\n", av_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } + + { /* Allocate descriptor sets */ + VkDescriptorSetAllocateInfo alloc_info = { + .sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO, + .descriptorPool = s->desc_pool, + .descriptorSetCount = s->descriptor_sets_num, + .pSetLayouts = s->desc_layout, + }; + + s->desc_set = av_malloc(s->descriptor_sets_num*sizeof(*s->desc_set)); + if (!s->desc_set) + return AVERROR(ENOMEM); + + ret = vkAllocateDescriptorSets(s->hwctx->act_dev, &alloc_info, + s->desc_set); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to allocate descriptor set: %s\n", + av_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } + + { /* Finally create the pipeline layout */ + VkPipelineLayoutCreateInfo spawn_pipeline_layout = { + .sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO, + .setLayoutCount = s->descriptor_sets_num, + .pSetLayouts = s->desc_layout, + .pushConstantRangeCount = s->push_consts_num, + .pPushConstantRanges = s->push_consts, + }; + + ret = vkCreatePipelineLayout(s->hwctx->act_dev, &spawn_pipeline_layout, + NULL, &s->pipeline_layout); + av_freep(&s->push_consts); + s->push_consts_num = 0; + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to init pipeline layout: %s\n", + av_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } + + { /* Descriptor template (for tightly packed descriptors) */ + VK_LOAD_PFN(s->hwctx->inst, vkCreateDescriptorUpdateTemplateKHR); + VkDescriptorUpdateTemplateCreateInfo *desc_template_info; + + s->desc_template = av_malloc(s->descriptor_sets_num*sizeof(*s->desc_template)); + if (!s->desc_template) + return AVERROR(ENOMEM); + + /* Create update templates for the descriptor sets */ + for (i = 0; i < s->descriptor_sets_num; i++) { + desc_template_info = &s->desc_template_info[i]; + desc_template_info->pipelineLayout = s->pipeline_layout; + ret = pfn_vkCreateDescriptorUpdateTemplateKHR(s->hwctx->act_dev, + desc_template_info, + NULL, + &s->desc_template[i]); + av_free((void *)desc_template_info->pDescriptorUpdateEntries); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to init descriptor " + "template: %s\n", av_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } + + av_freep(&s->desc_template_info); + } + + return 0; +} + +int ff_vk_init_compute_pipeline(AVFilterContext *avctx) +{ + int i; + VkResult ret; + VulkanFilterContext *s = avctx->priv; + + VkComputePipelineCreateInfo pipeline_desc = { + .sType = VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO, + .layout = s->pipeline_layout, + }; + + for (i = 0; i < s->shaders_num; i++) { + if (s->shaders[i].shader.stage & VK_SHADER_STAGE_COMPUTE_BIT) { + pipeline_desc.stage = s->shaders[i].shader; + break; + } + } + if (i == s->shaders_num) { + av_log(avctx, AV_LOG_ERROR, "Can't init compute pipeline, no shader\n"); + return AVERROR(EINVAL); + } + + ret = vkCreateComputePipelines(s->hwctx->act_dev, VK_NULL_HANDLE, 1, + &pipeline_desc, NULL, &s->pipeline); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to init compute pipeline: %s\n", + av_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + return 0; +} + +int ff_vk_init_graphics_pipeline(AVFilterContext *avctx) +{ + VkResult ret; + VulkanFilterContext *s = avctx->priv; + + VkVertexInputBindingDescription vbind_desc = { + .binding = 0, + .stride = sizeof(struct vertex), + .inputRate = VK_VERTEX_INPUT_RATE_VERTEX, + }; + + VkVertexInputAttributeDescription vatt_desc[4] = { { 0 } }; + for (int i = 0; i < 4; i++) { + VkVertexInputAttributeDescription *att = &vatt_desc[i]; + att->location = i; + att->binding = 0; + att->format = VK_FORMAT_R32G32_SFLOAT, + att->offset = i*2*sizeof(float); + } + + VkPipelineVertexInputStateCreateInfo vpipe_info = { + .sType = VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_STATE_CREATE_INFO, + .vertexAttributeDescriptionCount = FF_ARRAY_ELEMS(vatt_desc), + .pVertexAttributeDescriptions = vatt_desc, + .vertexBindingDescriptionCount = 1, + .pVertexBindingDescriptions = &vbind_desc, + }; + + VkPipelineDynamicStateCreateInfo dynamic_states = { + .sType = VK_STRUCTURE_TYPE_PIPELINE_DYNAMIC_STATE_CREATE_INFO, + .dynamicStateCount = 2, + .pDynamicStates = (VkDynamicState []) { + VK_DYNAMIC_STATE_VIEWPORT, VK_DYNAMIC_STATE_SCISSOR, + }, + }; + + VkPipelineInputAssemblyStateCreateInfo spawn_input_asm = { + .sType = VK_STRUCTURE_TYPE_PIPELINE_INPUT_ASSEMBLY_STATE_CREATE_INFO, + .topology = VK_PRIMITIVE_TOPOLOGY_TRIANGLE_STRIP, + .primitiveRestartEnable = VK_FALSE, + }; + + VkRect2D scissor = { .extent = { .width = s->output_width, .height = s->output_height } }; + VkViewport viewport = { .width = s->output_width, .height = s->output_height, .maxDepth = 1.0f }; + + VkPipelineViewportStateCreateInfo spawn_viewport = { + .sType = VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_STATE_CREATE_INFO, + .viewportCount = 1, + .pViewports = &viewport, + .scissorCount = 1, + .pScissors = &scissor, + }; + + VkPipelineRasterizationStateCreateInfo rasterizer = { + .sType = VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_CREATE_INFO, + .depthClampEnable = VK_FALSE, + .rasterizerDiscardEnable = VK_FALSE, + .polygonMode = VK_POLYGON_MODE_FILL, + .lineWidth = 1.0f, + .cullMode = VK_CULL_MODE_NONE, + .frontFace = VK_FRONT_FACE_COUNTER_CLOCKWISE, + .depthBiasEnable = VK_FALSE, + }; + + VkPipelineMultisampleStateCreateInfo multisampling = { + .sType = VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO, + .sampleShadingEnable = VK_FALSE, + .rasterizationSamples = VK_SAMPLE_COUNT_1_BIT, + .minSampleShading = 1.0f, + .alphaToCoverageEnable = VK_FALSE, + .alphaToOneEnable = VK_FALSE, + }; + + VkPipelineColorBlendAttachmentState col_blend_att = { + .colorWriteMask = VK_COLOR_COMPONENT_R_BIT | VK_COLOR_COMPONENT_G_BIT | + VK_COLOR_COMPONENT_B_BIT | VK_COLOR_COMPONENT_A_BIT, + .blendEnable = VK_FALSE, + .srcColorBlendFactor = VK_BLEND_FACTOR_ONE, + .dstColorBlendFactor = VK_BLEND_FACTOR_ZERO, + .colorBlendOp = VK_BLEND_OP_ADD, + .srcAlphaBlendFactor = VK_BLEND_FACTOR_ONE, + .dstAlphaBlendFactor = VK_BLEND_FACTOR_ZERO, + .alphaBlendOp = VK_BLEND_OP_ADD, + }; + + VkPipelineColorBlendStateCreateInfo col_blend = { + .sType = VK_STRUCTURE_TYPE_PIPELINE_COLOR_BLEND_STATE_CREATE_INFO, + .logicOpEnable = VK_FALSE, + .logicOp = VK_LOGIC_OP_COPY, + .attachmentCount = 1, + .pAttachments = &col_blend_att, + }; + + VkGraphicsPipelineCreateInfo spawn_pipeline = { + .sType = VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO, + .pVertexInputState = &vpipe_info, + .stageCount = s->shaders_num, + //.pStages = s->shaders, TODO + .renderPass = s->renderpass, + .subpass = 0, + .layout = s->pipeline_layout, + .pDynamicState = &dynamic_states, + .pInputAssemblyState = &spawn_input_asm, + .pViewportState = &spawn_viewport, + .pRasterizationState = &rasterizer, + .pMultisampleState = &multisampling, + .pColorBlendState = &col_blend, + }; + + ret = vkCreateGraphicsPipelines(s->hwctx->act_dev, VK_NULL_HANDLE, 1, + &spawn_pipeline, NULL, &s->pipeline); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to init pipeline: %s\n", av_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + return 0; +} diff --git a/libavfilter/vulkan.h b/libavfilter/vulkan.h new file mode 100644 index 0000000000..6e059731b7 --- /dev/null +++ b/libavfilter/vulkan.h @@ -0,0 +1,179 @@ +/* + * Vulkan utilities + * Copyright (c) 2018 Rostislav Pehlivanov + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVFILTER_VULKAN_COMMON_H +#define AVFILTER_VULKAN_COMMON_H + +#include "avfilter.h" +#include "libavutil/pixdesc.h" +#include "libavutil/bprint.h" +#include "libavutil/hwcontext.h" +#include "libavutil/hwcontext_vulkan.h" + +#include + +/* GLSL management macros */ +#define INDENT(N) INDENT_##N +#define INDENT_0 +#define INDENT_1 INDENT_0 " " +#define INDENT_2 INDENT_1 INDENT_1 +#define INDENT_3 INDENT_2 INDENT_1 +#define INDENT_4 INDENT_3 INDENT_1 +#define INDENT_5 INDENT_4 INDENT_1 +#define INDENT_6 INDENT_5 INDENT_1 +#define C(N, S) INDENT(N) #S "\n" +#define GLSLC(N, S) av_bprintf(&shd->src, C(N, S)) +#define GLSLA(...) av_bprintf(&shd->src, __VA_ARGS__) +#define GLSLF(N, S, ...) av_bprintf(&shd->src, C(N, S), __VA_ARGS__) +#define GLSLD(D) GLSLC(0, ); \ + av_bprint_append_data(&shd->src, D, strlen(D)); \ + GLSLC(0, ) + +typedef struct SPIRVShader { + const char *name; /* Name for id/debugging purposes */ + AVBPrint src; + int local_size[3]; /* Compute shader workgroup sizes */ + VkPipelineShaderStageCreateInfo shader; +} SPIRVShader; + +typedef struct VulkanDescriptorSetBinding { + const char *name; + VkDescriptorType type; + const char *mem_layout; /* Storage images (rgba8, etc.) and buffers (std430, etc.) */ + const char *mem_quali; /* readonly, writeonly, etc. */ + const char *buf_content; /* For buffers */ + uint32_t dimensions; /* Needed for e.g. sampler%iD */ + uint32_t elems; /* 0 - scalar, 1 or more - vector */ + VkShaderStageFlags stages; + const VkSampler *samplers; /* Immutable samplers */ + void *updater; +} VulkanDescriptorSetBinding; + +typedef struct VulkanFilterContext { + const AVClass *class; + + AVBufferRef *device_ref; + AVBufferRef *frames_ref; /* For in-place filtering */ + AVHWDeviceContext *device; + AVVulkanDeviceContext *hwctx; + + /* Properties */ + int output_width; + int output_height; + enum AVPixelFormat output_format; + enum AVPixelFormat input_format; + + /* Input */ + VkSampler sampler; + VkSamplerYcbcrConversionInfo yuv_sampler; + + /* Shaders */ + SPIRVShader *shaders; + int shaders_num; + shaderc_compiler_t sc_compiler; + shaderc_compile_options_t sc_opts; + + /* Contexts */ + VkRenderPass renderpass; + VkPipelineLayout pipeline_layout; + VkPipeline pipeline; + + /* Descriptors */ + VkDescriptorSetLayout *desc_layout; + VkDescriptorPool desc_pool; + VkDescriptorSet *desc_set; + VkDescriptorUpdateTemplate *desc_template; + int push_consts_num; + int descriptor_sets_num; + int pool_size_desc_num; + + /* Vertex buffer */ + AVVkBuffer vbuffer; + int num_verts; + + /* Temporary, used to store data in between initialization stages */ + VkDescriptorUpdateTemplateCreateInfo *desc_template_info; + VkDescriptorPoolSize *pool_size_desc; + VkPushConstantRange *push_consts; +} VulkanFilterContext; + +/* Gets the single-plane representation format */ +const VkFormat ff_vk_plane_rep_fmt(enum AVPixelFormat pixfmt, int plane); +/* Gets the image aspect flags of a plane */ +const enum VkImageAspectFlagBits ff_vk_aspect_flags(enum AVPixelFormat pixfmt, + int plane); +/* Creates an imageview */ +int ff_vk_create_imageview(AVFilterContext *avctx, VkImageView *v, AVVkFrame *f, + VkFormat fmt, enum VkImageAspectFlagBits aspect, + VkComponentMapping m, void *pnext); +/* Destroys an imageview */ +void ff_vk_destroy_imageview(AVFilterContext *avctx, VkImageView v); +/* Creates a shader */ +SPIRVShader *ff_vk_init_shader(AVFilterContext *avctx, const char *name, + VkShaderStageFlags stage); +/* For compute shaders, defines the workgroup size */ +void ff_vk_set_compute_shader_sizes(AVFilterContext *avctx, SPIRVShader *shd, + int local_size[3]); +/* Compiles a completed shader into a module */ +int ff_vk_compile_shader(AVFilterContext *avctx, SPIRVShader *shd, + const char *entry); + + + + + +/* Needs to be abstracted so it adds them to a certain pipeline layout */ +int ff_vk_add_descriptor_set(AVFilterContext *avctx, SPIRVShader *shd, + VulkanDescriptorSetBinding *desc, int num, + int only_print_to_shader); +int ff_vk_add_push_constant(AVFilterContext *avctx, int offset, int size, + VkShaderStageFlagBits stage); + + + + +/* Creates a Vulkan pipeline layout */ +int ff_vk_init_pipeline_layout(AVFilterContext *avctx); + +/* Create a Vulkan sampler, if input isn't NULL the sampler will convert to RGB */ +int ff_vk_init_sampler(AVFilterContext *avctx, AVFrame *input); + +/* Creates a compute pipeline */ +int ff_vk_init_compute_pipeline(AVFilterContext *avctx); + +/* Creates a Vulkan renderpass */ +int ff_vk_init_renderpass(AVFilterContext *avctx); +/* Creates a graphics pipeline */ +int ff_vk_init_graphics_pipeline(AVFilterContext *avctx); +/* Init a simple vertex buffer (4 vertices, a rectangle matching the video) */ +int ff_vk_init_simple_vbuffer(AVFilterContext *avctx); +/* Updates a given descriptor set after pipeline initialization */ +void ff_vk_update_descriptor_set(AVFilterContext *avctx, int set_id); + +/* General lavfi IO functions */ +int ff_vk_filter_query_formats (AVFilterContext *avctx); +int ff_vk_filter_init (AVFilterContext *avctx); +int ff_vk_filter_config_input (AVFilterLink *inlink); +int ff_vk_filter_config_output (AVFilterLink *outlink); +int ff_vk_filter_config_output_inplace(AVFilterLink *outlink); +void ff_vk_filter_uninit (AVFilterContext *avctx); + +#endif /* AVFILTER_VULKAN_COMMON_H */