From patchwork Thu Nov 18 05:46:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lynne X-Patchwork-Id: 31479 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a6b:d206:0:0:0:0:0 with SMTP id q6csp307051iob; Wed, 17 Nov 2021 21:46:23 -0800 (PST) X-Google-Smtp-Source: ABdhPJxvN4IHVELOlGCeRMM8PZm79b11KqvtvQHVvjSTzVVvgCiUUOP7hTroSrbkRNCZYznyOri8 X-Received: by 2002:a17:906:780a:: with SMTP id u10mr30213832ejm.235.1637214383610; Wed, 17 Nov 2021 21:46:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1637214383; cv=none; d=google.com; s=arc-20160816; b=xfqrUa5jDzyXhn1CKFxsTjTuM6TUAqIuVnhSd/VTPN/Sw6kvsKLr1V5HQX9qhAKXOB X5y0l6Kin/oCQeiN+mKsCHmi8Rk1hjwqn4398xZV35fs2etHYwglBC2+dcWCOA4Bn/Lq NiL0yGwBbCYkfdy/JfvdKPjMc2dWX37MRoc0UbXmx9fBRb0VNMPlL6BZy3qReL2dI0Yi XIjcdt79KYmRkykHea2UUBPGm9wFJG/jky2VqohB3bPsBUHNKdNNVQfHPstpdC2IxphP 8kfGifbWavxuAYvtX5gOPU2vdYnyW38lq6oSQFMUV8agIGdE8Qb90Jf3bfK+EeSk84Jq QPIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject :mime-version:message-id:to:from:date:dkim-signature:delivered-to; bh=2u6jPbGCErBiZbUviRssvXEaUxoklutsWzgYx1JLAE0=; b=JAqhIYiaNFgMYNCz34JypgK3hR2UyIMNdeut9dFYwqsRktS6QxpP9yLOhwG9xUh1I+ /HXxHZQUSm/ez+gY9Av+EeioJ+IrkBzfzWOhiCro9u/vnvgJ/goydgHbNv2Zb2QDd3N/ jKJEA54SKIB4nE4knA+IosyrdK5xtB6n+a9cgOs5Cy4lpcMUZQkCYGIZbyzvOsNEyZ6+ jwI7m5eMuFoEROaMnKh4AfkSXUuozA+Sh4rWygO5Q41tOu0IxzA2VEBuji0fQH3ZOTpP DrkLKGBPOjgWkdntKcYveB2P0ct+sAMaROI1asEEYFiPOoIXPCuWfFTdBueumbGoGKyI 5+9A== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@lynne.ee header.s=s1 header.b=g2Ijfu2X; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=lynne.ee Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id l16si4116459eda.406.2021.11.17.21.46.23; Wed, 17 Nov 2021 21:46:23 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@lynne.ee header.s=s1 header.b=g2Ijfu2X; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=lynne.ee Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 7BC2968AED0; Thu, 18 Nov 2021 07:46:18 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from w4.tutanota.de (w4.tutanota.de [81.3.6.165]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id A4B9D68810E for ; Thu, 18 Nov 2021 07:46:12 +0200 (EET) Received: from w3.tutanota.de (unknown [192.168.1.164]) by w4.tutanota.de (Postfix) with ESMTP id 4E1801060155 for ; Thu, 18 Nov 2021 05:46:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1637214371; s=s1; d=lynne.ee; h=From:From:To:To:Subject:Subject:Content-Description:Content-ID:Content-Type:Content-Type:Content-Transfer-Encoding:Cc:Date:Date:In-Reply-To:MIME-Version:MIME-Version:Message-ID:Message-ID:Reply-To:References:Sender; bh=yGixXMyurl+IBHPuun9KRGeaLUErvoTaTxhq1EnN/e8=; b=g2Ijfu2XtCvG+3epHD0ny3f3EvkPYVqX+iRxSxsszVMcR55ant03Punsnsnem2qA m8j/XFjLv8uIBJz9YVMW1XxH4cKgTUGGyJ3mOrH9+n8oUPVfnq+H6erl+G/ICWltdCj xtpd8KZ9fO7gkYkHW4Cg/rbGzn25pzSOZJa/EU4SvUpfIamE6UclPlpa2f1smHOJqyr Pdz/gAu4S2Dl+LKGAbAGns+gA+I0gE/SGWIEqp/DirDqk7w2FXCWtgC3s23kesCn8wZ J7wyLGBOBnrIR0v2rzI5hmrHk39CDdJheIAxefyUvPLLl9cDraXDVFaFDgvHJ13LffW OX6konWPOg== Date: Thu, 18 Nov 2021 06:46:11 +0100 (CET) From: Lynne To: Ffmpeg Devel Message-ID: MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] vulkan: move common Vulkan code from libavfilter to libavutil X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: OG6KRxKCVMQw This commit moves the common Vulkan framework into libavutil, allowing it to be used in libavcodec. Each library that uses it is expected to #include the .c/.h files into their own templates, that may include extra library-specific functions, and compile an object file to locally provide the necessary Vulkan code. This also decouples the libglslang dependency from the framework, eliminating the libglslang dependency on vf_libplacebo. The libavcodec hwaccel I'm writing depends on this. Patch attached. Subject: [PATCH] vulkan: move common Vulkan code from libavfilter to libavutil --- configure | 2 +- libavfilter/glslang.c | 239 +-- libavfilter/vf_avgblur_vulkan.c | 72 +- libavfilter/vf_chromaber_vulkan.c | 54 +- libavfilter/vf_gblur_vulkan.c | 86 +- libavfilter/vf_libplacebo.c | 2 +- libavfilter/vf_overlay_vulkan.c | 69 +- libavfilter/vf_scale_vulkan.c | 71 +- libavfilter/vulkan.c | 1391 +--------------- libavfilter/vulkan.h | 380 +---- libavutil/hwcontext_vulkan.c | 10 +- libavutil/vulkan.c | 1399 +++++++++++++++++ libavutil/vulkan.h | 414 +++++ libavutil/vulkan_glslang.c | 256 +++ .../glslang.h => libavutil/vulkan_glslang.h | 8 +- 15 files changed, 2261 insertions(+), 2192 deletions(-) create mode 100644 libavutil/vulkan.c create mode 100644 libavutil/vulkan.h create mode 100644 libavutil/vulkan_glslang.c rename libavfilter/glslang.h => libavutil/vulkan_glslang.h (87%) diff --git a/configure b/configure index 1b47f6512d..79252ac223 100755 --- a/configure +++ b/configure @@ -3620,7 +3620,7 @@ interlace_filter_deps="gpl" kerndeint_filter_deps="gpl" ladspa_filter_deps="ladspa libdl" lensfun_filter_deps="liblensfun version3" -libplacebo_filter_deps="libplacebo vulkan libglslang" +libplacebo_filter_deps="libplacebo vulkan" lv2_filter_deps="lv2" mcdeint_filter_deps="avcodec gpl" metadata_filter_deps="avformat" diff --git a/libavfilter/glslang.c b/libavfilter/glslang.c index e5a8d4dc2a..9aa41567a3 100644 --- a/libavfilter/glslang.c +++ b/libavfilter/glslang.c @@ -16,241 +16,4 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ -#include - -#include -#include - -#include "libavutil/mem.h" -#include "libavutil/avassert.h" - -#include "glslang.h" - -static pthread_mutex_t glslang_mutex = PTHREAD_MUTEX_INITIALIZER; -static int glslang_refcount = 0; - -static const glslang_resource_t glslc_resource_limits = { - .max_lights = 32, - .max_clip_planes = 6, - .max_texture_units = 32, - .max_texture_coords = 32, - .max_vertex_attribs = 64, - .max_vertex_uniform_components = 4096, - .max_varying_floats = 64, - .max_vertex_texture_image_units = 32, - .max_combined_texture_image_units = 80, - .max_texture_image_units = 32, - .max_fragment_uniform_components = 4096, - .max_draw_buffers = 32, - .max_vertex_uniform_vectors = 128, - .max_varying_vectors = 8, - .max_fragment_uniform_vectors = 16, - .max_vertex_output_vectors = 16, - .max_fragment_input_vectors = 15, - .min_program_texel_offset = -8, - .max_program_texel_offset = 7, - .max_clip_distances = 8, - .max_compute_work_group_count_x = 65535, - .max_compute_work_group_count_y = 65535, - .max_compute_work_group_count_z = 65535, - .max_compute_work_group_size_x = 1024, - .max_compute_work_group_size_y = 1024, - .max_compute_work_group_size_z = 64, - .max_compute_uniform_components = 1024, - .max_compute_texture_image_units = 16, - .max_compute_image_uniforms = 8, - .max_compute_atomic_counters = 8, - .max_compute_atomic_counter_buffers = 1, - .max_varying_components = 60, - .max_vertex_output_components = 64, - .max_geometry_input_components = 64, - .max_geometry_output_components = 128, - .max_fragment_input_components = 128, - .max_image_units = 8, - .max_combined_image_units_and_fragment_outputs = 8, - .max_combined_shader_output_resources = 8, - .max_image_samples = 0, - .max_vertex_image_uniforms = 0, - .max_tess_control_image_uniforms = 0, - .max_tess_evaluation_image_uniforms = 0, - .max_geometry_image_uniforms = 0, - .max_fragment_image_uniforms = 8, - .max_combined_image_uniforms = 8, - .max_geometry_texture_image_units = 16, - .max_geometry_output_vertices = 256, - .max_geometry_total_output_components = 1024, - .max_geometry_uniform_components = 1024, - .max_geometry_varying_components = 64, - .max_tess_control_input_components = 128, - .max_tess_control_output_components = 128, - .max_tess_control_texture_image_units = 16, - .max_tess_control_uniform_components = 1024, - .max_tess_control_total_output_components = 4096, - .max_tess_evaluation_input_components = 128, - .max_tess_evaluation_output_components = 128, - .max_tess_evaluation_texture_image_units = 16, - .max_tess_evaluation_uniform_components = 1024, - .max_tess_patch_components = 120, - .max_patch_vertices = 32, - .max_tess_gen_level = 64, - .max_viewports = 16, - .max_vertex_atomic_counters = 0, - .max_tess_control_atomic_counters = 0, - .max_tess_evaluation_atomic_counters = 0, - .max_geometry_atomic_counters = 0, - .max_fragment_atomic_counters = 8, - .max_combined_atomic_counters = 8, - .max_atomic_counter_bindings = 1, - .max_vertex_atomic_counter_buffers = 0, - .max_tess_control_atomic_counter_buffers = 0, - .max_tess_evaluation_atomic_counter_buffers = 0, - .max_geometry_atomic_counter_buffers = 0, - .max_fragment_atomic_counter_buffers = 1, - .max_combined_atomic_counter_buffers = 1, - .max_atomic_counter_buffer_size = 16384, - .max_transform_feedback_buffers = 4, - .max_transform_feedback_interleaved_components = 64, - .max_cull_distances = 8, - .max_combined_clip_and_cull_distances = 8, - .max_samples = 4, - .max_mesh_output_vertices_nv = 256, - .max_mesh_output_primitives_nv = 512, - .max_mesh_work_group_size_x_nv = 32, - .max_mesh_work_group_size_y_nv = 1, - .max_mesh_work_group_size_z_nv = 1, - .max_task_work_group_size_x_nv = 32, - .max_task_work_group_size_y_nv = 1, - .max_task_work_group_size_z_nv = 1, - .max_mesh_view_count_nv = 4, - .maxDualSourceDrawBuffersEXT = 1, - - .limits = { - .non_inductive_for_loops = 1, - .while_loops = 1, - .do_while_loops = 1, - .general_uniform_indexing = 1, - .general_attribute_matrix_vector_indexing = 1, - .general_varying_indexing = 1, - .general_sampler_indexing = 1, - .general_variable_indexing = 1, - .general_constant_matrix_vector_indexing = 1, - } -}; - -int ff_vk_glslang_shader_compile(AVFilterContext *avctx, FFSPIRVShader *shd, - uint8_t **data, size_t *size, void **opaque) -{ - const char *messages; - glslang_shader_t *glslc_shader; - glslang_program_t *glslc_program; - - static const glslang_stage_t glslc_stage[] = { - [VK_SHADER_STAGE_VERTEX_BIT] = GLSLANG_STAGE_VERTEX, - [VK_SHADER_STAGE_FRAGMENT_BIT] = GLSLANG_STAGE_FRAGMENT, - [VK_SHADER_STAGE_COMPUTE_BIT] = GLSLANG_STAGE_COMPUTE, - }; - - const glslang_input_t glslc_input = { - .language = GLSLANG_SOURCE_GLSL, - .stage = glslc_stage[shd->shader.stage], - .client = GLSLANG_CLIENT_VULKAN, - /* GLSLANG_TARGET_VULKAN_1_2 before 11.6 resulted in targeting 1.0 */ -#if (((GLSLANG_VERSION_MAJOR) > 11) || ((GLSLANG_VERSION_MAJOR) == 11 && \ - (((GLSLANG_VERSION_MINOR) > 6) || ((GLSLANG_VERSION_MINOR) == 6 && \ - ((GLSLANG_VERSION_PATCH) > 0))))) - .client_version = GLSLANG_TARGET_VULKAN_1_2, - .target_language_version = GLSLANG_TARGET_SPV_1_5, -#else - .client_version = GLSLANG_TARGET_VULKAN_1_1, - .target_language_version = GLSLANG_TARGET_SPV_1_3, -#endif - .target_language = GLSLANG_TARGET_SPV, - .code = shd->src.str, - .default_version = 460, - .default_profile = GLSLANG_NO_PROFILE, - .force_default_version_and_profile = false, - .forward_compatible = false, - .messages = GLSLANG_MSG_DEFAULT_BIT, - .resource = &glslc_resource_limits, - }; - - av_assert0(glslang_refcount); - - if (!(glslc_shader = glslang_shader_create(&glslc_input))) - return AVERROR(ENOMEM); - - if (!glslang_shader_preprocess(glslc_shader, &glslc_input)) { - ff_vk_print_shader(avctx, shd, AV_LOG_WARNING); - av_log(avctx, AV_LOG_ERROR, "Unable to preprocess shader: %s (%s)!\n", - glslang_shader_get_info_log(glslc_shader), - glslang_shader_get_info_debug_log(glslc_shader)); - glslang_shader_delete(glslc_shader); - return AVERROR(EINVAL); - } - - if (!glslang_shader_parse(glslc_shader, &glslc_input)) { - ff_vk_print_shader(avctx, shd, AV_LOG_WARNING); - av_log(avctx, AV_LOG_ERROR, "Unable to parse shader: %s (%s)!\n", - glslang_shader_get_info_log(glslc_shader), - glslang_shader_get_info_debug_log(glslc_shader)); - glslang_shader_delete(glslc_shader); - return AVERROR(EINVAL); - } - - if (!(glslc_program = glslang_program_create())) { - glslang_shader_delete(glslc_shader); - return AVERROR(EINVAL); - } - - glslang_program_add_shader(glslc_program, glslc_shader); - - if (!glslang_program_link(glslc_program, GLSLANG_MSG_SPV_RULES_BIT | - GLSLANG_MSG_VULKAN_RULES_BIT)) { - ff_vk_print_shader(avctx, shd, AV_LOG_WARNING); - av_log(avctx, AV_LOG_ERROR, "Unable to link shader: %s (%s)!\n", - glslang_program_get_info_log(glslc_program), - glslang_program_get_info_debug_log(glslc_program)); - glslang_program_delete(glslc_program); - glslang_shader_delete(glslc_shader); - return AVERROR(EINVAL); - } - - glslang_program_SPIRV_generate(glslc_program, glslc_input.stage); - - messages = glslang_program_SPIRV_get_messages(glslc_program); - if (messages) - av_log(avctx, AV_LOG_WARNING, "%s\n", messages); - - glslang_shader_delete(glslc_shader); - - *size = glslang_program_SPIRV_get_size(glslc_program) * sizeof(unsigned int); - *data = (void *)glslang_program_SPIRV_get_ptr(glslc_program); - *opaque = glslc_program; - - return 0; -} - -void ff_vk_glslang_shader_free(void *opaque) -{ - glslang_program_delete(opaque); -} - -int ff_vk_glslang_init(void) -{ - int ret = 0; - - pthread_mutex_lock(&glslang_mutex); - if (glslang_refcount++ == 0) - ret = !glslang_initialize_process(); - pthread_mutex_unlock(&glslang_mutex); - - return ret; -} - -void ff_vk_glslang_uninit(void) -{ - pthread_mutex_lock(&glslang_mutex); - if (glslang_refcount && (--glslang_refcount == 0)) - glslang_finalize_process(); - pthread_mutex_unlock(&glslang_mutex); -} +#include "libavutil/vulkan_glslang.c" diff --git a/libavfilter/vf_avgblur_vulkan.c b/libavfilter/vf_avgblur_vulkan.c index 4795e482a9..bbbf62a3fd 100644 --- a/libavfilter/vf_avgblur_vulkan.c +++ b/libavfilter/vf_avgblur_vulkan.c @@ -73,6 +73,7 @@ static av_cold int init_filter(AVFilterContext *ctx, AVFrame *in) int err; FFSPIRVShader *shd; AvgBlurVulkanContext *s = ctx->priv; + FFVulkanContext *vkctx = &s->vkctx; const int planes = av_pix_fmt_count_planes(s->vkctx.output_format); FFVulkanDescriptorSetBinding desc_i[2] = { @@ -94,9 +95,9 @@ static av_cold int init_filter(AVFilterContext *ctx, AVFrame *in) }, }; - ff_vk_qf_init(ctx, &s->qf, VK_QUEUE_COMPUTE_BIT, 0); + ff_vk_qf_init(vkctx, &s->qf, VK_QUEUE_COMPUTE_BIT, 0); - desc_i[0].sampler = ff_vk_init_sampler(ctx, 1, VK_FILTER_LINEAR); + desc_i[0].sampler = ff_vk_init_sampler(vkctx, 1, VK_FILTER_LINEAR); if (!desc_i[0].sampler) return AVERROR_EXTERNAL; @@ -104,16 +105,16 @@ static av_cold int init_filter(AVFilterContext *ctx, AVFrame *in) desc_i[0].updater = s->input_images; desc_i[1].updater = s->tmp_images; - s->pl_hor = ff_vk_create_pipeline(ctx, &s->qf); + s->pl_hor = ff_vk_create_pipeline(vkctx, &s->qf); if (!s->pl_hor) return AVERROR(ENOMEM); - shd = ff_vk_init_shader(ctx, s->pl_hor, "avgblur_compute_hor", + shd = ff_vk_init_shader(s->pl_hor, "avgblur_compute_hor", VK_SHADER_STAGE_COMPUTE_BIT); - ff_vk_set_compute_shader_sizes(ctx, shd, (int [3]){ CGS, 1, 1 }); + ff_vk_set_compute_shader_sizes(shd, (int [3]){ CGS, 1, 1 }); - RET(ff_vk_add_descriptor_set(ctx, s->pl_hor, shd, desc_i, 2, 0)); + RET(ff_vk_add_descriptor_set(vkctx, s->pl_hor, shd, desc_i, 2, 0)); GLSLF(0, #define FILTER_RADIUS (%i) ,s->size_x - 1); GLSLC(0, #define INC(x) (ivec2(x, 0)) ); @@ -137,26 +138,26 @@ static av_cold int init_filter(AVFilterContext *ctx, AVFrame *in) } GLSLC(0, } ); - RET(ff_vk_compile_shader(ctx, shd, "main")); + RET(ff_vk_compile_shader(vkctx, shd, "main")); - RET(ff_vk_init_pipeline_layout(ctx, s->pl_hor)); - RET(ff_vk_init_compute_pipeline(ctx, s->pl_hor)); + RET(ff_vk_init_pipeline_layout(vkctx, s->pl_hor)); + RET(ff_vk_init_compute_pipeline(vkctx, s->pl_hor)); } { /* Create shader for the vertical pass */ desc_i[0].updater = s->tmp_images; desc_i[1].updater = s->output_images; - s->pl_ver = ff_vk_create_pipeline(ctx, &s->qf); + s->pl_ver = ff_vk_create_pipeline(vkctx, &s->qf); if (!s->pl_ver) return AVERROR(ENOMEM); - shd = ff_vk_init_shader(ctx, s->pl_ver, "avgblur_compute_ver", + shd = ff_vk_init_shader(s->pl_ver, "avgblur_compute_ver", VK_SHADER_STAGE_COMPUTE_BIT); - ff_vk_set_compute_shader_sizes(ctx, shd, (int [3]){ 1, CGS, 1 }); + ff_vk_set_compute_shader_sizes(shd, (int [3]){ 1, CGS, 1 }); - RET(ff_vk_add_descriptor_set(ctx, s->pl_ver, shd, desc_i, 2, 0)); + RET(ff_vk_add_descriptor_set(vkctx, s->pl_ver, shd, desc_i, 2, 0)); GLSLF(0, #define FILTER_RADIUS (%i) ,s->size_y - 1); GLSLC(0, #define INC(x) (ivec2(0, x)) ); @@ -180,14 +181,14 @@ static av_cold int init_filter(AVFilterContext *ctx, AVFrame *in) } GLSLC(0, } ); - RET(ff_vk_compile_shader(ctx, shd, "main")); + RET(ff_vk_compile_shader(vkctx, shd, "main")); - RET(ff_vk_init_pipeline_layout(ctx, s->pl_ver)); - RET(ff_vk_init_compute_pipeline(ctx, s->pl_ver)); + RET(ff_vk_init_pipeline_layout(vkctx, s->pl_ver)); + RET(ff_vk_init_compute_pipeline(vkctx, s->pl_ver)); } /* Execution context */ - RET(ff_vk_create_exec_ctx(ctx, &s->exec, &s->qf)); + RET(ff_vk_create_exec_ctx(vkctx, &s->exec, &s->qf)); s->initialized = 1; @@ -202,29 +203,30 @@ static int process_frames(AVFilterContext *avctx, AVFrame *out_f, AVFrame *tmp_f int err; VkCommandBuffer cmd_buf; AvgBlurVulkanContext *s = avctx->priv; - FFVulkanFunctions *vk = &s->vkctx.vkfn; + FFVulkanContext *vkctx = &s->vkctx; + FFVulkanFunctions *vk = &vkctx->vkfn; AVVkFrame *in = (AVVkFrame *)in_f->data[0]; AVVkFrame *tmp = (AVVkFrame *)tmp_f->data[0]; AVVkFrame *out = (AVVkFrame *)out_f->data[0]; int planes = av_pix_fmt_count_planes(s->vkctx.output_format); /* Update descriptors and init the exec context */ - ff_vk_start_exec_recording(avctx, s->exec); - cmd_buf = ff_vk_get_exec_buf(avctx, s->exec); + ff_vk_start_exec_recording(vkctx, s->exec); + cmd_buf = ff_vk_get_exec_buf(s->exec); for (int i = 0; i < planes; i++) { - RET(ff_vk_create_imageview(avctx, s->exec, &s->input_images[i].imageView, - in->img[i], + RET(ff_vk_create_imageview(vkctx, s->exec, + &s->input_images[i].imageView, in->img[i], av_vkfmt_from_pixfmt(s->vkctx.input_format)[i], ff_comp_identity_map)); - RET(ff_vk_create_imageview(avctx, s->exec, &s->tmp_images[i].imageView, - tmp->img[i], + RET(ff_vk_create_imageview(vkctx, s->exec, + &s->tmp_images[i].imageView, tmp->img[i], av_vkfmt_from_pixfmt(s->vkctx.output_format)[i], ff_comp_identity_map)); - RET(ff_vk_create_imageview(avctx, s->exec, &s->output_images[i].imageView, - out->img[i], + RET(ff_vk_create_imageview(vkctx, s->exec, + &s->output_images[i].imageView, out->img[i], av_vkfmt_from_pixfmt(s->vkctx.output_format)[i], ff_comp_identity_map)); @@ -233,8 +235,8 @@ static int process_frames(AVFilterContext *avctx, AVFrame *out_f, AVFrame *tmp_f s->output_images[i].imageLayout = VK_IMAGE_LAYOUT_GENERAL; } - ff_vk_update_descriptor_set(avctx, s->pl_hor, 0); - ff_vk_update_descriptor_set(avctx, s->pl_ver, 0); + ff_vk_update_descriptor_set(vkctx, s->pl_hor, 0); + ff_vk_update_descriptor_set(vkctx, s->pl_ver, 0); for (int i = 0; i < planes; i++) { VkImageMemoryBarrier bar[] = { @@ -293,20 +295,20 @@ static int process_frames(AVFilterContext *avctx, AVFrame *out_f, AVFrame *tmp_f out->access[i] = bar[2].dstAccessMask; } - ff_vk_bind_pipeline_exec(avctx, s->exec, s->pl_hor); + ff_vk_bind_pipeline_exec(vkctx, s->exec, s->pl_hor); vk->CmdDispatch(cmd_buf, FFALIGN(s->vkctx.output_width, CGS)/CGS, s->vkctx.output_height, 1); - ff_vk_bind_pipeline_exec(avctx, s->exec, s->pl_ver); + ff_vk_bind_pipeline_exec(vkctx, s->exec, s->pl_ver); vk->CmdDispatch(cmd_buf, s->vkctx.output_width, FFALIGN(s->vkctx.output_height, CGS)/CGS, 1); - ff_vk_add_exec_dep(avctx, s->exec, in_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); - ff_vk_add_exec_dep(avctx, s->exec, out_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); + ff_vk_add_exec_dep(vkctx, s->exec, in_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); + ff_vk_add_exec_dep(vkctx, s->exec, out_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); - err = ff_vk_submit_exec_queue(avctx, s->exec); + err = ff_vk_submit_exec_queue(vkctx,s->exec); if (err) return err; @@ -315,7 +317,7 @@ static int process_frames(AVFilterContext *avctx, AVFrame *out_f, AVFrame *tmp_f return err; fail: - ff_vk_discard_exec_deps(avctx, s->exec); + ff_vk_discard_exec_deps(s->exec); return err; } @@ -364,7 +366,7 @@ static void avgblur_vulkan_uninit(AVFilterContext *avctx) { AvgBlurVulkanContext *s = avctx->priv; - ff_vk_filter_uninit(avctx); + ff_vk_uninit(&s->vkctx); s->initialized = 0; } diff --git a/libavfilter/vf_chromaber_vulkan.c b/libavfilter/vf_chromaber_vulkan.c index 83ab72f716..17290e8f25 100644 --- a/libavfilter/vf_chromaber_vulkan.c +++ b/libavfilter/vf_chromaber_vulkan.c @@ -70,16 +70,17 @@ static av_cold int init_filter(AVFilterContext *ctx, AVFrame *in) int err; FFVkSampler *sampler; ChromaticAberrationVulkanContext *s = ctx->priv; + FFVulkanContext *vkctx = &s->vkctx; const int planes = av_pix_fmt_count_planes(s->vkctx.output_format); - ff_vk_qf_init(ctx, &s->qf, VK_QUEUE_COMPUTE_BIT, 0); + ff_vk_qf_init(vkctx, &s->qf, VK_QUEUE_COMPUTE_BIT, 0); /* Create a sampler */ - sampler = ff_vk_init_sampler(ctx, 0, VK_FILTER_LINEAR); + sampler = ff_vk_init_sampler(vkctx, 0, VK_FILTER_LINEAR); if (!sampler) return AVERROR_EXTERNAL; - s->pl = ff_vk_create_pipeline(ctx, &s->qf); + s->pl = ff_vk_create_pipeline(vkctx, &s->qf); if (!s->pl) return AVERROR(ENOMEM); @@ -110,22 +111,22 @@ static av_cold int init_filter(AVFilterContext *ctx, AVFrame *in) }, }; - FFSPIRVShader *shd = ff_vk_init_shader(ctx, s->pl, "chromaber_compute", + FFSPIRVShader *shd = ff_vk_init_shader(s->pl, "chromaber_compute", VK_SHADER_STAGE_COMPUTE_BIT); if (!shd) return AVERROR(ENOMEM); - ff_vk_set_compute_shader_sizes(ctx, shd, CGROUPS); + ff_vk_set_compute_shader_sizes(shd, CGROUPS); GLSLC(0, layout(push_constant, std430) uniform pushConstants { ); GLSLC(1, vec2 dist; ); GLSLC(0, }; ); GLSLC(0, ); - ff_vk_add_push_constant(ctx, s->pl, 0, sizeof(s->opts), + ff_vk_add_push_constant(s->pl, 0, sizeof(s->opts), VK_SHADER_STAGE_COMPUTE_BIT); - RET(ff_vk_add_descriptor_set(ctx, s->pl, shd, desc_i, 2, 0)); /* set 0 */ + RET(ff_vk_add_descriptor_set(vkctx, s->pl, shd, desc_i, 2, 0)); /* set 0 */ GLSLD( distort_chroma_kernel ); GLSLC(0, void main() ); @@ -152,14 +153,14 @@ static av_cold int init_filter(AVFilterContext *ctx, AVFrame *in) } GLSLC(0, } ); - RET(ff_vk_compile_shader(ctx, shd, "main")); + RET(ff_vk_compile_shader(vkctx, shd, "main")); } - RET(ff_vk_init_pipeline_layout(ctx, s->pl)); - RET(ff_vk_init_compute_pipeline(ctx, s->pl)); + RET(ff_vk_init_pipeline_layout(vkctx, s->pl)); + RET(ff_vk_init_compute_pipeline(vkctx, s->pl)); /* Execution context */ - RET(ff_vk_create_exec_ctx(ctx, &s->exec, &s->qf)); + RET(ff_vk_create_exec_ctx(vkctx, &s->exec, &s->qf)); s->initialized = 1; @@ -174,23 +175,24 @@ static int process_frames(AVFilterContext *avctx, AVFrame *out_f, AVFrame *in_f) int err = 0; VkCommandBuffer cmd_buf; ChromaticAberrationVulkanContext *s = avctx->priv; - FFVulkanFunctions *vk = &s->vkctx.vkfn; + FFVulkanContext *vkctx = &s->vkctx; + FFVulkanFunctions *vk = &vkctx->vkfn; AVVkFrame *in = (AVVkFrame *)in_f->data[0]; AVVkFrame *out = (AVVkFrame *)out_f->data[0]; int planes = av_pix_fmt_count_planes(s->vkctx.output_format); /* Update descriptors and init the exec context */ - ff_vk_start_exec_recording(avctx, s->exec); - cmd_buf = ff_vk_get_exec_buf(avctx, s->exec); + ff_vk_start_exec_recording(vkctx, s->exec); + cmd_buf = ff_vk_get_exec_buf(s->exec); for (int i = 0; i < planes; i++) { - RET(ff_vk_create_imageview(avctx, s->exec, &s->input_images[i].imageView, - in->img[i], + RET(ff_vk_create_imageview(vkctx, s->exec, + &s->input_images[i].imageView, in->img[i], av_vkfmt_from_pixfmt(s->vkctx.input_format)[i], ff_comp_identity_map)); - RET(ff_vk_create_imageview(avctx, s->exec, &s->output_images[i].imageView, - out->img[i], + RET(ff_vk_create_imageview(vkctx, s->exec, + &s->output_images[i].imageView, out->img[i], av_vkfmt_from_pixfmt(s->vkctx.output_format)[i], ff_comp_identity_map)); @@ -198,7 +200,7 @@ static int process_frames(AVFilterContext *avctx, AVFrame *out_f, AVFrame *in_f) s->output_images[i].imageLayout = VK_IMAGE_LAYOUT_GENERAL; } - ff_vk_update_descriptor_set(avctx, s->pl, 0); + ff_vk_update_descriptor_set(vkctx, s->pl, 0); for (int i = 0; i < planes; i++) { VkImageMemoryBarrier bar[2] = { @@ -241,19 +243,19 @@ static int process_frames(AVFilterContext *avctx, AVFrame *out_f, AVFrame *in_f) out->access[i] = bar[1].dstAccessMask; } - ff_vk_bind_pipeline_exec(avctx, s->exec, s->pl); + ff_vk_bind_pipeline_exec(vkctx, s->exec, s->pl); - ff_vk_update_push_exec(avctx, s->exec, VK_SHADER_STAGE_COMPUTE_BIT, + ff_vk_update_push_exec(vkctx, s->exec, VK_SHADER_STAGE_COMPUTE_BIT, 0, sizeof(s->opts), &s->opts); vk->CmdDispatch(cmd_buf, FFALIGN(s->vkctx.output_width, CGROUPS[0])/CGROUPS[0], FFALIGN(s->vkctx.output_height, CGROUPS[1])/CGROUPS[1], 1); - ff_vk_add_exec_dep(avctx, s->exec, in_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); - ff_vk_add_exec_dep(avctx, s->exec, out_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); + ff_vk_add_exec_dep(vkctx, s->exec, in_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); + ff_vk_add_exec_dep(vkctx, s->exec, out_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); - err = ff_vk_submit_exec_queue(avctx, s->exec); + err = ff_vk_submit_exec_queue(vkctx, s->exec); if (err) return err; @@ -262,7 +264,7 @@ static int process_frames(AVFilterContext *avctx, AVFrame *out_f, AVFrame *in_f) return err; fail: - ff_vk_discard_exec_deps(avctx, s->exec); + ff_vk_discard_exec_deps(s->exec); return err; } @@ -302,7 +304,7 @@ static void chromaber_vulkan_uninit(AVFilterContext *avctx) { ChromaticAberrationVulkanContext *s = avctx->priv; - ff_vk_filter_uninit(avctx); + ff_vk_uninit(&s->vkctx); s->initialized = 0; } diff --git a/libavfilter/vf_gblur_vulkan.c b/libavfilter/vf_gblur_vulkan.c index 16c8bbb189..6c601db821 100644 --- a/libavfilter/vf_gblur_vulkan.c +++ b/libavfilter/vf_gblur_vulkan.c @@ -160,7 +160,7 @@ static av_cold int init_filter(AVFilterContext *ctx, AVFrame *in) .buf_content = NULL, }; - image_descs[0].sampler = ff_vk_init_sampler(ctx, 1, VK_FILTER_LINEAR); + image_descs[0].sampler = ff_vk_init_sampler(&s->vkctx, 1, VK_FILTER_LINEAR); if (!image_descs[0].sampler) return AVERROR_EXTERNAL; @@ -172,28 +172,28 @@ static av_cold int init_filter(AVFilterContext *ctx, AVFrame *in) buf_desc.buf_content = kernel_def; - ff_vk_qf_init(ctx, &s->qf, VK_QUEUE_COMPUTE_BIT, 0); + ff_vk_qf_init(&s->vkctx, &s->qf, VK_QUEUE_COMPUTE_BIT, 0); { /* Create shader for the horizontal pass */ image_descs[0].updater = s->input_images; image_descs[1].updater = s->tmp_images; buf_desc.updater = &s->params_desc_hor; - s->pl_hor = ff_vk_create_pipeline(ctx, &s->qf); + s->pl_hor = ff_vk_create_pipeline(&s->vkctx, &s->qf); if (!s->pl_hor) { err = AVERROR(ENOMEM); goto fail; } - shd = ff_vk_init_shader(ctx, s->pl_hor, "gblur_compute_hor", image_descs[0].stages); + shd = ff_vk_init_shader(s->pl_hor, "gblur_compute_hor", image_descs[0].stages); if (!shd) { err = AVERROR(ENOMEM); goto fail; } - ff_vk_set_compute_shader_sizes(ctx, shd, (int [3]){ CGS, CGS, 1 }); - RET(ff_vk_add_descriptor_set(ctx, s->pl_hor, shd, image_descs, FF_ARRAY_ELEMS(image_descs), 0)); - RET(ff_vk_add_descriptor_set(ctx, s->pl_hor, shd, &buf_desc, 1, 0)); + ff_vk_set_compute_shader_sizes(shd, (int [3]){ CGS, CGS, 1 }); + RET(ff_vk_add_descriptor_set(&s->vkctx, s->pl_hor, shd, image_descs, FF_ARRAY_ELEMS(image_descs), 0)); + RET(ff_vk_add_descriptor_set(&s->vkctx, s->pl_hor, shd, &buf_desc, 1, 0)); GLSLD( gblur_horizontal ); GLSLC(0, void main() ); @@ -214,23 +214,23 @@ static av_cold int init_filter(AVFilterContext *ctx, AVFrame *in) } GLSLC(0, } ); - RET(ff_vk_compile_shader(ctx, shd, "main")); + RET(ff_vk_compile_shader(&s->vkctx, shd, "main")); - RET(ff_vk_init_pipeline_layout(ctx, s->pl_hor)); - RET(ff_vk_init_compute_pipeline(ctx, s->pl_hor)); + RET(ff_vk_init_pipeline_layout(&s->vkctx, s->pl_hor)); + RET(ff_vk_init_compute_pipeline(&s->vkctx, s->pl_hor)); - RET(ff_vk_create_buf(ctx, &s->params_buf_hor, sizeof(float) * s->kernel_size, + RET(ff_vk_create_buf(&s->vkctx, &s->params_buf_hor, sizeof(float) * s->kernel_size, VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT)); - RET(ff_vk_map_buffers(ctx, &s->params_buf_hor, &kernel_mapped, 1, 0)); + RET(ff_vk_map_buffers(&s->vkctx, &s->params_buf_hor, &kernel_mapped, 1, 0)); init_gaussian_kernel((float *)kernel_mapped, s->sigma, s->kernel_size); - RET(ff_vk_unmap_buffers(ctx, &s->params_buf_hor, 1, 1)); + RET(ff_vk_unmap_buffers(&s->vkctx, &s->params_buf_hor, 1, 1)); s->params_desc_hor.buffer = s->params_buf_hor.buf; s->params_desc_hor.range = VK_WHOLE_SIZE; - ff_vk_update_descriptor_set(ctx, s->pl_hor, 1); + ff_vk_update_descriptor_set(&s->vkctx, s->pl_hor, 1); } { /* Create shader for the vertical pass */ @@ -238,21 +238,21 @@ static av_cold int init_filter(AVFilterContext *ctx, AVFrame *in) image_descs[1].updater = s->output_images; buf_desc.updater = &s->params_desc_ver; - s->pl_ver = ff_vk_create_pipeline(ctx, &s->qf); + s->pl_ver = ff_vk_create_pipeline(&s->vkctx, &s->qf); if (!s->pl_ver) { err = AVERROR(ENOMEM); goto fail; } - shd = ff_vk_init_shader(ctx, s->pl_ver, "gblur_compute_ver", image_descs[0].stages); + shd = ff_vk_init_shader(s->pl_ver, "gblur_compute_ver", image_descs[0].stages); if (!shd) { err = AVERROR(ENOMEM); goto fail; } - ff_vk_set_compute_shader_sizes(ctx, shd, (int [3]){ CGS, CGS, 1 }); - RET(ff_vk_add_descriptor_set(ctx, s->pl_ver, shd, image_descs, FF_ARRAY_ELEMS(image_descs), 0)); - RET(ff_vk_add_descriptor_set(ctx, s->pl_ver, shd, &buf_desc, 1, 0)); + ff_vk_set_compute_shader_sizes(shd, (int [3]){ CGS, CGS, 1 }); + RET(ff_vk_add_descriptor_set(&s->vkctx, s->pl_ver, shd, image_descs, FF_ARRAY_ELEMS(image_descs), 0)); + RET(ff_vk_add_descriptor_set(&s->vkctx, s->pl_ver, shd, &buf_desc, 1, 0)); GLSLD( gblur_vertical ); GLSLC(0, void main() ); @@ -273,26 +273,26 @@ static av_cold int init_filter(AVFilterContext *ctx, AVFrame *in) } GLSLC(0, } ); - RET(ff_vk_compile_shader(ctx, shd, "main")); + RET(ff_vk_compile_shader(&s->vkctx, shd, "main")); - RET(ff_vk_init_pipeline_layout(ctx, s->pl_ver)); - RET(ff_vk_init_compute_pipeline(ctx, s->pl_ver)); + RET(ff_vk_init_pipeline_layout(&s->vkctx, s->pl_ver)); + RET(ff_vk_init_compute_pipeline(&s->vkctx, s->pl_ver)); - RET(ff_vk_create_buf(ctx, &s->params_buf_ver, sizeof(float) * s->kernel_size, + RET(ff_vk_create_buf(&s->vkctx, &s->params_buf_ver, sizeof(float) * s->kernel_size, VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT)); - RET(ff_vk_map_buffers(ctx, &s->params_buf_ver, &kernel_mapped, 1, 0)); + RET(ff_vk_map_buffers(&s->vkctx, &s->params_buf_ver, &kernel_mapped, 1, 0)); init_gaussian_kernel((float *)kernel_mapped, s->sigmaV, s->kernel_size); - RET(ff_vk_unmap_buffers(ctx, &s->params_buf_ver, 1, 1)); + RET(ff_vk_unmap_buffers(&s->vkctx, &s->params_buf_ver, 1, 1)); s->params_desc_ver.buffer = s->params_buf_ver.buf; s->params_desc_ver.range = VK_WHOLE_SIZE; - ff_vk_update_descriptor_set(ctx, s->pl_ver, 1); + ff_vk_update_descriptor_set(&s->vkctx, s->pl_ver, 1); } - RET(ff_vk_create_exec_ctx(ctx, &s->exec, &s->qf)); + RET(ff_vk_create_exec_ctx(&s->vkctx, &s->exec, &s->qf)); s->initialized = 1; @@ -307,9 +307,9 @@ static av_cold void gblur_vulkan_uninit(AVFilterContext *avctx) av_frame_free(&s->tmpframe); - ff_vk_filter_uninit(avctx); - ff_vk_free_buf(avctx, &s->params_buf_hor); - ff_vk_free_buf(avctx, &s->params_buf_ver); + ff_vk_free_buf(&s->vkctx, &s->params_buf_hor); + ff_vk_free_buf(&s->vkctx, &s->params_buf_ver); + ff_vk_uninit(&s->vkctx); s->initialized = 0; } @@ -329,23 +329,23 @@ static int process_frames(AVFilterContext *avctx, AVFrame *outframe, AVFrame *in int planes = av_pix_fmt_count_planes(s->vkctx.output_format); - ff_vk_start_exec_recording(avctx, s->exec); - cmd_buf = ff_vk_get_exec_buf(avctx, s->exec); + ff_vk_start_exec_recording(&s->vkctx, s->exec); + cmd_buf = ff_vk_get_exec_buf(s->exec); input_formats = av_vkfmt_from_pixfmt(s->vkctx.input_format); output_formats = av_vkfmt_from_pixfmt(s->vkctx.output_format); for (int i = 0; i < planes; i++) { - RET(ff_vk_create_imageview(avctx, s->exec, &s->input_images[i].imageView, + RET(ff_vk_create_imageview(&s->vkctx, s->exec, &s->input_images[i].imageView, in->img[i], input_formats[i], ff_comp_identity_map)); - RET(ff_vk_create_imageview(avctx, s->exec, &s->tmp_images[i].imageView, + RET(ff_vk_create_imageview(&s->vkctx, s->exec, &s->tmp_images[i].imageView, tmp->img[i], output_formats[i], ff_comp_identity_map)); - RET(ff_vk_create_imageview(avctx, s->exec, &s->output_images[i].imageView, + RET(ff_vk_create_imageview(&s->vkctx, s->exec, &s->output_images[i].imageView, out->img[i], output_formats[i], ff_comp_identity_map)); @@ -355,8 +355,8 @@ static int process_frames(AVFilterContext *avctx, AVFrame *outframe, AVFrame *in s->output_images[i].imageLayout = VK_IMAGE_LAYOUT_GENERAL; } - ff_vk_update_descriptor_set(avctx, s->pl_hor, 0); - ff_vk_update_descriptor_set(avctx, s->pl_ver, 0); + ff_vk_update_descriptor_set(&s->vkctx, s->pl_hor, 0); + ff_vk_update_descriptor_set(&s->vkctx, s->pl_ver, 0); for (int i = 0; i < planes; i++) { VkImageMemoryBarrier barriers[] = { @@ -415,20 +415,20 @@ static int process_frames(AVFilterContext *avctx, AVFrame *outframe, AVFrame *in out->access[i] = barriers[2].dstAccessMask; } - ff_vk_bind_pipeline_exec(avctx, s->exec, s->pl_hor); + ff_vk_bind_pipeline_exec(&s->vkctx, s->exec, s->pl_hor); vk->CmdDispatch(cmd_buf, FFALIGN(s->vkctx.output_width, CGS)/CGS, FFALIGN(s->vkctx.output_height, CGS)/CGS, 1); - ff_vk_bind_pipeline_exec(avctx, s->exec, s->pl_ver); + ff_vk_bind_pipeline_exec(&s->vkctx, s->exec, s->pl_ver); vk->CmdDispatch(cmd_buf, FFALIGN(s->vkctx.output_width, CGS)/CGS, FFALIGN(s->vkctx.output_height, CGS)/CGS, 1); - ff_vk_add_exec_dep(avctx, s->exec, inframe, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); - ff_vk_add_exec_dep(avctx, s->exec, outframe, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); + ff_vk_add_exec_dep(&s->vkctx, s->exec, inframe, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); + ff_vk_add_exec_dep(&s->vkctx, s->exec, outframe, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); - err = ff_vk_submit_exec_queue(avctx, s->exec); + err = ff_vk_submit_exec_queue(&s->vkctx, s->exec); if (err) return err; @@ -436,7 +436,7 @@ static int process_frames(AVFilterContext *avctx, AVFrame *outframe, AVFrame *in return 0; fail: - ff_vk_discard_exec_deps(avctx, s->exec); + ff_vk_discard_exec_deps(s->exec); return err; } diff --git a/libavfilter/vf_libplacebo.c b/libavfilter/vf_libplacebo.c index ede6888bd3..fe301db417 100644 --- a/libavfilter/vf_libplacebo.c +++ b/libavfilter/vf_libplacebo.c @@ -270,7 +270,7 @@ static void libplacebo_uninit(AVFilterContext *avctx) pl_renderer_destroy(&s->renderer); pl_vulkan_destroy(&s->vulkan); pl_log_destroy(&s->log); - ff_vk_filter_uninit(avctx); + ff_vk_uninit(&s->vkctx); s->initialized = 0; s->gpu = NULL; } diff --git a/libavfilter/vf_overlay_vulkan.c b/libavfilter/vf_overlay_vulkan.c index b902ad83f5..73a7bae722 100644 --- a/libavfilter/vf_overlay_vulkan.c +++ b/libavfilter/vf_overlay_vulkan.c @@ -82,15 +82,16 @@ static av_cold int init_filter(AVFilterContext *ctx) int err; FFVkSampler *sampler; OverlayVulkanContext *s = ctx->priv; + FFVulkanContext *vkctx = &s->vkctx; const int planes = av_pix_fmt_count_planes(s->vkctx.output_format); - ff_vk_qf_init(ctx, &s->qf, VK_QUEUE_COMPUTE_BIT, 0); + ff_vk_qf_init(vkctx, &s->qf, VK_QUEUE_COMPUTE_BIT, 0); - sampler = ff_vk_init_sampler(ctx, 1, VK_FILTER_NEAREST); + sampler = ff_vk_init_sampler(vkctx, 1, VK_FILTER_NEAREST); if (!sampler) return AVERROR_EXTERNAL; - s->pl = ff_vk_create_pipeline(ctx, &s->qf); + s->pl = ff_vk_create_pipeline(vkctx, &s->qf); if (!s->pl) return AVERROR(ENOMEM); @@ -138,15 +139,15 @@ static av_cold int init_filter(AVFilterContext *ctx) .buf_content = "ivec2 o_offset[3], o_size[3];", }; - FFSPIRVShader *shd = ff_vk_init_shader(ctx, s->pl, "overlay_compute", + FFSPIRVShader *shd = ff_vk_init_shader(s->pl, "overlay_compute", VK_SHADER_STAGE_COMPUTE_BIT); if (!shd) return AVERROR(ENOMEM); - ff_vk_set_compute_shader_sizes(ctx, shd, CGROUPS); + ff_vk_set_compute_shader_sizes(shd, CGROUPS); - RET(ff_vk_add_descriptor_set(ctx, s->pl, shd, desc_i, 3, 0)); /* set 0 */ - RET(ff_vk_add_descriptor_set(ctx, s->pl, shd, &desc_b, 1, 0)); /* set 1 */ + RET(ff_vk_add_descriptor_set(vkctx, s->pl, shd, desc_i, 3, 0)); /* set 0 */ + RET(ff_vk_add_descriptor_set(vkctx, s->pl, shd, &desc_b, 1, 0)); /* set 1 */ GLSLD( overlay_noalpha ); GLSLD( overlay_alpha ); @@ -162,11 +163,11 @@ static av_cold int init_filter(AVFilterContext *ctx) GLSLC(1, } ); GLSLC(0, } ); - RET(ff_vk_compile_shader(ctx, shd, "main")); + RET(ff_vk_compile_shader(vkctx, shd, "main")); } - RET(ff_vk_init_pipeline_layout(ctx, s->pl)); - RET(ff_vk_init_compute_pipeline(ctx, s->pl)); + RET(ff_vk_init_pipeline_layout(vkctx, s->pl)); + RET(ff_vk_init_compute_pipeline(vkctx, s->pl)); { /* Create and update buffer */ const AVPixFmtDescriptor *desc; @@ -179,14 +180,14 @@ static av_cold int init_filter(AVFilterContext *ctx) int32_t o_size[2*3]; } *par; - err = ff_vk_create_buf(ctx, &s->params_buf, + err = ff_vk_create_buf(vkctx, &s->params_buf, sizeof(*par), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT); if (err) return err; - err = ff_vk_map_buffers(ctx, &s->params_buf, (uint8_t **)&par, 1, 0); + err = ff_vk_map_buffers(vkctx, &s->params_buf, (uint8_t **)&par, 1, 0); if (err) return err; @@ -206,18 +207,18 @@ static av_cold int init_filter(AVFilterContext *ctx) par->o_size[4] = par->o_size[0] >> desc->log2_chroma_w; par->o_size[5] = par->o_size[1] >> desc->log2_chroma_h; - err = ff_vk_unmap_buffers(ctx, &s->params_buf, 1, 1); + err = ff_vk_unmap_buffers(vkctx, &s->params_buf, 1, 1); if (err) return err; s->params_desc.buffer = s->params_buf.buf; s->params_desc.range = VK_WHOLE_SIZE; - ff_vk_update_descriptor_set(ctx, s->pl, 1); + ff_vk_update_descriptor_set(vkctx, s->pl, 1); } /* Execution context */ - RET(ff_vk_create_exec_ctx(ctx, &s->exec, &s->qf)); + RET(ff_vk_create_exec_ctx(vkctx, &s->exec, &s->qf)); s->initialized = 1; @@ -233,7 +234,8 @@ static int process_frames(AVFilterContext *avctx, AVFrame *out_f, int err; VkCommandBuffer cmd_buf; OverlayVulkanContext *s = avctx->priv; - FFVulkanFunctions *vk = &s->vkctx.vkfn; + FFVulkanContext *vkctx = &s->vkctx; + FFVulkanFunctions *vk = &vkctx->vkfn; int planes = av_pix_fmt_count_planes(s->vkctx.output_format); AVVkFrame *out = (AVVkFrame *)out_f->data[0]; @@ -244,22 +246,22 @@ static int process_frames(AVFilterContext *avctx, AVFrame *out_f, AVHWFramesContext *overlay_fc = (AVHWFramesContext*)overlay_f->hw_frames_ctx->data; /* Update descriptors and init the exec context */ - ff_vk_start_exec_recording(avctx, s->exec); - cmd_buf = ff_vk_get_exec_buf(avctx, s->exec); + ff_vk_start_exec_recording(vkctx, s->exec); + cmd_buf = ff_vk_get_exec_buf(s->exec); for (int i = 0; i < planes; i++) { - RET(ff_vk_create_imageview(avctx, s->exec, &s->main_images[i].imageView, - main->img[i], + RET(ff_vk_create_imageview(vkctx, s->exec, + &s->main_images[i].imageView, main->img[i], av_vkfmt_from_pixfmt(main_fc->sw_format)[i], ff_comp_identity_map)); - RET(ff_vk_create_imageview(avctx, s->exec, &s->overlay_images[i].imageView, - overlay->img[i], + RET(ff_vk_create_imageview(vkctx, s->exec, + &s->overlay_images[i].imageView, overlay->img[i], av_vkfmt_from_pixfmt(overlay_fc->sw_format)[i], ff_comp_identity_map)); - RET(ff_vk_create_imageview(avctx, s->exec, &s->output_images[i].imageView, - out->img[i], + RET(ff_vk_create_imageview(vkctx, s->exec, + &s->output_images[i].imageView, out->img[i], av_vkfmt_from_pixfmt(s->vkctx.output_format)[i], ff_comp_identity_map)); @@ -268,7 +270,7 @@ static int process_frames(AVFilterContext *avctx, AVFrame *out_f, s->output_images[i].imageLayout = VK_IMAGE_LAYOUT_GENERAL; } - ff_vk_update_descriptor_set(avctx, s->pl, 0); + ff_vk_update_descriptor_set(vkctx, s->pl, 0); for (int i = 0; i < planes; i++) { VkImageMemoryBarrier bar[3] = { @@ -327,17 +329,17 @@ static int process_frames(AVFilterContext *avctx, AVFrame *out_f, out->access[i] = bar[2].dstAccessMask; } - ff_vk_bind_pipeline_exec(avctx, s->exec, s->pl); + ff_vk_bind_pipeline_exec(vkctx, s->exec, s->pl); vk->CmdDispatch(cmd_buf, FFALIGN(s->vkctx.output_width, CGROUPS[0])/CGROUPS[0], FFALIGN(s->vkctx.output_height, CGROUPS[1])/CGROUPS[1], 1); - ff_vk_add_exec_dep(avctx, s->exec, main_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); - ff_vk_add_exec_dep(avctx, s->exec, overlay_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); - ff_vk_add_exec_dep(avctx, s->exec, out_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); + ff_vk_add_exec_dep(vkctx, s->exec, main_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); + ff_vk_add_exec_dep(vkctx, s->exec, overlay_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); + ff_vk_add_exec_dep(vkctx, s->exec, out_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); - err = ff_vk_submit_exec_queue(avctx, s->exec); + err = ff_vk_submit_exec_queue(vkctx, s->exec); if (err) return err; @@ -346,7 +348,7 @@ static int process_frames(AVFilterContext *avctx, AVFrame *out_f, return err; fail: - ff_vk_discard_exec_deps(avctx, s->exec); + ff_vk_discard_exec_deps(s->exec); return err; } @@ -438,11 +440,10 @@ static void overlay_vulkan_uninit(AVFilterContext *avctx) { OverlayVulkanContext *s = avctx->priv; - ff_vk_filter_uninit(avctx); + ff_vk_free_buf(&s->vkctx, &s->params_buf); + ff_vk_uninit(&s->vkctx); ff_framesync_uninit(&s->fs); - ff_vk_free_buf(avctx, &s->params_buf); - s->initialized = 0; } diff --git a/libavfilter/vf_scale_vulkan.c b/libavfilter/vf_scale_vulkan.c index 3a2251f8df..33f52ed007 100644 --- a/libavfilter/vf_scale_vulkan.c +++ b/libavfilter/vf_scale_vulkan.c @@ -111,6 +111,7 @@ static av_cold int init_filter(AVFilterContext *ctx, AVFrame *in) FFVkSampler *sampler; VkFilter sampler_mode; ScaleVulkanContext *s = ctx->priv; + FFVulkanContext *vkctx = &s->vkctx; int crop_x = in->crop_left; int crop_y = in->crop_top; @@ -118,7 +119,7 @@ static av_cold int init_filter(AVFilterContext *ctx, AVFrame *in) int crop_h = in->height - (in->crop_top + in->crop_bottom); int in_planes = av_pix_fmt_count_planes(s->vkctx.input_format); - ff_vk_qf_init(ctx, &s->qf, VK_QUEUE_COMPUTE_BIT, 0); + ff_vk_qf_init(vkctx, &s->qf, VK_QUEUE_COMPUTE_BIT, 0); switch (s->scaler) { case F_NEAREST: @@ -130,11 +131,11 @@ static av_cold int init_filter(AVFilterContext *ctx, AVFrame *in) }; /* Create a sampler */ - sampler = ff_vk_init_sampler(ctx, 0, sampler_mode); + sampler = ff_vk_init_sampler(vkctx, 0, sampler_mode); if (!sampler) return AVERROR_EXTERNAL; - s->pl = ff_vk_create_pipeline(ctx, &s->qf); + s->pl = ff_vk_create_pipeline(vkctx, &s->qf); if (!s->pl) return AVERROR(ENOMEM); @@ -171,15 +172,15 @@ static av_cold int init_filter(AVFilterContext *ctx, AVFrame *in) .buf_content = "mat4 yuv_matrix;", }; - FFSPIRVShader *shd = ff_vk_init_shader(ctx, s->pl, "scale_compute", + FFSPIRVShader *shd = ff_vk_init_shader(s->pl, "scale_compute", VK_SHADER_STAGE_COMPUTE_BIT); if (!shd) return AVERROR(ENOMEM); - ff_vk_set_compute_shader_sizes(ctx, shd, CGROUPS); + ff_vk_set_compute_shader_sizes(shd, CGROUPS); - RET(ff_vk_add_descriptor_set(ctx, s->pl, shd, desc_i, 2, 0)); /* set 0 */ - RET(ff_vk_add_descriptor_set(ctx, s->pl, shd, &desc_b, 1, 0)); /* set 1 */ + RET(ff_vk_add_descriptor_set(vkctx, s->pl, shd, desc_i, 2, 0)); /* set 0 */ + RET(ff_vk_add_descriptor_set(vkctx, s->pl, shd, &desc_b, 1, 0)); /* set 1 */ GLSLD( scale_bilinear ); @@ -229,11 +230,11 @@ static av_cold int init_filter(AVFilterContext *ctx, AVFrame *in) GLSLC(0, } ); - RET(ff_vk_compile_shader(ctx, shd, "main")); + RET(ff_vk_compile_shader(vkctx, shd, "main")); } - RET(ff_vk_init_pipeline_layout(ctx, s->pl)); - RET(ff_vk_init_compute_pipeline(ctx, s->pl)); + RET(ff_vk_init_pipeline_layout(vkctx, s->pl)); + RET(ff_vk_init_compute_pipeline(vkctx, s->pl)); if (s->vkctx.output_format != s->vkctx.input_format) { const struct LumaCoefficients *lcoeffs; @@ -249,14 +250,14 @@ static av_cold int init_filter(AVFilterContext *ctx, AVFrame *in) return AVERROR(EINVAL); } - err = ff_vk_create_buf(ctx, &s->params_buf, + err = ff_vk_create_buf(vkctx, &s->params_buf, sizeof(*par), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT); if (err) return err; - err = ff_vk_map_buffers(ctx, &s->params_buf, (uint8_t **)&par, 1, 0); + err = ff_vk_map_buffers(vkctx, &s->params_buf, (uint8_t **)&par, 1, 0); if (err) return err; @@ -270,18 +271,18 @@ static av_cold int init_filter(AVFilterContext *ctx, AVFrame *in) par->yuv_matrix[3][3] = 1.0; - err = ff_vk_unmap_buffers(ctx, &s->params_buf, 1, 1); + err = ff_vk_unmap_buffers(vkctx, &s->params_buf, 1, 1); if (err) return err; s->params_desc.buffer = s->params_buf.buf; s->params_desc.range = VK_WHOLE_SIZE; - ff_vk_update_descriptor_set(ctx, s->pl, 1); + ff_vk_update_descriptor_set(vkctx, s->pl, 1); } /* Execution context */ - RET(ff_vk_create_exec_ctx(ctx, &s->exec, &s->qf)); + RET(ff_vk_create_exec_ctx(vkctx, &s->exec, &s->qf)); s->initialized = 1; @@ -296,19 +297,20 @@ static int process_frames(AVFilterContext *avctx, AVFrame *out_f, AVFrame *in_f) int err = 0; VkCommandBuffer cmd_buf; ScaleVulkanContext *s = avctx->priv; - FFVulkanFunctions *vk = &s->vkctx.vkfn; + FFVulkanContext *vkctx = &s->vkctx; + FFVulkanFunctions *vk = &vkctx->vkfn; AVVkFrame *in = (AVVkFrame *)in_f->data[0]; AVVkFrame *out = (AVVkFrame *)out_f->data[0]; VkImageMemoryBarrier barriers[AV_NUM_DATA_POINTERS*2]; int barrier_count = 0; /* Update descriptors and init the exec context */ - ff_vk_start_exec_recording(avctx, s->exec); - cmd_buf = ff_vk_get_exec_buf(avctx, s->exec); + ff_vk_start_exec_recording(vkctx, s->exec); + cmd_buf = ff_vk_get_exec_buf(s->exec); for (int i = 0; i < av_pix_fmt_count_planes(s->vkctx.input_format); i++) { - RET(ff_vk_create_imageview(avctx, s->exec, &s->input_images[i].imageView, - in->img[i], + RET(ff_vk_create_imageview(vkctx, s->exec, + &s->input_images[i].imageView, in->img[i], av_vkfmt_from_pixfmt(s->vkctx.input_format)[i], ff_comp_identity_map)); @@ -316,15 +318,15 @@ static int process_frames(AVFilterContext *avctx, AVFrame *out_f, AVFrame *in_f) } for (int i = 0; i < av_pix_fmt_count_planes(s->vkctx.output_format); i++) { - RET(ff_vk_create_imageview(avctx, s->exec, &s->output_images[i].imageView, - out->img[i], + RET(ff_vk_create_imageview(vkctx, s->exec, + &s->output_images[i].imageView, out->img[i], av_vkfmt_from_pixfmt(s->vkctx.output_format)[i], ff_comp_identity_map)); s->output_images[i].imageLayout = VK_IMAGE_LAYOUT_GENERAL; } - ff_vk_update_descriptor_set(avctx, s->pl, 0); + ff_vk_update_descriptor_set(vkctx, s->pl, 0); for (int i = 0; i < av_pix_fmt_count_planes(s->vkctx.input_format); i++) { VkImageMemoryBarrier bar = { @@ -372,16 +374,16 @@ static int process_frames(AVFilterContext *avctx, AVFrame *out_f, AVFrame *in_f) VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT, 0, 0, NULL, 0, NULL, barrier_count, barriers); - ff_vk_bind_pipeline_exec(avctx, s->exec, s->pl); + ff_vk_bind_pipeline_exec(vkctx, s->exec, s->pl); vk->CmdDispatch(cmd_buf, - FFALIGN(s->vkctx.output_width, CGROUPS[0])/CGROUPS[0], - FFALIGN(s->vkctx.output_height, CGROUPS[1])/CGROUPS[1], 1); + FFALIGN(vkctx->output_width, CGROUPS[0])/CGROUPS[0], + FFALIGN(vkctx->output_height, CGROUPS[1])/CGROUPS[1], 1); - ff_vk_add_exec_dep(avctx, s->exec, in_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); - ff_vk_add_exec_dep(avctx, s->exec, out_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); + ff_vk_add_exec_dep(vkctx, s->exec, in_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); + ff_vk_add_exec_dep(vkctx, s->exec, out_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); - err = ff_vk_submit_exec_queue(avctx, s->exec); + err = ff_vk_submit_exec_queue(vkctx, s->exec); if (err) return err; @@ -390,7 +392,7 @@ static int process_frames(AVFilterContext *avctx, AVFrame *out_f, AVFrame *in_f) return err; fail: - ff_vk_discard_exec_deps(avctx, s->exec); + ff_vk_discard_exec_deps(s->exec); return err; } @@ -436,11 +438,12 @@ static int scale_vulkan_config_output(AVFilterLink *outlink) int err; AVFilterContext *avctx = outlink->src; ScaleVulkanContext *s = avctx->priv; + FFVulkanContext *vkctx = &s->vkctx; AVFilterLink *inlink = outlink->src->inputs[0]; err = ff_scale_eval_dimensions(s, s->w_expr, s->h_expr, inlink, outlink, - &s->vkctx.output_width, - &s->vkctx.output_height); + &vkctx->output_width, + &vkctx->output_height); if (err < 0) return err; @@ -481,8 +484,8 @@ static void scale_vulkan_uninit(AVFilterContext *avctx) { ScaleVulkanContext *s = avctx->priv; - ff_vk_filter_uninit(avctx); - ff_vk_free_buf(avctx, &s->params_buf); + ff_vk_free_buf(&s->vkctx, &s->params_buf); + ff_vk_uninit(&s->vkctx); s->initialized = 0; } diff --git a/libavfilter/vulkan.c b/libavfilter/vulkan.c index 48f02e4603..e0fcf87f21 100644 --- a/libavfilter/vulkan.c +++ b/libavfilter/vulkan.c @@ -16,659 +16,8 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ -#include "formats.h" #include "vulkan.h" -#include "glslang.h" - -#include "libavutil/avassert.h" -#include "libavutil/vulkan_loader.h" - -/* Generic macro for creating contexts which need to keep their addresses - * if another context is created. */ -#define FN_CREATING(ctx, type, shortname, array, num) \ -static av_always_inline type *create_ ##shortname(ctx *dctx) \ -{ \ - type **array, *sctx = av_mallocz(sizeof(*sctx)); \ - if (!sctx) \ - return NULL; \ - \ - array = av_realloc_array(dctx->array, sizeof(*dctx->array), dctx->num + 1);\ - if (!array) { \ - av_free(sctx); \ - return NULL; \ - } \ - \ - dctx->array = array; \ - dctx->array[dctx->num++] = sctx; \ - \ - return sctx; \ -} - -const VkComponentMapping ff_comp_identity_map = { - .r = VK_COMPONENT_SWIZZLE_IDENTITY, - .g = VK_COMPONENT_SWIZZLE_IDENTITY, - .b = VK_COMPONENT_SWIZZLE_IDENTITY, - .a = VK_COMPONENT_SWIZZLE_IDENTITY, -}; - -/* Converts return values to strings */ -const char *ff_vk_ret2str(VkResult res) -{ -#define CASE(VAL) case VAL: return #VAL - switch (res) { - CASE(VK_SUCCESS); - CASE(VK_NOT_READY); - CASE(VK_TIMEOUT); - CASE(VK_EVENT_SET); - CASE(VK_EVENT_RESET); - CASE(VK_INCOMPLETE); - CASE(VK_ERROR_OUT_OF_HOST_MEMORY); - CASE(VK_ERROR_OUT_OF_DEVICE_MEMORY); - CASE(VK_ERROR_INITIALIZATION_FAILED); - CASE(VK_ERROR_DEVICE_LOST); - CASE(VK_ERROR_MEMORY_MAP_FAILED); - CASE(VK_ERROR_LAYER_NOT_PRESENT); - CASE(VK_ERROR_EXTENSION_NOT_PRESENT); - CASE(VK_ERROR_FEATURE_NOT_PRESENT); - CASE(VK_ERROR_INCOMPATIBLE_DRIVER); - CASE(VK_ERROR_TOO_MANY_OBJECTS); - CASE(VK_ERROR_FORMAT_NOT_SUPPORTED); - CASE(VK_ERROR_FRAGMENTED_POOL); - CASE(VK_ERROR_SURFACE_LOST_KHR); - CASE(VK_ERROR_NATIVE_WINDOW_IN_USE_KHR); - CASE(VK_SUBOPTIMAL_KHR); - CASE(VK_ERROR_OUT_OF_DATE_KHR); - CASE(VK_ERROR_INCOMPATIBLE_DISPLAY_KHR); - CASE(VK_ERROR_VALIDATION_FAILED_EXT); - CASE(VK_ERROR_INVALID_SHADER_NV); - CASE(VK_ERROR_OUT_OF_POOL_MEMORY); - CASE(VK_ERROR_INVALID_EXTERNAL_HANDLE); - CASE(VK_ERROR_NOT_PERMITTED_EXT); - default: return "Unknown error"; - } -#undef CASE -} - -void ff_vk_qf_init(AVFilterContext *avctx, FFVkQueueFamilyCtx *qf, - VkQueueFlagBits dev_family, int nb_queues) -{ - FFVulkanContext *s = avctx->priv; - - switch (dev_family) { - case VK_QUEUE_GRAPHICS_BIT: - qf->queue_family = s->hwctx->queue_family_index; - qf->actual_queues = s->hwctx->nb_graphics_queues; - break; - case VK_QUEUE_COMPUTE_BIT: - qf->queue_family = s->hwctx->queue_family_comp_index; - qf->actual_queues = s->hwctx->nb_comp_queues; - break; - case VK_QUEUE_TRANSFER_BIT: - qf->queue_family = s->hwctx->queue_family_tx_index; - qf->actual_queues = s->hwctx->nb_tx_queues; - break; - case VK_QUEUE_VIDEO_ENCODE_BIT_KHR: - qf->queue_family = s->hwctx->queue_family_encode_index; - qf->actual_queues = s->hwctx->nb_encode_queues; - break; - case VK_QUEUE_VIDEO_DECODE_BIT_KHR: - qf->queue_family = s->hwctx->queue_family_decode_index; - qf->actual_queues = s->hwctx->nb_decode_queues; - break; - default: - av_assert0(0); /* Should never happen */ - } - - if (!nb_queues) - qf->nb_queues = qf->actual_queues; - else - qf->nb_queues = nb_queues; - - return; -} - -void ff_vk_qf_rotate(FFVkQueueFamilyCtx *qf) -{ - qf->cur_queue = (qf->cur_queue + 1) % qf->nb_queues; -} - -static int vk_alloc_mem(AVFilterContext *avctx, VkMemoryRequirements *req, - VkMemoryPropertyFlagBits req_flags, void *alloc_extension, - VkMemoryPropertyFlagBits *mem_flags, VkDeviceMemory *mem) -{ - VkResult ret; - int index = -1; - FFVulkanContext *s = avctx->priv; - FFVulkanFunctions *vk = &s->vkfn; - - VkMemoryAllocateInfo alloc_info = { - .sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO, - .pNext = alloc_extension, - }; - - /* Align if we need to */ - if (req_flags & VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT) - req->size = FFALIGN(req->size, s->props.limits.minMemoryMapAlignment); - - alloc_info.allocationSize = req->size; - - /* The vulkan spec requires memory types to be sorted in the "optimal" - * order, so the first matching type we find will be the best/fastest one */ - for (int i = 0; i < s->mprops.memoryTypeCount; i++) { - /* The memory type must be supported by the requirements (bitfield) */ - if (!(req->memoryTypeBits & (1 << i))) - continue; - - /* The memory type flags must include our properties */ - if ((s->mprops.memoryTypes[i].propertyFlags & req_flags) != req_flags) - continue; - - /* Found a suitable memory type */ - index = i; - break; - } - - if (index < 0) { - av_log(avctx, AV_LOG_ERROR, "No memory type found for flags 0x%x\n", - req_flags); - return AVERROR(EINVAL); - } - - alloc_info.memoryTypeIndex = index; - - ret = vk->AllocateMemory(s->hwctx->act_dev, &alloc_info, - s->hwctx->alloc, mem); - if (ret != VK_SUCCESS) { - av_log(avctx, AV_LOG_ERROR, "Failed to allocate memory: %s\n", - ff_vk_ret2str(ret)); - return AVERROR(ENOMEM); - } - - *mem_flags |= s->mprops.memoryTypes[index].propertyFlags; - - return 0; -} - -int ff_vk_create_buf(AVFilterContext *avctx, FFVkBuffer *buf, size_t size, - VkBufferUsageFlags usage, VkMemoryPropertyFlagBits flags) -{ - int err; - VkResult ret; - int use_ded_mem; - FFVulkanContext *s = avctx->priv; - FFVulkanFunctions *vk = &s->vkfn; - - VkBufferCreateInfo buf_spawn = { - .sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO, - .pNext = NULL, - .usage = usage, - .sharingMode = VK_SHARING_MODE_EXCLUSIVE, - .size = size, /* Gets FFALIGNED during alloc if host visible - but should be ok */ - }; - - VkBufferMemoryRequirementsInfo2 req_desc = { - .sType = VK_STRUCTURE_TYPE_BUFFER_MEMORY_REQUIREMENTS_INFO_2, - }; - VkMemoryDedicatedAllocateInfo ded_alloc = { - .sType = VK_STRUCTURE_TYPE_MEMORY_DEDICATED_ALLOCATE_INFO, - .pNext = NULL, - }; - VkMemoryDedicatedRequirements ded_req = { - .sType = VK_STRUCTURE_TYPE_MEMORY_DEDICATED_REQUIREMENTS, - }; - VkMemoryRequirements2 req = { - .sType = VK_STRUCTURE_TYPE_MEMORY_REQUIREMENTS_2, - .pNext = &ded_req, - }; - - ret = vk->CreateBuffer(s->hwctx->act_dev, &buf_spawn, NULL, &buf->buf); - if (ret != VK_SUCCESS) { - av_log(avctx, AV_LOG_ERROR, "Failed to create buffer: %s\n", - ff_vk_ret2str(ret)); - return AVERROR_EXTERNAL; - } - - req_desc.buffer = buf->buf; - - vk->GetBufferMemoryRequirements2(s->hwctx->act_dev, &req_desc, &req); - - /* In case the implementation prefers/requires dedicated allocation */ - use_ded_mem = ded_req.prefersDedicatedAllocation | - ded_req.requiresDedicatedAllocation; - if (use_ded_mem) - ded_alloc.buffer = buf->buf; - - err = vk_alloc_mem(avctx, &req.memoryRequirements, flags, - use_ded_mem ? &ded_alloc : (void *)ded_alloc.pNext, - &buf->flags, &buf->mem); - if (err) - return err; - - ret = vk->BindBufferMemory(s->hwctx->act_dev, buf->buf, buf->mem, 0); - if (ret != VK_SUCCESS) { - av_log(avctx, AV_LOG_ERROR, "Failed to bind memory to buffer: %s\n", - ff_vk_ret2str(ret)); - return AVERROR_EXTERNAL; - } - - return 0; -} - -int ff_vk_map_buffers(AVFilterContext *avctx, FFVkBuffer *buf, uint8_t *mem[], - int nb_buffers, int invalidate) -{ - VkResult ret; - FFVulkanContext *s = avctx->priv; - FFVulkanFunctions *vk = &s->vkfn; - VkMappedMemoryRange *inval_list = NULL; - int inval_count = 0; - - for (int i = 0; i < nb_buffers; i++) { - ret = vk->MapMemory(s->hwctx->act_dev, buf[i].mem, 0, - VK_WHOLE_SIZE, 0, (void **)&mem[i]); - if (ret != VK_SUCCESS) { - av_log(avctx, AV_LOG_ERROR, "Failed to map buffer memory: %s\n", - ff_vk_ret2str(ret)); - return AVERROR_EXTERNAL; - } - } - - if (!invalidate) - return 0; - - for (int i = 0; i < nb_buffers; i++) { - const VkMappedMemoryRange ival_buf = { - .sType = VK_STRUCTURE_TYPE_MAPPED_MEMORY_RANGE, - .memory = buf[i].mem, - .size = VK_WHOLE_SIZE, - }; - if (buf[i].flags & VK_MEMORY_PROPERTY_HOST_COHERENT_BIT) - continue; - inval_list = av_fast_realloc(s->scratch, &s->scratch_size, - (++inval_count)*sizeof(*inval_list)); - if (!inval_list) - return AVERROR(ENOMEM); - inval_list[inval_count - 1] = ival_buf; - } - - if (inval_count) { - ret = vk->InvalidateMappedMemoryRanges(s->hwctx->act_dev, inval_count, - inval_list); - if (ret != VK_SUCCESS) { - av_log(avctx, AV_LOG_ERROR, "Failed to invalidate memory: %s\n", - ff_vk_ret2str(ret)); - return AVERROR_EXTERNAL; - } - } - - return 0; -} - -int ff_vk_unmap_buffers(AVFilterContext *avctx, FFVkBuffer *buf, int nb_buffers, - int flush) -{ - int err = 0; - VkResult ret; - FFVulkanContext *s = avctx->priv; - FFVulkanFunctions *vk = &s->vkfn; - VkMappedMemoryRange *flush_list = NULL; - int flush_count = 0; - - if (flush) { - for (int i = 0; i < nb_buffers; i++) { - const VkMappedMemoryRange flush_buf = { - .sType = VK_STRUCTURE_TYPE_MAPPED_MEMORY_RANGE, - .memory = buf[i].mem, - .size = VK_WHOLE_SIZE, - }; - if (buf[i].flags & VK_MEMORY_PROPERTY_HOST_COHERENT_BIT) - continue; - flush_list = av_fast_realloc(s->scratch, &s->scratch_size, - (++flush_count)*sizeof(*flush_list)); - if (!flush_list) - return AVERROR(ENOMEM); - flush_list[flush_count - 1] = flush_buf; - } - } - - if (flush_count) { - ret = vk->FlushMappedMemoryRanges(s->hwctx->act_dev, flush_count, - flush_list); - if (ret != VK_SUCCESS) { - av_log(avctx, AV_LOG_ERROR, "Failed to flush memory: %s\n", - ff_vk_ret2str(ret)); - err = AVERROR_EXTERNAL; /* We still want to try to unmap them */ - } - } - - for (int i = 0; i < nb_buffers; i++) - vk->UnmapMemory(s->hwctx->act_dev, buf[i].mem); - - return err; -} - -void ff_vk_free_buf(AVFilterContext *avctx, FFVkBuffer *buf) -{ - FFVulkanContext *s = avctx->priv; - FFVulkanFunctions *vk = &s->vkfn; - - if (!buf) - return; - - if (buf->buf != VK_NULL_HANDLE) - vk->DestroyBuffer(s->hwctx->act_dev, buf->buf, s->hwctx->alloc); - if (buf->mem != VK_NULL_HANDLE) - vk->FreeMemory(s->hwctx->act_dev, buf->mem, s->hwctx->alloc); -} - -int ff_vk_add_push_constant(AVFilterContext *avctx, FFVulkanPipeline *pl, - int offset, int size, VkShaderStageFlagBits stage) -{ - VkPushConstantRange *pc; - - pl->push_consts = av_realloc_array(pl->push_consts, sizeof(*pl->push_consts), - pl->push_consts_num + 1); - if (!pl->push_consts) - return AVERROR(ENOMEM); - - pc = &pl->push_consts[pl->push_consts_num++]; - memset(pc, 0, sizeof(*pc)); - - pc->stageFlags = stage; - pc->offset = offset; - pc->size = size; - - return 0; -} - -FN_CREATING(FFVulkanContext, FFVkExecContext, exec_ctx, exec_ctx, exec_ctx_num) -int ff_vk_create_exec_ctx(AVFilterContext *avctx, FFVkExecContext **ctx, - FFVkQueueFamilyCtx *qf) -{ - VkResult ret; - FFVkExecContext *e; - FFVulkanContext *s = avctx->priv; - FFVulkanFunctions *vk = &s->vkfn; - - VkCommandPoolCreateInfo cqueue_create = { - .sType = VK_STRUCTURE_TYPE_COMMAND_POOL_CREATE_INFO, - .flags = VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT, - .queueFamilyIndex = qf->queue_family, - }; - VkCommandBufferAllocateInfo cbuf_create = { - .sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_ALLOCATE_INFO, - .level = VK_COMMAND_BUFFER_LEVEL_PRIMARY, - .commandBufferCount = qf->nb_queues, - }; - - e = create_exec_ctx(s); - if (!e) - return AVERROR(ENOMEM); - - e->qf = qf; - - e->queues = av_mallocz(qf->nb_queues * sizeof(*e->queues)); - if (!e->queues) - return AVERROR(ENOMEM); - - e->bufs = av_mallocz(qf->nb_queues * sizeof(*e->bufs)); - if (!e->bufs) - return AVERROR(ENOMEM); - - /* Create command pool */ - ret = vk->CreateCommandPool(s->hwctx->act_dev, &cqueue_create, - s->hwctx->alloc, &e->pool); - if (ret != VK_SUCCESS) { - av_log(avctx, AV_LOG_ERROR, "Command pool creation failure: %s\n", - ff_vk_ret2str(ret)); - return AVERROR_EXTERNAL; - } - - cbuf_create.commandPool = e->pool; - - /* Allocate command buffer */ - ret = vk->AllocateCommandBuffers(s->hwctx->act_dev, &cbuf_create, e->bufs); - if (ret != VK_SUCCESS) { - av_log(avctx, AV_LOG_ERROR, "Command buffer alloc failure: %s\n", - ff_vk_ret2str(ret)); - return AVERROR_EXTERNAL; - } - - for (int i = 0; i < qf->nb_queues; i++) { - FFVkQueueCtx *q = &e->queues[i]; - vk->GetDeviceQueue(s->hwctx->act_dev, qf->queue_family, - i % qf->actual_queues, &q->queue); - } - - *ctx = e; - - return 0; -} - -void ff_vk_discard_exec_deps(AVFilterContext *avctx, FFVkExecContext *e) -{ - FFVkQueueCtx *q = &e->queues[e->qf->cur_queue]; - - for (int j = 0; j < q->nb_buf_deps; j++) - av_buffer_unref(&q->buf_deps[j]); - q->nb_buf_deps = 0; - - for (int j = 0; j < q->nb_frame_deps; j++) - av_frame_free(&q->frame_deps[j]); - q->nb_frame_deps = 0; - - e->sem_wait_cnt = 0; - e->sem_sig_cnt = 0; -} - -int ff_vk_start_exec_recording(AVFilterContext *avctx, FFVkExecContext *e) -{ - VkResult ret; - FFVulkanContext *s = avctx->priv; - FFVulkanFunctions *vk = &s->vkfn; - FFVkQueueCtx *q = &e->queues[e->qf->cur_queue]; - - VkCommandBufferBeginInfo cmd_start = { - .sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO, - .flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT, - }; - - /* Create the fence and don't wait for it initially */ - if (!q->fence) { - VkFenceCreateInfo fence_spawn = { - .sType = VK_STRUCTURE_TYPE_FENCE_CREATE_INFO, - }; - ret = vk->CreateFence(s->hwctx->act_dev, &fence_spawn, s->hwctx->alloc, - &q->fence); - if (ret != VK_SUCCESS) { - av_log(avctx, AV_LOG_ERROR, "Failed to queue frame fence: %s\n", - ff_vk_ret2str(ret)); - return AVERROR_EXTERNAL; - } - } else { - vk->WaitForFences(s->hwctx->act_dev, 1, &q->fence, VK_TRUE, UINT64_MAX); - vk->ResetFences(s->hwctx->act_dev, 1, &q->fence); - } - - /* Discard queue dependencies */ - ff_vk_discard_exec_deps(avctx, e); - - ret = vk->BeginCommandBuffer(e->bufs[e->qf->cur_queue], &cmd_start); - if (ret != VK_SUCCESS) { - av_log(avctx, AV_LOG_ERROR, "Failed to start command recoding: %s\n", - ff_vk_ret2str(ret)); - return AVERROR_EXTERNAL; - } - - return 0; -} - -VkCommandBuffer ff_vk_get_exec_buf(AVFilterContext *avctx, FFVkExecContext *e) -{ - return e->bufs[e->qf->cur_queue]; -} - -int ff_vk_add_exec_dep(AVFilterContext *avctx, FFVkExecContext *e, - AVFrame *frame, VkPipelineStageFlagBits in_wait_dst_flag) -{ - AVFrame **dst; - AVVkFrame *f = (AVVkFrame *)frame->data[0]; - FFVkQueueCtx *q = &e->queues[e->qf->cur_queue]; - AVHWFramesContext *fc = (AVHWFramesContext *)frame->hw_frames_ctx->data; - int planes = av_pix_fmt_count_planes(fc->sw_format); - - for (int i = 0; i < planes; i++) { - e->sem_wait = av_fast_realloc(e->sem_wait, &e->sem_wait_alloc, - (e->sem_wait_cnt + 1)*sizeof(*e->sem_wait)); - if (!e->sem_wait) { - ff_vk_discard_exec_deps(avctx, e); - return AVERROR(ENOMEM); - } - - e->sem_wait_dst = av_fast_realloc(e->sem_wait_dst, &e->sem_wait_dst_alloc, - (e->sem_wait_cnt + 1)*sizeof(*e->sem_wait_dst)); - if (!e->sem_wait_dst) { - ff_vk_discard_exec_deps(avctx, e); - return AVERROR(ENOMEM); - } - - e->sem_wait_val = av_fast_realloc(e->sem_wait_val, &e->sem_wait_val_alloc, - (e->sem_wait_cnt + 1)*sizeof(*e->sem_wait_val)); - if (!e->sem_wait_val) { - ff_vk_discard_exec_deps(avctx, e); - return AVERROR(ENOMEM); - } - - e->sem_sig = av_fast_realloc(e->sem_sig, &e->sem_sig_alloc, - (e->sem_sig_cnt + 1)*sizeof(*e->sem_sig)); - if (!e->sem_sig) { - ff_vk_discard_exec_deps(avctx, e); - return AVERROR(ENOMEM); - } - - e->sem_sig_val = av_fast_realloc(e->sem_sig_val, &e->sem_sig_val_alloc, - (e->sem_sig_cnt + 1)*sizeof(*e->sem_sig_val)); - if (!e->sem_sig_val) { - ff_vk_discard_exec_deps(avctx, e); - return AVERROR(ENOMEM); - } - - e->sem_sig_val_dst = av_fast_realloc(e->sem_sig_val_dst, &e->sem_sig_val_dst_alloc, - (e->sem_sig_cnt + 1)*sizeof(*e->sem_sig_val_dst)); - if (!e->sem_sig_val_dst) { - ff_vk_discard_exec_deps(avctx, e); - return AVERROR(ENOMEM); - } - - e->sem_wait[e->sem_wait_cnt] = f->sem[i]; - e->sem_wait_dst[e->sem_wait_cnt] = in_wait_dst_flag; - e->sem_wait_val[e->sem_wait_cnt] = f->sem_value[i]; - e->sem_wait_cnt++; - - e->sem_sig[e->sem_sig_cnt] = f->sem[i]; - e->sem_sig_val[e->sem_sig_cnt] = f->sem_value[i] + 1; - e->sem_sig_val_dst[e->sem_sig_cnt] = &f->sem_value[i]; - e->sem_sig_cnt++; - } - - dst = av_fast_realloc(q->frame_deps, &q->frame_deps_alloc_size, - (q->nb_frame_deps + 1) * sizeof(*dst)); - if (!dst) { - ff_vk_discard_exec_deps(avctx, e); - return AVERROR(ENOMEM); - } - - q->frame_deps = dst; - q->frame_deps[q->nb_frame_deps] = av_frame_clone(frame); - if (!q->frame_deps[q->nb_frame_deps]) { - ff_vk_discard_exec_deps(avctx, e); - return AVERROR(ENOMEM); - } - q->nb_frame_deps++; - - return 0; -} - -int ff_vk_submit_exec_queue(AVFilterContext *avctx, FFVkExecContext *e) -{ - VkResult ret; - FFVulkanContext *s = avctx->priv; - FFVulkanFunctions *vk = &s->vkfn; - FFVkQueueCtx *q = &e->queues[e->qf->cur_queue]; - - VkTimelineSemaphoreSubmitInfo s_timeline_sem_info = { - .sType = VK_STRUCTURE_TYPE_TIMELINE_SEMAPHORE_SUBMIT_INFO, - .pWaitSemaphoreValues = e->sem_wait_val, - .pSignalSemaphoreValues = e->sem_sig_val, - .waitSemaphoreValueCount = e->sem_wait_cnt, - .signalSemaphoreValueCount = e->sem_sig_cnt, - }; - - VkSubmitInfo s_info = { - .sType = VK_STRUCTURE_TYPE_SUBMIT_INFO, - .pNext = &s_timeline_sem_info, - - .commandBufferCount = 1, - .pCommandBuffers = &e->bufs[e->qf->cur_queue], - - .pWaitSemaphores = e->sem_wait, - .pWaitDstStageMask = e->sem_wait_dst, - .waitSemaphoreCount = e->sem_wait_cnt, - - .pSignalSemaphores = e->sem_sig, - .signalSemaphoreCount = e->sem_sig_cnt, - }; - - ret = vk->EndCommandBuffer(e->bufs[e->qf->cur_queue]); - if (ret != VK_SUCCESS) { - av_log(avctx, AV_LOG_ERROR, "Unable to finish command buffer: %s\n", - ff_vk_ret2str(ret)); - return AVERROR_EXTERNAL; - } - - ret = vk->QueueSubmit(q->queue, 1, &s_info, q->fence); - if (ret != VK_SUCCESS) { - av_log(avctx, AV_LOG_ERROR, "Unable to submit command buffer: %s\n", - ff_vk_ret2str(ret)); - return AVERROR_EXTERNAL; - } - - for (int i = 0; i < e->sem_sig_cnt; i++) - *e->sem_sig_val_dst[i] += 1; - - return 0; -} - -int ff_vk_add_dep_exec_ctx(AVFilterContext *avctx, FFVkExecContext *e, - AVBufferRef **deps, int nb_deps) -{ - AVBufferRef **dst; - FFVkQueueCtx *q = &e->queues[e->qf->cur_queue]; - - if (!deps || !nb_deps) - return 0; - - dst = av_fast_realloc(q->buf_deps, &q->buf_deps_alloc_size, - (q->nb_buf_deps + nb_deps) * sizeof(*dst)); - if (!dst) - goto err; - - q->buf_deps = dst; - - for (int i = 0; i < nb_deps; i++) { - q->buf_deps[q->nb_buf_deps] = deps[i]; - if (!q->buf_deps[q->nb_buf_deps]) - goto err; - q->nb_buf_deps++; - } - - return 0; - -err: - ff_vk_discard_exec_deps(avctx, e); - return AVERROR(ENOMEM); -} +#include "libavutil/vulkan.c" static int vulkan_filter_set_device(AVFilterContext *avctx, AVBufferRef *device) @@ -844,741 +193,3 @@ int ff_vk_filter_init(AVFilterContext *avctx) return 0; } - -FN_CREATING(FFVulkanContext, FFVkSampler, sampler, samplers, samplers_num) -FFVkSampler *ff_vk_init_sampler(AVFilterContext *avctx, int unnorm_coords, - VkFilter filt) -{ - VkResult ret; - FFVulkanContext *s = avctx->priv; - FFVulkanFunctions *vk = &s->vkfn; - - VkSamplerCreateInfo sampler_info = { - .sType = VK_STRUCTURE_TYPE_SAMPLER_CREATE_INFO, - .magFilter = filt, - .minFilter = sampler_info.magFilter, - .mipmapMode = unnorm_coords ? VK_SAMPLER_MIPMAP_MODE_NEAREST : - VK_SAMPLER_MIPMAP_MODE_LINEAR, - .addressModeU = VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE, - .addressModeV = sampler_info.addressModeU, - .addressModeW = sampler_info.addressModeU, - .anisotropyEnable = VK_FALSE, - .compareOp = VK_COMPARE_OP_NEVER, - .borderColor = VK_BORDER_COLOR_FLOAT_TRANSPARENT_BLACK, - .unnormalizedCoordinates = unnorm_coords, - }; - - FFVkSampler *sctx = create_sampler(s); - if (!sctx) - return NULL; - - ret = vk->CreateSampler(s->hwctx->act_dev, &sampler_info, - s->hwctx->alloc, &sctx->sampler[0]); - if (ret != VK_SUCCESS) { - av_log(avctx, AV_LOG_ERROR, "Unable to init sampler: %s\n", - ff_vk_ret2str(ret)); - return NULL; - } - - for (int i = 1; i < 4; i++) - sctx->sampler[i] = sctx->sampler[0]; - - return sctx; -} - -int ff_vk_mt_is_np_rgb(enum AVPixelFormat pix_fmt) -{ - if (pix_fmt == AV_PIX_FMT_ABGR || pix_fmt == AV_PIX_FMT_BGRA || - pix_fmt == AV_PIX_FMT_RGBA || pix_fmt == AV_PIX_FMT_RGB24 || - pix_fmt == AV_PIX_FMT_BGR24 || pix_fmt == AV_PIX_FMT_RGB48 || - pix_fmt == AV_PIX_FMT_RGBA64 || pix_fmt == AV_PIX_FMT_RGB565 || - pix_fmt == AV_PIX_FMT_BGR565 || pix_fmt == AV_PIX_FMT_BGR0 || - pix_fmt == AV_PIX_FMT_0BGR || pix_fmt == AV_PIX_FMT_RGB0) - return 1; - return 0; -} - -const char *ff_vk_shader_rep_fmt(enum AVPixelFormat pixfmt) -{ - const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(pixfmt); - const int high = desc->comp[0].depth > 8; - return high ? "rgba16f" : "rgba8"; -} - -typedef struct ImageViewCtx { - VkImageView view; -} ImageViewCtx; - -static void destroy_imageview(void *opaque, uint8_t *data) -{ - FFVulkanContext *s = opaque; - FFVulkanFunctions *vk = &s->vkfn; - ImageViewCtx *iv = (ImageViewCtx *)data; - - vk->DestroyImageView(s->hwctx->act_dev, iv->view, s->hwctx->alloc); - av_free(iv); -} - -int ff_vk_create_imageview(AVFilterContext *avctx, FFVkExecContext *e, - VkImageView *v, VkImage img, VkFormat fmt, - const VkComponentMapping map) -{ - int err; - AVBufferRef *buf; - FFVulkanContext *s = avctx->priv; - FFVulkanFunctions *vk = &s->vkfn; - - VkImageViewCreateInfo imgview_spawn = { - .sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO, - .pNext = NULL, - .image = img, - .viewType = VK_IMAGE_VIEW_TYPE_2D, - .format = fmt, - .components = map, - .subresourceRange = { - .aspectMask = VK_IMAGE_ASPECT_COLOR_BIT, - .baseMipLevel = 0, - .levelCount = 1, - .baseArrayLayer = 0, - .layerCount = 1, - }, - }; - - ImageViewCtx *iv = av_mallocz(sizeof(*iv)); - - VkResult ret = vk->CreateImageView(s->hwctx->act_dev, &imgview_spawn, - s->hwctx->alloc, &iv->view); - if (ret != VK_SUCCESS) { - av_log(avctx, AV_LOG_ERROR, "Failed to create imageview: %s\n", - ff_vk_ret2str(ret)); - return AVERROR_EXTERNAL; - } - - buf = av_buffer_create((uint8_t *)iv, sizeof(*iv), destroy_imageview, s, 0); - if (!buf) { - destroy_imageview(s, (uint8_t *)iv); - return AVERROR(ENOMEM); - } - - /* Add to queue dependencies */ - err = ff_vk_add_dep_exec_ctx(avctx, e, &buf, 1); - if (err) { - av_buffer_unref(&buf); - return err; - } - - *v = iv->view; - - return 0; -} - -FN_CREATING(FFVulkanPipeline, FFSPIRVShader, shader, shaders, shaders_num) -FFSPIRVShader *ff_vk_init_shader(AVFilterContext *avctx, FFVulkanPipeline *pl, - const char *name, VkShaderStageFlags stage) -{ - FFSPIRVShader *shd = create_shader(pl); - if (!shd) - return NULL; - - av_bprint_init(&shd->src, 0, AV_BPRINT_SIZE_UNLIMITED); - - shd->shader.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO; - shd->shader.stage = stage; - - shd->name = name; - - GLSLF(0, #version %i ,460); - GLSLC(0, #define IS_WITHIN(v1, v2) ((v1.x < v2.x) && (v1.y < v2.y)) ); - GLSLC(0, ); - - return shd; -} - -void ff_vk_set_compute_shader_sizes(AVFilterContext *avctx, FFSPIRVShader *shd, - int local_size[3]) -{ - shd->local_size[0] = local_size[0]; - shd->local_size[1] = local_size[1]; - shd->local_size[2] = local_size[2]; - - av_bprintf(&shd->src, "layout (local_size_x = %i, " - "local_size_y = %i, local_size_z = %i) in;\n\n", - shd->local_size[0], shd->local_size[1], shd->local_size[2]); -} - -void ff_vk_print_shader(AVFilterContext *avctx, FFSPIRVShader *shd, int prio) -{ - int line = 0; - const char *p = shd->src.str; - const char *start = p; - - AVBPrint buf; - av_bprint_init(&buf, 0, AV_BPRINT_SIZE_UNLIMITED); - - for (int i = 0; i < strlen(p); i++) { - if (p[i] == '\n') { - av_bprintf(&buf, "%i\t", ++line); - av_bprint_append_data(&buf, start, &p[i] - start + 1); - start = &p[i + 1]; - } - } - - av_log(avctx, prio, "Shader %s: \n%s", shd->name, buf.str); - av_bprint_finalize(&buf, NULL); -} - -int ff_vk_compile_shader(AVFilterContext *avctx, FFSPIRVShader *shd, - const char *entrypoint) -{ - int err; - VkResult ret; - FFVulkanContext *s = avctx->priv; - FFVulkanFunctions *vk = &s->vkfn; - VkShaderModuleCreateInfo shader_create; - uint8_t *spirv; - size_t spirv_size; - void *priv; - - shd->shader.pName = entrypoint; - - err = ff_vk_glslang_shader_compile(avctx, shd, &spirv, &spirv_size, &priv); - if (err < 0) - return err; - - ff_vk_print_shader(avctx, shd, AV_LOG_VERBOSE); - - av_log(avctx, AV_LOG_VERBOSE, "Shader %s compiled! Size: %zu bytes\n", - shd->name, spirv_size); - - shader_create.sType = VK_STRUCTURE_TYPE_SHADER_MODULE_CREATE_INFO; - shader_create.pNext = NULL; - shader_create.codeSize = spirv_size; - shader_create.flags = 0; - shader_create.pCode = (void *)spirv; - - ret = vk->CreateShaderModule(s->hwctx->act_dev, &shader_create, NULL, - &shd->shader.module); - - ff_vk_glslang_shader_free(priv); - - if (ret != VK_SUCCESS) { - av_log(avctx, AV_LOG_ERROR, "Unable to create shader module: %s\n", - ff_vk_ret2str(ret)); - return AVERROR_EXTERNAL; - } - - return 0; -} - -static const struct descriptor_props { - size_t struct_size; /* Size of the opaque which updates the descriptor */ - const char *type; - int is_uniform; - int mem_quali; /* Can use a memory qualifier */ - int dim_needed; /* Must indicate dimension */ - int buf_content; /* Must indicate buffer contents */ -} descriptor_props[] = { - [VK_DESCRIPTOR_TYPE_SAMPLER] = { sizeof(VkDescriptorImageInfo), "sampler", 1, 0, 0, 0, }, - [VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE] = { sizeof(VkDescriptorImageInfo), "texture", 1, 0, 1, 0, }, - [VK_DESCRIPTOR_TYPE_STORAGE_IMAGE] = { sizeof(VkDescriptorImageInfo), "image", 1, 1, 1, 0, }, - [VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT] = { sizeof(VkDescriptorImageInfo), "subpassInput", 1, 0, 0, 0, }, - [VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER] = { sizeof(VkDescriptorImageInfo), "sampler", 1, 0, 1, 0, }, - [VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER] = { sizeof(VkDescriptorBufferInfo), NULL, 1, 0, 0, 1, }, - [VK_DESCRIPTOR_TYPE_STORAGE_BUFFER] = { sizeof(VkDescriptorBufferInfo), "buffer", 0, 1, 0, 1, }, - [VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC] = { sizeof(VkDescriptorBufferInfo), NULL, 1, 0, 0, 1, }, - [VK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC] = { sizeof(VkDescriptorBufferInfo), "buffer", 0, 1, 0, 1, }, - [VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER] = { sizeof(VkBufferView), "samplerBuffer", 1, 0, 0, 0, }, - [VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER] = { sizeof(VkBufferView), "imageBuffer", 1, 0, 0, 0, }, -}; - -int ff_vk_add_descriptor_set(AVFilterContext *avctx, FFVulkanPipeline *pl, - FFSPIRVShader *shd, FFVulkanDescriptorSetBinding *desc, - int num, int only_print_to_shader) -{ - VkResult ret; - VkDescriptorSetLayout *layout; - FFVulkanContext *s = avctx->priv; - FFVulkanFunctions *vk = &s->vkfn; - - if (only_print_to_shader) - goto print; - - pl->desc_layout = av_realloc_array(pl->desc_layout, sizeof(*pl->desc_layout), - pl->desc_layout_num + pl->qf->nb_queues); - if (!pl->desc_layout) - return AVERROR(ENOMEM); - - pl->desc_set_initialized = av_realloc_array(pl->desc_set_initialized, - sizeof(*pl->desc_set_initialized), - pl->descriptor_sets_num + 1); - if (!pl->desc_set_initialized) - return AVERROR(ENOMEM); - - pl->desc_set_initialized[pl->descriptor_sets_num] = 0; - layout = &pl->desc_layout[pl->desc_layout_num]; - - { /* Create descriptor set layout descriptions */ - VkDescriptorSetLayoutCreateInfo desc_create_layout = { 0 }; - VkDescriptorSetLayoutBinding *desc_binding; - - desc_binding = av_mallocz(sizeof(*desc_binding)*num); - if (!desc_binding) - return AVERROR(ENOMEM); - - for (int i = 0; i < num; i++) { - desc_binding[i].binding = i; - desc_binding[i].descriptorType = desc[i].type; - desc_binding[i].descriptorCount = FFMAX(desc[i].elems, 1); - desc_binding[i].stageFlags = desc[i].stages; - desc_binding[i].pImmutableSamplers = desc[i].sampler ? - desc[i].sampler->sampler : - NULL; - } - - desc_create_layout.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO; - desc_create_layout.pBindings = desc_binding; - desc_create_layout.bindingCount = num; - - for (int i = 0; i < pl->qf->nb_queues; i++) { - ret = vk->CreateDescriptorSetLayout(s->hwctx->act_dev, &desc_create_layout, - s->hwctx->alloc, &layout[i]); - if (ret != VK_SUCCESS) { - av_log(avctx, AV_LOG_ERROR, "Unable to init descriptor set " - "layout: %s\n", ff_vk_ret2str(ret)); - av_free(desc_binding); - return AVERROR_EXTERNAL; - } - } - - av_free(desc_binding); - } - - { /* Pool each descriptor by type and update pool counts */ - for (int i = 0; i < num; i++) { - int j; - for (j = 0; j < pl->pool_size_desc_num; j++) - if (pl->pool_size_desc[j].type == desc[i].type) - break; - if (j >= pl->pool_size_desc_num) { - pl->pool_size_desc = av_realloc_array(pl->pool_size_desc, - sizeof(*pl->pool_size_desc), - ++pl->pool_size_desc_num); - if (!pl->pool_size_desc) - return AVERROR(ENOMEM); - memset(&pl->pool_size_desc[j], 0, sizeof(VkDescriptorPoolSize)); - } - pl->pool_size_desc[j].type = desc[i].type; - pl->pool_size_desc[j].descriptorCount += FFMAX(desc[i].elems, 1)*pl->qf->nb_queues; - } - } - - { /* Create template creation struct */ - VkDescriptorUpdateTemplateCreateInfo *dt; - VkDescriptorUpdateTemplateEntry *des_entries; - - /* Freed after descriptor set initialization */ - des_entries = av_mallocz(num*sizeof(VkDescriptorUpdateTemplateEntry)); - if (!des_entries) - return AVERROR(ENOMEM); - - for (int i = 0; i < num; i++) { - des_entries[i].dstBinding = i; - des_entries[i].descriptorType = desc[i].type; - des_entries[i].descriptorCount = FFMAX(desc[i].elems, 1); - des_entries[i].dstArrayElement = 0; - des_entries[i].offset = ((uint8_t *)desc[i].updater) - (uint8_t *)s; - des_entries[i].stride = descriptor_props[desc[i].type].struct_size; - } - - pl->desc_template_info = av_realloc_array(pl->desc_template_info, - sizeof(*pl->desc_template_info), - pl->total_descriptor_sets + pl->qf->nb_queues); - if (!pl->desc_template_info) - return AVERROR(ENOMEM); - - dt = &pl->desc_template_info[pl->total_descriptor_sets]; - memset(dt, 0, sizeof(*dt)*pl->qf->nb_queues); - - for (int i = 0; i < pl->qf->nb_queues; i++) { - dt[i].sType = VK_STRUCTURE_TYPE_DESCRIPTOR_UPDATE_TEMPLATE_CREATE_INFO; - dt[i].templateType = VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_DESCRIPTOR_SET; - dt[i].descriptorSetLayout = layout[i]; - dt[i].pDescriptorUpdateEntries = des_entries; - dt[i].descriptorUpdateEntryCount = num; - } - } - - pl->descriptor_sets_num++; - - pl->desc_layout_num += pl->qf->nb_queues; - pl->total_descriptor_sets += pl->qf->nb_queues; - -print: - /* Write shader info */ - for (int i = 0; i < num; i++) { - const struct descriptor_props *prop = &descriptor_props[desc[i].type]; - GLSLA("layout (set = %i, binding = %i", pl->descriptor_sets_num - 1, i); - - if (desc[i].mem_layout) - GLSLA(", %s", desc[i].mem_layout); - GLSLA(")"); - - if (prop->is_uniform) - GLSLA(" uniform"); - - if (prop->mem_quali && desc[i].mem_quali) - GLSLA(" %s", desc[i].mem_quali); - - if (prop->type) - GLSLA(" %s", prop->type); - - if (prop->dim_needed) - GLSLA("%iD", desc[i].dimensions); - - GLSLA(" %s", desc[i].name); - - if (prop->buf_content) - GLSLA(" {\n %s\n}", desc[i].buf_content); - else if (desc[i].elems > 0) - GLSLA("[%i]", desc[i].elems); - - GLSLA(";\n"); - } - GLSLA("\n"); - - return 0; -} - -void ff_vk_update_descriptor_set(AVFilterContext *avctx, FFVulkanPipeline *pl, - int set_id) -{ - FFVulkanContext *s = avctx->priv; - FFVulkanFunctions *vk = &s->vkfn; - - /* If a set has never been updated, update all queues' sets. */ - if (!pl->desc_set_initialized[set_id]) { - for (int i = 0; i < pl->qf->nb_queues; i++) { - int idx = set_id*pl->qf->nb_queues + i; - vk->UpdateDescriptorSetWithTemplate(s->hwctx->act_dev, - pl->desc_set[idx], - pl->desc_template[idx], - s); - } - pl->desc_set_initialized[set_id] = 1; - return; - } - - set_id = set_id*pl->qf->nb_queues + pl->qf->cur_queue; - - vk->UpdateDescriptorSetWithTemplate(s->hwctx->act_dev, - pl->desc_set[set_id], - pl->desc_template[set_id], - s); -} - -void ff_vk_update_push_exec(AVFilterContext *avctx, FFVkExecContext *e, - VkShaderStageFlagBits stage, int offset, - size_t size, void *src) -{ - FFVulkanContext *s = avctx->priv; - FFVulkanFunctions *vk = &s->vkfn; - - vk->CmdPushConstants(e->bufs[e->qf->cur_queue], e->bound_pl->pipeline_layout, - stage, offset, size, src); -} - -int ff_vk_init_pipeline_layout(AVFilterContext *avctx, FFVulkanPipeline *pl) -{ - VkResult ret; - FFVulkanContext *s = avctx->priv; - FFVulkanFunctions *vk = &s->vkfn; - - pl->desc_staging = av_malloc(pl->descriptor_sets_num*sizeof(*pl->desc_staging)); - if (!pl->desc_staging) - return AVERROR(ENOMEM); - - { /* Init descriptor set pool */ - VkDescriptorPoolCreateInfo pool_create_info = { - .sType = VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO, - .poolSizeCount = pl->pool_size_desc_num, - .pPoolSizes = pl->pool_size_desc, - .maxSets = pl->total_descriptor_sets, - }; - - ret = vk->CreateDescriptorPool(s->hwctx->act_dev, &pool_create_info, - s->hwctx->alloc, &pl->desc_pool); - av_freep(&pl->pool_size_desc); - if (ret != VK_SUCCESS) { - av_log(avctx, AV_LOG_ERROR, "Unable to init descriptor set " - "pool: %s\n", ff_vk_ret2str(ret)); - return AVERROR_EXTERNAL; - } - } - - { /* Allocate descriptor sets */ - VkDescriptorSetAllocateInfo alloc_info = { - .sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO, - .descriptorPool = pl->desc_pool, - .descriptorSetCount = pl->total_descriptor_sets, - .pSetLayouts = pl->desc_layout, - }; - - pl->desc_set = av_malloc(pl->total_descriptor_sets*sizeof(*pl->desc_set)); - if (!pl->desc_set) - return AVERROR(ENOMEM); - - ret = vk->AllocateDescriptorSets(s->hwctx->act_dev, &alloc_info, - pl->desc_set); - if (ret != VK_SUCCESS) { - av_log(avctx, AV_LOG_ERROR, "Unable to allocate descriptor set: %s\n", - ff_vk_ret2str(ret)); - return AVERROR_EXTERNAL; - } - } - - { /* Finally create the pipeline layout */ - VkPipelineLayoutCreateInfo spawn_pipeline_layout = { - .sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO, - .pSetLayouts = (VkDescriptorSetLayout *)pl->desc_staging, - .pushConstantRangeCount = pl->push_consts_num, - .pPushConstantRanges = pl->push_consts, - }; - - for (int i = 0; i < pl->total_descriptor_sets; i += pl->qf->nb_queues) - pl->desc_staging[spawn_pipeline_layout.setLayoutCount++] = pl->desc_layout[i]; - - ret = vk->CreatePipelineLayout(s->hwctx->act_dev, &spawn_pipeline_layout, - s->hwctx->alloc, &pl->pipeline_layout); - av_freep(&pl->push_consts); - pl->push_consts_num = 0; - if (ret != VK_SUCCESS) { - av_log(avctx, AV_LOG_ERROR, "Unable to init pipeline layout: %s\n", - ff_vk_ret2str(ret)); - return AVERROR_EXTERNAL; - } - } - - { /* Descriptor template (for tightly packed descriptors) */ - VkDescriptorUpdateTemplateCreateInfo *dt; - - pl->desc_template = av_malloc(pl->total_descriptor_sets*sizeof(*pl->desc_template)); - if (!pl->desc_template) - return AVERROR(ENOMEM); - - /* Create update templates for the descriptor sets */ - for (int i = 0; i < pl->total_descriptor_sets; i++) { - dt = &pl->desc_template_info[i]; - dt->pipelineLayout = pl->pipeline_layout; - ret = vk->CreateDescriptorUpdateTemplate(s->hwctx->act_dev, - dt, s->hwctx->alloc, - &pl->desc_template[i]); - if (ret != VK_SUCCESS) { - av_log(avctx, AV_LOG_ERROR, "Unable to init descriptor " - "template: %s\n", ff_vk_ret2str(ret)); - return AVERROR_EXTERNAL; - } - } - - /* Free the duplicated memory used for the template entries */ - for (int i = 0; i < pl->total_descriptor_sets; i += pl->qf->nb_queues) { - dt = &pl->desc_template_info[i]; - av_free((void *)dt->pDescriptorUpdateEntries); - } - - av_freep(&pl->desc_template_info); - } - - return 0; -} - -FN_CREATING(FFVulkanContext, FFVulkanPipeline, pipeline, pipelines, pipelines_num) -FFVulkanPipeline *ff_vk_create_pipeline(AVFilterContext *avctx, - FFVkQueueFamilyCtx *qf) -{ - FFVulkanPipeline *pl = create_pipeline(avctx->priv); - if (pl) - pl->qf = qf; - - return pl; -} - -int ff_vk_init_compute_pipeline(AVFilterContext *avctx, FFVulkanPipeline *pl) -{ - int i; - VkResult ret; - FFVulkanContext *s = avctx->priv; - FFVulkanFunctions *vk = &s->vkfn; - - VkComputePipelineCreateInfo pipe = { - .sType = VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO, - .layout = pl->pipeline_layout, - }; - - for (i = 0; i < pl->shaders_num; i++) { - if (pl->shaders[i]->shader.stage & VK_SHADER_STAGE_COMPUTE_BIT) { - pipe.stage = pl->shaders[i]->shader; - break; - } - } - if (i == pl->shaders_num) { - av_log(avctx, AV_LOG_ERROR, "Can't init compute pipeline, no shader\n"); - return AVERROR(EINVAL); - } - - ret = vk->CreateComputePipelines(s->hwctx->act_dev, VK_NULL_HANDLE, 1, &pipe, - s->hwctx->alloc, &pl->pipeline); - if (ret != VK_SUCCESS) { - av_log(avctx, AV_LOG_ERROR, "Unable to init compute pipeline: %s\n", - ff_vk_ret2str(ret)); - return AVERROR_EXTERNAL; - } - - pl->bind_point = VK_PIPELINE_BIND_POINT_COMPUTE; - - return 0; -} - -void ff_vk_bind_pipeline_exec(AVFilterContext *avctx, FFVkExecContext *e, - FFVulkanPipeline *pl) -{ - FFVulkanContext *s = avctx->priv; - FFVulkanFunctions *vk = &s->vkfn; - - vk->CmdBindPipeline(e->bufs[e->qf->cur_queue], pl->bind_point, pl->pipeline); - - for (int i = 0; i < pl->descriptor_sets_num; i++) - pl->desc_staging[i] = pl->desc_set[i*pl->qf->nb_queues + pl->qf->cur_queue]; - - vk->CmdBindDescriptorSets(e->bufs[e->qf->cur_queue], pl->bind_point, - pl->pipeline_layout, 0, - pl->descriptor_sets_num, - (VkDescriptorSet *)pl->desc_staging, - 0, NULL); - - e->bound_pl = pl; -} - -static void free_exec_ctx(FFVulkanContext *s, FFVkExecContext *e) -{ - FFVulkanFunctions *vk = &s->vkfn; - - /* Make sure all queues have finished executing */ - for (int i = 0; i < e->qf->nb_queues; i++) { - FFVkQueueCtx *q = &e->queues[i]; - - if (q->fence) { - vk->WaitForFences(s->hwctx->act_dev, 1, &q->fence, VK_TRUE, UINT64_MAX); - vk->ResetFences(s->hwctx->act_dev, 1, &q->fence); - } - - /* Free the fence */ - if (q->fence) - vk->DestroyFence(s->hwctx->act_dev, q->fence, s->hwctx->alloc); - - /* Free buffer dependencies */ - for (int j = 0; j < q->nb_buf_deps; j++) - av_buffer_unref(&q->buf_deps[j]); - av_free(q->buf_deps); - - /* Free frame dependencies */ - for (int j = 0; j < q->nb_frame_deps; j++) - av_frame_free(&q->frame_deps[j]); - av_free(q->frame_deps); - } - - if (e->bufs) - vk->FreeCommandBuffers(s->hwctx->act_dev, e->pool, e->qf->nb_queues, e->bufs); - if (e->pool) - vk->DestroyCommandPool(s->hwctx->act_dev, e->pool, s->hwctx->alloc); - - av_freep(&e->bufs); - av_freep(&e->queues); - av_freep(&e->sem_sig); - av_freep(&e->sem_sig_val); - av_freep(&e->sem_sig_val_dst); - av_freep(&e->sem_wait); - av_freep(&e->sem_wait_dst); - av_freep(&e->sem_wait_val); - av_free(e); -} - -static void free_pipeline(FFVulkanContext *s, FFVulkanPipeline *pl) -{ - FFVulkanFunctions *vk = &s->vkfn; - - for (int i = 0; i < pl->shaders_num; i++) { - FFSPIRVShader *shd = pl->shaders[i]; - av_bprint_finalize(&shd->src, NULL); - vk->DestroyShaderModule(s->hwctx->act_dev, shd->shader.module, - s->hwctx->alloc); - av_free(shd); - } - - vk->DestroyPipeline(s->hwctx->act_dev, pl->pipeline, s->hwctx->alloc); - vk->DestroyPipelineLayout(s->hwctx->act_dev, pl->pipeline_layout, - s->hwctx->alloc); - - for (int i = 0; i < pl->desc_layout_num; i++) { - if (pl->desc_template && pl->desc_template[i]) - vk->DestroyDescriptorUpdateTemplate(s->hwctx->act_dev, pl->desc_template[i], - s->hwctx->alloc); - if (pl->desc_layout && pl->desc_layout[i]) - vk->DestroyDescriptorSetLayout(s->hwctx->act_dev, pl->desc_layout[i], - s->hwctx->alloc); - } - - /* Also frees the descriptor sets */ - if (pl->desc_pool) - vk->DestroyDescriptorPool(s->hwctx->act_dev, pl->desc_pool, - s->hwctx->alloc); - - av_freep(&pl->desc_staging); - av_freep(&pl->desc_set); - av_freep(&pl->shaders); - av_freep(&pl->desc_layout); - av_freep(&pl->desc_template); - av_freep(&pl->desc_set_initialized); - av_freep(&pl->push_consts); - pl->push_consts_num = 0; - - /* Only freed in case of failure */ - av_freep(&pl->pool_size_desc); - if (pl->desc_template_info) { - for (int i = 0; i < pl->total_descriptor_sets; i += pl->qf->nb_queues) { - VkDescriptorUpdateTemplateCreateInfo *dt = &pl->desc_template_info[i]; - av_free((void *)dt->pDescriptorUpdateEntries); - } - av_freep(&pl->desc_template_info); - } - - av_free(pl); -} - -void ff_vk_filter_uninit(AVFilterContext *avctx) -{ - FFVulkanContext *s = avctx->priv; - FFVulkanFunctions *vk = &s->vkfn; - - ff_vk_glslang_uninit(); - - for (int i = 0; i < s->exec_ctx_num; i++) - free_exec_ctx(s, s->exec_ctx[i]); - av_freep(&s->exec_ctx); - - for (int i = 0; i < s->samplers_num; i++) { - vk->DestroySampler(s->hwctx->act_dev, s->samplers[i]->sampler[0], - s->hwctx->alloc); - av_free(s->samplers[i]); - } - av_freep(&s->samplers); - - for (int i = 0; i < s->pipelines_num; i++) - free_pipeline(s, s->pipelines[i]); - av_freep(&s->pipelines); - - av_freep(&s->scratch); - s->scratch_size = 0; - - av_buffer_unref(&s->device_ref); - av_buffer_unref(&s->frames_ref); -} diff --git a/libavfilter/vulkan.h b/libavfilter/vulkan.h index df0e830af8..39c139cafa 100644 --- a/libavfilter/vulkan.h +++ b/libavfilter/vulkan.h @@ -19,197 +19,8 @@ #ifndef AVFILTER_VULKAN_H #define AVFILTER_VULKAN_H -#define VK_NO_PROTOTYPES -#define VK_ENABLE_BETA_EXTENSIONS - #include "avfilter.h" -#include "libavutil/pixdesc.h" -#include "libavutil/bprint.h" -#include "libavutil/hwcontext.h" -#include "libavutil/hwcontext_vulkan.h" -#include "libavutil/vulkan_functions.h" - -/* GLSL management macros */ -#define INDENT(N) INDENT_##N -#define INDENT_0 -#define INDENT_1 INDENT_0 " " -#define INDENT_2 INDENT_1 INDENT_1 -#define INDENT_3 INDENT_2 INDENT_1 -#define INDENT_4 INDENT_3 INDENT_1 -#define INDENT_5 INDENT_4 INDENT_1 -#define INDENT_6 INDENT_5 INDENT_1 -#define C(N, S) INDENT(N) #S "\n" -#define GLSLC(N, S) av_bprintf(&shd->src, C(N, S)) -#define GLSLA(...) av_bprintf(&shd->src, __VA_ARGS__) -#define GLSLF(N, S, ...) av_bprintf(&shd->src, C(N, S), __VA_ARGS__) -#define GLSLD(D) GLSLC(0, ); \ - av_bprint_append_data(&shd->src, D, strlen(D)); \ - GLSLC(0, ) - -/* Helper, pretty much every Vulkan return value needs to be checked */ -#define RET(x) \ - do { \ - if ((err = (x)) < 0) \ - goto fail; \ - } while (0) - -typedef struct FFSPIRVShader { - const char *name; /* Name for id/debugging purposes */ - AVBPrint src; - int local_size[3]; /* Compute shader workgroup sizes */ - VkPipelineShaderStageCreateInfo shader; -} FFSPIRVShader; - -typedef struct FFVkSampler { - VkSampler sampler[4]; -} FFVkSampler; - -typedef struct FFVulkanDescriptorSetBinding { - const char *name; - VkDescriptorType type; - const char *mem_layout; /* Storage images (rgba8, etc.) and buffers (std430, etc.) */ - const char *mem_quali; /* readonly, writeonly, etc. */ - const char *buf_content; /* For buffers */ - uint32_t dimensions; /* Needed for e.g. sampler%iD */ - uint32_t elems; /* 0 - scalar, 1 or more - vector */ - VkShaderStageFlags stages; - FFVkSampler *sampler; /* Sampler to use for all elems */ - void *updater; /* Pointer to VkDescriptor*Info */ -} FFVulkanDescriptorSetBinding; - -typedef struct FFVkBuffer { - VkBuffer buf; - VkDeviceMemory mem; - VkMemoryPropertyFlagBits flags; -} FFVkBuffer; - -typedef struct FFVkQueueFamilyCtx { - int queue_family; - int nb_queues; - int cur_queue; - int actual_queues; -} FFVkQueueFamilyCtx; - -typedef struct FFVulkanPipeline { - FFVkQueueFamilyCtx *qf; - - VkPipelineBindPoint bind_point; - - /* Contexts */ - VkPipelineLayout pipeline_layout; - VkPipeline pipeline; - - /* Shaders */ - FFSPIRVShader **shaders; - int shaders_num; - - /* Push consts */ - VkPushConstantRange *push_consts; - int push_consts_num; - - /* Descriptors */ - VkDescriptorSetLayout *desc_layout; - VkDescriptorPool desc_pool; - VkDescriptorSet *desc_set; - void **desc_staging; - VkDescriptorSetLayoutBinding **desc_binding; - VkDescriptorUpdateTemplate *desc_template; - int *desc_set_initialized; - int desc_layout_num; - int descriptor_sets_num; - int total_descriptor_sets; - int pool_size_desc_num; - - /* Temporary, used to store data in between initialization stages */ - VkDescriptorUpdateTemplateCreateInfo *desc_template_info; - VkDescriptorPoolSize *pool_size_desc; -} FFVulkanPipeline; - -typedef struct FFVkQueueCtx { - VkFence fence; - VkQueue queue; - - /* Buffer dependencies */ - AVBufferRef **buf_deps; - int nb_buf_deps; - int buf_deps_alloc_size; - - /* Frame dependencies */ - AVFrame **frame_deps; - int nb_frame_deps; - int frame_deps_alloc_size; -} FFVkQueueCtx; - -typedef struct FFVkExecContext { - FFVkQueueFamilyCtx *qf; - - VkCommandPool pool; - VkCommandBuffer *bufs; - FFVkQueueCtx *queues; - - AVBufferRef ***deps; - int *nb_deps; - int *dep_alloc_size; - - FFVulkanPipeline *bound_pl; - - VkSemaphore *sem_wait; - int sem_wait_alloc; /* Allocated sem_wait */ - int sem_wait_cnt; - - uint64_t *sem_wait_val; - int sem_wait_val_alloc; - - VkPipelineStageFlagBits *sem_wait_dst; - int sem_wait_dst_alloc; /* Allocated sem_wait_dst */ - - VkSemaphore *sem_sig; - int sem_sig_alloc; /* Allocated sem_sig */ - int sem_sig_cnt; - - uint64_t *sem_sig_val; - int sem_sig_val_alloc; - - uint64_t **sem_sig_val_dst; - int sem_sig_val_dst_alloc; -} FFVkExecContext; - -typedef struct FFVulkanContext { - const AVClass *class; - FFVulkanFunctions vkfn; - FFVulkanExtensions extensions; - VkPhysicalDeviceProperties props; - VkPhysicalDeviceMemoryProperties mprops; - - AVBufferRef *device_ref; - AVBufferRef *frames_ref; /* For in-place filtering */ - AVHWDeviceContext *device; - AVVulkanDeviceContext *hwctx; - - /* Properties */ - int output_width; - int output_height; - enum AVPixelFormat output_format; - enum AVPixelFormat input_format; - - /* Samplers */ - FFVkSampler **samplers; - int samplers_num; - - /* Exec contexts */ - FFVkExecContext **exec_ctx; - int exec_ctx_num; - - /* Pipelines (each can have 1 shader of each type) */ - FFVulkanPipeline **pipelines; - int pipelines_num; - - void *scratch; /* Scratch memory used only in functions */ - unsigned int scratch_size; -} FFVulkanContext; - -/* Identity mapping - r = r, b = b, g = g, a = a */ -extern const VkComponentMapping ff_comp_identity_map; +#include "libavutil/vulkan.h" /** * General lavfi IO functions @@ -218,194 +29,5 @@ int ff_vk_filter_init (AVFilterContext *avctx); int ff_vk_filter_config_input (AVFilterLink *inlink); int ff_vk_filter_config_output (AVFilterLink *outlink); int ff_vk_filter_config_output_inplace(AVFilterLink *outlink); -void ff_vk_filter_uninit (AVFilterContext *avctx); - -/** - * Converts Vulkan return values to strings - */ -const char *ff_vk_ret2str(VkResult res); - -/** - * Returns 1 if the image is any sort of supported RGB - */ -int ff_vk_mt_is_np_rgb(enum AVPixelFormat pix_fmt); - -/** - * Gets the glsl format string for a pixel format - */ -const char *ff_vk_shader_rep_fmt(enum AVPixelFormat pixfmt); - -/** - * Initialize a queue family with a specific number of queues. - * If nb_queues == 0, use however many queues the queue family has. - */ -void ff_vk_qf_init(AVFilterContext *avctx, FFVkQueueFamilyCtx *qf, - VkQueueFlagBits dev_family, int nb_queues); - -/** - * Rotate through the queues in a queue family. - */ -void ff_vk_qf_rotate(FFVkQueueFamilyCtx *qf); - -/** - * Create a Vulkan sampler, will be auto-freed in ff_vk_filter_uninit() - */ -FFVkSampler *ff_vk_init_sampler(AVFilterContext *avctx, int unnorm_coords, - VkFilter filt); - -/** - * Create an imageview. - * Guaranteed to remain alive until the queue submission has finished executing, - * and will be destroyed after that. - */ -int ff_vk_create_imageview(AVFilterContext *avctx, FFVkExecContext *e, - VkImageView *v, VkImage img, VkFormat fmt, - const VkComponentMapping map); - -/** - * Define a push constant for a given stage into a pipeline. - * Must be called before the pipeline layout has been initialized. - */ -int ff_vk_add_push_constant(AVFilterContext *avctx, FFVulkanPipeline *pl, - int offset, int size, VkShaderStageFlagBits stage); - -/** - * Inits a pipeline. Everything in it will be auto-freed when calling - * ff_vk_filter_uninit(). - */ -FFVulkanPipeline *ff_vk_create_pipeline(AVFilterContext *avctx, - FFVkQueueFamilyCtx *qf); - -/** - * Inits a shader for a specific pipeline. Will be auto-freed on uninit. - */ -FFSPIRVShader *ff_vk_init_shader(AVFilterContext *avctx, FFVulkanPipeline *pl, - const char *name, VkShaderStageFlags stage); - -/** - * Writes the workgroup size for a shader. - */ -void ff_vk_set_compute_shader_sizes(AVFilterContext *avctx, FFSPIRVShader *shd, - int local_size[3]); - -/** - * Adds a descriptor set to the shader and registers them in the pipeline. - */ -int ff_vk_add_descriptor_set(AVFilterContext *avctx, FFVulkanPipeline *pl, - FFSPIRVShader *shd, FFVulkanDescriptorSetBinding *desc, - int num, int only_print_to_shader); - -/** - * Compiles the shader, entrypoint must be set to "main". - */ -int ff_vk_compile_shader(AVFilterContext *avctx, FFSPIRVShader *shd, - const char *entrypoint); - -/** - * Pretty print shader, mainly used by shader compilers. - */ -void ff_vk_print_shader(AVFilterContext *avctx, FFSPIRVShader *shd, int prio); - -/** - * Initializes the pipeline layout after all shaders and descriptor sets have - * been finished. - */ -int ff_vk_init_pipeline_layout(AVFilterContext *avctx, FFVulkanPipeline *pl); - -/** - * Initializes a compute pipeline. Will pick the first shader with the - * COMPUTE flag set. - */ -int ff_vk_init_compute_pipeline(AVFilterContext *avctx, FFVulkanPipeline *pl); - -/** - * Updates a descriptor set via the updaters defined. - * Can be called immediately after pipeline creation, but must be called - * at least once before queue submission. - */ -void ff_vk_update_descriptor_set(AVFilterContext *avctx, FFVulkanPipeline *pl, - int set_id); - -/** - * Init an execution context for command recording and queue submission. - * WIll be auto-freed on uninit. - */ -int ff_vk_create_exec_ctx(AVFilterContext *avctx, FFVkExecContext **ctx, - FFVkQueueFamilyCtx *qf); - -/** - * Begin recording to the command buffer. Previous execution must have been - * completed, which ff_vk_submit_exec_queue() will ensure. - */ -int ff_vk_start_exec_recording(AVFilterContext *avctx, FFVkExecContext *e); - -/** - * Add a command to bind the completed pipeline and its descriptor sets. - * Must be called after ff_vk_start_exec_recording() and before submission. - */ -void ff_vk_bind_pipeline_exec(AVFilterContext *avctx, FFVkExecContext *e, - FFVulkanPipeline *pl); - -/** - * Updates push constants. - * Must be called after binding a pipeline if any push constants were defined. - */ -void ff_vk_update_push_exec(AVFilterContext *avctx, FFVkExecContext *e, - VkShaderStageFlagBits stage, int offset, - size_t size, void *src); - -/** - * Gets the command buffer to use for this submission from the exe context. - */ -VkCommandBuffer ff_vk_get_exec_buf(AVFilterContext *avctx, FFVkExecContext *e); - -/** - * Adds a generic AVBufferRef as a queue depenency. - */ -int ff_vk_add_dep_exec_ctx(AVFilterContext *avctx, FFVkExecContext *e, - AVBufferRef **deps, int nb_deps); - -/** - * Discards all queue dependencies - */ -void ff_vk_discard_exec_deps(AVFilterContext *avctx, FFVkExecContext *e); - -/** - * Adds a frame as a queue dependency. This also manages semaphore signalling. - * Must be called before submission. - */ -int ff_vk_add_exec_dep(AVFilterContext *avctx, FFVkExecContext *e, - AVFrame *frame, VkPipelineStageFlagBits in_wait_dst_flag); - -/** - * Submits a command buffer to the queue for execution. - * Will block until execution has finished in order to simplify resource - * management. - */ -int ff_vk_submit_exec_queue(AVFilterContext *avctx, FFVkExecContext *e); - -/** - * Create a VkBuffer with the specified parameters. - */ -int ff_vk_create_buf(AVFilterContext *avctx, FFVkBuffer *buf, size_t size, - VkBufferUsageFlags usage, VkMemoryPropertyFlagBits flags); - -/** - * Maps the buffer to userspace. Set invalidate to 1 if reading the contents - * is necessary. - */ -int ff_vk_map_buffers(AVFilterContext *avctx, FFVkBuffer *buf, uint8_t *mem[], - int nb_buffers, int invalidate); - -/** - * Unmaps the buffer from userspace. Set flush to 1 to write and sync. - */ -int ff_vk_unmap_buffers(AVFilterContext *avctx, FFVkBuffer *buf, int nb_buffers, - int flush); - -/** - * Frees a buffer. - */ -void ff_vk_free_buf(AVFilterContext *avctx, FFVkBuffer *buf); #endif /* AVFILTER_VULKAN_H */ diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c index b218b6c411..1e352d5ef7 100644 --- a/libavutil/hwcontext_vulkan.c +++ b/libavutil/hwcontext_vulkan.c @@ -38,6 +38,7 @@ #include "hwcontext_internal.h" #include "hwcontext_vulkan.h" +#include "vulkan.h" #include "vulkan_loader.h" #if CONFIG_LIBDRM @@ -131,11 +132,6 @@ typedef struct AVVkFrameInternal { #endif } AVVkFrameInternal; -#define DEFAULT_USAGE_FLAGS (VK_IMAGE_USAGE_SAMPLED_BIT | \ - VK_IMAGE_USAGE_STORAGE_BIT | \ - VK_IMAGE_USAGE_TRANSFER_SRC_BIT | \ - VK_IMAGE_USAGE_TRANSFER_DST_BIT) - #define ADD_VAL_TO_LIST(list, count, val) \ do { \ list = av_realloc_array(list, sizeof(*list), ++count); \ @@ -251,7 +247,7 @@ static int pixfmt_is_supported(AVHWDeviceContext *dev_ctx, enum AVPixelFormat p, vk->GetPhysicalDeviceFormatProperties2(hwctx->phys_dev, fmt[i], &prop); flags = linear ? prop.formatProperties.linearTilingFeatures : prop.formatProperties.optimalTilingFeatures; - if (!(flags & DEFAULT_USAGE_FLAGS)) + if (!(flags & FF_VK_DEFAULT_USAGE_FLAGS)) return 0; } @@ -2041,7 +2037,7 @@ static int vulkan_frames_init(AVHWFramesContext *hwfc) VK_IMAGE_TILING_LINEAR : VK_IMAGE_TILING_OPTIMAL; if (!hwctx->usage) - hwctx->usage = DEFAULT_USAGE_FLAGS; + hwctx->usage = FF_VK_DEFAULT_USAGE_FLAGS; err = create_exec_ctx(hwfc, &fp->conv_ctx, dev_hwctx->queue_family_comp_index, diff --git a/libavutil/vulkan.c b/libavutil/vulkan.c new file mode 100644 index 0000000000..745e366713 --- /dev/null +++ b/libavutil/vulkan.c @@ -0,0 +1,1399 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "vulkan.h" +#include "vulkan_glslang.h" + +#include "avassert.h" +#include "vulkan_loader.h" + +/* Generic macro for creating contexts which need to keep their addresses + * if another context is created. */ +#define FN_CREATING(ctx, type, shortname, array, num) \ +static av_always_inline type *create_ ##shortname(ctx *dctx) \ +{ \ + type **array, *sctx = av_mallocz(sizeof(*sctx)); \ + if (!sctx) \ + return NULL; \ + \ + array = av_realloc_array(dctx->array, sizeof(*dctx->array), dctx->num + 1);\ + if (!array) { \ + av_free(sctx); \ + return NULL; \ + } \ + \ + dctx->array = array; \ + dctx->array[dctx->num++] = sctx; \ + \ + return sctx; \ +} + +const VkComponentMapping ff_comp_identity_map = { + .r = VK_COMPONENT_SWIZZLE_IDENTITY, + .g = VK_COMPONENT_SWIZZLE_IDENTITY, + .b = VK_COMPONENT_SWIZZLE_IDENTITY, + .a = VK_COMPONENT_SWIZZLE_IDENTITY, +}; + +/* Converts return values to strings */ +const char *ff_vk_ret2str(VkResult res) +{ +#define CASE(VAL) case VAL: return #VAL + switch (res) { + CASE(VK_SUCCESS); + CASE(VK_NOT_READY); + CASE(VK_TIMEOUT); + CASE(VK_EVENT_SET); + CASE(VK_EVENT_RESET); + CASE(VK_INCOMPLETE); + CASE(VK_ERROR_OUT_OF_HOST_MEMORY); + CASE(VK_ERROR_OUT_OF_DEVICE_MEMORY); + CASE(VK_ERROR_INITIALIZATION_FAILED); + CASE(VK_ERROR_DEVICE_LOST); + CASE(VK_ERROR_MEMORY_MAP_FAILED); + CASE(VK_ERROR_LAYER_NOT_PRESENT); + CASE(VK_ERROR_EXTENSION_NOT_PRESENT); + CASE(VK_ERROR_FEATURE_NOT_PRESENT); + CASE(VK_ERROR_INCOMPATIBLE_DRIVER); + CASE(VK_ERROR_TOO_MANY_OBJECTS); + CASE(VK_ERROR_FORMAT_NOT_SUPPORTED); + CASE(VK_ERROR_FRAGMENTED_POOL); + CASE(VK_ERROR_SURFACE_LOST_KHR); + CASE(VK_ERROR_NATIVE_WINDOW_IN_USE_KHR); + CASE(VK_SUBOPTIMAL_KHR); + CASE(VK_ERROR_OUT_OF_DATE_KHR); + CASE(VK_ERROR_INCOMPATIBLE_DISPLAY_KHR); + CASE(VK_ERROR_VALIDATION_FAILED_EXT); + CASE(VK_ERROR_INVALID_SHADER_NV); + CASE(VK_ERROR_OUT_OF_POOL_MEMORY); + CASE(VK_ERROR_INVALID_EXTERNAL_HANDLE); + CASE(VK_ERROR_NOT_PERMITTED_EXT); + default: return "Unknown error"; + } +#undef CASE +} + +void ff_vk_qf_init(FFVulkanContext *s, FFVkQueueFamilyCtx *qf, + VkQueueFlagBits dev_family, int nb_queues) +{ + switch (dev_family) { + case VK_QUEUE_GRAPHICS_BIT: + qf->queue_family = s->hwctx->queue_family_index; + qf->actual_queues = s->hwctx->nb_graphics_queues; + break; + case VK_QUEUE_COMPUTE_BIT: + qf->queue_family = s->hwctx->queue_family_comp_index; + qf->actual_queues = s->hwctx->nb_comp_queues; + break; + case VK_QUEUE_TRANSFER_BIT: + qf->queue_family = s->hwctx->queue_family_tx_index; + qf->actual_queues = s->hwctx->nb_tx_queues; + break; + case VK_QUEUE_VIDEO_ENCODE_BIT_KHR: + qf->queue_family = s->hwctx->queue_family_encode_index; + qf->actual_queues = s->hwctx->nb_encode_queues; + break; + case VK_QUEUE_VIDEO_DECODE_BIT_KHR: + qf->queue_family = s->hwctx->queue_family_decode_index; + qf->actual_queues = s->hwctx->nb_decode_queues; + break; + default: + av_assert0(0); /* Should never happen */ + } + + if (!nb_queues) + qf->nb_queues = qf->actual_queues; + else + qf->nb_queues = nb_queues; + + return; +} + +void ff_vk_qf_rotate(FFVkQueueFamilyCtx *qf) +{ + qf->cur_queue = (qf->cur_queue + 1) % qf->nb_queues; +} + +static int vk_alloc_mem(FFVulkanContext *s, VkMemoryRequirements *req, + VkMemoryPropertyFlagBits req_flags, void *alloc_extension, + VkMemoryPropertyFlagBits *mem_flags, VkDeviceMemory *mem) +{ + VkResult ret; + int index = -1; + FFVulkanFunctions *vk = &s->vkfn; + + VkMemoryAllocateInfo alloc_info = { + .sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO, + .pNext = alloc_extension, + }; + + /* Align if we need to */ + if (req_flags & VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT) + req->size = FFALIGN(req->size, s->props.limits.minMemoryMapAlignment); + + alloc_info.allocationSize = req->size; + + /* The vulkan spec requires memory types to be sorted in the "optimal" + * order, so the first matching type we find will be the best/fastest one */ + for (int i = 0; i < s->mprops.memoryTypeCount; i++) { + /* The memory type must be supported by the requirements (bitfield) */ + if (!(req->memoryTypeBits & (1 << i))) + continue; + + /* The memory type flags must include our properties */ + if ((s->mprops.memoryTypes[i].propertyFlags & req_flags) != req_flags) + continue; + + /* Found a suitable memory type */ + index = i; + break; + } + + if (index < 0) { + av_log(s, AV_LOG_ERROR, "No memory type found for flags 0x%x\n", + req_flags); + return AVERROR(EINVAL); + } + + alloc_info.memoryTypeIndex = index; + + ret = vk->AllocateMemory(s->hwctx->act_dev, &alloc_info, + s->hwctx->alloc, mem); + if (ret != VK_SUCCESS) { + av_log(s, AV_LOG_ERROR, "Failed to allocate memory: %s\n", + ff_vk_ret2str(ret)); + return AVERROR(ENOMEM); + } + + *mem_flags |= s->mprops.memoryTypes[index].propertyFlags; + + return 0; +} + +int ff_vk_create_buf(FFVulkanContext *s, + FFVkBuffer *buf, size_t size, + VkBufferUsageFlags usage, + VkMemoryPropertyFlagBits flags) +{ + int err; + VkResult ret; + int use_ded_mem; + FFVulkanFunctions *vk = &s->vkfn; + + VkBufferCreateInfo buf_spawn = { + .sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO, + .pNext = NULL, + .usage = usage, + .sharingMode = VK_SHARING_MODE_EXCLUSIVE, + .size = size, /* Gets FFALIGNED during alloc if host visible + but should be ok */ + }; + + VkBufferMemoryRequirementsInfo2 req_desc = { + .sType = VK_STRUCTURE_TYPE_BUFFER_MEMORY_REQUIREMENTS_INFO_2, + }; + VkMemoryDedicatedAllocateInfo ded_alloc = { + .sType = VK_STRUCTURE_TYPE_MEMORY_DEDICATED_ALLOCATE_INFO, + .pNext = NULL, + }; + VkMemoryDedicatedRequirements ded_req = { + .sType = VK_STRUCTURE_TYPE_MEMORY_DEDICATED_REQUIREMENTS, + }; + VkMemoryRequirements2 req = { + .sType = VK_STRUCTURE_TYPE_MEMORY_REQUIREMENTS_2, + .pNext = &ded_req, + }; + + ret = vk->CreateBuffer(s->hwctx->act_dev, &buf_spawn, NULL, &buf->buf); + if (ret != VK_SUCCESS) { + av_log(s, AV_LOG_ERROR, "Failed to create buffer: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + req_desc.buffer = buf->buf; + + vk->GetBufferMemoryRequirements2(s->hwctx->act_dev, &req_desc, &req); + + /* In case the implementation prefers/requires dedicated allocation */ + use_ded_mem = ded_req.prefersDedicatedAllocation | + ded_req.requiresDedicatedAllocation; + if (use_ded_mem) + ded_alloc.buffer = buf->buf; + + err = vk_alloc_mem(s, &req.memoryRequirements, flags, + use_ded_mem ? &ded_alloc : (void *)ded_alloc.pNext, + &buf->flags, &buf->mem); + if (err) + return err; + + ret = vk->BindBufferMemory(s->hwctx->act_dev, buf->buf, buf->mem, 0); + if (ret != VK_SUCCESS) { + av_log(s, AV_LOG_ERROR, "Failed to bind memory to buffer: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + return 0; +} + +int ff_vk_map_buffers(FFVulkanContext *s, + FFVkBuffer *buf, uint8_t *mem[], + int nb_buffers, int invalidate) +{ + VkResult ret; + FFVulkanFunctions *vk = &s->vkfn; + VkMappedMemoryRange *inval_list = NULL; + int inval_count = 0; + + for (int i = 0; i < nb_buffers; i++) { + ret = vk->MapMemory(s->hwctx->act_dev, buf[i].mem, 0, + VK_WHOLE_SIZE, 0, (void **)&mem[i]); + if (ret != VK_SUCCESS) { + av_log(s, AV_LOG_ERROR, "Failed to map buffer memory: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } + + if (!invalidate) + return 0; + + for (int i = 0; i < nb_buffers; i++) { + const VkMappedMemoryRange ival_buf = { + .sType = VK_STRUCTURE_TYPE_MAPPED_MEMORY_RANGE, + .memory = buf[i].mem, + .size = VK_WHOLE_SIZE, + }; + if (buf[i].flags & VK_MEMORY_PROPERTY_HOST_COHERENT_BIT) + continue; + inval_list = av_fast_realloc(s->scratch, &s->scratch_size, + (++inval_count)*sizeof(*inval_list)); + if (!inval_list) + return AVERROR(ENOMEM); + inval_list[inval_count - 1] = ival_buf; + } + + if (inval_count) { + ret = vk->InvalidateMappedMemoryRanges(s->hwctx->act_dev, inval_count, + inval_list); + if (ret != VK_SUCCESS) { + av_log(s, AV_LOG_ERROR, "Failed to invalidate memory: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } + + return 0; +} + +int ff_vk_unmap_buffers(FFVulkanContext *s, + FFVkBuffer *buf, int nb_buffers, int flush) +{ + int err = 0; + VkResult ret; + FFVulkanFunctions *vk = &s->vkfn; + VkMappedMemoryRange *flush_list = NULL; + int flush_count = 0; + + if (flush) { + for (int i = 0; i < nb_buffers; i++) { + const VkMappedMemoryRange flush_buf = { + .sType = VK_STRUCTURE_TYPE_MAPPED_MEMORY_RANGE, + .memory = buf[i].mem, + .size = VK_WHOLE_SIZE, + }; + if (buf[i].flags & VK_MEMORY_PROPERTY_HOST_COHERENT_BIT) + continue; + flush_list = av_fast_realloc(s->scratch, &s->scratch_size, + (++flush_count)*sizeof(*flush_list)); + if (!flush_list) + return AVERROR(ENOMEM); + flush_list[flush_count - 1] = flush_buf; + } + } + + if (flush_count) { + ret = vk->FlushMappedMemoryRanges(s->hwctx->act_dev, flush_count, + flush_list); + if (ret != VK_SUCCESS) { + av_log(s, AV_LOG_ERROR, "Failed to flush memory: %s\n", + ff_vk_ret2str(ret)); + err = AVERROR_EXTERNAL; /* We still want to try to unmap them */ + } + } + + for (int i = 0; i < nb_buffers; i++) + vk->UnmapMemory(s->hwctx->act_dev, buf[i].mem); + + return err; +} + +void ff_vk_free_buf(FFVulkanContext *s, FFVkBuffer *buf) +{ + FFVulkanFunctions *vk = &s->vkfn; + + if (!buf) + return; + + vk->DeviceWaitIdle(s->hwctx->act_dev); + + if (buf->buf != VK_NULL_HANDLE) + vk->DestroyBuffer(s->hwctx->act_dev, buf->buf, s->hwctx->alloc); + if (buf->mem != VK_NULL_HANDLE) + vk->FreeMemory(s->hwctx->act_dev, buf->mem, s->hwctx->alloc); +} + +int ff_vk_add_push_constant(FFVulkanPipeline *pl, + int offset, int size, VkShaderStageFlagBits stage) +{ + VkPushConstantRange *pc; + + pl->push_consts = av_realloc_array(pl->push_consts, sizeof(*pl->push_consts), + pl->push_consts_num + 1); + if (!pl->push_consts) + return AVERROR(ENOMEM); + + pc = &pl->push_consts[pl->push_consts_num++]; + memset(pc, 0, sizeof(*pc)); + + pc->stageFlags = stage; + pc->offset = offset; + pc->size = size; + + return 0; +} + +FN_CREATING(FFVulkanContext, FFVkExecContext, exec_ctx, exec_ctx, exec_ctx_num) +int ff_vk_create_exec_ctx(FFVulkanContext *s, + FFVkExecContext **ctx, FFVkQueueFamilyCtx *qf) +{ + VkResult ret; + FFVkExecContext *e; + FFVulkanFunctions *vk = &s->vkfn; + + VkCommandPoolCreateInfo cqueue_create = { + .sType = VK_STRUCTURE_TYPE_COMMAND_POOL_CREATE_INFO, + .flags = VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT, + .queueFamilyIndex = qf->queue_family, + }; + VkCommandBufferAllocateInfo cbuf_create = { + .sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_ALLOCATE_INFO, + .level = VK_COMMAND_BUFFER_LEVEL_PRIMARY, + .commandBufferCount = qf->nb_queues, + }; + + e = create_exec_ctx(s); + if (!e) + return AVERROR(ENOMEM); + + e->qf = qf; + + e->queues = av_mallocz(qf->nb_queues * sizeof(*e->queues)); + if (!e->queues) + return AVERROR(ENOMEM); + + e->bufs = av_mallocz(qf->nb_queues * sizeof(*e->bufs)); + if (!e->bufs) + return AVERROR(ENOMEM); + + /* Create command pool */ + ret = vk->CreateCommandPool(s->hwctx->act_dev, &cqueue_create, + s->hwctx->alloc, &e->pool); + if (ret != VK_SUCCESS) { + av_log(s, AV_LOG_ERROR, "Command pool creation failure: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + cbuf_create.commandPool = e->pool; + + /* Allocate command buffer */ + ret = vk->AllocateCommandBuffers(s->hwctx->act_dev, &cbuf_create, e->bufs); + if (ret != VK_SUCCESS) { + av_log(s, AV_LOG_ERROR, "Command buffer alloc failure: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + for (int i = 0; i < qf->nb_queues; i++) { + FFVkQueueCtx *q = &e->queues[i]; + vk->GetDeviceQueue(s->hwctx->act_dev, qf->queue_family, + i % qf->actual_queues, &q->queue); + } + + *ctx = e; + + return 0; +} + +void ff_vk_discard_exec_deps(FFVkExecContext *e) +{ + FFVkQueueCtx *q = &e->queues[e->qf->cur_queue]; + + for (int j = 0; j < q->nb_buf_deps; j++) + av_buffer_unref(&q->buf_deps[j]); + q->nb_buf_deps = 0; + + for (int j = 0; j < q->nb_frame_deps; j++) + av_frame_free(&q->frame_deps[j]); + q->nb_frame_deps = 0; + + e->sem_wait_cnt = 0; + e->sem_sig_cnt = 0; +} + +int ff_vk_start_exec_recording(FFVulkanContext *s, + FFVkExecContext *e) +{ + VkResult ret; + FFVulkanFunctions *vk = &s->vkfn; + FFVkQueueCtx *q = &e->queues[e->qf->cur_queue]; + + VkCommandBufferBeginInfo cmd_start = { + .sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO, + .flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT, + }; + + /* Create the fence and don't wait for it initially */ + if (!q->fence) { + VkFenceCreateInfo fence_spawn = { + .sType = VK_STRUCTURE_TYPE_FENCE_CREATE_INFO, + }; + ret = vk->CreateFence(s->hwctx->act_dev, &fence_spawn, s->hwctx->alloc, + &q->fence); + if (ret != VK_SUCCESS) { + av_log(s, AV_LOG_ERROR, "Failed to queue frame fence: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } else { + vk->WaitForFences(s->hwctx->act_dev, 1, &q->fence, VK_TRUE, UINT64_MAX); + vk->ResetFences(s->hwctx->act_dev, 1, &q->fence); + } + + /* Discard queue dependencies */ + ff_vk_discard_exec_deps(e); + + ret = vk->BeginCommandBuffer(e->bufs[e->qf->cur_queue], &cmd_start); + if (ret != VK_SUCCESS) { + av_log(s, AV_LOG_ERROR, "Failed to start command recoding: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + return 0; +} + +VkCommandBuffer ff_vk_get_exec_buf(FFVkExecContext *e) +{ + return e->bufs[e->qf->cur_queue]; +} + +int ff_vk_add_exec_dep(FFVulkanContext *s, FFVkExecContext *e, + AVFrame *frame, VkPipelineStageFlagBits in_wait_dst_flag) +{ + AVFrame **dst; + AVVkFrame *f = (AVVkFrame *)frame->data[0]; + FFVkQueueCtx *q = &e->queues[e->qf->cur_queue]; + AVHWFramesContext *fc = (AVHWFramesContext *)frame->hw_frames_ctx->data; + int planes = av_pix_fmt_count_planes(fc->sw_format); + + for (int i = 0; i < planes; i++) { + e->sem_wait = av_fast_realloc(e->sem_wait, &e->sem_wait_alloc, + (e->sem_wait_cnt + 1)*sizeof(*e->sem_wait)); + if (!e->sem_wait) { + ff_vk_discard_exec_deps(e); + return AVERROR(ENOMEM); + } + + e->sem_wait_dst = av_fast_realloc(e->sem_wait_dst, &e->sem_wait_dst_alloc, + (e->sem_wait_cnt + 1)*sizeof(*e->sem_wait_dst)); + if (!e->sem_wait_dst) { + ff_vk_discard_exec_deps(e); + return AVERROR(ENOMEM); + } + + e->sem_wait_val = av_fast_realloc(e->sem_wait_val, &e->sem_wait_val_alloc, + (e->sem_wait_cnt + 1)*sizeof(*e->sem_wait_val)); + if (!e->sem_wait_val) { + ff_vk_discard_exec_deps(e); + return AVERROR(ENOMEM); + } + + e->sem_sig = av_fast_realloc(e->sem_sig, &e->sem_sig_alloc, + (e->sem_sig_cnt + 1)*sizeof(*e->sem_sig)); + if (!e->sem_sig) { + ff_vk_discard_exec_deps(e); + return AVERROR(ENOMEM); + } + + e->sem_sig_val = av_fast_realloc(e->sem_sig_val, &e->sem_sig_val_alloc, + (e->sem_sig_cnt + 1)*sizeof(*e->sem_sig_val)); + if (!e->sem_sig_val) { + ff_vk_discard_exec_deps(e); + return AVERROR(ENOMEM); + } + + e->sem_sig_val_dst = av_fast_realloc(e->sem_sig_val_dst, &e->sem_sig_val_dst_alloc, + (e->sem_sig_cnt + 1)*sizeof(*e->sem_sig_val_dst)); + if (!e->sem_sig_val_dst) { + ff_vk_discard_exec_deps(e); + return AVERROR(ENOMEM); + } + + e->sem_wait[e->sem_wait_cnt] = f->sem[i]; + e->sem_wait_dst[e->sem_wait_cnt] = in_wait_dst_flag; + e->sem_wait_val[e->sem_wait_cnt] = f->sem_value[i]; + e->sem_wait_cnt++; + + e->sem_sig[e->sem_sig_cnt] = f->sem[i]; + e->sem_sig_val[e->sem_sig_cnt] = f->sem_value[i] + 1; + e->sem_sig_val_dst[e->sem_sig_cnt] = &f->sem_value[i]; + e->sem_sig_cnt++; + } + + dst = av_fast_realloc(q->frame_deps, &q->frame_deps_alloc_size, + (q->nb_frame_deps + 1) * sizeof(*dst)); + if (!dst) { + ff_vk_discard_exec_deps(e); + return AVERROR(ENOMEM); + } + + q->frame_deps = dst; + q->frame_deps[q->nb_frame_deps] = av_frame_clone(frame); + if (!q->frame_deps[q->nb_frame_deps]) { + ff_vk_discard_exec_deps(e); + return AVERROR(ENOMEM); + } + q->nb_frame_deps++; + + return 0; +} + +int ff_vk_submit_exec_queue(FFVulkanContext *s, FFVkExecContext *e) +{ + VkResult ret; + FFVulkanFunctions *vk = &s->vkfn; + FFVkQueueCtx *q = &e->queues[e->qf->cur_queue]; + + VkTimelineSemaphoreSubmitInfo s_timeline_sem_info = { + .sType = VK_STRUCTURE_TYPE_TIMELINE_SEMAPHORE_SUBMIT_INFO, + .pWaitSemaphoreValues = e->sem_wait_val, + .pSignalSemaphoreValues = e->sem_sig_val, + .waitSemaphoreValueCount = e->sem_wait_cnt, + .signalSemaphoreValueCount = e->sem_sig_cnt, + }; + + VkSubmitInfo s_info = { + .sType = VK_STRUCTURE_TYPE_SUBMIT_INFO, + .pNext = &s_timeline_sem_info, + + .commandBufferCount = 1, + .pCommandBuffers = &e->bufs[e->qf->cur_queue], + + .pWaitSemaphores = e->sem_wait, + .pWaitDstStageMask = e->sem_wait_dst, + .waitSemaphoreCount = e->sem_wait_cnt, + + .pSignalSemaphores = e->sem_sig, + .signalSemaphoreCount = e->sem_sig_cnt, + }; + + ret = vk->EndCommandBuffer(e->bufs[e->qf->cur_queue]); + if (ret != VK_SUCCESS) { + av_log(s, AV_LOG_ERROR, "Unable to finish command buffer: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + ret = vk->QueueSubmit(q->queue, 1, &s_info, q->fence); + if (ret != VK_SUCCESS) { + av_log(s, AV_LOG_ERROR, "Unable to submit command buffer: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + for (int i = 0; i < e->sem_sig_cnt; i++) + *e->sem_sig_val_dst[i] += 1; + + return 0; +} + +int ff_vk_add_dep_exec_ctx(FFVulkanContext *s, FFVkExecContext *e, + AVBufferRef **deps, int nb_deps) +{ + AVBufferRef **dst; + FFVkQueueCtx *q = &e->queues[e->qf->cur_queue]; + + if (!deps || !nb_deps) + return 0; + + dst = av_fast_realloc(q->buf_deps, &q->buf_deps_alloc_size, + (q->nb_buf_deps + nb_deps) * sizeof(*dst)); + if (!dst) + goto err; + + q->buf_deps = dst; + + for (int i = 0; i < nb_deps; i++) { + q->buf_deps[q->nb_buf_deps] = deps[i]; + if (!q->buf_deps[q->nb_buf_deps]) + goto err; + q->nb_buf_deps++; + } + + return 0; + +err: + ff_vk_discard_exec_deps(e); + return AVERROR(ENOMEM); +} + +FN_CREATING(FFVulkanContext, FFVkSampler, sampler, samplers, samplers_num) +FFVkSampler *ff_vk_init_sampler(FFVulkanContext *s, + int unnorm_coords, VkFilter filt) +{ + VkResult ret; + FFVulkanFunctions *vk = &s->vkfn; + + VkSamplerCreateInfo sampler_info = { + .sType = VK_STRUCTURE_TYPE_SAMPLER_CREATE_INFO, + .magFilter = filt, + .minFilter = sampler_info.magFilter, + .mipmapMode = unnorm_coords ? VK_SAMPLER_MIPMAP_MODE_NEAREST : + VK_SAMPLER_MIPMAP_MODE_LINEAR, + .addressModeU = VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE, + .addressModeV = sampler_info.addressModeU, + .addressModeW = sampler_info.addressModeU, + .anisotropyEnable = VK_FALSE, + .compareOp = VK_COMPARE_OP_NEVER, + .borderColor = VK_BORDER_COLOR_FLOAT_TRANSPARENT_BLACK, + .unnormalizedCoordinates = unnorm_coords, + }; + + FFVkSampler *sctx = create_sampler(s); + if (!sctx) + return NULL; + + ret = vk->CreateSampler(s->hwctx->act_dev, &sampler_info, + s->hwctx->alloc, &sctx->sampler[0]); + if (ret != VK_SUCCESS) { + av_log(s, AV_LOG_ERROR, "Unable to init sampler: %s\n", + ff_vk_ret2str(ret)); + return NULL; + } + + for (int i = 1; i < 4; i++) + sctx->sampler[i] = sctx->sampler[0]; + + return sctx; +} + +int ff_vk_mt_is_np_rgb(enum AVPixelFormat pix_fmt) +{ + if (pix_fmt == AV_PIX_FMT_ABGR || pix_fmt == AV_PIX_FMT_BGRA || + pix_fmt == AV_PIX_FMT_RGBA || pix_fmt == AV_PIX_FMT_RGB24 || + pix_fmt == AV_PIX_FMT_BGR24 || pix_fmt == AV_PIX_FMT_RGB48 || + pix_fmt == AV_PIX_FMT_RGBA64 || pix_fmt == AV_PIX_FMT_RGB565 || + pix_fmt == AV_PIX_FMT_BGR565 || pix_fmt == AV_PIX_FMT_BGR0 || + pix_fmt == AV_PIX_FMT_0BGR || pix_fmt == AV_PIX_FMT_RGB0) + return 1; + return 0; +} + +const char *ff_vk_shader_rep_fmt(enum AVPixelFormat pixfmt) +{ + const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(pixfmt); + const int high = desc->comp[0].depth > 8; + return high ? "rgba16f" : "rgba8"; +} + +typedef struct ImageViewCtx { + VkImageView view; +} ImageViewCtx; + +static void destroy_imageview(void *opaque, uint8_t *data) +{ + FFVulkanContext *s = opaque; + FFVulkanFunctions *vk = &s->vkfn; + ImageViewCtx *iv = (ImageViewCtx *)data; + + vk->DestroyImageView(s->hwctx->act_dev, iv->view, s->hwctx->alloc); + av_free(iv); +} + +int ff_vk_create_imageview(FFVulkanContext *s, FFVkExecContext *e, + VkImageView *v, VkImage img, VkFormat fmt, + const VkComponentMapping map) +{ + int err; + AVBufferRef *buf; + FFVulkanFunctions *vk = &s->vkfn; + + VkImageViewCreateInfo imgview_spawn = { + .sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO, + .pNext = NULL, + .image = img, + .viewType = VK_IMAGE_VIEW_TYPE_2D, + .format = fmt, + .components = map, + .subresourceRange = { + .aspectMask = VK_IMAGE_ASPECT_COLOR_BIT, + .baseMipLevel = 0, + .levelCount = 1, + .baseArrayLayer = 0, + .layerCount = 1, + }, + }; + + ImageViewCtx *iv = av_mallocz(sizeof(*iv)); + + VkResult ret = vk->CreateImageView(s->hwctx->act_dev, &imgview_spawn, + s->hwctx->alloc, &iv->view); + if (ret != VK_SUCCESS) { + av_log(s, AV_LOG_ERROR, "Failed to create imageview: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + buf = av_buffer_create((uint8_t *)iv, sizeof(*iv), destroy_imageview, s, 0); + if (!buf) { + destroy_imageview(s, (uint8_t *)iv); + return AVERROR(ENOMEM); + } + + /* Add to queue dependencies */ + err = ff_vk_add_dep_exec_ctx(s, e, &buf, 1); + if (err) { + av_buffer_unref(&buf); + return err; + } + + *v = iv->view; + + return 0; +} + +FN_CREATING(FFVulkanPipeline, FFSPIRVShader, shader, shaders, shaders_num) +FFSPIRVShader *ff_vk_init_shader(FFVulkanPipeline *pl, const char *name, + VkShaderStageFlags stage) +{ + FFSPIRVShader *shd = create_shader(pl); + if (!shd) + return NULL; + + av_bprint_init(&shd->src, 0, AV_BPRINT_SIZE_UNLIMITED); + + shd->shader.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO; + shd->shader.stage = stage; + + shd->name = name; + + GLSLF(0, #version %i ,460); + GLSLC(0, #define IS_WITHIN(v1, v2) ((v1.x < v2.x) && (v1.y < v2.y)) ); + GLSLC(0, ); + + return shd; +} + +void ff_vk_set_compute_shader_sizes(FFSPIRVShader *shd, int local_size[3]) +{ + shd->local_size[0] = local_size[0]; + shd->local_size[1] = local_size[1]; + shd->local_size[2] = local_size[2]; + + av_bprintf(&shd->src, "layout (local_size_x = %i, " + "local_size_y = %i, local_size_z = %i) in;\n\n", + shd->local_size[0], shd->local_size[1], shd->local_size[2]); +} + +void ff_vk_print_shader(void *ctx, FFSPIRVShader *shd, int prio) +{ + int line = 0; + const char *p = shd->src.str; + const char *start = p; + + AVBPrint buf; + av_bprint_init(&buf, 0, AV_BPRINT_SIZE_UNLIMITED); + + for (int i = 0; i < strlen(p); i++) { + if (p[i] == '\n') { + av_bprintf(&buf, "%i\t", ++line); + av_bprint_append_data(&buf, start, &p[i] - start + 1); + start = &p[i + 1]; + } + } + + av_log(ctx, prio, "Shader %s: \n%s", shd->name, buf.str); + av_bprint_finalize(&buf, NULL); +} + +int ff_vk_compile_shader(FFVulkanContext *s, FFSPIRVShader *shd, + const char *entrypoint) +{ +#if CONFIG_LIBGLSLANG + int err; + VkResult ret; + FFVulkanFunctions *vk = &s->vkfn; + VkShaderModuleCreateInfo shader_create; + uint8_t *spirv; + size_t spirv_size; + void *priv; + + shd->shader.pName = entrypoint; + + err = ff_vk_glslang_shader_compile(s, shd, &spirv, &spirv_size, &priv); + if (err < 0) + return err; + + ff_vk_print_shader(s, shd, AV_LOG_VERBOSE); + + av_log(s, AV_LOG_VERBOSE, "Shader %s compiled! Size: %zu bytes\n", + shd->name, spirv_size); + + shader_create.sType = VK_STRUCTURE_TYPE_SHADER_MODULE_CREATE_INFO; + shader_create.pNext = NULL; + shader_create.codeSize = spirv_size; + shader_create.flags = 0; + shader_create.pCode = (void *)spirv; + + ret = vk->CreateShaderModule(s->hwctx->act_dev, &shader_create, NULL, + &shd->shader.module); + + ff_vk_glslang_shader_free(priv); + + if (ret != VK_SUCCESS) { + av_log(s, AV_LOG_ERROR, "Unable to create shader module: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + return 0; +#else + return AVERROR(ENOSYS); +#endif +} + +static const struct descriptor_props { + size_t struct_size; /* Size of the opaque which updates the descriptor */ + const char *type; + int is_uniform; + int mem_quali; /* Can use a memory qualifier */ + int dim_needed; /* Must indicate dimension */ + int buf_content; /* Must indicate buffer contents */ +} descriptor_props[] = { + [VK_DESCRIPTOR_TYPE_SAMPLER] = { sizeof(VkDescriptorImageInfo), "sampler", 1, 0, 0, 0, }, + [VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE] = { sizeof(VkDescriptorImageInfo), "texture", 1, 0, 1, 0, }, + [VK_DESCRIPTOR_TYPE_STORAGE_IMAGE] = { sizeof(VkDescriptorImageInfo), "image", 1, 1, 1, 0, }, + [VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT] = { sizeof(VkDescriptorImageInfo), "subpassInput", 1, 0, 0, 0, }, + [VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER] = { sizeof(VkDescriptorImageInfo), "sampler", 1, 0, 1, 0, }, + [VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER] = { sizeof(VkDescriptorBufferInfo), NULL, 1, 0, 0, 1, }, + [VK_DESCRIPTOR_TYPE_STORAGE_BUFFER] = { sizeof(VkDescriptorBufferInfo), "buffer", 0, 1, 0, 1, }, + [VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC] = { sizeof(VkDescriptorBufferInfo), NULL, 1, 0, 0, 1, }, + [VK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC] = { sizeof(VkDescriptorBufferInfo), "buffer", 0, 1, 0, 1, }, + [VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER] = { sizeof(VkBufferView), "samplerBuffer", 1, 0, 0, 0, }, + [VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER] = { sizeof(VkBufferView), "imageBuffer", 1, 0, 0, 0, }, +}; + +int ff_vk_add_descriptor_set(FFVulkanContext *s, FFVulkanPipeline *pl, + FFSPIRVShader *shd, FFVulkanDescriptorSetBinding *desc, + int num, int only_print_to_shader) +{ + VkResult ret; + VkDescriptorSetLayout *layout; + FFVulkanFunctions *vk = &s->vkfn; + + if (only_print_to_shader) + goto print; + + pl->desc_layout = av_realloc_array(pl->desc_layout, sizeof(*pl->desc_layout), + pl->desc_layout_num + pl->qf->nb_queues); + if (!pl->desc_layout) + return AVERROR(ENOMEM); + + pl->desc_set_initialized = av_realloc_array(pl->desc_set_initialized, + sizeof(*pl->desc_set_initialized), + pl->descriptor_sets_num + 1); + if (!pl->desc_set_initialized) + return AVERROR(ENOMEM); + + pl->desc_set_initialized[pl->descriptor_sets_num] = 0; + layout = &pl->desc_layout[pl->desc_layout_num]; + + { /* Create descriptor set layout descriptions */ + VkDescriptorSetLayoutCreateInfo desc_create_layout = { 0 }; + VkDescriptorSetLayoutBinding *desc_binding; + + desc_binding = av_mallocz(sizeof(*desc_binding)*num); + if (!desc_binding) + return AVERROR(ENOMEM); + + for (int i = 0; i < num; i++) { + desc_binding[i].binding = i; + desc_binding[i].descriptorType = desc[i].type; + desc_binding[i].descriptorCount = FFMAX(desc[i].elems, 1); + desc_binding[i].stageFlags = desc[i].stages; + desc_binding[i].pImmutableSamplers = desc[i].sampler ? + desc[i].sampler->sampler : + NULL; + } + + desc_create_layout.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO; + desc_create_layout.pBindings = desc_binding; + desc_create_layout.bindingCount = num; + + for (int i = 0; i < pl->qf->nb_queues; i++) { + ret = vk->CreateDescriptorSetLayout(s->hwctx->act_dev, &desc_create_layout, + s->hwctx->alloc, &layout[i]); + if (ret != VK_SUCCESS) { + av_log(s, AV_LOG_ERROR, "Unable to init descriptor set " + "layout: %s\n", ff_vk_ret2str(ret)); + av_free(desc_binding); + return AVERROR_EXTERNAL; + } + } + + av_free(desc_binding); + } + + { /* Pool each descriptor by type and update pool counts */ + for (int i = 0; i < num; i++) { + int j; + for (j = 0; j < pl->pool_size_desc_num; j++) + if (pl->pool_size_desc[j].type == desc[i].type) + break; + if (j >= pl->pool_size_desc_num) { + pl->pool_size_desc = av_realloc_array(pl->pool_size_desc, + sizeof(*pl->pool_size_desc), + ++pl->pool_size_desc_num); + if (!pl->pool_size_desc) + return AVERROR(ENOMEM); + memset(&pl->pool_size_desc[j], 0, sizeof(VkDescriptorPoolSize)); + } + pl->pool_size_desc[j].type = desc[i].type; + pl->pool_size_desc[j].descriptorCount += FFMAX(desc[i].elems, 1)*pl->qf->nb_queues; + } + } + + { /* Create template creation struct */ + VkDescriptorUpdateTemplateCreateInfo *dt; + VkDescriptorUpdateTemplateEntry *des_entries; + + /* Freed after descriptor set initialization */ + des_entries = av_mallocz(num*sizeof(VkDescriptorUpdateTemplateEntry)); + if (!des_entries) + return AVERROR(ENOMEM); + + for (int i = 0; i < num; i++) { + des_entries[i].dstBinding = i; + des_entries[i].descriptorType = desc[i].type; + des_entries[i].descriptorCount = FFMAX(desc[i].elems, 1); + des_entries[i].dstArrayElement = 0; + des_entries[i].offset = ((uint8_t *)desc[i].updater) - (uint8_t *)s; + des_entries[i].stride = descriptor_props[desc[i].type].struct_size; + } + + pl->desc_template_info = av_realloc_array(pl->desc_template_info, + sizeof(*pl->desc_template_info), + pl->total_descriptor_sets + pl->qf->nb_queues); + if (!pl->desc_template_info) + return AVERROR(ENOMEM); + + dt = &pl->desc_template_info[pl->total_descriptor_sets]; + memset(dt, 0, sizeof(*dt)*pl->qf->nb_queues); + + for (int i = 0; i < pl->qf->nb_queues; i++) { + dt[i].sType = VK_STRUCTURE_TYPE_DESCRIPTOR_UPDATE_TEMPLATE_CREATE_INFO; + dt[i].templateType = VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_DESCRIPTOR_SET; + dt[i].descriptorSetLayout = layout[i]; + dt[i].pDescriptorUpdateEntries = des_entries; + dt[i].descriptorUpdateEntryCount = num; + } + } + + pl->descriptor_sets_num++; + + pl->desc_layout_num += pl->qf->nb_queues; + pl->total_descriptor_sets += pl->qf->nb_queues; + +print: + /* Write shader info */ + for (int i = 0; i < num; i++) { + const struct descriptor_props *prop = &descriptor_props[desc[i].type]; + GLSLA("layout (set = %i, binding = %i", pl->descriptor_sets_num - 1, i); + + if (desc[i].mem_layout) + GLSLA(", %s", desc[i].mem_layout); + GLSLA(")"); + + if (prop->is_uniform) + GLSLA(" uniform"); + + if (prop->mem_quali && desc[i].mem_quali) + GLSLA(" %s", desc[i].mem_quali); + + if (prop->type) + GLSLA(" %s", prop->type); + + if (prop->dim_needed) + GLSLA("%iD", desc[i].dimensions); + + GLSLA(" %s", desc[i].name); + + if (prop->buf_content) + GLSLA(" {\n %s\n}", desc[i].buf_content); + else if (desc[i].elems > 0) + GLSLA("[%i]", desc[i].elems); + + GLSLA(";\n"); + } + GLSLA("\n"); + + return 0; +} + +void ff_vk_update_descriptor_set(FFVulkanContext *s, FFVulkanPipeline *pl, + int set_id) +{ + FFVulkanFunctions *vk = &s->vkfn; + + /* If a set has never been updated, update all queues' sets. */ + if (!pl->desc_set_initialized[set_id]) { + for (int i = 0; i < pl->qf->nb_queues; i++) { + int idx = set_id*pl->qf->nb_queues + i; + vk->UpdateDescriptorSetWithTemplate(s->hwctx->act_dev, + pl->desc_set[idx], + pl->desc_template[idx], + s); + } + pl->desc_set_initialized[set_id] = 1; + return; + } + + set_id = set_id*pl->qf->nb_queues + pl->qf->cur_queue; + + vk->UpdateDescriptorSetWithTemplate(s->hwctx->act_dev, + pl->desc_set[set_id], + pl->desc_template[set_id], + s); +} + +void ff_vk_update_push_exec(FFVulkanContext *s, FFVkExecContext *e, + VkShaderStageFlagBits stage, int offset, + size_t size, void *src) +{ + FFVulkanFunctions *vk = &s->vkfn; + + vk->CmdPushConstants(e->bufs[e->qf->cur_queue], e->bound_pl->pipeline_layout, + stage, offset, size, src); +} + +int ff_vk_init_pipeline_layout(FFVulkanContext *s, + FFVulkanPipeline *pl) +{ + VkResult ret; + FFVulkanFunctions *vk = &s->vkfn; + + pl->desc_staging = av_malloc(pl->descriptor_sets_num*sizeof(*pl->desc_staging)); + if (!pl->desc_staging) + return AVERROR(ENOMEM); + + { /* Init descriptor set pool */ + VkDescriptorPoolCreateInfo pool_create_info = { + .sType = VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO, + .poolSizeCount = pl->pool_size_desc_num, + .pPoolSizes = pl->pool_size_desc, + .maxSets = pl->total_descriptor_sets, + }; + + ret = vk->CreateDescriptorPool(s->hwctx->act_dev, &pool_create_info, + s->hwctx->alloc, &pl->desc_pool); + av_freep(&pl->pool_size_desc); + if (ret != VK_SUCCESS) { + av_log(s, AV_LOG_ERROR, "Unable to init descriptor set " + "pool: %s\n", ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } + + { /* Allocate descriptor sets */ + VkDescriptorSetAllocateInfo alloc_info = { + .sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO, + .descriptorPool = pl->desc_pool, + .descriptorSetCount = pl->total_descriptor_sets, + .pSetLayouts = pl->desc_layout, + }; + + pl->desc_set = av_malloc(pl->total_descriptor_sets*sizeof(*pl->desc_set)); + if (!pl->desc_set) + return AVERROR(ENOMEM); + + ret = vk->AllocateDescriptorSets(s->hwctx->act_dev, &alloc_info, + pl->desc_set); + if (ret != VK_SUCCESS) { + av_log(s, AV_LOG_ERROR, "Unable to allocate descriptor set: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } + + { /* Finally create the pipeline layout */ + VkPipelineLayoutCreateInfo spawn_pipeline_layout = { + .sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO, + .pSetLayouts = (VkDescriptorSetLayout *)pl->desc_staging, + .pushConstantRangeCount = pl->push_consts_num, + .pPushConstantRanges = pl->push_consts, + }; + + for (int i = 0; i < pl->total_descriptor_sets; i += pl->qf->nb_queues) + pl->desc_staging[spawn_pipeline_layout.setLayoutCount++] = pl->desc_layout[i]; + + ret = vk->CreatePipelineLayout(s->hwctx->act_dev, &spawn_pipeline_layout, + s->hwctx->alloc, &pl->pipeline_layout); + av_freep(&pl->push_consts); + pl->push_consts_num = 0; + if (ret != VK_SUCCESS) { + av_log(s, AV_LOG_ERROR, "Unable to init pipeline layout: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } + + { /* Descriptor template (for tightly packed descriptors) */ + VkDescriptorUpdateTemplateCreateInfo *dt; + + pl->desc_template = av_malloc(pl->total_descriptor_sets*sizeof(*pl->desc_template)); + if (!pl->desc_template) + return AVERROR(ENOMEM); + + /* Create update templates for the descriptor sets */ + for (int i = 0; i < pl->total_descriptor_sets; i++) { + dt = &pl->desc_template_info[i]; + dt->pipelineLayout = pl->pipeline_layout; + ret = vk->CreateDescriptorUpdateTemplate(s->hwctx->act_dev, + dt, s->hwctx->alloc, + &pl->desc_template[i]); + if (ret != VK_SUCCESS) { + av_log(s, AV_LOG_ERROR, "Unable to init descriptor " + "template: %s\n", ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + } + + /* Free the duplicated memory used for the template entries */ + for (int i = 0; i < pl->total_descriptor_sets; i += pl->qf->nb_queues) { + dt = &pl->desc_template_info[i]; + av_free((void *)dt->pDescriptorUpdateEntries); + } + + av_freep(&pl->desc_template_info); + } + + return 0; +} + +FN_CREATING(FFVulkanContext, FFVulkanPipeline, pipeline, pipelines, pipelines_num) +FFVulkanPipeline *ff_vk_create_pipeline(FFVulkanContext *s, + FFVkQueueFamilyCtx *qf) +{ + FFVulkanPipeline *pl = create_pipeline(s); + if (pl) + pl->qf = qf; + + return pl; +} + +int ff_vk_init_compute_pipeline(FFVulkanContext *s, + FFVulkanPipeline *pl) +{ + int i; + VkResult ret; + FFVulkanFunctions *vk = &s->vkfn; + + VkComputePipelineCreateInfo pipe = { + .sType = VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO, + .layout = pl->pipeline_layout, + }; + + for (i = 0; i < pl->shaders_num; i++) { + if (pl->shaders[i]->shader.stage & VK_SHADER_STAGE_COMPUTE_BIT) { + pipe.stage = pl->shaders[i]->shader; + break; + } + } + if (i == pl->shaders_num) { + av_log(s, AV_LOG_ERROR, "Can't init compute pipeline, no shader\n"); + return AVERROR(EINVAL); + } + + ret = vk->CreateComputePipelines(s->hwctx->act_dev, VK_NULL_HANDLE, 1, &pipe, + s->hwctx->alloc, &pl->pipeline); + if (ret != VK_SUCCESS) { + av_log(s, AV_LOG_ERROR, "Unable to init compute pipeline: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } + + pl->bind_point = VK_PIPELINE_BIND_POINT_COMPUTE; + + return 0; +} + +void ff_vk_bind_pipeline_exec(FFVulkanContext *s, FFVkExecContext *e, + FFVulkanPipeline *pl) +{ + FFVulkanFunctions *vk = &s->vkfn; + + vk->CmdBindPipeline(e->bufs[e->qf->cur_queue], pl->bind_point, pl->pipeline); + + for (int i = 0; i < pl->descriptor_sets_num; i++) + pl->desc_staging[i] = pl->desc_set[i*pl->qf->nb_queues + pl->qf->cur_queue]; + + vk->CmdBindDescriptorSets(e->bufs[e->qf->cur_queue], pl->bind_point, + pl->pipeline_layout, 0, + pl->descriptor_sets_num, + (VkDescriptorSet *)pl->desc_staging, + 0, NULL); + + e->bound_pl = pl; +} + +static void free_exec_ctx(FFVulkanContext *s, FFVkExecContext *e) +{ + FFVulkanFunctions *vk = &s->vkfn; + + /* Make sure all queues have finished executing */ + for (int i = 0; i < e->qf->nb_queues; i++) { + FFVkQueueCtx *q = &e->queues[i]; + + if (q->fence) { + vk->WaitForFences(s->hwctx->act_dev, 1, &q->fence, VK_TRUE, UINT64_MAX); + vk->ResetFences(s->hwctx->act_dev, 1, &q->fence); + } + + /* Free the fence */ + if (q->fence) + vk->DestroyFence(s->hwctx->act_dev, q->fence, s->hwctx->alloc); + + /* Free buffer dependencies */ + for (int j = 0; j < q->nb_buf_deps; j++) + av_buffer_unref(&q->buf_deps[j]); + av_free(q->buf_deps); + + /* Free frame dependencies */ + for (int j = 0; j < q->nb_frame_deps; j++) + av_frame_free(&q->frame_deps[j]); + av_free(q->frame_deps); + } + + if (e->bufs) + vk->FreeCommandBuffers(s->hwctx->act_dev, e->pool, e->qf->nb_queues, e->bufs); + if (e->pool) + vk->DestroyCommandPool(s->hwctx->act_dev, e->pool, s->hwctx->alloc); + + av_freep(&e->bufs); + av_freep(&e->queues); + av_freep(&e->sem_sig); + av_freep(&e->sem_sig_val); + av_freep(&e->sem_sig_val_dst); + av_freep(&e->sem_wait); + av_freep(&e->sem_wait_dst); + av_freep(&e->sem_wait_val); + av_free(e); +} + +static void free_pipeline(FFVulkanContext *s, FFVulkanPipeline *pl) +{ + FFVulkanFunctions *vk = &s->vkfn; + + for (int i = 0; i < pl->shaders_num; i++) { + FFSPIRVShader *shd = pl->shaders[i]; + av_bprint_finalize(&shd->src, NULL); + vk->DestroyShaderModule(s->hwctx->act_dev, shd->shader.module, + s->hwctx->alloc); + av_free(shd); + } + + vk->DestroyPipeline(s->hwctx->act_dev, pl->pipeline, s->hwctx->alloc); + vk->DestroyPipelineLayout(s->hwctx->act_dev, pl->pipeline_layout, + s->hwctx->alloc); + + for (int i = 0; i < pl->desc_layout_num; i++) { + if (pl->desc_template && pl->desc_template[i]) + vk->DestroyDescriptorUpdateTemplate(s->hwctx->act_dev, pl->desc_template[i], + s->hwctx->alloc); + if (pl->desc_layout && pl->desc_layout[i]) + vk->DestroyDescriptorSetLayout(s->hwctx->act_dev, pl->desc_layout[i], + s->hwctx->alloc); + } + + /* Also frees the descriptor sets */ + if (pl->desc_pool) + vk->DestroyDescriptorPool(s->hwctx->act_dev, pl->desc_pool, + s->hwctx->alloc); + + av_freep(&pl->desc_staging); + av_freep(&pl->desc_set); + av_freep(&pl->shaders); + av_freep(&pl->desc_layout); + av_freep(&pl->desc_template); + av_freep(&pl->desc_set_initialized); + av_freep(&pl->push_consts); + pl->push_consts_num = 0; + + /* Only freed in case of failure */ + av_freep(&pl->pool_size_desc); + if (pl->desc_template_info) { + for (int i = 0; i < pl->total_descriptor_sets; i += pl->qf->nb_queues) { + VkDescriptorUpdateTemplateCreateInfo *dt = &pl->desc_template_info[i]; + av_free((void *)dt->pDescriptorUpdateEntries); + } + av_freep(&pl->desc_template_info); + } + + av_free(pl); +} + +void ff_vk_uninit(FFVulkanContext *s) +{ + FFVulkanFunctions *vk = &s->vkfn; + + ff_vk_glslang_uninit(); + + for (int i = 0; i < s->exec_ctx_num; i++) + free_exec_ctx(s, s->exec_ctx[i]); + av_freep(&s->exec_ctx); + + for (int i = 0; i < s->samplers_num; i++) { + vk->DestroySampler(s->hwctx->act_dev, s->samplers[i]->sampler[0], + s->hwctx->alloc); + av_free(s->samplers[i]); + } + av_freep(&s->samplers); + + for (int i = 0; i < s->pipelines_num; i++) + free_pipeline(s, s->pipelines[i]); + av_freep(&s->pipelines); + + av_freep(&s->scratch); + s->scratch_size = 0; + + av_buffer_unref(&s->device_ref); + av_buffer_unref(&s->frames_ref); +} diff --git a/libavutil/vulkan.h b/libavutil/vulkan.h new file mode 100644 index 0000000000..ae0e28dc0a --- /dev/null +++ b/libavutil/vulkan.h @@ -0,0 +1,414 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVUTIL_VULKAN_H +#define AVUTIL_VULKAN_H + +#define VK_NO_PROTOTYPES +#define VK_ENABLE_BETA_EXTENSIONS + +#include "pixdesc.h" +#include "bprint.h" +#include "hwcontext.h" +#include "hwcontext_vulkan.h" +#include "vulkan_functions.h" + +#define FF_VK_DEFAULT_USAGE_FLAGS (VK_IMAGE_USAGE_SAMPLED_BIT | \ + VK_IMAGE_USAGE_STORAGE_BIT | \ + VK_IMAGE_USAGE_TRANSFER_SRC_BIT | \ + VK_IMAGE_USAGE_TRANSFER_DST_BIT) + +/* GLSL management macros */ +#define INDENT(N) INDENT_##N +#define INDENT_0 +#define INDENT_1 INDENT_0 " " +#define INDENT_2 INDENT_1 INDENT_1 +#define INDENT_3 INDENT_2 INDENT_1 +#define INDENT_4 INDENT_3 INDENT_1 +#define INDENT_5 INDENT_4 INDENT_1 +#define INDENT_6 INDENT_5 INDENT_1 +#define C(N, S) INDENT(N) #S "\n" +#define GLSLC(N, S) av_bprintf(&shd->src, C(N, S)) +#define GLSLA(...) av_bprintf(&shd->src, __VA_ARGS__) +#define GLSLF(N, S, ...) av_bprintf(&shd->src, C(N, S), __VA_ARGS__) +#define GLSLD(D) GLSLC(0, ); \ + av_bprint_append_data(&shd->src, D, strlen(D)); \ + GLSLC(0, ) + +/* Helper, pretty much every Vulkan return value needs to be checked */ +#define RET(x) \ + do { \ + if ((err = (x)) < 0) \ + goto fail; \ + } while (0) + +typedef struct FFSPIRVShader { + const char *name; /* Name for id/debugging purposes */ + AVBPrint src; + int local_size[3]; /* Compute shader workgroup sizes */ + VkPipelineShaderStageCreateInfo shader; +} FFSPIRVShader; + +typedef struct FFVkSampler { + VkSampler sampler[4]; +} FFVkSampler; + +typedef struct FFVulkanDescriptorSetBinding { + const char *name; + VkDescriptorType type; + const char *mem_layout; /* Storage images (rgba8, etc.) and buffers (std430, etc.) */ + const char *mem_quali; /* readonly, writeonly, etc. */ + const char *buf_content; /* For buffers */ + uint32_t dimensions; /* Needed for e.g. sampler%iD */ + uint32_t elems; /* 0 - scalar, 1 or more - vector */ + VkShaderStageFlags stages; + FFVkSampler *sampler; /* Sampler to use for all elems */ + void *updater; /* Pointer to VkDescriptor*Info */ +} FFVulkanDescriptorSetBinding; + +typedef struct FFVkBuffer { + VkBuffer buf; + VkDeviceMemory mem; + VkMemoryPropertyFlagBits flags; +} FFVkBuffer; + +typedef struct FFVkQueueFamilyCtx { + int queue_family; + int nb_queues; + int cur_queue; + int actual_queues; +} FFVkQueueFamilyCtx; + +typedef struct FFVulkanPipeline { + FFVkQueueFamilyCtx *qf; + + VkPipelineBindPoint bind_point; + + /* Contexts */ + VkPipelineLayout pipeline_layout; + VkPipeline pipeline; + + /* Shaders */ + FFSPIRVShader **shaders; + int shaders_num; + + /* Push consts */ + VkPushConstantRange *push_consts; + int push_consts_num; + + /* Descriptors */ + VkDescriptorSetLayout *desc_layout; + VkDescriptorPool desc_pool; + VkDescriptorSet *desc_set; + void **desc_staging; + VkDescriptorSetLayoutBinding **desc_binding; + VkDescriptorUpdateTemplate *desc_template; + int *desc_set_initialized; + int desc_layout_num; + int descriptor_sets_num; + int total_descriptor_sets; + int pool_size_desc_num; + + /* Temporary, used to store data in between initialization stages */ + VkDescriptorUpdateTemplateCreateInfo *desc_template_info; + VkDescriptorPoolSize *pool_size_desc; +} FFVulkanPipeline; + +typedef struct FFVkQueueCtx { + VkFence fence; + VkQueue queue; + + /* Buffer dependencies */ + AVBufferRef **buf_deps; + int nb_buf_deps; + int buf_deps_alloc_size; + + /* Frame dependencies */ + AVFrame **frame_deps; + int nb_frame_deps; + int frame_deps_alloc_size; +} FFVkQueueCtx; + +typedef struct FFVkExecContext { + FFVkQueueFamilyCtx *qf; + + VkCommandPool pool; + VkCommandBuffer *bufs; + FFVkQueueCtx *queues; + + AVBufferRef ***deps; + int *nb_deps; + int *dep_alloc_size; + + FFVulkanPipeline *bound_pl; + + VkSemaphore *sem_wait; + int sem_wait_alloc; /* Allocated sem_wait */ + int sem_wait_cnt; + + uint64_t *sem_wait_val; + int sem_wait_val_alloc; + + VkPipelineStageFlagBits *sem_wait_dst; + int sem_wait_dst_alloc; /* Allocated sem_wait_dst */ + + VkSemaphore *sem_sig; + int sem_sig_alloc; /* Allocated sem_sig */ + int sem_sig_cnt; + + uint64_t *sem_sig_val; + int sem_sig_val_alloc; + + uint64_t **sem_sig_val_dst; + int sem_sig_val_dst_alloc; +} FFVkExecContext; + +typedef struct FFVulkanContext { + const AVClass *class; /* Filters and encoders use this */ + + FFVulkanFunctions vkfn; + FFVulkanExtensions extensions; + VkPhysicalDeviceProperties props; + VkPhysicalDeviceMemoryProperties mprops; + + AVBufferRef *device_ref; + AVHWDeviceContext *device; + AVVulkanDeviceContext *hwctx; + + AVBufferRef *frames_ref; + AVHWFramesContext *frames; + AVVulkanFramesContext *hwfc; + + /* Properties */ + int output_width; + int output_height; + enum AVPixelFormat output_format; + enum AVPixelFormat input_format; + + /* Samplers */ + FFVkSampler **samplers; + int samplers_num; + + /* Exec contexts */ + FFVkExecContext **exec_ctx; + int exec_ctx_num; + + /* Pipelines (each can have 1 shader of each type) */ + FFVulkanPipeline **pipelines; + int pipelines_num; + + void *scratch; /* Scratch memory used only in functions */ + unsigned int scratch_size; +} FFVulkanContext; + +/* Identity mapping - r = r, b = b, g = g, a = a */ +extern const VkComponentMapping ff_comp_identity_map; + +/** + * Converts Vulkan return values to strings + */ +const char *ff_vk_ret2str(VkResult res); + +/** + * Returns 1 if the image is any sort of supported RGB + */ +int ff_vk_mt_is_np_rgb(enum AVPixelFormat pix_fmt); + +/** + * Gets the glsl format string for a pixel format + */ +const char *ff_vk_shader_rep_fmt(enum AVPixelFormat pixfmt); + +/** + * Initialize a queue family with a specific number of queues. + * If nb_queues == 0, use however many queues the queue family has. + */ +void ff_vk_qf_init(FFVulkanContext *s, FFVkQueueFamilyCtx *qf, + VkQueueFlagBits dev_family, int nb_queues); + +/** + * Rotate through the queues in a queue family. + */ +void ff_vk_qf_rotate(FFVkQueueFamilyCtx *qf); + +/** + * Create a Vulkan sampler, will be auto-freed in ff_vk_filter_uninit() + */ +FFVkSampler *ff_vk_init_sampler(FFVulkanContext *s, int unnorm_coords, + VkFilter filt); + +/** + * Create an imageview. + * Guaranteed to remain alive until the queue submission has finished executing, + * and will be destroyed after that. + */ +int ff_vk_create_imageview(FFVulkanContext *s, FFVkExecContext *e, + VkImageView *v, VkImage img, VkFormat fmt, + const VkComponentMapping map); + +/** + * Define a push constant for a given stage into a pipeline. + * Must be called before the pipeline layout has been initialized. + */ +int ff_vk_add_push_constant(FFVulkanPipeline *pl, int offset, int size, + VkShaderStageFlagBits stage); + +/** + * Inits a pipeline. Everything in it will be auto-freed when calling + * ff_vk_filter_uninit(). + */ +FFVulkanPipeline *ff_vk_create_pipeline(FFVulkanContext *s, + FFVkQueueFamilyCtx *qf); + +/** + * Inits a shader for a specific pipeline. Will be auto-freed on uninit. + */ +FFSPIRVShader *ff_vk_init_shader(FFVulkanPipeline *pl, const char *name, + VkShaderStageFlags stage); + +/** + * Writes the workgroup size for a shader. + */ +void ff_vk_set_compute_shader_sizes(FFSPIRVShader *shd, int local_size[3]); + +/** + * Adds a descriptor set to the shader and registers them in the pipeline. + */ +int ff_vk_add_descriptor_set(FFVulkanContext *s, FFVulkanPipeline *pl, + FFSPIRVShader *shd, FFVulkanDescriptorSetBinding *desc, + int num, int only_print_to_shader); + +/** + * Compiles the shader, entrypoint must be set to "main". + */ +int ff_vk_compile_shader(FFVulkanContext *s, FFSPIRVShader *shd, + const char *entrypoint); + +/** + * Pretty print shader, mainly used by shader compilers. + */ +void ff_vk_print_shader(void *ctx, FFSPIRVShader *shd, int prio); + +/** + * Initializes the pipeline layout after all shaders and descriptor sets have + * been finished. + */ +int ff_vk_init_pipeline_layout(FFVulkanContext *s, FFVulkanPipeline *pl); + +/** + * Initializes a compute pipeline. Will pick the first shader with the + * COMPUTE flag set. + */ +int ff_vk_init_compute_pipeline(FFVulkanContext *s, FFVulkanPipeline *pl); + +/** + * Updates a descriptor set via the updaters defined. + * Can be called immediately after pipeline creation, but must be called + * at least once before queue submission. + */ +void ff_vk_update_descriptor_set(FFVulkanContext *s, FFVulkanPipeline *pl, + int set_id); + +/** + * Init an execution context for command recording and queue submission. + * WIll be auto-freed on uninit. + */ +int ff_vk_create_exec_ctx(FFVulkanContext *s, FFVkExecContext **ctx, + FFVkQueueFamilyCtx *qf); + +/** + * Begin recording to the command buffer. Previous execution must have been + * completed, which ff_vk_submit_exec_queue() will ensure. + */ +int ff_vk_start_exec_recording(FFVulkanContext *s, FFVkExecContext *e); + +/** + * Add a command to bind the completed pipeline and its descriptor sets. + * Must be called after ff_vk_start_exec_recording() and before submission. + */ +void ff_vk_bind_pipeline_exec(FFVulkanContext *s, FFVkExecContext *e, + FFVulkanPipeline *pl); + +/** + * Updates push constants. + * Must be called after binding a pipeline if any push constants were defined. + */ +void ff_vk_update_push_exec(FFVulkanContext *s, FFVkExecContext *e, + VkShaderStageFlagBits stage, int offset, + size_t size, void *src); + +/** + * Gets the command buffer to use for this submission from the exe context. + */ +VkCommandBuffer ff_vk_get_exec_buf(FFVkExecContext *e); + +/** + * Adds a generic AVBufferRef as a queue depenency. + */ +int ff_vk_add_dep_exec_ctx(FFVulkanContext *s, FFVkExecContext *e, + AVBufferRef **deps, int nb_deps); + +/** + * Discards all queue dependencies + */ +void ff_vk_discard_exec_deps(FFVkExecContext *e); + +/** + * Adds a frame as a queue dependency. This also manages semaphore signalling. + * Must be called before submission. + */ +int ff_vk_add_exec_dep(FFVulkanContext *s, FFVkExecContext *e, AVFrame *frame, + VkPipelineStageFlagBits in_wait_dst_flag); + +/** + * Submits a command buffer to the queue for execution. + * Will block until execution has finished in order to simplify resource + * management. + */ +int ff_vk_submit_exec_queue(FFVulkanContext *s, FFVkExecContext *e); + +/** + * Create a VkBuffer with the specified parameters. + */ +int ff_vk_create_buf(FFVulkanContext *s, FFVkBuffer *buf, size_t size, + VkBufferUsageFlags usage, VkMemoryPropertyFlagBits flags); + +/** + * Maps the buffer to userspace. Set invalidate to 1 if reading the contents + * is necessary. + */ +int ff_vk_map_buffers(FFVulkanContext *s, FFVkBuffer *buf, uint8_t *mem[], + int nb_buffers, int invalidate); + +/** + * Unmaps the buffer from userspace. Set flush to 1 to write and sync. + */ +int ff_vk_unmap_buffers(FFVulkanContext *s, FFVkBuffer *buf, int nb_buffers, + int flush); + +/** + * Frees a buffer. + */ +void ff_vk_free_buf(FFVulkanContext *s, FFVkBuffer *buf); + +/** + * Frees the main Vulkan context. + */ +void ff_vk_uninit(FFVulkanContext *s); + +#endif /* AVUTIL_VULKAN_H */ diff --git a/libavutil/vulkan_glslang.c b/libavutil/vulkan_glslang.c new file mode 100644 index 0000000000..d19a36c5fb --- /dev/null +++ b/libavutil/vulkan_glslang.c @@ -0,0 +1,256 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include +#include + +#include "mem.h" +#include "avassert.h" + +#include "vulkan_glslang.h" + +static pthread_mutex_t glslang_mutex = PTHREAD_MUTEX_INITIALIZER; +static int glslang_refcount = 0; + +static const glslang_resource_t glslc_resource_limits = { + .max_lights = 32, + .max_clip_planes = 6, + .max_texture_units = 32, + .max_texture_coords = 32, + .max_vertex_attribs = 64, + .max_vertex_uniform_components = 4096, + .max_varying_floats = 64, + .max_vertex_texture_image_units = 32, + .max_combined_texture_image_units = 80, + .max_texture_image_units = 32, + .max_fragment_uniform_components = 4096, + .max_draw_buffers = 32, + .max_vertex_uniform_vectors = 128, + .max_varying_vectors = 8, + .max_fragment_uniform_vectors = 16, + .max_vertex_output_vectors = 16, + .max_fragment_input_vectors = 15, + .min_program_texel_offset = -8, + .max_program_texel_offset = 7, + .max_clip_distances = 8, + .max_compute_work_group_count_x = 65535, + .max_compute_work_group_count_y = 65535, + .max_compute_work_group_count_z = 65535, + .max_compute_work_group_size_x = 1024, + .max_compute_work_group_size_y = 1024, + .max_compute_work_group_size_z = 64, + .max_compute_uniform_components = 1024, + .max_compute_texture_image_units = 16, + .max_compute_image_uniforms = 8, + .max_compute_atomic_counters = 8, + .max_compute_atomic_counter_buffers = 1, + .max_varying_components = 60, + .max_vertex_output_components = 64, + .max_geometry_input_components = 64, + .max_geometry_output_components = 128, + .max_fragment_input_components = 128, + .max_image_units = 8, + .max_combined_image_units_and_fragment_outputs = 8, + .max_combined_shader_output_resources = 8, + .max_image_samples = 0, + .max_vertex_image_uniforms = 0, + .max_tess_control_image_uniforms = 0, + .max_tess_evaluation_image_uniforms = 0, + .max_geometry_image_uniforms = 0, + .max_fragment_image_uniforms = 8, + .max_combined_image_uniforms = 8, + .max_geometry_texture_image_units = 16, + .max_geometry_output_vertices = 256, + .max_geometry_total_output_components = 1024, + .max_geometry_uniform_components = 1024, + .max_geometry_varying_components = 64, + .max_tess_control_input_components = 128, + .max_tess_control_output_components = 128, + .max_tess_control_texture_image_units = 16, + .max_tess_control_uniform_components = 1024, + .max_tess_control_total_output_components = 4096, + .max_tess_evaluation_input_components = 128, + .max_tess_evaluation_output_components = 128, + .max_tess_evaluation_texture_image_units = 16, + .max_tess_evaluation_uniform_components = 1024, + .max_tess_patch_components = 120, + .max_patch_vertices = 32, + .max_tess_gen_level = 64, + .max_viewports = 16, + .max_vertex_atomic_counters = 0, + .max_tess_control_atomic_counters = 0, + .max_tess_evaluation_atomic_counters = 0, + .max_geometry_atomic_counters = 0, + .max_fragment_atomic_counters = 8, + .max_combined_atomic_counters = 8, + .max_atomic_counter_bindings = 1, + .max_vertex_atomic_counter_buffers = 0, + .max_tess_control_atomic_counter_buffers = 0, + .max_tess_evaluation_atomic_counter_buffers = 0, + .max_geometry_atomic_counter_buffers = 0, + .max_fragment_atomic_counter_buffers = 1, + .max_combined_atomic_counter_buffers = 1, + .max_atomic_counter_buffer_size = 16384, + .max_transform_feedback_buffers = 4, + .max_transform_feedback_interleaved_components = 64, + .max_cull_distances = 8, + .max_combined_clip_and_cull_distances = 8, + .max_samples = 4, + .max_mesh_output_vertices_nv = 256, + .max_mesh_output_primitives_nv = 512, + .max_mesh_work_group_size_x_nv = 32, + .max_mesh_work_group_size_y_nv = 1, + .max_mesh_work_group_size_z_nv = 1, + .max_task_work_group_size_x_nv = 32, + .max_task_work_group_size_y_nv = 1, + .max_task_work_group_size_z_nv = 1, + .max_mesh_view_count_nv = 4, + .maxDualSourceDrawBuffersEXT = 1, + + .limits = { + .non_inductive_for_loops = 1, + .while_loops = 1, + .do_while_loops = 1, + .general_uniform_indexing = 1, + .general_attribute_matrix_vector_indexing = 1, + .general_varying_indexing = 1, + .general_sampler_indexing = 1, + .general_variable_indexing = 1, + .general_constant_matrix_vector_indexing = 1, + } +}; + +int ff_vk_glslang_shader_compile(void *avctx, FFSPIRVShader *shd, + uint8_t **data, size_t *size, void **opaque) +{ + const char *messages; + glslang_shader_t *glslc_shader; + glslang_program_t *glslc_program; + + static const glslang_stage_t glslc_stage[] = { + [VK_SHADER_STAGE_VERTEX_BIT] = GLSLANG_STAGE_VERTEX, + [VK_SHADER_STAGE_FRAGMENT_BIT] = GLSLANG_STAGE_FRAGMENT, + [VK_SHADER_STAGE_COMPUTE_BIT] = GLSLANG_STAGE_COMPUTE, + }; + + const glslang_input_t glslc_input = { + .language = GLSLANG_SOURCE_GLSL, + .stage = glslc_stage[shd->shader.stage], + .client = GLSLANG_CLIENT_VULKAN, + /* GLSLANG_TARGET_VULKAN_1_2 before 11.6 resulted in targeting 1.0 */ +#if (((GLSLANG_VERSION_MAJOR) > 11) || ((GLSLANG_VERSION_MAJOR) == 11 && \ + (((GLSLANG_VERSION_MINOR) > 6) || ((GLSLANG_VERSION_MINOR) == 6 && \ + ((GLSLANG_VERSION_PATCH) > 0))))) + .client_version = GLSLANG_TARGET_VULKAN_1_2, + .target_language_version = GLSLANG_TARGET_SPV_1_5, +#else + .client_version = GLSLANG_TARGET_VULKAN_1_1, + .target_language_version = GLSLANG_TARGET_SPV_1_3, +#endif + .target_language = GLSLANG_TARGET_SPV, + .code = shd->src.str, + .default_version = 460, + .default_profile = GLSLANG_NO_PROFILE, + .force_default_version_and_profile = false, + .forward_compatible = false, + .messages = GLSLANG_MSG_DEFAULT_BIT, + .resource = &glslc_resource_limits, + }; + + av_assert0(glslang_refcount); + + if (!(glslc_shader = glslang_shader_create(&glslc_input))) + return AVERROR(ENOMEM); + + if (!glslang_shader_preprocess(glslc_shader, &glslc_input)) { + ff_vk_print_shader(avctx, shd, AV_LOG_WARNING); + av_log(avctx, AV_LOG_ERROR, "Unable to preprocess shader: %s (%s)!\n", + glslang_shader_get_info_log(glslc_shader), + glslang_shader_get_info_debug_log(glslc_shader)); + glslang_shader_delete(glslc_shader); + return AVERROR(EINVAL); + } + + if (!glslang_shader_parse(glslc_shader, &glslc_input)) { + ff_vk_print_shader(avctx, shd, AV_LOG_WARNING); + av_log(avctx, AV_LOG_ERROR, "Unable to parse shader: %s (%s)!\n", + glslang_shader_get_info_log(glslc_shader), + glslang_shader_get_info_debug_log(glslc_shader)); + glslang_shader_delete(glslc_shader); + return AVERROR(EINVAL); + } + + if (!(glslc_program = glslang_program_create())) { + glslang_shader_delete(glslc_shader); + return AVERROR(EINVAL); + } + + glslang_program_add_shader(glslc_program, glslc_shader); + + if (!glslang_program_link(glslc_program, GLSLANG_MSG_SPV_RULES_BIT | + GLSLANG_MSG_VULKAN_RULES_BIT)) { + ff_vk_print_shader(avctx, shd, AV_LOG_WARNING); + av_log(avctx, AV_LOG_ERROR, "Unable to link shader: %s (%s)!\n", + glslang_program_get_info_log(glslc_program), + glslang_program_get_info_debug_log(glslc_program)); + glslang_program_delete(glslc_program); + glslang_shader_delete(glslc_shader); + return AVERROR(EINVAL); + } + + glslang_program_SPIRV_generate(glslc_program, glslc_input.stage); + + messages = glslang_program_SPIRV_get_messages(glslc_program); + if (messages) + av_log(avctx, AV_LOG_WARNING, "%s\n", messages); + + glslang_shader_delete(glslc_shader); + + *size = glslang_program_SPIRV_get_size(glslc_program) * sizeof(unsigned int); + *data = (void *)glslang_program_SPIRV_get_ptr(glslc_program); + *opaque = glslc_program; + + return 0; +} + +void ff_vk_glslang_shader_free(void *opaque) +{ + glslang_program_delete(opaque); +} + +int ff_vk_glslang_init(void) +{ + int ret = 0; + + pthread_mutex_lock(&glslang_mutex); + if (glslang_refcount++ == 0) + ret = !glslang_initialize_process(); + pthread_mutex_unlock(&glslang_mutex); + + return ret; +} + +void ff_vk_glslang_uninit(void) +{ + pthread_mutex_lock(&glslang_mutex); + if (glslang_refcount && (--glslang_refcount == 0)) + glslang_finalize_process(); + pthread_mutex_unlock(&glslang_mutex); +} diff --git a/libavfilter/glslang.h b/libavutil/vulkan_glslang.h similarity index 87% rename from libavfilter/glslang.h rename to libavutil/vulkan_glslang.h index 93a077dbfc..6a848ad6c0 100644 --- a/libavfilter/glslang.h +++ b/libavutil/vulkan_glslang.h @@ -16,8 +16,8 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ -#ifndef AVFILTER_GLSLANG_H -#define AVFILTER_GLSLANG_H +#ifndef AVUTIL_GLSLANG_H +#define AVUTIL_GLSLANG_H #include "vulkan.h" @@ -30,7 +30,7 @@ void ff_vk_glslang_uninit(void); /** * Compile GLSL into SPIR-V using glslang. */ -int ff_vk_glslang_shader_compile(AVFilterContext *avctx, FFSPIRVShader *shd, +int ff_vk_glslang_shader_compile(void *avctx, FFSPIRVShader *shd, uint8_t **data, size_t *size, void **opaque); /** @@ -38,4 +38,4 @@ int ff_vk_glslang_shader_compile(AVFilterContext *avctx, FFSPIRVShader *shd, */ void ff_vk_glslang_shader_free(void *opaque); -#endif /* AVFILTER_GLSLANG_H */ +#endif /* AVUTIL_GLSLANG_H */