From patchwork Sun Apr 22 16:47:51 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rostislav Pehlivanov X-Patchwork-Id: 8590 Delivered-To: ffmpegpatchwork@gmail.com Received: by 2002:a02:155:0:0:0:0:0 with SMTP id c82-v6csp2915420jad; Sun, 22 Apr 2018 09:56:23 -0700 (PDT) X-Google-Smtp-Source: AIpwx494fIH3zhXdCXYJahZGSwpgiHq55c6H7MUuZz00/Jw6z7nNM/Kn64Eox3TqepC7GQ3VWImj X-Received: by 2002:adf:c792:: with SMTP id l18-v6mr14968952wrg.224.1524416183585; Sun, 22 Apr 2018 09:56:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524416183; cv=none; d=google.com; s=arc-20160816; b=qv8DMbkV9o8NcbIi2wmahbTriAZu9mi1HgTe1XQ1+ECazleMyhtP1FWaAV90Y/GNrS zikRxe32Yd8n3WH15xiNcQ7sGoshOmm5suvVyOoWvs9cygn+Nrcn0TvBZI1ESy4tBUmS Yp9jHOWeSMfDtzRSJB7/kQ9+L22+GUo32pIhhZUc9O7tXPM8Bmq1yz13UcgOuEe4A3yO r10+WlDuliSYHHfXFhCeD5AvbbWGhsoCT6oLsYkUTibe2SOvaST8+a3rKrBrCtcaul3b qn5MbG41gaq4jKQnCAZqs5R+IO2XdH9TQi56ENGf2zJn4mnZ3nRHavNDlwG6h45/w472 Csow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:references:in-reply-to:message-id:date :to:from:dkim-signature:delivered-to:arc-authentication-results; bh=4yuY/gq6ixwueuaRsFpVDRwejOJMi6rRYO4yQT85fOE=; b=eciFbQ9CypQVd+xSwvBIvPTmGxrTokIHoxseCqUgQZe3tKZvedbtXi5Nur8PUMW9gJ 46pJLKEqDNtObXmRduW/RhYJyKx2vuvIg8X85qF+YID5YpHQzC27LM/VW55T7Ry3BAMM T7Pz8ybMUKBr/QrWXyu1JqFB6Mmlp32tiT/jebgM0H8aGEFxOTqZdcRg3C/2CHcsHxI4 mKa55934h/0g3rijgOc0oAw3Wv9ldrys/EI7cVRNJ/3cXbWL6tpSbaPOwDUIeh/xD78R 4FknRTXOJHoHlyAaKdfx2imq3jC4deGt2MGTaenaWYdhgyeDX2sQCYcQxDJhzIEzOXVT WoMQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20161025 header.b=Inn/WZGs; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id j199si4337084wmj.92.2018.04.22.09.56.23; Sun, 22 Apr 2018 09:56:23 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20161025 header.b=Inn/WZGs; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C8A8868A355; Sun, 22 Apr 2018 19:55:52 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wr0-f193.google.com (mail-wr0-f193.google.com [209.85.128.193]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id CBB3968A33E for ; Sun, 22 Apr 2018 19:55:46 +0300 (EEST) Received: by mail-wr0-f193.google.com with SMTP id p5-v6so7124503wre.12 for ; Sun, 22 Apr 2018 09:56:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=r4rcgn9bgU9q8xnVqRtrKoMldWHi8V/YrcuaWC5w6fg=; b=Inn/WZGspAFf7zFJgvVNNfRqm9fQz4GjCoqrShBZIswG4XitUnDa/VaQxrgmSfpH1e apC286bZB+hA5BRDbE4nzQf8oDtLk6QFa7ZV3L7bqgDJBW16HrsGL5H0+mu4nNyRtqJY 0e18zybzUiORFJ7BzaCIwiMw9DHo7N+fmok2tXmvttZnaoUDvMj3Jjqo2O6QqfMYtnS4 B0ftENIgSPlCFtd0CoJUdrQorKPIIbJKbbYW9kddED0P3GIymFWliIlJixIzXkkXACcC vCltmMesCrwGBpjRum1AjKAjKK+4odZg1MvDaPn6oWkN48I2+V/34pOdWdt7KJAbb3rY kj+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=r4rcgn9bgU9q8xnVqRtrKoMldWHi8V/YrcuaWC5w6fg=; b=I4cnfO52wUnVcgm+LHJEwHlMjFmT2f7/mqI/2kjnHkyMej/KYECiUb4LWqPwOvLL0B c5t2TcpBZvp1yEYcyQTZqyPzIhx8AnYKiOFZ48d2XaBGUf3fvXoDlcKXOZBgpcXRtpYZ Qtj2ECfgJOnJiBPA82C2pYfm/RK7+CEQ62RmzSgGYk+kHtBATGdxASkzWTOsjPJp+wi7 UJ7cq4FU6PYLDOFea7lUgJ0Umma0L5p5O5TSsnZX38mJcXltBPgQD1qzhWHfmX+HC8Xl fuVYbirJ6MVhWwR3cMBI7RNmzm+/PuFCBtM5eWmuHO7QSfrVRrirmJIcq01nG3vxNoZy zkoA== X-Gm-Message-State: ALQs6tAUt/9jUjsEvxZAxafPAouB0HViypkybOOn7bD5+7Dau1kBgIy4 aaEMC+HEZLL1sYCVEKuTULavArGA X-Received: by 2002:adf:9893:: with SMTP id w19-v6mr13194725wrb.34.1524415720912; Sun, 22 Apr 2018 09:48:40 -0700 (PDT) Received: from moonbase.pars.ee ([2a00:23c4:7c88:af00:5419:5e6c:3fec:91ff]) by smtp.gmail.com with ESMTPSA id l41-v6sm18923022wrl.2.2018.04.22.09.48.40 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 22 Apr 2018 09:48:40 -0700 (PDT) From: Rostislav Pehlivanov To: ffmpeg-devel@ffmpeg.org Date: Sun, 22 Apr 2018 17:47:51 +0100 Message-Id: <20180422164751.22628-9-atomnuker@gmail.com> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180422164751.22628-1-atomnuker@gmail.com> References: <20180422164751.22628-1-atomnuker@gmail.com> Subject: [FFmpeg-devel] [PATCH v2 8/8] lavfi: add a Vulkan overlay filter X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Rostislav Pehlivanov MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Could be done in-plane with the main image but framesync segfaults. Signed-off-by: Rostislav Pehlivanov --- configure | 1 + libavfilter/Makefile | 1 + libavfilter/allfilters.c | 1 + libavfilter/vf_overlay_vulkan.c | 458 ++++++++++++++++++++++++++++++++ 4 files changed, 461 insertions(+) create mode 100644 libavfilter/vf_overlay_vulkan.c diff --git a/configure b/configure index 19a231104e..93000a3a4c 100755 --- a/configure +++ b/configure @@ -3354,6 +3354,7 @@ ocr_filter_deps="libtesseract" ocv_filter_deps="libopencv" openclsrc_filter_deps="opencl" overlay_opencl_filter_deps="opencl" +overlay_vulkan_filter_deps="vulkan libshaderc" overlay_qsv_filter_deps="libmfx" overlay_qsv_filter_select="qsvvpp" owdenoise_filter_deps="gpl" diff --git a/libavfilter/Makefile b/libavfilter/Makefile index 83a42563f5..0b858ac917 100644 --- a/libavfilter/Makefile +++ b/libavfilter/Makefile @@ -272,6 +272,7 @@ OBJS-$(CONFIG_OSCILLOSCOPE_FILTER) += vf_datascope.o OBJS-$(CONFIG_OVERLAY_FILTER) += vf_overlay.o framesync.o OBJS-$(CONFIG_OVERLAY_OPENCL_FILTER) += vf_overlay_opencl.o opencl.o \ opencl/overlay.o framesync.o +OBJS-$(CONFIG_OVERLAY_VULKAN_FILTER) += vf_overlay_vulkan.o OBJS-$(CONFIG_OVERLAY_QSV_FILTER) += vf_overlay_qsv.o framesync.o OBJS-$(CONFIG_OWDENOISE_FILTER) += vf_owdenoise.o OBJS-$(CONFIG_PAD_FILTER) += vf_pad.o diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c index 31966e9a21..8016fed36c 100644 --- a/libavfilter/allfilters.c +++ b/libavfilter/allfilters.c @@ -263,6 +263,7 @@ extern AVFilter ff_vf_ocv; extern AVFilter ff_vf_oscilloscope; extern AVFilter ff_vf_overlay; extern AVFilter ff_vf_overlay_opencl; +extern AVFilter ff_vf_overlay_vulkan; extern AVFilter ff_vf_overlay_qsv; extern AVFilter ff_vf_owdenoise; extern AVFilter ff_vf_pad; diff --git a/libavfilter/vf_overlay_vulkan.c b/libavfilter/vf_overlay_vulkan.c new file mode 100644 index 0000000000..549e84e308 --- /dev/null +++ b/libavfilter/vf_overlay_vulkan.c @@ -0,0 +1,458 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/opt.h" +#include "vulkan.h" +#include "internal.h" +#include "framesync.h" + +typedef struct OverlayVulkanContext { + VulkanFilterContext vkctx; + + int initialized; + FFVkExecContext exec; + FFFrameSync fs; + FFVkBuffer params_buf; + + /* Shader updators, must be in the main filter struct */ + VkDescriptorImageInfo main_images[3]; + VkDescriptorImageInfo overlay_images[3]; + VkDescriptorImageInfo output_images[3]; + VkDescriptorBufferInfo params_desc; + + int overlay_x; + int overlay_y; +} OverlayVulkanContext; + +static const char overlay_noalpha[] = { + C(0, void overlay_noalpha(int i, ivec2 pos) ) + C(0, { ) + C(1, ivec2 overlay_size = imageSize(overlay_img[i]); ) + C(1, if ((o_offset[i].x <= pos.x) && (o_offset[i].y <= pos.y) && + (pos.x < (o_offset[i].x + overlay_size.x)) && + (pos.y < (o_offset[i].y + overlay_size.y))) { ) + C(2, vec4 res = imageLoad(overlay_img[i], pos - o_offset[i]); ) + C(2, imageStore(output_img[i], pos, res); ) + C(1, } else { ) + C(2, vec4 res = imageLoad(main_img[i], pos); ) + C(2, imageStore(output_img[i], pos, res); ) + C(1, } ) + C(0, } ) +}; + +static av_cold int init_filter(AVFilterContext *ctx) +{ + int err; + OverlayVulkanContext *s = ctx->priv; + + { /* Create the shader */ + const int planes = av_pix_fmt_count_planes(s->vkctx.output_format); + + SPIRVShader *shd = ff_vk_init_shader(ctx, "overlay_compute", + VK_SHADER_STAGE_COMPUTE_BIT); + ff_vk_set_compute_shader_sizes(ctx, shd, (int [3]){ 16, 16, 1 }); + + VulkanDescriptorSetBinding desc_i[3] = { + { + .name = "main_img", + .type = VK_DESCRIPTOR_TYPE_STORAGE_IMAGE, + .mem_layout = "rgba8", + .mem_quali = "readonly", + .dimensions = 2, + .elems = planes, + .stages = VK_SHADER_STAGE_COMPUTE_BIT, + .updater = s->main_images, + }, + { + .name = "overlay_img", + .type = VK_DESCRIPTOR_TYPE_STORAGE_IMAGE, + .mem_layout = "rgba8", + .mem_quali = "readonly", + .dimensions = 2, + .elems = planes, + .stages = VK_SHADER_STAGE_COMPUTE_BIT, + .updater = s->overlay_images, + }, + { + .name = "output_img", + .type = VK_DESCRIPTOR_TYPE_STORAGE_IMAGE, + .mem_layout = "rgba8", + .mem_quali = "writeonly", + .dimensions = 2, + .elems = planes, + .stages = VK_SHADER_STAGE_COMPUTE_BIT, + .updater = s->output_images, + }, + }; + + VulkanDescriptorSetBinding desc_b = { + .name = "params", + .type = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, + .mem_quali = "readonly", + .mem_layout = "std430", + .stages = VK_SHADER_STAGE_COMPUTE_BIT, + .updater = &s->params_desc, + .buf_content = "ivec2 o_offset[3];", + }; + + RET(ff_vk_add_descriptor_set(ctx, shd, desc_i, 3, 0)); /* set 0 */ + RET(ff_vk_add_descriptor_set(ctx, shd, &desc_b, 1, 0)); /* set 1 */ + + GLSLD( overlay_noalpha ); + GLSLC(0, void main() ); + GLSLC(0, { ); + GLSLC(1, ivec2 pos = ivec2(gl_GlobalInvocationID.xy); ); + GLSLF(1, int planes = %i; ,planes); + GLSLC(1, for (int i = 0; i < planes; i++) { ); + GLSLC(2, overlay_noalpha(i, pos); ); + GLSLC(1, } ); + GLSLC(0, } ); + + RET(ff_vk_compile_shader(ctx, shd, "main")); + } + + RET(ff_vk_init_pipeline_layout(ctx)); + + { + struct { + int32_t o_offset[2*3]; + } *par; + + err = ff_vk_create_buf(ctx, &s->params_buf, + sizeof(*par), + VK_BUFFER_USAGE_VERTEX_BUFFER_BIT, + VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT); + if (err) + return err; + + err = ff_vk_map_buffers(ctx, &s->params_buf, (uint8_t **)&par, 1, 0); + if (err) + return err; + + par->o_offset[0] = s->overlay_x; + par->o_offset[1] = s->overlay_y; + par->o_offset[2] = par->o_offset[0]/2; + par->o_offset[3] = par->o_offset[1]/2; + par->o_offset[4] = par->o_offset[0]/2; + par->o_offset[5] = par->o_offset[1]/2; + + err = ff_vk_unmap_buffers(ctx, &s->params_buf, 1, 1); + if (err) + return err; + + s->params_desc.buffer = s->params_buf.buf; + s->params_desc.range = VK_WHOLE_SIZE; + + ff_vk_update_descriptor_set(ctx, 1); + } + + /* Execution context */ + RET(ff_vk_create_exec_ctx(ctx, &s->exec, + s->vkctx.hwctx->queue_family_comp_index)); + + /* The pipeline */ + RET(ff_vk_init_compute_pipeline(ctx)); + + s->initialized = 1; + + return 0; + +fail: + return err; +} + +static int process_frames(AVFilterContext *avctx, AVFrame *out_f, + AVFrame *main_f, AVFrame *overlay_f) +{ + int err; + OverlayVulkanContext *s = avctx->priv; + int planes = av_pix_fmt_count_planes(s->vkctx.output_format); + + AVVkFrame *out = (AVVkFrame *)out_f->data[0]; + AVVkFrame *main = (AVVkFrame *)main_f->data[0]; + AVVkFrame *overlay = (AVVkFrame *)overlay_f->data[0]; + + AVHWFramesContext *main_fc = (AVHWFramesContext*)main_f->hw_frames_ctx->data; + AVHWFramesContext *overlay_fc = (AVHWFramesContext*)overlay_f->hw_frames_ctx->data; + + VkCommandBufferBeginInfo cmd_start = { + .sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO, + .flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT, + }; + + VkComponentMapping null_map = { + .r = VK_COMPONENT_SWIZZLE_IDENTITY, + .g = VK_COMPONENT_SWIZZLE_IDENTITY, + .b = VK_COMPONENT_SWIZZLE_IDENTITY, + .a = VK_COMPONENT_SWIZZLE_IDENTITY, + }; + + for (int i = 0; i < planes; i++) { + RET(ff_vk_create_imageview(avctx, &s->main_images[i].imageView, main, + ff_vk_plane_rep_fmt(main_fc->sw_format, i), + ff_vk_aspect_flags(main_fc->sw_format, i), + null_map, NULL)); + + RET(ff_vk_create_imageview(avctx, &s->overlay_images[i].imageView, overlay, + ff_vk_plane_rep_fmt(overlay_fc->sw_format, i), + ff_vk_aspect_flags(overlay_fc->sw_format, i), + null_map, NULL)); + + RET(ff_vk_create_imageview(avctx, &s->output_images[i].imageView, out, + ff_vk_plane_rep_fmt(s->vkctx.output_format, i), + ff_vk_aspect_flags(s->vkctx.output_format, i), + null_map, NULL)); + + s->main_images[i].imageLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL; + s->overlay_images[i].imageLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL; + s->output_images[i].imageLayout = VK_IMAGE_LAYOUT_GENERAL; + } + + ff_vk_update_descriptor_set(avctx, 0); + + vkBeginCommandBuffer(s->exec.buf, &cmd_start); + + { + VkImageMemoryBarrier bar[3] = { + { + .sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER, + .srcAccessMask = 0, + .dstAccessMask = VK_ACCESS_SHADER_READ_BIT, + .oldLayout = main->layout, + .newLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL, + .srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED, + .dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED, + .image = main->img, + .subresourceRange.aspectMask = ff_vk_aspect_flags(s->vkctx.input_format, -1), + .subresourceRange.levelCount = 1, + .subresourceRange.layerCount = 1, + }, + { + .sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER, + .srcAccessMask = 0, + .dstAccessMask = VK_ACCESS_SHADER_READ_BIT, + .oldLayout = overlay->layout, + .newLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL, + .srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED, + .dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED, + .image = overlay->img, + .subresourceRange.aspectMask = ff_vk_aspect_flags(s->vkctx.input_format, -1), + .subresourceRange.levelCount = 1, + .subresourceRange.layerCount = 1, + }, + { + .sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER, + .srcAccessMask = 0, + .dstAccessMask = VK_ACCESS_SHADER_WRITE_BIT, + .oldLayout = out->layout, + .newLayout = VK_IMAGE_LAYOUT_GENERAL, + .srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED, + .dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED, + .image = out->img, + .subresourceRange.aspectMask = ff_vk_aspect_flags(s->vkctx.input_format, -1), + .subresourceRange.levelCount = 1, + .subresourceRange.layerCount = 1, + }, + }; + + vkCmdPipelineBarrier(s->exec.buf, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT, + VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT, 0, + 0, NULL, 0, NULL, 3, bar); + + main->layout = bar[0].newLayout; + main->access = bar[0].dstAccessMask; + + overlay->layout = bar[1].newLayout; + overlay->access = bar[1].dstAccessMask; + + out->layout = bar[2].newLayout; + out->access = bar[2].dstAccessMask; + } + + vkCmdBindPipeline(s->exec.buf, VK_PIPELINE_BIND_POINT_COMPUTE, s->vkctx.pipeline); + vkCmdBindDescriptorSets(s->exec.buf, VK_PIPELINE_BIND_POINT_COMPUTE, s->vkctx.pipeline_layout, 0, s->vkctx.descriptor_sets_num, s->vkctx.desc_set, 0, 0); + vkCmdDispatch(s->exec.buf, + FFALIGN(s->vkctx.output_width, s->vkctx.shaders[0].local_size[0])/s->vkctx.shaders[0].local_size[0], + FFALIGN(s->vkctx.output_height, s->vkctx.shaders[0].local_size[1])/s->vkctx.shaders[0].local_size[1], 1); + + vkEndCommandBuffer(s->exec.buf); + + VkSubmitInfo s_info = { + .sType = VK_STRUCTURE_TYPE_SUBMIT_INFO, + .commandBufferCount = 1, + .pCommandBuffers = &s->exec.buf, + }; + + VkResult ret = vkQueueSubmit(s->exec.queue, 1, &s_info, s->exec.fence); + if (ret != VK_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Unable to submit command buffer: %s\n", + ff_vk_ret2str(ret)); + return AVERROR_EXTERNAL; + } else { + vkWaitForFences(s->vkctx.hwctx->act_dev, 1, &s->exec.fence, VK_TRUE, UINT64_MAX); + vkResetFences(s->vkctx.hwctx->act_dev, 1, &s->exec.fence); + } + +fail: + + for (int i = 0; i < planes; i++) { + ff_vk_destroy_imageview(avctx, s->main_images[i].imageView); + ff_vk_destroy_imageview(avctx, s->overlay_images[i].imageView); + ff_vk_destroy_imageview(avctx, s->output_images[i].imageView); + } + + return err; +} + +static int overlay_vulkan_blend(FFFrameSync *fs) +{ + int err; + AVFilterContext *ctx = fs->parent; + OverlayVulkanContext *s = ctx->priv; + AVFilterLink *outlink = ctx->outputs[0]; + AVFrame *input_main, *input_overlay, *out; + + err = ff_framesync_get_frame(fs, 0, &input_main, 0); + if (err < 0) + goto fail; + err = ff_framesync_get_frame(fs, 1, &input_overlay, 0); + if (err < 0) + goto fail; + + if (!input_main || !input_overlay) + return 0; + + if (!s->initialized) { + AVHWFramesContext *main_fc = (AVHWFramesContext*)input_main->hw_frames_ctx->data; + AVHWFramesContext *overlay_fc = (AVHWFramesContext*)input_overlay->hw_frames_ctx->data; + if (main_fc->sw_format != overlay_fc->sw_format) { + av_log(ctx, AV_LOG_ERROR, "Mismatching sw formats!\n"); + return AVERROR(EINVAL); + } + RET(init_filter(ctx)); + } + + out = ff_get_video_buffer(outlink, outlink->w, outlink->h); + if (!out) { + err = AVERROR(ENOMEM); + goto fail; + } + + RET(process_frames(ctx, out, input_main, input_overlay)); + + err = av_frame_copy_props(out, input_main); + if (err < 0) + goto fail; + + return ff_filter_frame(outlink, out); + +fail: + av_frame_free(&out); + return err; +} + +static int overlay_vulkan_config_output(AVFilterLink *outlink) +{ + int err; + AVFilterContext *avctx = outlink->src; + OverlayVulkanContext *s = avctx->priv; + + err = ff_vk_filter_config_output(outlink); + if (err < 0) + return err; + + err = ff_framesync_init_dualinput(&s->fs, avctx); + if (err < 0) + return err; + + return ff_framesync_configure(&s->fs); +} + +static int overlay_vulkan_activate(AVFilterContext *avctx) +{ + OverlayVulkanContext *s = avctx->priv; + + return ff_framesync_activate(&s->fs); +} + +static av_cold int overlay_vulkan_init(AVFilterContext *avctx) +{ + OverlayVulkanContext *s = avctx->priv; + + s->fs.on_event = &overlay_vulkan_blend; + + return ff_vk_filter_init(avctx); +} + +static void overlay_vulkan_uninit(AVFilterContext *avctx) +{ + OverlayVulkanContext *s = avctx->priv; + + ff_vk_free_exec_ctx(avctx, &s->exec); + ff_vk_filter_uninit(avctx); + ff_framesync_uninit(&s->fs); + + s->initialized = 0; +} + +#define OFFSET(x) offsetof(OverlayVulkanContext, x) +#define FLAGS (AV_OPT_FLAG_FILTERING_PARAM | AV_OPT_FLAG_VIDEO_PARAM) +static const AVOption overlay_vulkan_options[] = { + { "x", "Set horizontal offset", OFFSET(overlay_x), AV_OPT_TYPE_INT, {.i64 = 0}, 0, INT_MAX, .flags = FLAGS }, + { "y", "Set vertical offset", OFFSET(overlay_y), AV_OPT_TYPE_INT, {.i64 = 0}, 0, INT_MAX, .flags = FLAGS }, + { NULL }, +}; + +AVFILTER_DEFINE_CLASS(overlay_vulkan); + +static const AVFilterPad overlay_vulkan_inputs[] = { + { + .name = "main", + .type = AVMEDIA_TYPE_VIDEO, + .config_props = &ff_vk_filter_config_input, + }, + { + .name = "overlay", + .type = AVMEDIA_TYPE_VIDEO, + .config_props = &ff_vk_filter_config_input, + }, + { NULL } +}; + +static const AVFilterPad overlay_vulkan_outputs[] = { + { + .name = "default", + .type = AVMEDIA_TYPE_VIDEO, + .config_props = &overlay_vulkan_config_output, + }, + { NULL } +}; + +AVFilter ff_vf_overlay_vulkan = { + .name = "overlay_vulkan", + .description = NULL_IF_CONFIG_SMALL("Overlay a source on top of another"), + .priv_size = sizeof(OverlayVulkanContext), + .init = &overlay_vulkan_init, + .uninit = &overlay_vulkan_uninit, + .query_formats = &ff_vk_filter_query_formats, + .activate = &overlay_vulkan_activate, + .inputs = overlay_vulkan_inputs, + .outputs = overlay_vulkan_outputs, + .priv_class = &overlay_vulkan_class, + .flags_internal = FF_FILTER_FLAG_HWFRAME_AWARE, +};