From patchwork Tue Aug 31 01:43:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenbin Chen X-Patchwork-Id: 29886 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6602:2a4a:0:0:0:0 with SMTP id k10csp4497390iov; Mon, 30 Aug 2021 18:46:21 -0700 (PDT) X-Google-Smtp-Source: ABdhPJztvlCEC8SxwjiiV2Cr3+nD0j6sSMF+mrkXLZZ839K7Qwzcrdf5QWmCj8mL6/64yluaFoeG X-Received: by 2002:a17:906:32d6:: with SMTP id k22mr27695936ejk.228.1630374380978; Mon, 30 Aug 2021 18:46:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1630374380; cv=none; d=google.com; s=arc-20160816; b=K78l7+zrZ79K6D8rUDw4YOVIryRTyyjDwY4feGhIy3jDdiorziI8D8FyYvqaGUR+80 tAxBTGKUcUSoUYDgqKSuBwmDWc/FsDktNyg/UvPJxaKQjgbvwEvCdF/7r/sgSqGkk/iS +C5fcJgeIaFQzszDZVCrEz/IomWxyGhSXsyV1qAUjNI+UD4Qm5f/PZD4b+DQU3B5BsG8 lHmajtqFBtgmrr8K+9/LCDIVN0GukN62cqYmOXr/lq3BNGiR471JUjsGVTTiYh8f9yCJ cE5tIn6OQSROaPHnc8W/SOkgW5HoXLcWfnyH6ji2U/2aWwBmbLaxB0DDzYh28TjngZkt +bBw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:delivered-to; bh=+bqJiFyQj69iaUrv26zESv0td4ek5hIvRFVIPUHYNao=; b=B5qFPxLRU0V8HoxkVhRs2pIM+X8Qc+9Tp9uOr8F3oqBb411E7JGHrKrIeVs+HiL5Bc razb+tDHyl3ca90DA8D8g6IIo6dutRYW7T8vcTSHDFPkz/wFCj1aTFc0Hw5Ux+fXm+A8 eQsBT9SKZCHReJH1+khBCsPbBo7rDCARmF5BYFA6WHHF/TdSxcGe3xX3O0UVPJ59ZmY0 Wtx8XmBwlz02yUGAW/w7Qmai9BN6VuvW0Xwtk0eb4ohxd34jZJf7QSiSoBHK3g2X4tXT x3n3aeWGOz8mBJ5cb1ZbhPRtPvwJySBqx9Bv/pCxIFKA0l+rDBB1JEtuWnEA7hVvbR4C yamg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a1si16687046edr.277.2021.08.30.18.46.20; Mon, 30 Aug 2021 18:46:20 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2B0A668A336; Tue, 31 Aug 2021 04:46:01 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 5F6A3689E61 for ; Tue, 31 Aug 2021 04:45:52 +0300 (EEST) X-IronPort-AV: E=McAfee;i="6200,9189,10092"; a="216531709" X-IronPort-AV: E=Sophos;i="5.84,365,1620716400"; d="scan'208";a="216531709" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Aug 2021 18:45:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.84,365,1620716400"; d="scan'208";a="530827040" Received: from chenwenbin-z390-aorus-ultra.sh.intel.com ([10.239.35.5]) by FMSMGA003.fm.intel.com with ESMTP; 30 Aug 2021 18:45:44 -0700 From: wenbin.chen@intel.com To: ffmpeg-devel@ffmpeg.org Date: Tue, 31 Aug 2021 09:43:31 +0800 Message-Id: <20210831014338.134086-3-wenbin.chen@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210831014338.134086-1-wenbin.chen@intel.com> References: <20210831014338.134086-1-wenbin.chen@intel.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 03/10] libavfilter/vulkan: Fix the way to use sem X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: "Chen,Wenbin" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 14SnOm43nacW From: "Chen,Wenbin" We chould set waitSem and signalSem differently. Current ffmpeg-vulkan uses the same sem to set waitSem and signalSem and it doesn't work on latest intel-vulkan-driver. The commit: a193060221c4df123e26a562949cae5df3e73cde on mesa causes this problem. This commit add code to resets the signalSem. This will reset waitSem too on current ffmpeg-vulkan. Now set waitSem and signalSem separetely. Now the following command can run on the latest mesa on intel platform: ffmpeg -v verbose -init_hw_device vulkan=vul:0,linear_images=1 -filter_hw_device vul -i input1080p.264 -vf "hwupload=extra_hw_frames=16,scale_vulkan=1920:1080, hwdownload,format=yuv420p" -f rawvideo output.yuv Signed-off-by: Wenbin Chen --- libavfilter/vf_avgblur_vulkan.c | 4 +-- libavfilter/vf_chromaber_vulkan.c | 4 +-- libavfilter/vf_overlay_vulkan.c | 6 ++-- libavfilter/vf_scale_vulkan.c | 4 +-- libavfilter/vulkan.c | 55 +++++++++++++++++-------------- libavfilter/vulkan.h | 3 +- libavutil/hwcontext_vulkan.c | 14 ++++---- 7 files changed, 50 insertions(+), 40 deletions(-) diff --git a/libavfilter/vf_avgblur_vulkan.c b/libavfilter/vf_avgblur_vulkan.c index 5ae487fc8c..d2104c191e 100644 --- a/libavfilter/vf_avgblur_vulkan.c +++ b/libavfilter/vf_avgblur_vulkan.c @@ -304,8 +304,8 @@ static int process_frames(AVFilterContext *avctx, AVFrame *out_f, AVFrame *tmp_f vkCmdDispatch(cmd_buf, s->vkctx.output_width, FFALIGN(s->vkctx.output_height, CGS)/CGS, 1); - ff_vk_add_exec_dep(avctx, s->exec, in_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); - ff_vk_add_exec_dep(avctx, s->exec, out_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); + ff_vk_add_exec_dep(avctx, s->exec, in_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT, 1); + ff_vk_add_exec_dep(avctx, s->exec, out_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT, 0); err = ff_vk_submit_exec_queue(avctx, s->exec); if (err) diff --git a/libavfilter/vf_chromaber_vulkan.c b/libavfilter/vf_chromaber_vulkan.c index 96fdd7bd9c..fe66a31cea 100644 --- a/libavfilter/vf_chromaber_vulkan.c +++ b/libavfilter/vf_chromaber_vulkan.c @@ -249,8 +249,8 @@ static int process_frames(AVFilterContext *avctx, AVFrame *out_f, AVFrame *in_f) FFALIGN(s->vkctx.output_width, CGROUPS[0])/CGROUPS[0], FFALIGN(s->vkctx.output_height, CGROUPS[1])/CGROUPS[1], 1); - ff_vk_add_exec_dep(avctx, s->exec, in_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); - ff_vk_add_exec_dep(avctx, s->exec, out_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); + ff_vk_add_exec_dep(avctx, s->exec, in_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT, 1); + ff_vk_add_exec_dep(avctx, s->exec, out_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT, 0); err = ff_vk_submit_exec_queue(avctx, s->exec); if (err) diff --git a/libavfilter/vf_overlay_vulkan.c b/libavfilter/vf_overlay_vulkan.c index 1815709d82..2e5bef5be5 100644 --- a/libavfilter/vf_overlay_vulkan.c +++ b/libavfilter/vf_overlay_vulkan.c @@ -331,9 +331,9 @@ static int process_frames(AVFilterContext *avctx, AVFrame *out_f, FFALIGN(s->vkctx.output_width, CGROUPS[0])/CGROUPS[0], FFALIGN(s->vkctx.output_height, CGROUPS[1])/CGROUPS[1], 1); - ff_vk_add_exec_dep(avctx, s->exec, main_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); - ff_vk_add_exec_dep(avctx, s->exec, overlay_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); - ff_vk_add_exec_dep(avctx, s->exec, out_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); + ff_vk_add_exec_dep(avctx, s->exec, main_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT, 1); + ff_vk_add_exec_dep(avctx, s->exec, overlay_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT, 1); + ff_vk_add_exec_dep(avctx, s->exec, out_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT, 0); err = ff_vk_submit_exec_queue(avctx, s->exec); if (err) diff --git a/libavfilter/vf_scale_vulkan.c b/libavfilter/vf_scale_vulkan.c index 4eb4fe5664..0d946e0416 100644 --- a/libavfilter/vf_scale_vulkan.c +++ b/libavfilter/vf_scale_vulkan.c @@ -377,8 +377,8 @@ static int process_frames(AVFilterContext *avctx, AVFrame *out_f, AVFrame *in_f) FFALIGN(s->vkctx.output_width, CGROUPS[0])/CGROUPS[0], FFALIGN(s->vkctx.output_height, CGROUPS[1])/CGROUPS[1], 1); - ff_vk_add_exec_dep(avctx, s->exec, in_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); - ff_vk_add_exec_dep(avctx, s->exec, out_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT); + ff_vk_add_exec_dep(avctx, s->exec, in_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT, 1); + ff_vk_add_exec_dep(avctx, s->exec, out_f, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT, 0); err = ff_vk_submit_exec_queue(avctx, s->exec); if (err) diff --git a/libavfilter/vulkan.c b/libavfilter/vulkan.c index e5b070b3e6..e8cbf66b2b 100644 --- a/libavfilter/vulkan.c +++ b/libavfilter/vulkan.c @@ -462,9 +462,10 @@ VkCommandBuffer ff_vk_get_exec_buf(AVFilterContext *avctx, FFVkExecContext *e) } int ff_vk_add_exec_dep(AVFilterContext *avctx, FFVkExecContext *e, - AVFrame *frame, VkPipelineStageFlagBits in_wait_dst_flag) + AVFrame *frame, VkPipelineStageFlagBits in_wait_dst_flag, int input_frame) { AVFrame **dst; + VkSemaphore *sem_temp; VulkanFilterContext *s = avctx->priv; AVVkFrame *f = (AVVkFrame *)frame->data[0]; FFVkQueueCtx *q = &e->queues[s->cur_queue_idx]; @@ -472,33 +473,39 @@ int ff_vk_add_exec_dep(AVFilterContext *avctx, FFVkExecContext *e, int planes = av_pix_fmt_count_planes(fc->sw_format); for (int i = 0; i < planes; i++) { - e->sem_wait = av_fast_realloc(e->sem_wait, &e->sem_wait_alloc, - (e->sem_wait_cnt + 1)*sizeof(*e->sem_wait)); - if (!e->sem_wait) { - ff_vk_discard_exec_deps(avctx, e); - return AVERROR(ENOMEM); - } + if (input_frame) { + sem_temp = av_fast_realloc(e->sem_wait, &e->sem_wait_alloc, + (e->sem_wait_cnt + 1)*sizeof(*e->sem_wait)); + if (!sem_temp) { + ff_vk_discard_exec_deps(avctx, e); + return AVERROR(ENOMEM); + } + e->sem_wait = sem_temp; - e->sem_wait_dst = av_fast_realloc(e->sem_wait_dst, &e->sem_wait_dst_alloc, - (e->sem_wait_cnt + 1)*sizeof(*e->sem_wait_dst)); - if (!e->sem_wait_dst) { - ff_vk_discard_exec_deps(avctx, e); - return AVERROR(ENOMEM); - } + sem_temp = av_fast_realloc(e->sem_wait_dst, &e->sem_wait_dst_alloc, + (e->sem_wait_cnt + 1)*sizeof(*e->sem_wait_dst)); + if (!sem_temp) { + ff_vk_discard_exec_deps(avctx, e); + return AVERROR(ENOMEM); + } + e->sem_wait_dst = sem_temp; - e->sem_sig = av_fast_realloc(e->sem_sig, &e->sem_sig_alloc, - (e->sem_sig_cnt + 1)*sizeof(*e->sem_sig)); - if (!e->sem_sig) { - ff_vk_discard_exec_deps(avctx, e); - return AVERROR(ENOMEM); - } + e->sem_wait[e->sem_wait_cnt] = f->sem[i]; + e->sem_wait_dst[e->sem_wait_cnt] = in_wait_dst_flag; + e->sem_wait_cnt++; + } else { - e->sem_wait[e->sem_wait_cnt] = f->sem[i]; - e->sem_wait_dst[e->sem_wait_cnt] = in_wait_dst_flag; - e->sem_wait_cnt++; + sem_temp = av_fast_realloc(e->sem_sig, &e->sem_sig_alloc, + (e->sem_sig_cnt + 1)*sizeof(*e->sem_sig)); + if (!sem_temp) { + ff_vk_discard_exec_deps(avctx, e); + return AVERROR(ENOMEM); + } + e->sem_sig = sem_temp; - e->sem_sig[e->sem_sig_cnt] = f->sem[i]; - e->sem_sig_cnt++; + e->sem_sig[e->sem_sig_cnt] = f->sem[i]; + e->sem_sig_cnt++; + } } dst = av_fast_realloc(q->frame_deps, &q->frame_deps_alloc_size, diff --git a/libavfilter/vulkan.h b/libavfilter/vulkan.h index f9a4dc5839..2fdc0e1368 100644 --- a/libavfilter/vulkan.h +++ b/libavfilter/vulkan.h @@ -340,7 +340,8 @@ void ff_vk_discard_exec_deps(AVFilterContext *avctx, FFVkExecContext *e); * Must be called before submission. */ int ff_vk_add_exec_dep(AVFilterContext *avctx, FFVkExecContext *e, - AVFrame *frame, VkPipelineStageFlagBits in_wait_dst_flag); + AVFrame *frame, VkPipelineStageFlagBits in_wait_dst_flag, + int input_frame); /** * Submits a command buffer to the queue for execution. diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c index 88db5b8b70..9a29267aed 100644 --- a/libavutil/hwcontext_vulkan.c +++ b/libavutil/hwcontext_vulkan.c @@ -1737,8 +1737,6 @@ static int prepare_frame(AVHWFramesContext *hwfc, VulkanExecCtx *ectx, VkSubmitInfo s_info = { .sType = VK_STRUCTURE_TYPE_SUBMIT_INFO, - .pSignalSemaphores = frame->sem, - .signalSemaphoreCount = planes, }; VkPipelineStageFlagBits wait_st[AV_NUM_DATA_POINTERS]; @@ -1750,11 +1748,15 @@ static int prepare_frame(AVHWFramesContext *hwfc, VulkanExecCtx *ectx, new_layout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL; new_access = VK_ACCESS_TRANSFER_WRITE_BIT; dst_qf = VK_QUEUE_FAMILY_IGNORED; + s_info.pSignalSemaphores = frame->sem; + s_info.signalSemaphoreCount = planes; break; case PREP_MODE_RO_SHADER: new_layout = VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL; new_access = VK_ACCESS_TRANSFER_READ_BIT; dst_qf = VK_QUEUE_FAMILY_IGNORED; + s_info.pSignalSemaphores = frame->sem; + s_info.signalSemaphoreCount = planes; break; case PREP_MODE_EXTERNAL_EXPORT: new_layout = VK_IMAGE_LAYOUT_GENERAL; @@ -3226,11 +3228,11 @@ static int transfer_image_buf(AVHWFramesContext *hwfc, const AVFrame *f, VkSubmitInfo s_info = { .sType = VK_STRUCTURE_TYPE_SUBMIT_INFO, - .pSignalSemaphores = frame->sem, - .pWaitSemaphores = frame->sem, + .pSignalSemaphores = to_buf ? NULL: frame->sem, + .pWaitSemaphores = to_buf ? frame->sem : NULL, .pWaitDstStageMask = sem_wait_dst, - .signalSemaphoreCount = planes, - .waitSemaphoreCount = planes, + .signalSemaphoreCount = to_buf ? 0 : planes, + .waitSemaphoreCount = to_buf ? planes : 0, }; if ((err = wait_start_exec_ctx(hwfc, ectx)))