From patchwork Wed Jul 19 04:17:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lynne X-Patchwork-Id: 42823 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:b813:b0:130:ccc6:6c4b with SMTP id fi19csp931737pzb; Tue, 18 Jul 2023 21:17:53 -0700 (PDT) X-Google-Smtp-Source: APBJJlH4PEIsibiznbS7JcS+2/7/2wgvMrmsOmsyvmkBd5+Ve+ZW8xYDPmRwrLgIkATGfU8cWinU X-Received: by 2002:a17:906:3085:b0:997:c5c3:32cc with SMTP id 5-20020a170906308500b00997c5c332ccmr1198886ejv.66.1689740273233; Tue, 18 Jul 2023 21:17:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689740273; cv=none; d=google.com; s=arc-20160816; b=CwzDv89l8ElT5BSo3rkJwmaBF8i4XWkAf3byI7/P8f9tOI7eTz+ab6kfeyxLi0FnUv phW0s43biLD89iHctB3XjVv8PpJ4oX+IMKNd3/Bn02EGquelfWTLVXmpVgkbC0bCJKNe FInfYc2WmxZ+KzgFgDBEP8WzbBjI5tQ5WaO8olgN/wJG185sJBRoRDjspmRgUFP/dQ48 GPwdLJ4qfJOJAZpGFLoSoK3gGqdl8Yyy2OMgoG+fAIuI1TKtt/sjzNNg774Ri6zDMbvr SIpSI0YJ0rFe8pcnUE3JjjxWI19CrrT5ntcBB7phJChpJwDHJ929R+fMJSafdrcFfvdn rd/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject :mime-version:message-id:to:from:date:dkim-signature:delivered-to; bh=SaBpQ+SR1jfiEpN5QNoic4YOD/qFkMpkKTmoWfWt1FM=; fh=NIkoTDpgqwsBDY3Ej4pG8uwJcqBS6rd7Vc7bU5x+v0U=; b=kjAqDkLJlxesWQyja0e1w2qC6B8xwU+3KDjB+bPO+IY0Oa/w6mebd6Q/b/Jra/QMP3 mEEGtAnEJws3MHn6QD7pu6rQwXK7Wq3/KtU2sEkJDBQhrL/ZtjJ/xifzBaCdjxCiVi46 Hw9ZW93ZR7Tr/zFtkorv8xe8RYCTn7PNQ5pLOQkR2SjVzZ0svgK15SdacwiAzYfhLagr wqyXRu4VjKNfQ+BRVDytCnB9ZxXXLsHSjvBiAKvwJ6DHqBuOOg4rZ3h6CFvGhTUVVKB9 j/be/ZD184FKVdtRzCIfGAGiGfQCxbvdjpTWFiE1amxOmbIaohpygF+3bnpkvR/M4dhS fIDg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@lynne.ee header.s=s1 header.b=bBbs6kT0; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=lynne.ee Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id q3-20020a170906940300b0099453a54a77si2150840ejx.224.2023.07.18.21.17.52; Tue, 18 Jul 2023 21:17:53 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@lynne.ee header.s=s1 header.b=bBbs6kT0; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=lynne.ee Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 300EA68C230; Wed, 19 Jul 2023 07:17:48 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from w4.tutanota.de (w4.tutanota.de [81.3.6.165]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id DCF4368C230 for ; Wed, 19 Jul 2023 07:17:41 +0300 (EEST) Received: from tutadb.w10.tutanota.de (unknown [192.168.1.10]) by w4.tutanota.de (Postfix) with ESMTP id E728C1060136 for ; Wed, 19 Jul 2023 04:17:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1689740261; s=s1; d=lynne.ee; h=From:From:To:To:Subject:Subject:Content-Description:Content-ID:Content-Type:Content-Type:Content-Transfer-Encoding:Cc:Date:Date:In-Reply-To:MIME-Version:MIME-Version:Message-ID:Message-ID:Reply-To:References:Sender; bh=01C44wwwITtaLc3yiuqxaNJm0K3BxNfBKgs3Rde5FgQ=; b=bBbs6kT0mXTJq+4grL2ttQndORry7ftb6w7T9nooKptzTXBcGyzgifdATFCVSlcn yeuv2K2Y1G/0vADLD3tkVT698zB6PGrQ0hivGpB9P0rhBvoFT91txER9qXZz7WF+q3Z iSsOohMnvHykOZ3Cz1CIfjyV0F5oniif2e8wA24p5LVuPCJbIofAna0xOwrZeMSGMGV fHGJ4YYM25SOxD38vn3/y7+BGkj41sjsWrD/W18MnGhmB10V/AfPdHjLsUbGlE4G9bB P0aGYjohs30rODesRLYXZzWDapHRv0YraI0DbI3BJ8nv6nMhcRYNjRHlArxOkFQkq7I NwGIo0A7Ag== Date: Wed, 19 Jul 2023 06:17:41 +0200 (CEST) From: Lynne To: Ffmpeg Devel Message-ID: MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/3] lavc/vulkan_decode: use a single execution pool per thread X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: WqL0r+1NxSgO The spec says command buffer pools must be externally synchronized objects, which caused us to fail validation when decoding. This still lets us pool some resources, just not as much. Patch attached. From ef7fc2ea266f04467cf8c37a76275ea167b4d5a1 Mon Sep 17 00:00:00 2001 From: Lynne Date: Wed, 19 Jul 2023 05:39:07 +0200 Subject: [PATCH 1/3] lavc/vulkan_decode: use a single execution pool per thread The spec says command buffer pools must be externally synchronized objects. This still lets us pool some, just not as much. --- libavcodec/vulkan_decode.c | 86 ++++++++++++++++++++++++++++---------- libavcodec/vulkan_decode.h | 3 +- 2 files changed, 66 insertions(+), 23 deletions(-) diff --git a/libavcodec/vulkan_decode.c b/libavcodec/vulkan_decode.c index 973c7ca548..f20733fb39 100644 --- a/libavcodec/vulkan_decode.c +++ b/libavcodec/vulkan_decode.c @@ -42,12 +42,53 @@ static const VkExtensionProperties *dec_ext[] = { #endif }; +static const VkVideoProfileInfoKHR *get_video_profile(FFVulkanDecodeShared *ctx, enum AVCodecID codec_id) +{ + const VkVideoProfileListInfoKHR *profile_list; + + VkStructureType profile_struct_type = + codec_id == AV_CODEC_ID_H264 ? VK_STRUCTURE_TYPE_VIDEO_DECODE_H264_PROFILE_INFO_KHR : + codec_id == AV_CODEC_ID_HEVC ? VK_STRUCTURE_TYPE_VIDEO_DECODE_H265_PROFILE_INFO_KHR : + codec_id == AV_CODEC_ID_AV1 ? VK_STRUCTURE_TYPE_VIDEO_DECODE_AV1_PROFILE_INFO_MESA : + 0; + + profile_list = ff_vk_find_struct(ctx->s.hwfc->create_pnext, + VK_STRUCTURE_TYPE_VIDEO_PROFILE_LIST_INFO_KHR); + if (!profile_list) + return NULL; + + for (int i = 0; i < profile_list->profileCount; i++) + if (ff_vk_find_struct(profile_list->pProfiles[i].pNext, profile_struct_type)) + return &profile_list->pProfiles[i]; + + return NULL; +} + int ff_vk_update_thread_context(AVCodecContext *dst, const AVCodecContext *src) { int err; FFVulkanDecodeContext *src_ctx = src->internal->hwaccel_priv_data; FFVulkanDecodeContext *dst_ctx = dst->internal->hwaccel_priv_data; + if (!dst_ctx->exec_pool.cmd_bufs) { + FFVulkanDecodeShared *ctx = (FFVulkanDecodeShared *)src_ctx->shared_ref->data; + + const VkVideoProfileInfoKHR *profile = get_video_profile(ctx, dst->codec_id); + if (!profile) { + av_log(dst, AV_LOG_ERROR, "Video profile missing from frames context!"); + return AVERROR(EINVAL); + } + + err = ff_vk_exec_pool_init(&ctx->s, &ctx->qf, + &dst_ctx->exec_pool, + src_ctx->exec_pool.pool_size, + src_ctx->exec_pool.nb_queries, + VK_QUERY_TYPE_RESULT_STATUS_ONLY_KHR, 0, + profile); + if (err < 0) + return err; + } + err = av_buffer_replace(&dst_ctx->shared_ref, src_ctx->shared_ref); if (err < 0) return err; @@ -271,7 +312,7 @@ void ff_vk_decode_flush(AVCodecContext *avctx) }; VkCommandBuffer cmd_buf; - FFVkExecContext *exec = ff_vk_exec_get(&ctx->exec_pool); + FFVkExecContext *exec = ff_vk_exec_get(&dec->exec_pool); ff_vk_exec_start(&ctx->s, exec); cmd_buf = exec->buf; @@ -317,7 +358,7 @@ int ff_vk_decode_frame(AVCodecContext *avctx, size_t data_size = FFALIGN(vp->slices_size, ctx->caps.minBitstreamBufferSizeAlignment); - FFVkExecContext *exec = ff_vk_exec_get(&ctx->exec_pool); + FFVkExecContext *exec = ff_vk_exec_get(&dec->exec_pool); /* The current decoding reference has to be bound as an inactive reference */ VkVideoReferenceSlotInfoKHR *cur_vk_ref; @@ -326,7 +367,7 @@ int ff_vk_decode_frame(AVCodecContext *avctx, cur_vk_ref[0].slotIndex = -1; decode_start.referenceSlotCount++; - if (ctx->exec_pool.nb_queries) { + if (dec->exec_pool.nb_queries) { int64_t prev_sub_res = 0; ff_vk_exec_wait(&ctx->s, exec); ret = ff_vk_exec_get_query(&ctx->s, exec, NULL, &prev_sub_res); @@ -495,14 +536,14 @@ int ff_vk_decode_frame(AVCodecContext *avctx, vk->CmdBeginVideoCodingKHR(cmd_buf, &decode_start); /* Start status query */ - if (ctx->exec_pool.nb_queries) - vk->CmdBeginQuery(cmd_buf, ctx->exec_pool.query_pool, exec->query_idx + 0, 0); + if (dec->exec_pool.nb_queries) + vk->CmdBeginQuery(cmd_buf, dec->exec_pool.query_pool, exec->query_idx + 0, 0); vk->CmdDecodeVideoKHR(cmd_buf, &vp->decode_info); /* End status query */ - if (ctx->exec_pool.nb_queries) - vk->CmdEndQuery(cmd_buf, ctx->exec_pool.query_pool, exec->query_idx + 0); + if (dec->exec_pool.nb_queries) + vk->CmdEndQuery(cmd_buf, dec->exec_pool.query_pool, exec->query_idx + 0); vk->CmdEndVideoCodingKHR(cmd_buf, &decode_end); @@ -555,9 +596,6 @@ static void free_common(void *opaque, uint8_t *data) FFVulkanContext *s = &ctx->s; FFVulkanFunctions *vk = &ctx->s.vkfn; - /* Wait on and free execution pool */ - ff_vk_exec_pool_free(s, &ctx->exec_pool); - /* Destroy layered view */ if (ctx->layered_view) vk->DestroyImageView(s->hwctx->act_dev, ctx->layered_view, s->hwctx->alloc); @@ -1029,6 +1067,11 @@ void ff_vk_decode_free_params(void *opaque, uint8_t *data) int ff_vk_decode_uninit(AVCodecContext *avctx) { FFVulkanDecodeContext *dec = avctx->internal->hwaccel_priv_data; + FFVulkanDecodeShared *ctx = (FFVulkanDecodeShared *)dec->shared_ref->data; + + /* Wait on and free execution pool */ + ff_vk_exec_pool_free(&ctx->s, &dec->exec_pool); + av_buffer_pool_uninit(&dec->tmp_pool); av_buffer_unref(&dec->session_params); av_buffer_unref(&dec->shared_ref); @@ -1044,8 +1087,7 @@ int ff_vk_decode_init(AVCodecContext *avctx) FFVulkanDecodeShared *ctx; FFVulkanContext *s; FFVulkanFunctions *vk; - FFVkQueueFamilyCtx qf_dec; - const VkVideoProfileListInfoKHR *profile_list; + const VkVideoProfileInfoKHR *profile; VkVideoDecodeH264SessionParametersCreateInfoKHR h264_params = { .sType = VK_STRUCTURE_TYPE_VIDEO_DECODE_H264_SESSION_PARAMETERS_CREATE_INFO_KHR, @@ -1089,10 +1131,9 @@ int ff_vk_decode_init(AVCodecContext *avctx) s->device = (AVHWDeviceContext *)s->frames->device_ref->data; s->hwctx = s->device->hwctx; - profile_list = ff_vk_find_struct(s->hwfc->create_pnext, - VK_STRUCTURE_TYPE_VIDEO_PROFILE_LIST_INFO_KHR); - if (!profile_list) { - av_log(avctx, AV_LOG_ERROR, "Profile list missing from frames context!"); + profile = get_video_profile(ctx, avctx->codec_id); + if (!profile) { + av_log(avctx, AV_LOG_ERROR, "Video profile missing from frames context!"); return AVERROR(EINVAL); } @@ -1101,7 +1142,7 @@ int ff_vk_decode_init(AVCodecContext *avctx) goto fail; /* Create queue context */ - qf = ff_vk_qf_init(s, &qf_dec, VK_QUEUE_VIDEO_DECODE_BIT_KHR); + qf = ff_vk_qf_init(s, &ctx->qf, VK_QUEUE_VIDEO_DECODE_BIT_KHR); /* Check for support */ if (!(s->video_props[qf].videoCodecOperations & @@ -1123,14 +1164,14 @@ int ff_vk_decode_init(AVCodecContext *avctx) session_create.pictureFormat = s->hwfc->format[0]; session_create.referencePictureFormat = session_create.pictureFormat; session_create.pStdHeaderVersion = dec_ext[avctx->codec_id]; - session_create.pVideoProfile = &profile_list->pProfiles[0]; + session_create.pVideoProfile = profile; - /* Create decode exec context. + /* Create decode exec context for this specific main thread. * 2 async contexts per thread was experimentally determined to be optimal * for a majority of streams. */ - err = ff_vk_exec_pool_init(s, &qf_dec, &ctx->exec_pool, 2*avctx->thread_count, + err = ff_vk_exec_pool_init(s, &ctx->qf, &dec->exec_pool, 2, nb_q, VK_QUERY_TYPE_RESULT_STATUS_ONLY_KHR, 0, - session_create.pVideoProfile); + profile); if (err < 0) goto fail; @@ -1168,7 +1209,8 @@ int ff_vk_decode_init(AVCodecContext *avctx) dpb_frames->height = s->frames->height; dpb_hwfc = dpb_frames->hwctx; - dpb_hwfc->create_pnext = (void *)profile_list; + dpb_hwfc->create_pnext = (void *)ff_vk_find_struct(ctx->s.hwfc->create_pnext, + VK_STRUCTURE_TYPE_VIDEO_PROFILE_LIST_INFO_KHR); dpb_hwfc->format[0] = s->hwfc->format[0]; dpb_hwfc->tiling = VK_IMAGE_TILING_OPTIMAL; dpb_hwfc->usage = VK_IMAGE_USAGE_VIDEO_DECODE_DPB_BIT_KHR | diff --git a/libavcodec/vulkan_decode.h b/libavcodec/vulkan_decode.h index 4e45cbde71..1b4e1cc712 100644 --- a/libavcodec/vulkan_decode.h +++ b/libavcodec/vulkan_decode.h @@ -37,7 +37,7 @@ typedef struct FFVulkanDecodeProfileData { typedef struct FFVulkanDecodeShared { FFVulkanContext s; FFVkVideoCommon common; - FFVkExecPool exec_pool; + FFVkQueueFamilyCtx qf; VkVideoCapabilitiesKHR caps; VkVideoDecodeCapabilitiesKHR dec_caps; @@ -56,6 +56,7 @@ typedef struct FFVulkanDecodeShared { typedef struct FFVulkanDecodeContext { AVBufferRef *shared_ref; AVBufferRef *session_params; + FFVkExecPool exec_pool; int dedicated_dpb; /* Oddity #1 - separate DPB images */ int layered_dpb; /* Madness #1 - layered DPB images */ -- 2.40.1