From patchwork Tue Aug 10 10:49:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lynne X-Patchwork-Id: 29393 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a6b:8e8b:0:0:0:0:0 with SMTP id q133csp326961iod; Tue, 10 Aug 2021 03:49:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzbWWP2kdS20wnOOqfheSIfMMqg+gPAu6dfGN3MzAVa7uaROdQVdhYVFutCf6zqvH6ysmQq X-Received: by 2002:a17:906:fa92:: with SMTP id lt18mr7993575ejb.359.1628592567751; Tue, 10 Aug 2021 03:49:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1628592567; cv=none; d=google.com; s=arc-20160816; b=mBNA/YfEMhm85It1mC6haFnRX7lZ07g4I0W/5R7kxkQvFawn+VwFZp6j5HXlT0NiqZ rRxVqgFYZb7rplt306mVTmbUGI72XhAknbuR1d+FSRggm/MRzjU0h90w1n4nkB+iGRV1 BlRYEjCkUDUr8dHeOFyYjuXA4y0s5Ot58qhGUtwZaLohRD4wxxEXyQ4s7tdxqDPt0dUp C40Z2QZbTNF2u7n44i0V0zWYo9530O1VWh6IU6bD1mORvvs0ZW9mmpBWXdADGuLmUS/X aqtz/CGaTunW949j36losIYhe2cf3EiNlTcw8slukCwV5WMJ7Ac9KtHoZsJR/EuFVOS5 xI4w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject :mime-version:message-id:to:from:date:dkim-signature:delivered-to; bh=wpcuY7NhLsCb7hwthVKStLY8Z3IUoWB8gIL8zS21zHQ=; b=IDqkLfTqEx0rXfAl0WWNyu972//SkLKhl+6dlZKHDgzwsiTGk3rRBv4l/yAdlHLS8C zixY3c8QG95yAuNS7yUEWW+mW4AeSbwVtGtS0gMIY3NIQaJqxl6ICWEAr9lX2koMpFgb cdq62wvlmNYnWaKgje7tCK/GhO0f9J5kzDn6bQGdhZghDyZbAZSvUS79/ZqF7SyWvj/9 FJfb6ylMPZP1o0TkyxNmjr/HzTnuRVSmCN3V4y3HFVkpctgnlzz8enWnZGlHzWm+ixl5 G6fLitCQr4jAsnFPy8gT1QEqikTSrGtYIGaZ71Faea6wivYQyOAPqJVEEm3Kx418gDaG pegg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@lynne.ee header.s=s1 header.b=1YKJAjQG; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=lynne.ee Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id n11si179221edt.458.2021.08.10.03.49.26; Tue, 10 Aug 2021 03:49:27 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@lynne.ee header.s=s1 header.b=1YKJAjQG; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=lynne.ee Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id DB9BF68A318; Tue, 10 Aug 2021 13:49:22 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from w4.tutanota.de (w4.tutanota.de [81.3.6.165]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 4DF746897E7 for ; Tue, 10 Aug 2021 13:49:16 +0300 (EEST) Received: from w3.tutanota.de (unknown [192.168.1.164]) by w4.tutanota.de (Postfix) with ESMTP id 9F30E106016F for ; Tue, 10 Aug 2021 10:49:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1628592555; s=s1; d=lynne.ee; h=From:From:To:To:Subject:Subject:Content-Description:Content-ID:Content-Type:Content-Type:Content-Transfer-Encoding:Cc:Date:Date:In-Reply-To:MIME-Version:MIME-Version:Message-ID:Message-ID:Reply-To:References:Sender; bh=0Dbxmuo1Nyi/juOh9cM90pV2nQiKJpGOlwiGdT3QCb8=; b=1YKJAjQGpEx5MDn9tWiVBo/9biLqw8ybm8e66H5BlaOqf8S0qKAxOfFsBuyCELgl aY/mNhG2VXCxG1a+nYxVdVBxrONWLrPWBJSBVKPb3bTArX9WMqKt0o/xWWQDYrTRgRv zQtKlVGnFlF0x6pt/u27MPTZYaBbE4habRZM+6za6SD2Hn0U3n+esIrQdyk/QGr2Jud yIbOKpIrE+1DpmbqermQ/kp1aSgcKRbDWiTuqqoSZQjUqjGv8x3K8s7iNbgpXuxhaPC qRqrZYZTt4rGmUH7fN0mF4Ar2UvnDwQeG/IXAI2HuLsDxBaZUhvLbGc4J1K+cgW0+1D 7R7oVybnaw== Date: Tue, 10 Aug 2021 12:49:15 +0200 (CEST) From: Lynne To: Ffmpeg Devel Message-ID: MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/2] imgutils: expose av_image_copy_plane_uc_from() X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: GOP35V7Vxu3C The reason why the generic av_image_copy_uc_from() doesn't really fit in the case for Vulkan is because some planes may be copied via other methods (such as mapping GPU memory), and if they don't satisfy the strict alignment requirements, a gpu image->gpu buffer->cpu ram copy is performed. We need this for hwcontext_vulkan, and I think this will also be useful to API users like libplacebo who would rather not write a custom SIMD memcpy. Patch attached. Subject: [PATCH 1/2] imgutils: expose av_image_copy_plane_uc_from() The reason why the generic av_image_copy_uc_from() doesn't really fit in the case for Vulkan is because some planes may be copied via other methods (such as mapping GPU memory), and if they don't satisfy the strict alignment requirements, a gpu image->gpu buffer->cpu ram copy is performed. We need this for hwcontext_vulkan, and I think this will also be useful to API users like libplacebo who would rather not write a custom SIMD memcpy. --- doc/APIchanges | 3 +++ libavutil/imgutils.c | 8 ++++---- libavutil/imgutils.h | 18 ++++++++++++++++++ libavutil/version.h | 2 +- 4 files changed, 26 insertions(+), 5 deletions(-) diff --git a/doc/APIchanges b/doc/APIchanges index 6eefc7fc33..0367d39923 100644 --- a/doc/APIchanges +++ b/doc/APIchanges @@ -14,6 +14,9 @@ libavutil: 2021-04-27 API changes, most recent first: +2021-08-10 - xxxxxxxxxx - lavu 57.3.101 - imgutils.h + Add av_image_copy_plane_uc_from() + 2021-08-02 - xxxxxxxxxx - lavc 59.4.100 - packet.h Add AVPacket.opaque, AVPacket.opaque_ref, AVPacket.time_base. diff --git a/libavutil/imgutils.c b/libavutil/imgutils.c index 6c32a71cc5..9ab5757cf6 100644 --- a/libavutil/imgutils.c +++ b/libavutil/imgutils.c @@ -356,9 +356,9 @@ static void image_copy_plane(uint8_t *dst, ptrdiff_t dst_linesize, } } -static void image_copy_plane_uc_from(uint8_t *dst, ptrdiff_t dst_linesize, - const uint8_t *src, ptrdiff_t src_linesize, - ptrdiff_t bytewidth, int height) +void av_image_copy_plane_uc_from(uint8_t *dst, ptrdiff_t dst_linesize, + const uint8_t *src, ptrdiff_t src_linesize, + ptrdiff_t bytewidth, int height) { int ret = -1; @@ -440,7 +440,7 @@ void av_image_copy_uc_from(uint8_t *dst_data[4], const ptrdiff_t dst_linesizes[4 enum AVPixelFormat pix_fmt, int width, int height) { image_copy(dst_data, dst_linesizes, src_data, src_linesizes, pix_fmt, - width, height, image_copy_plane_uc_from); + width, height, av_image_copy_plane_uc_from); } int av_image_fill_arrays(uint8_t *dst_data[4], int dst_linesize[4], diff --git a/libavutil/imgutils.h b/libavutil/imgutils.h index 5eccbf0bf7..cb2d74728e 100644 --- a/libavutil/imgutils.h +++ b/libavutil/imgutils.h @@ -124,6 +124,24 @@ void av_image_copy_plane(uint8_t *dst, int dst_linesize, const uint8_t *src, int src_linesize, int bytewidth, int height); +/** + * Copy image data located in uncacheable (e.g. GPU mapped) memory. Where + * available, this function will use special functionality for reading from such + * memory, which may result in greatly improved performance compared to plain + * av_image_copy_plane(). + * + * bytewidth must be contained by both absolute values of dst_linesize + * and src_linesize, otherwise the function behavior is undefined. + * + * @note The linesize parameters have the type ptrdiff_t here, while they are + * int for av_image_copy_plane(). + * @note On x86, the linesizes currently need to be aligned to the cacheline + * size (i.e. 64) to get improved performance. + */ +void av_image_copy_plane_uc_from(uint8_t *dst, ptrdiff_t dst_linesize, + const uint8_t *src, ptrdiff_t src_linesize, + ptrdiff_t bytewidth, int height); + /** * Copy image in src_data to dst_data. * diff --git a/libavutil/version.h b/libavutil/version.h index 6b4a265457..678cb9bfa6 100644 --- a/libavutil/version.h +++ b/libavutil/version.h @@ -80,7 +80,7 @@ #define LIBAVUTIL_VERSION_MAJOR 57 #define LIBAVUTIL_VERSION_MINOR 3 -#define LIBAVUTIL_VERSION_MICRO 100 +#define LIBAVUTIL_VERSION_MICRO 101 #define LIBAVUTIL_VERSION_INT AV_VERSION_INT(LIBAVUTIL_VERSION_MAJOR, \ LIBAVUTIL_VERSION_MINOR, \ From patchwork Tue Aug 10 10:50:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lynne X-Patchwork-Id: 29392 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a6b:8e8b:0:0:0:0:0 with SMTP id q133csp327680iod; Tue, 10 Aug 2021 03:50:26 -0700 (PDT) X-Google-Smtp-Source: ABdhPJySPXhToXiEqSEU8uaozdPmqXbqo3gxS7hpb4IqlQiZHJVnFnYV5CE+N/I4B72KLX5iUTD0 X-Received: by 2002:a17:906:4b47:: with SMTP id j7mr26381416ejv.148.1628592626232; Tue, 10 Aug 2021 03:50:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1628592626; cv=none; d=google.com; s=arc-20160816; b=sY17a6qlQ460rdE6Nge49C3zpUk+gY3dA4EZSstzhXiUXD8cDEadP1xiULFFuglMOW bqlsdmsoYCNsAQVCrB2gnHUScuDeb9xDBhUS4I3U3lvKMSweC9UdvIXbfTWJ0FbDxwZg 7N8fEHv4hEx1FCLwzVjewjIi1ATCOMe9gU9ctQzCg4wsPWAljjx8pldaxAW+DmINX8z6 i3KWqQefGgRdUMqrlXmVuMcYkuCVm7rO4GPVN7JTfM8fQs6j7l1zXqSff94N7vMIjNmG GYZZR0qbUz7Oihrh+NizKtDY1HkEd2TSxoVWLJRgPgTi4Afl4rS0MVdpmt3QWj3bgUbW Uoqg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject :mime-version:references:in-reply-to:message-id:to:from:date :dkim-signature:delivered-to; bh=N97CD5eGa4YWgo9GaWuKLh0fOs9hZeEgGiMfauy9w6k=; b=tlbqMbnRNA5wEJUbs0dDmanhSt4+qqmjH3urU4AlTNw3CDgkx48fWGtoXSND08CDzP VGDrr2VO+en6Ztt8hwb6PbNGmhfn/Ezx8DU2hhieGSsnz7TbpyRrYiNaLSigp7TvtA2O /njqpf5gbODFNwf6KP2nJhPd1ylDTfILBfrE1P5uXWvvbCSpU0/zQiP8K3t5QVirmk4H TJLJDO1gvzZus8nLd7tNbZz5aSU7feDZg0ZgylfmdoA2AjcEbgg0SfZEHrwx30aGymck 90fjrgmn2pM0CxnR3oPGfLKlnqLSfA1a2kXv66sovLK/WlaBqlQQ9lfKVNx0NLRnIEkX l4Kg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@lynne.ee header.s=s1 header.b=uOaJ+3QR; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=lynne.ee Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id gj5si19716598ejb.686.2021.08.10.03.50.25; Tue, 10 Aug 2021 03:50:26 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@lynne.ee header.s=s1 header.b=uOaJ+3QR; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=lynne.ee Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 10EDF689D21; Tue, 10 Aug 2021 13:50:23 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from w4.tutanota.de (w4.tutanota.de [81.3.6.165]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id CB6AF6806EF for ; Tue, 10 Aug 2021 13:50:16 +0300 (EEST) Received: from w3.tutanota.de (unknown [192.168.1.164]) by w4.tutanota.de (Postfix) with ESMTP id 7A1911060157 for ; Tue, 10 Aug 2021 10:50:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1628592616; s=s1; d=lynne.ee; h=From:From:To:To:Subject:Subject:Content-Description:Content-ID:Content-Type:Content-Type:Content-Transfer-Encoding:Cc:Date:Date:In-Reply-To:In-Reply-To:MIME-Version:MIME-Version:Message-ID:Message-ID:Reply-To:References:References:Sender; bh=gI7NL59moC+Ui62RgnzeVsPpQiGwzcMdt7Dsk0uuw9Y=; b=uOaJ+3QRW2VdoSLjCRJvVZ0aDhlv+2L25Cf1u2erxkSgq5F52iWrD7M3MzQYZChx fmQDrHMDDMG6+8MH0VJTKZsPId+ircO2GXnJpJQICX3eeRksHNf67NPuZDHMTmxWcaR 1S1PhYeGVdqNZlGBW9CumbR9QKOhwfZ7WZiYkV20izDQDgLSzZLSzBXI9altzq6IKIy Fv84fAHAbNLHlujZSwwSuLvjn7jFmzudWJCQ/63wxE9FjABWPmD+IqPF7CHU9ny2sO6 cDM8VVnd2rppd/y2iAEe7jrwfhFCv8W3NBvTqAXd3/VrAVQxPHPY1w1i08BPIRwh1T+ eDr0Hob6vQ== Date: Tue, 10 Aug 2021 12:50:16 +0200 (CEST) From: Lynne To: FFmpeg development discussions and patches Message-ID: In-Reply-To: References: MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/2] hwcontext_vulkan: use GPU memcpy when copying to system RAM X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 0E778917G9jL This should speed it up significantly on systems where it matters. Patch attached. Subject: [PATCH 2/2] hwcontext_vulkan: use GPU memcpy when copying to system RAM This should speed it up significantly on systems where it matters. --- libavutil/hwcontext_vulkan.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c index b7da6a7e32..92b2c236c8 100644 --- a/libavutil/hwcontext_vulkan.c +++ b/libavutil/hwcontext_vulkan.c @@ -3424,7 +3424,7 @@ static int vulkan_transfer_data(AVHWFramesContext *hwfc, const AVFrame *vkf, } if (!from) { - /* Map, copy image to buffer, unmap */ + /* Map, copy image from buffer, unmap */ if ((err = map_buffers(dev_ctx, bufs, tmp.data, planes, 0))) goto end; @@ -3434,10 +3434,10 @@ static int vulkan_transfer_data(AVHWFramesContext *hwfc, const AVFrame *vkf, get_plane_wh(&p_w, &p_h, swf->format, swf->width, swf->height, i); - av_image_copy_plane(tmp.data[i], tmp.linesize[i], - (const uint8_t *)swf->data[i], swf->linesize[i], - FFMIN(tmp.linesize[i], FFABS(swf->linesize[i])), - p_h); + av_image_copy_plane_uc_from(tmp.data[i], tmp.linesize[i], + (const uint8_t *)swf->data[i], swf->linesize[i], + FFMIN(tmp.linesize[i], FFABS(swf->linesize[i])), + p_h); } if ((err = unmap_buffers(dev_ctx, bufs, planes, 1)))