From patchwork Wed Aug 10 20:47:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Timo Rothenpieler X-Patchwork-Id: 37222 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3d0d:b0:8d:a68e:8a0e with SMTP id y13csp1142332pzi; Wed, 10 Aug 2022 13:48:01 -0700 (PDT) X-Google-Smtp-Source: AA6agR7vIs7oard1TXU33pozpxIQDfNEZfMLBz75ylNZFtSVjknhbFlSo465iAKzlRug94WNS9eJ X-Received: by 2002:a05:6402:3708:b0:433:2d3b:ed5 with SMTP id ek8-20020a056402370800b004332d3b0ed5mr28023493edb.246.1660164481262; Wed, 10 Aug 2022 13:48:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660164481; cv=none; d=google.com; s=arc-20160816; b=jyD3gn2KXS9igVKGDywZhlBbd6esbpDRG+ZHhLIthCVjU7ZVta24SkUCJ6zIt3IWnw rb1TcUcakBd2HLtNacHROYG2VrdUlqi8X9xGQpOaGfQWE3MfmPxdwZIgA589/8HQjfCH V2j1lGYC9jTZDuEsLMnO/vh27sLCQcdeBaXpf92ieuf0O+D3sFnjrbsi0t0LyyhOLBdE 4dv0IOlxUKgy4EI3uBRDCqb5qntgqDQ8Ys6498DCQDYNVhUwf0d3Wgr8tICRXwE2g3tV LtigiL2oy3ra4zUSb1T1DA2/nNlSqSK6afjqXxIquBzhUwcsgpGZrLgNdfsd+Ut3A+QA 8PQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=7CxywYZsqxtB8yA/HRMQZ6zt1wfmqY1Y3EBieSS5zd4=; b=p+9W0YgXq+i4FSKEB1Kthjli57eF8aoC0NSyYJ8FLzmltDjGHpZ6Ps4IwgfxeDn86g 3z4bDVcrgxjNKtHTcto5+Fzpnj19wFJavF/DaPkxwmXVBfOFjANR8jwEmbNJmkBYsGxa 9gfILVTQimdm0W5q+WkER4ziX8rkk3G3uh0KuskgX390ESUyrWT0mD8slO0R7yEUpVyY EicNIFPTNuIBh5KeP4IfG6BqUCDEW6cKTH7Qxci8sU7e2IhweE0ilg6zqD2FBYDoRraz PECYdcGy15xr+p84cdNpTJI3Z+oaBSnicS3IKDTFZSVJ9v+IFhbDywczJOk0xV3u9k7/ 18dg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b=FXepwxq0; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id f20-20020a056402355400b0043c2e89d0acsi14447667edd.3.2022.08.10.13.47.49; Wed, 10 Aug 2022 13:48:01 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b=FXepwxq0; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 370D468B308; Wed, 10 Aug 2022 23:47:33 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from btbn.de (btbn.de [136.243.74.85]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id ADEF068B7AD for ; Wed, 10 Aug 2022 23:47:25 +0300 (EEST) Received: from [authenticated] by btbn.de (Postfix) with ESMTPSA id 6D3C62F2604; Wed, 10 Aug 2022 22:47:24 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rothenpieler.org; s=mail; t=1660164444; bh=a9hV2rb1WlJgg3c+YboDyrziG45hZ1U3wXoOLkpDAqQ=; h=From:To:Cc:Subject:Date; b=FXepwxq0wwa9YvPcVkP1kLIVvaYf/MEcKE3PpytLd7Afc0+TbEPIs/rhH3owqEe1r 3Dm+liQqi6yGBY3UVzApViXqBv1wFdXPBxp9vMY6thaTueTe61FHBF8voMhP9bcjUs HUtHV07HvEYHiYrIb4VtJqz45JCAwjDY+OK1NegpPB3gwt1uzYArEd15ZvDcc+EA/X augsUnjunvbcvmHARq+ugpLx2IZInhlQkGfgFHE/GkpfipHJDcDmcoP1JKOG0G257M Aix+/zV/7qzZfChZ7VkMdcP9eu50cc4w0HPSZ5K7LI05vX34VmbiciZ6C2awG5yYFx SujCoD/TOwlqg== From: Timo Rothenpieler To: ffmpeg-devel@ffmpeg.org Date: Wed, 10 Aug 2022 22:47:02 +0200 Message-Id: <20220810204712.3123-1-timo@rothenpieler.org> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 01/11] lavu/pixfmt: add packed RGBA float16 format X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Timo Rothenpieler Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: r9S6C32781jY This is the default format of the Windows compositor and what DXGI Desktop Duplication will give you for any kind of HDR output. --- libavutil/pixdesc.c | 28 ++++++++++++++++++++++++++++ libavutil/pixfmt.h | 5 +++++ libavutil/version.h | 4 ++-- tests/ref/fate/imgutils | 2 ++ tests/ref/fate/sws-pixdesc-query | 13 +++++++++++++ 5 files changed, 50 insertions(+), 2 deletions(-) diff --git a/libavutil/pixdesc.c b/libavutil/pixdesc.c index e078fd5320..f7558ff8b9 100644 --- a/libavutil/pixdesc.c +++ b/libavutil/pixdesc.c @@ -2504,6 +2504,34 @@ static const AVPixFmtDescriptor av_pix_fmt_descriptors[AV_PIX_FMT_NB] = { }, .flags = AV_PIX_FMT_FLAG_ALPHA, }, + [AV_PIX_FMT_RGBAF16BE] = { + .name = "rgbaf16be", + .nb_components = 4, + .log2_chroma_w = 0, + .log2_chroma_h = 0, + .comp = { + { 0, 8, 0, 0, 16 }, /* R */ + { 0, 8, 2, 0, 16 }, /* G */ + { 0, 8, 4, 0, 16 }, /* B */ + { 0, 8, 6, 0, 16 }, /* A */ + }, + .flags = AV_PIX_FMT_FLAG_BE | AV_PIX_FMT_FLAG_RGB | + AV_PIX_FMT_FLAG_ALPHA | AV_PIX_FMT_FLAG_FLOAT, + }, + [AV_PIX_FMT_RGBAF16LE] = { + .name = "rgbaf16le", + .nb_components = 4, + .log2_chroma_w = 0, + .log2_chroma_h = 0, + .comp = { + { 0, 8, 0, 0, 16 }, /* R */ + { 0, 8, 2, 0, 16 }, /* G */ + { 0, 8, 4, 0, 16 }, /* B */ + { 0, 8, 6, 0, 16 }, /* A */ + }, + .flags = AV_PIX_FMT_FLAG_RGB | AV_PIX_FMT_FLAG_ALPHA | + AV_PIX_FMT_FLAG_FLOAT, + }, }; static const char * const color_range_names[] = { diff --git a/libavutil/pixfmt.h b/libavutil/pixfmt.h index 9d1fdaf82d..86c9bdefeb 100644 --- a/libavutil/pixfmt.h +++ b/libavutil/pixfmt.h @@ -369,6 +369,9 @@ enum AVPixelFormat { AV_PIX_FMT_VUYA, ///< packed VUYA 4:4:4, 32bpp, VUYAVUYA... + AV_PIX_FMT_RGBAF16BE, ///< IEEE-754 half precision packed RGBA 16:16:16:16, 64bpp, RGBARGBA..., big-endian + AV_PIX_FMT_RGBAF16LE, ///< IEEE-754 half precision packed RGBA 16:16:16:16, 64bpp, RGBARGBA..., little-endian + AV_PIX_FMT_NB ///< number of pixel formats, DO NOT USE THIS if you want to link with shared libav* because the number of formats might differ between versions }; @@ -466,6 +469,8 @@ enum AVPixelFormat { #define AV_PIX_FMT_P216 AV_PIX_FMT_NE(P216BE, P216LE) #define AV_PIX_FMT_P416 AV_PIX_FMT_NE(P416BE, P416LE) +#define AV_PIX_FMT_RGBAF16 AV_PIX_FMT_NE(RGBAF16BE, RGBAF16LE) + /** * Chromaticity coordinates of the source primaries. * These values match the ones defined by ISO/IEC 23091-2_2019 subclause 8.1 and ITU-T H.273. diff --git a/libavutil/version.h b/libavutil/version.h index ee43526dc6..f0a8b5c098 100644 --- a/libavutil/version.h +++ b/libavutil/version.h @@ -79,8 +79,8 @@ */ #define LIBAVUTIL_VERSION_MAJOR 57 -#define LIBAVUTIL_VERSION_MINOR 32 -#define LIBAVUTIL_VERSION_MICRO 101 +#define LIBAVUTIL_VERSION_MINOR 33 +#define LIBAVUTIL_VERSION_MICRO 100 #define LIBAVUTIL_VERSION_INT AV_VERSION_INT(LIBAVUTIL_VERSION_MAJOR, \ LIBAVUTIL_VERSION_MINOR, \ diff --git a/tests/ref/fate/imgutils b/tests/ref/fate/imgutils index 4ec66febb8..01c9877de5 100644 --- a/tests/ref/fate/imgutils +++ b/tests/ref/fate/imgutils @@ -247,3 +247,5 @@ p216le planes: 2, linesizes: 128 128 0 0, plane_sizes: 6144 6144 p416be planes: 2, linesizes: 128 256 0 0, plane_sizes: 6144 12288 0 0, plane_offsets: 6144 0 0, total_size: 18432 p416le planes: 2, linesizes: 128 256 0 0, plane_sizes: 6144 12288 0 0, plane_offsets: 6144 0 0, total_size: 18432 vuya planes: 1, linesizes: 256 0 0 0, plane_sizes: 12288 0 0 0, plane_offsets: 0 0 0, total_size: 12288 +rgbaf16be planes: 1, linesizes: 512 0 0 0, plane_sizes: 24576 0 0 0, plane_offsets: 0 0 0, total_size: 24576 +rgbaf16le planes: 1, linesizes: 512 0 0 0, plane_sizes: 24576 0 0 0, plane_offsets: 0 0 0, total_size: 24576 diff --git a/tests/ref/fate/sws-pixdesc-query b/tests/ref/fate/sws-pixdesc-query index bd0f1fcb82..f79d99e513 100644 --- a/tests/ref/fate/sws-pixdesc-query +++ b/tests/ref/fate/sws-pixdesc-query @@ -21,6 +21,8 @@ is16BPS: rgb48le rgba64be rgba64le + rgbaf16be + rgbaf16le ya16be ya16le yuv420p16be @@ -157,6 +159,7 @@ isBE: rgb555be rgb565be rgba64be + rgbaf16be x2bgr10be x2rgb10be xyz12be @@ -479,6 +482,8 @@ isRGB: rgb8 rgba64be rgba64le + rgbaf16be + rgbaf16le x2bgr10be x2bgr10le x2rgb10be @@ -629,6 +634,8 @@ AnyRGB: rgb8 rgba64be rgba64le + rgbaf16be + rgbaf16le x2bgr10be x2bgr10le x2rgb10be @@ -655,6 +662,8 @@ ALPHA: rgb32_1 rgba64be rgba64le + rgbaf16be + rgbaf16le vuya ya16be ya16le @@ -739,6 +748,8 @@ Packed: rgb8 rgba64be rgba64le + rgbaf16be + rgbaf16le uyvy422 uyyvyy411 vuya @@ -918,6 +929,8 @@ PackedRGB: rgb8 rgba64be rgba64le + rgbaf16be + rgbaf16le x2bgr10be x2bgr10le x2rgb10be From patchwork Wed Aug 10 20:47:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Timo Rothenpieler X-Patchwork-Id: 37220 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3d0d:b0:8d:a68e:8a0e with SMTP id y13csp1142147pzi; Wed, 10 Aug 2022 13:47:38 -0700 (PDT) X-Google-Smtp-Source: AA6agR5n6oB8LtwJTAFwjm3wEi9U/8Jjh5lOzDh0znrCiJFnEXj2H+IAuDITJVVG8VPZPPAITk4H X-Received: by 2002:a17:906:fd84:b0:730:acee:d067 with SMTP id xa4-20020a170906fd8400b00730aceed067mr21829351ejb.206.1660164457939; Wed, 10 Aug 2022 13:47:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660164457; cv=none; d=google.com; s=arc-20160816; b=FJ8HaR2jGp/gwAZoiIHliBnfLjs6XALcjHeV7d6bricRitMOT5O0paA4kHsFrZSfG/ 073cmwEbHLwtzov7VXDbTy20pg8/hOsrFsmRoNgUQf7rhUrswkcB/tk1z59DGNGIVGVE 3P6hAV/RIHYOpkzcYSDGiUHOIQ0DSztKNREUahVK6yJ2Q3ptsQycnWqtati6Vmvi/RZQ kg9+kfh4zkbStdT8MFlolhr9C7V7NqjpoKle/Ibdsr0H3YY3WqeMUzfvjbipbuk9C0Bh 3IUsF7PuG/OT7kSJVareZGGHbbsXc2fBnoKriIE1qOMFNB8+vUlEmOE1fAdm52nYEyax 6r/g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=VGr4mhXNczhJHhwsw7kw6zy2/0Xpsec8s/7z5RzWCn0=; b=KAWd7Bx8hlcucZgZUrKnWnO18QYoAa9MLDkwkBZCpselDr9SAxYj7lz4EPg/sfUSxh ThVwqGzXwqtGzsgRw2kaTlJS45IBZm4kQHrlix3BvuCwQ8z1CU8eSuIzn/yJBaaAD189 nPwXRV1Srlm2s0XkqXd/ahVI38tegVKExsTM5H0yYNKBcc7FmPjPxAnIGlGKH9d/VwDe n6gIHvkID64ok9MLvLHsu3yQo60xVgk1SzsYDv/h1oZ4PBUPw2gSdz58oCF6/MO7Y9IM 7mMmxmVsG6m9c8KVuFP3g6AokKZVJiF4s1gwfXOMnL/KKef9ADU9N8nFaa76JDBnX1ZU +zSQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b=MW51Tt1c; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id b7-20020a056402278700b0043d35059c65si13656681ede.160.2022.08.10.13.47.36; Wed, 10 Aug 2022 13:47:37 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b=MW51Tt1c; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E5E9268B8C8; Wed, 10 Aug 2022 23:47:31 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from btbn.de (btbn.de [136.243.74.85]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id AA60868B586 for ; Wed, 10 Aug 2022 23:47:25 +0300 (EEST) Received: from [authenticated] by btbn.de (Postfix) with ESMTPSA id 89E262F1950; Wed, 10 Aug 2022 22:47:24 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rothenpieler.org; s=mail; t=1660164444; bh=lAqmVdqSlxk80XnTUlXyRFQF9ncu0ttbnPSalBdmU88=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=MW51Tt1chXuq9kPUoGHRi491WOOEOlY3vdxq3Ihc/9FtGkGH3gthDdzRaWi26osYg ZPU69QTURboJPYMU/7pqCLue8/br3AXxJTPQZdZRKbe66SSqEQ3Sy+LUrsnHr6M8AK 6mEm8Yvf/XPdLLF6K06p20XNLW6FVkT9b+ZubwgSksjyeNQhLyEbx5PxGvWKYFQ6P6 VNjTjpuLfYvZUv20SuMgtNuNceQ76l6ovbx+In4dPE10q+yiwdrIaKHRDH8Kwil9mF WXdMk27RvfDsLyi/7KMwT5pZSSW8yMEC4Ux79aXCyHf0ttcHxaMntj1aenr2z2Abl1 P+IdDam8L7NEQ== From: Timo Rothenpieler To: ffmpeg-devel@ffmpeg.org Date: Wed, 10 Aug 2022 22:47:03 +0200 Message-Id: <20220810204712.3123-2-timo@rothenpieler.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220810204712.3123-1-timo@rothenpieler.org> References: <20220810204712.3123-1-timo@rothenpieler.org> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 02/11] avutil/hwcontext_d3d11va: add support for rgbaf16 pixel format X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Timo Rothenpieler Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: pwFPdTHADJoq --- libavutil/hwcontext_d3d11va.c | 1 + 1 file changed, 1 insertion(+) diff --git a/libavutil/hwcontext_d3d11va.c b/libavutil/hwcontext_d3d11va.c index 27c0c80413..363ec6a47d 100644 --- a/libavutil/hwcontext_d3d11va.c +++ b/libavutil/hwcontext_d3d11va.c @@ -88,6 +88,7 @@ static const struct { { DXGI_FORMAT_P010, AV_PIX_FMT_P010 }, { DXGI_FORMAT_B8G8R8A8_UNORM, AV_PIX_FMT_BGRA }, { DXGI_FORMAT_R10G10B10A2_UNORM, AV_PIX_FMT_X2BGR10 }, + { DXGI_FORMAT_R16G16B16A16_FLOAT, AV_PIX_FMT_RGBAF16 }, // Special opaque formats. The pix_fmt is merely a place holder, as the // opaque format cannot be accessed directly. { DXGI_FORMAT_420_OPAQUE, AV_PIX_FMT_YUV420P }, From patchwork Wed Aug 10 20:47:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Timo Rothenpieler X-Patchwork-Id: 37221 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3d0d:b0:8d:a68e:8a0e with SMTP id y13csp1142316pzi; Wed, 10 Aug 2022 13:47:59 -0700 (PDT) X-Google-Smtp-Source: AA6agR5cBEJd5fBu3MSKX9bXG51j6TnnW4uRLopeGH8pXwEDuXS59O5Plb0V5lCKe47zt9M1+2L2 X-Received: by 2002:a05:6402:3222:b0:43e:49f9:11e with SMTP id g34-20020a056402322200b0043e49f9011emr28467937eda.426.1660164478925; Wed, 10 Aug 2022 13:47:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660164478; cv=none; d=google.com; s=arc-20160816; b=hQOXoIz2krlf1WFqayZDC6vl+zzKrzcLgcRw61xLveREkD4UtkDcIiZhfwh9ipO3yj Eg/baJEA88CNtcWPHAOkIXqEB0zy3SGy9SI+NrevPMz2DpxNsPBqTsDezocRHkYO+qzR Q5xfqjeZo72xpMlv2kcyHWTYGklwBMc555GXuqJfICruKESEGqTa4/SBahjbPOz7U1G/ /3Ltzd7xgZaxGQyd2IcSeVdxLmW53x/qLjGfH8dfLinuDlRfTJ8sGcW8WGJGylKZrSGn BSBPkhdfpy8OSiQ8+C2SbBzlLjSHCsQ/hlYNSjfZdEkPaVW9SrVUTEr6S0Qs1wpRozf/ 8sRQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=+nA1E43S6UHMMQw5GuN4plOv1+Ui3YVsz2SrK6mWORs=; b=YK3IKWiSXqDQzowl9dpraBUv94HgYZcrRKSKtGmVn+x6Jq1gFQJ/IviatLaOzCHc4u kDLTnxaRHYOwIUWXe7yXiV/BDSv+Qerv8v3n8DVmr084BiDQOVqoqhsYfgm5PyHSQtRt lr9jf5OMWlZ3Rk0JQ/r63niMru9YuN00cwTmennzWJlvhxkSH28TzIExQuP4YvPIINM2 05a4alj8FgndPs5Th2SAwTdXuJKlGrwoCXWzUE+QgSc3tC413iKZ5RxgmwLDt89S7Vya VzXmrfhASnXCBuTCl0JoQT5BdQsbKnx4AtkKYsN5YZ+gynw/ukRno2WYHpnfGFlZzoQm NETA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b="eeSQJw/v"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id g24-20020a50ee18000000b00440253e1144si10503743eds.277.2022.08.10.13.47.58; Wed, 10 Aug 2022 13:47:58 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b="eeSQJw/v"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4F67368B1E1; Wed, 10 Aug 2022 23:47:34 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from btbn.de (btbn.de [136.243.74.85]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id B10DA68B7F3 for ; Wed, 10 Aug 2022 23:47:25 +0300 (EEST) Received: from [authenticated] by btbn.de (Postfix) with ESMTPSA id B0FD02F2605; Wed, 10 Aug 2022 22:47:24 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rothenpieler.org; s=mail; t=1660164444; bh=4TnNCQ+Ql+0myt0HXWahMYmNY2bdj3sCoKj011c1T1Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=eeSQJw/vWwrBPYmIWw5OVX3eezpxKWFyzFFDY1g2XGQMxg27hXl/2zqja1AaCKxxc nQvm4REUWAQCoBBrz2nks7ndNWTGgDKz9FFwgkfUo8OLZFgA7/x+y5QLA/fjWIGon6 o7DlMFzl821bFJP/aAWJjXo5FwweJHrXHoYTEW5VanOJylW+RE2nJtn+oLBwQjjySJ ghHVQS6JxVuQB2sNKuqQR3CH+DVprdVdCLC4eVYylNV3hL+6apUZlU8YdLbBZv3MH7 InX4mRi8ySOjMAQeKbweatDL3Pr2Ma5VkOWsZBOLLcLYi+BLu2k2fzYeHToKwXtoFE k03gGgqtKWItw== From: Timo Rothenpieler To: ffmpeg-devel@ffmpeg.org Date: Wed, 10 Aug 2022 22:47:04 +0200 Message-Id: <20220810204712.3123-3-timo@rothenpieler.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220810204712.3123-1-timo@rothenpieler.org> References: <20220810204712.3123-1-timo@rothenpieler.org> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 03/11] avfilter/vsrc_ddagrab: add rgbaf16 output support X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Timo Rothenpieler Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: /5o+RL/FtKzO --- libavfilter/version.h | 2 +- libavfilter/vsrc_ddagrab.c | 13 +++++++++++++ 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/libavfilter/version.h b/libavfilter/version.h index 19a009c110..fa67606495 100644 --- a/libavfilter/version.h +++ b/libavfilter/version.h @@ -32,7 +32,7 @@ #include "version_major.h" #define LIBAVFILTER_VERSION_MINOR 46 -#define LIBAVFILTER_VERSION_MICRO 101 +#define LIBAVFILTER_VERSION_MICRO 102 #define LIBAVFILTER_VERSION_INT AV_VERSION_INT(LIBAVFILTER_VERSION_MAJOR, \ diff --git a/libavfilter/vsrc_ddagrab.c b/libavfilter/vsrc_ddagrab.c index ce36716281..252505b96d 100644 --- a/libavfilter/vsrc_ddagrab.c +++ b/libavfilter/vsrc_ddagrab.c @@ -115,6 +115,8 @@ static const AVOption ddagrab_options[] = { { "bgra", "only output 8 Bit BGRA", 0, AV_OPT_TYPE_CONST, { .i64 = DXGI_FORMAT_B8G8R8A8_UNORM }, 0, INT_MAX, FLAGS, "output_fmt" }, { "10bit", "only output default 10 Bit format", 0, AV_OPT_TYPE_CONST, { .i64 = DXGI_FORMAT_R10G10B10A2_UNORM }, 0, INT_MAX, FLAGS, "output_fmt" }, { "x2bgr10", "only output 10 Bit X2BGR10", 0, AV_OPT_TYPE_CONST, { .i64 = DXGI_FORMAT_R10G10B10A2_UNORM }, 0, INT_MAX, FLAGS, "output_fmt" }, + { "16bit", "only output default 16 Bit format", 0, AV_OPT_TYPE_CONST, { .i64 = DXGI_FORMAT_R16G16B16A16_FLOAT },0, INT_MAX, FLAGS, "output_fmt" }, + { "rgbaf16", "only output 16 Bit RGBAF16", 0, AV_OPT_TYPE_CONST, { .i64 = DXGI_FORMAT_R16G16B16A16_FLOAT },0, INT_MAX, FLAGS, "output_fmt" }, { NULL } }; @@ -212,6 +214,7 @@ static av_cold int init_dxgi_dda(AVFilterContext *avctx) if (set_thread_dpi && SUCCEEDED(hr)) { DPI_AWARENESS_CONTEXT prev_dpi_ctx; DXGI_FORMAT formats[] = { + DXGI_FORMAT_R16G16B16A16_FLOAT, DXGI_FORMAT_R10G10B10A2_UNORM, DXGI_FORMAT_B8G8R8A8_UNORM }; @@ -665,6 +668,10 @@ static av_cold int init_hwframes_ctx(AVFilterContext *avctx) av_log(avctx, AV_LOG_VERBOSE, "Probed 10 bit RGB frame format\n"); dda->frames_ctx->sw_format = AV_PIX_FMT_X2BGR10; break; + case DXGI_FORMAT_R16G16B16A16_FLOAT: + av_log(avctx, AV_LOG_VERBOSE, "Probed 16 bit float RGB frame format\n"); + dda->frames_ctx->sw_format = AV_PIX_FMT_RGBAF16; + break; default: av_log(avctx, AV_LOG_ERROR, "Unexpected texture output format!\n"); return AVERROR_BUG; @@ -990,6 +997,12 @@ static int ddagrab_request_frame(AVFilterLink *outlink) frame->color_primaries = AVCOL_PRI_BT709; frame->color_trc = AVCOL_TRC_IEC61966_2_1; frame->colorspace = AVCOL_SPC_RGB; + } else if(desc.Format == DXGI_FORMAT_R16G16B16A16_FLOAT) { + // According to MSDN, all floating point formats contain sRGB image data with linear 1.0 gamma. + frame->color_range = AVCOL_RANGE_JPEG; + frame->color_primaries = AVCOL_PRI_BT709; + frame->color_trc = AVCOL_TRC_LINEAR; + frame->colorspace = AVCOL_SPC_RGB; } else { ret = AVERROR_BUG; goto fail; From patchwork Wed Aug 10 20:47:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Timo Rothenpieler X-Patchwork-Id: 37224 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3d0d:b0:8d:a68e:8a0e with SMTP id y13csp1142497pzi; Wed, 10 Aug 2022 13:48:18 -0700 (PDT) X-Google-Smtp-Source: AA6agR6Fal1f76ii9fM2x44XRUyl7IFPFtaLBDaUVSpdjHbs2sej4k2y/VpOInosgf5pDKfZyqeC X-Received: by 2002:a17:907:a427:b0:732:ea25:2d38 with SMTP id sg39-20020a170907a42700b00732ea252d38mr6593843ejc.87.1660164498106; Wed, 10 Aug 2022 13:48:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660164498; cv=none; d=google.com; s=arc-20160816; b=KQGkhgRo4M602GYSdAtLxDORUBhOI/GYxijP854SL3mmDEUQrgL5hdDuijgxPmit+V 59gXQR7uUt0i/eb2VP3VF9PxH6wRXNuDt8s1ugwy0eqGJ7MhOKRrv0wGVkHPzBFsDGD+ NPJRTE9gvLfEBQaIi0k2+NIzA8EYwFGY4vZqssHbe6K6A61l4EJj0qI3mzv2CVbzfoJP YfBL2q6v26Psy1ZQQmfp//cs32isEHYx9Jgr8NQkoHEvX9pIJbj0YSo2LpF3jkByQ8uR CEM/atpXhF+0d+AdL7RClPYBIZx5xA6T4YSZbTsFAzqLNmQEdqGP22NSm79yI/YMAlrB ergw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=yUBjkzyONpcfmnwKuT5i23JSRJgPFierbegZ6SOFMgg=; b=Ed6yEp8Jbol+j17wXiWBMo0GVI8gJMZxuo6muUYxEg+bmZR1ZWZKoy3wZOOGbWrk0f BzKLpnaoU7QW1RNTzO2iq4/Bzong9SFyEgs6fprheQ+926KFvZwNnGEGpYNbERe6Ngrp D8IZ9Ww0ZTYabQ2gVaykECR46XQCBq5IzwhjTxdNIN6XMuDiuihSoIpBSnsJPJwaKctj cwmwYQ8wXODDBoS1rRPA+jegpL41KcVG31QJX4BVsNg943pYMxqbBm/bSZfY+ofNyqyU mmmXW7C8cUo2gbQErJRdreYDpUOjijFphveWq21Br2xvSkqyZ8f2Gc1DVhZ5icYd2uWT EXQQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b=RkmP0qbJ; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id g10-20020a1709067c4a00b0073304017857si2682355ejp.171.2022.08.10.13.48.17; Wed, 10 Aug 2022 13:48:18 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b=RkmP0qbJ; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5DEF268B82B; Wed, 10 Aug 2022 23:47:36 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from btbn.de (btbn.de [136.243.74.85]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id B300B68B810 for ; Wed, 10 Aug 2022 23:47:25 +0300 (EEST) Received: from [authenticated] by btbn.de (Postfix) with ESMTPSA id CC91D2F260A; Wed, 10 Aug 2022 22:47:24 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rothenpieler.org; s=mail; t=1660164444; bh=5JXiadhX66Oeb3v8epLGJztzidQz75zIzINHwUjk9gk=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=RkmP0qbJgKXEcY5UFZlb/OP9wEGYF4scDvlXkATM7OPJs00J17juPrREswlTpSr3g Cew7mNmTOYwX2ETSzBC4Clo5MMjseSChTR6pm+wS29/9IrhpuFbwgZS2U95fLECGTG oG3s8CvL+YNNAVQ63Xy/+2OisxArKSo54e2mzld4HTOlForAKN6bC9yYbldhSXwc1U K/ylzq/TidUmsR4ve4MtyEQB9WjBsiLad0MOKX4kmKcshfBoFdxh5e1rRUzbkA3hUZ xJ8z7PXvXkzjxrhgDaWP9A21DEOGWnhiLFPWvyePa8w4ezb+XK7QcDmaji8IUGvH6C ouITo02pWnt4Q== From: Timo Rothenpieler To: ffmpeg-devel@ffmpeg.org Date: Wed, 10 Aug 2022 22:47:05 +0200 Message-Id: <20220810204712.3123-4-timo@rothenpieler.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220810204712.3123-1-timo@rothenpieler.org> References: <20220810204712.3123-1-timo@rothenpieler.org> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 04/11] avfilter/vsrc_ddagrab: add options for more control over output format fallback X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Timo Rothenpieler Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 66eYrUHuaCi5 --- libavfilter/vsrc_ddagrab.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/libavfilter/vsrc_ddagrab.c b/libavfilter/vsrc_ddagrab.c index 252505b96d..00c72187ea 100644 --- a/libavfilter/vsrc_ddagrab.c +++ b/libavfilter/vsrc_ddagrab.c @@ -98,6 +98,8 @@ typedef struct DdagrabContext { int offset_x; int offset_y; int out_fmt; + int allow_fallback; + int force_fmt; } DdagrabContext; #define OFFSET(x) offsetof(DdagrabContext, x) @@ -117,6 +119,10 @@ static const AVOption ddagrab_options[] = { { "x2bgr10", "only output 10 Bit X2BGR10", 0, AV_OPT_TYPE_CONST, { .i64 = DXGI_FORMAT_R10G10B10A2_UNORM }, 0, INT_MAX, FLAGS, "output_fmt" }, { "16bit", "only output default 16 Bit format", 0, AV_OPT_TYPE_CONST, { .i64 = DXGI_FORMAT_R16G16B16A16_FLOAT },0, INT_MAX, FLAGS, "output_fmt" }, { "rgbaf16", "only output 16 Bit RGBAF16", 0, AV_OPT_TYPE_CONST, { .i64 = DXGI_FORMAT_R16G16B16A16_FLOAT },0, INT_MAX, FLAGS, "output_fmt" }, + { "allow_fallback", "don't error on fallback to default 8 Bit format", + OFFSET(allow_fallback), AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1, FLAGS }, + { "force_fmt", "exclude BGRA from format list (experimental, discouraged by Microsoft)", + OFFSET(force_fmt), AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1, FLAGS }, { NULL } }; @@ -226,7 +232,7 @@ static av_cold int init_dxgi_dda(AVFilterContext *avctx) } else if (dda->out_fmt) { formats[0] = dda->out_fmt; formats[1] = DXGI_FORMAT_B8G8R8A8_UNORM; - nb_formats = 2; + nb_formats = dda->force_fmt ? 1 : 2; } IDXGIOutput_Release(dxgi_output); @@ -262,7 +268,7 @@ static av_cold int init_dxgi_dda(AVFilterContext *avctx) #else { #endif - if (dda->out_fmt && dda->out_fmt != DXGI_FORMAT_B8G8R8A8_UNORM) { + if (dda->out_fmt && dda->out_fmt != DXGI_FORMAT_B8G8R8A8_UNORM && (!dda->allow_fallback || dda->force_fmt)) { av_log(avctx, AV_LOG_ERROR, "Only 8 bit output supported with legacy API\n"); return AVERROR(ENOTSUP); } @@ -733,7 +739,7 @@ static int ddagrab_config_props(AVFilterLink *outlink) if (ret < 0) return ret; - if (dda->out_fmt && dda->raw_format != dda->out_fmt) { + if (dda->out_fmt && dda->raw_format != dda->out_fmt && (!dda->allow_fallback || dda->force_fmt)) { av_log(avctx, AV_LOG_ERROR, "Requested output format unavailable.\n"); return AVERROR(ENOTSUP); } From patchwork Wed Aug 10 20:47:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Timo Rothenpieler X-Patchwork-Id: 37223 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3d0d:b0:8d:a68e:8a0e with SMTP id y13csp1142415pzi; Wed, 10 Aug 2022 13:48:08 -0700 (PDT) X-Google-Smtp-Source: AA6agR7PYVLjNak6zBgwgdjcMq2ahooNAvHXUwfHWUu3S35LAB8JUK+orTXhFVkbeT2AyP5R/FMo X-Received: by 2002:a17:907:da2:b0:731:60e4:2261 with SMTP id go34-20020a1709070da200b0073160e42261mr10803063ejc.679.1660164488654; Wed, 10 Aug 2022 13:48:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660164488; cv=none; d=google.com; s=arc-20160816; b=xdjSDEiXyLvk8iepzCBdP2lTKpQlMsm8rndIBe7J9j1iCwk3QHWCvpd/R4Dp7MBnLA DUPH0FC1OSh/HEFMS/0y//GW2C5PxAihkY/OwK7qgPsJFI0na+cjk+Kt3m9cpSFimZC5 MiOwAFAwnJyOxd5MmOtPd4OZ9iGYoZByLieC+fzlVlDoUCw0ISzwtEKM9kUu8tKY+pzM uC+aWwWrXl23+TxFgsuoe/8M2m6Y9yhDptKcCJ959IepLizDMHR7gBFOF+pU7V7lEkTO jBkQqJ3PMOC7gdMIcSRNvKIuR3ja8x/RXjIzaqmWtQ4v9LzNkjMR2TdD4BLM2atEwSqP 7KpA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=KGzEnEwzbqen9iKksD8wcq87gmGrd6QN9eNV2fBGpUc=; b=icNvGxB4IYBYE+B3y73IRVNenGt54kZsfveQt7pMSggB1BiGAAkfiX/M1lDWx0+oB1 nE9D9Xg5MkJtPiljyD86DLONa+0QkIWSL4UnMb/gQcl/pSEdORNmullYzQ0FrOjR0izS ecPY4W5BNplYbXRl9tXfKsL28t6BaN38QkoMQatFxtZmDrm0nurL93YYlX6C9n+VkOl0 RAVTJ94YXHUM7HlMuq4QeDG7291q0yoN75togSxrSFUwD/lcQG7Sc5/X0nCq+mbZI+Pk Jn2Eltkmz/guCgfyfdY186iuWP6qiCkGsnNyD+zjX9oTuEl8pBeaFbGxXKlrRZO4o1Nd oekg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b=PVZYwObf; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id dd7-20020a1709069b8700b0073045237fd2si5745399ejc.751.2022.08.10.13.48.08; Wed, 10 Aug 2022 13:48:08 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b=PVZYwObf; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 62D7368B89B; Wed, 10 Aug 2022 23:47:35 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from btbn.de (btbn.de [136.243.74.85]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id B3B7968B81D for ; Wed, 10 Aug 2022 23:47:25 +0300 (EEST) Received: from [authenticated] by btbn.de (Postfix) with ESMTPSA id E77B12F260B; Wed, 10 Aug 2022 22:47:24 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rothenpieler.org; s=mail; t=1660164445; bh=hIA6fXAMgpIjKVN47QgW6IcamIRVV4SEKeeSRuuBMr0=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=PVZYwObfvFSW8nSEU0tr9oJNRbsTLTYyxMdXQj7et4fZ8k/QAHBGL6Oe9wquPOlK+ P2ijHzZaBN14wq3/jPccEl38Ck4rr/0+iJ18BWL5fRLCN4toaR43ls0ttw8aeEqdKe PENeCddnqd68ZUVEncKPJ0/9lL/3G3mVPtYpOaPylvTf4i1KjVYkZn+HQawTWAeEMJ N1F1ouC2fpS0/COlvFwYFiDFAPPwIbYBN+dHTIIaYYxkDefwgrpw3g2Hjh+oAiYTQU r0da7SwHNfGAmlXhigFbNrp2LkGDO53OX18//fGYQDvqtGZrgoiG3LgVaQMlV93i3D 58XWXEItjQTPg== From: Timo Rothenpieler To: ffmpeg-devel@ffmpeg.org Date: Wed, 10 Aug 2022 22:47:06 +0200 Message-Id: <20220810204712.3123-5-timo@rothenpieler.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220810204712.3123-1-timo@rothenpieler.org> References: <20220810204712.3123-1-timo@rothenpieler.org> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 05/11] avutil: move half-precision float helper to avutil X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Timo Rothenpieler Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 7sSmrOL2dAQX --- libavcodec/exr.c | 2 +- libavcodec/exrenc.c | 2 +- libavcodec/pnmdec.c | 3 ++- libavcodec/pnmenc.c | 2 +- {libavcodec => libavutil}/float2half.h | 6 +++--- {libavcodec => libavutil}/half2float.h | 6 +++--- 6 files changed, 11 insertions(+), 10 deletions(-) rename {libavcodec => libavutil}/float2half.h (96%) rename {libavcodec => libavutil}/half2float.h (96%) diff --git a/libavcodec/exr.c b/libavcodec/exr.c index 3a6b9c3014..5c6ca9adbf 100644 --- a/libavcodec/exr.c +++ b/libavcodec/exr.c @@ -41,6 +41,7 @@ #include "libavutil/avstring.h" #include "libavutil/opt.h" #include "libavutil/color_utils.h" +#include "libavutil/half2float.h" #include "avcodec.h" #include "bytestream.h" @@ -53,7 +54,6 @@ #include "exrdsp.h" #include "get_bits.h" #include "internal.h" -#include "half2float.h" #include "mathops.h" #include "thread.h" diff --git a/libavcodec/exrenc.c b/libavcodec/exrenc.c index 8cf7827bb6..56c084d483 100644 --- a/libavcodec/exrenc.c +++ b/libavcodec/exrenc.c @@ -31,11 +31,11 @@ #include "libavutil/intreadwrite.h" #include "libavutil/imgutils.h" #include "libavutil/pixdesc.h" +#include "libavutil/float2half.h" #include "avcodec.h" #include "bytestream.h" #include "codec_internal.h" #include "encode.h" -#include "float2half.h" enum ExrCompr { EXR_RAW, diff --git a/libavcodec/pnmdec.c b/libavcodec/pnmdec.c index 130407df25..9383dc8e60 100644 --- a/libavcodec/pnmdec.c +++ b/libavcodec/pnmdec.c @@ -21,12 +21,13 @@ #include "config_components.h" +#include "libavutil/half2float.h" + #include "avcodec.h" #include "codec_internal.h" #include "internal.h" #include "put_bits.h" #include "pnm.h" -#include "half2float.h" static void samplecpy(uint8_t *dst, const uint8_t *src, int n, int maxval) { diff --git a/libavcodec/pnmenc.c b/libavcodec/pnmenc.c index b16c93c88f..7ce534d06e 100644 --- a/libavcodec/pnmenc.c +++ b/libavcodec/pnmenc.c @@ -24,10 +24,10 @@ #include "libavutil/intreadwrite.h" #include "libavutil/imgutils.h" #include "libavutil/pixdesc.h" +#include "libavutil/float2half.h" #include "avcodec.h" #include "codec_internal.h" #include "encode.h" -#include "float2half.h" typedef struct PHMEncContext { uint16_t basetable[512]; diff --git a/libavcodec/float2half.h b/libavutil/float2half.h similarity index 96% rename from libavcodec/float2half.h rename to libavutil/float2half.h index e05125088c..d6aaab8278 100644 --- a/libavcodec/float2half.h +++ b/libavutil/float2half.h @@ -16,8 +16,8 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ -#ifndef AVCODEC_FLOAT2HALF_H -#define AVCODEC_FLOAT2HALF_H +#ifndef AVUTIL_FLOAT2HALF_H +#define AVUTIL_FLOAT2HALF_H #include @@ -64,4 +64,4 @@ static uint16_t float2half(uint32_t f, uint16_t *basetable, uint8_t *shifttable) return h; } -#endif /* AVCODEC_FLOAT2HALF_H */ +#endif /* AVUTIL_FLOAT2HALF_H */ diff --git a/libavcodec/half2float.h b/libavutil/half2float.h similarity index 96% rename from libavcodec/half2float.h rename to libavutil/half2float.h index 7df6747e50..1f6deade07 100644 --- a/libavcodec/half2float.h +++ b/libavutil/half2float.h @@ -16,8 +16,8 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ -#ifndef AVCODEC_HALF2FLOAT_H -#define AVCODEC_HALF2FLOAT_H +#ifndef AVUTIL_HALF2FLOAT_H +#define AVUTIL_HALF2FLOAT_H #include @@ -71,4 +71,4 @@ static uint32_t half2float(uint16_t h, const uint32_t *mantissatable, const uint return f; } -#endif /* AVCODEC_HALF2FLOAT_H */ +#endif /* AVUTIL_HALF2FLOAT_H */ From patchwork Wed Aug 10 20:47:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Timo Rothenpieler X-Patchwork-Id: 37229 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3d0d:b0:8d:a68e:8a0e with SMTP id y13csp1142868pzi; Wed, 10 Aug 2022 13:49:06 -0700 (PDT) X-Google-Smtp-Source: AA6agR7k4PEOOySPo6k0EFX5IvytqTiLdMGWeurg54kqYBdKT4OH+/OlmUSmMzfN/SBjmJKxX6bJ X-Received: by 2002:a17:907:94c7:b0:730:d5bc:14c with SMTP id dn7-20020a17090794c700b00730d5bc014cmr22015938ejc.68.1660164545869; Wed, 10 Aug 2022 13:49:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660164545; cv=none; d=google.com; s=arc-20160816; b=Y5wiBEag9c2/PMTMvrvLur4kXq89mTuCiqHn0BKkx4V6kAW6MhoXVQD+MnY3O+MCHp 8mITVgOAk3E25R3cCBdc/jzEiUeGF5mkuNeRriUYdpN762xPDiaPFj6IG5Ef/sifrxSm 74pClWGize/8jcgAO2imsk/EA+uhnt6MjQDtLC4q4MYWSnzC337kX0VC0EJqntXkJc47 U+TSTBDBWhpKnOXMAoLInknBngyuh9tldBvLnKU+cw+KcOoClt3EaqUBu8sXog8zKLwg 0FbBcCm5vUDtFsYgalPStRyLk1R9Dy+2J63SZF58REf35gzgIJsN8DMZC5YOSY13zqrf YarQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=tTi/43BreJYp/zvBD7SDdVv4Gvwy3AH8gmSa/2rSAr4=; b=y+rhN7el1ikfMX5xnP1Bv7sx89k10ycXpLPa4mSdxG58Rwg5fBPesO4/pFGY5K8jNG Wufj6bAMqZcfvNNFXmRSpFi27ieXWunzxpXZ1yb8TyQ+yBoKo9KzO0rs3e1mJexXDHZR zVg7EczwX57MxbHP/G2MYVH6f7x8uDaS7nQ7l20BhPp7D8patElNcDCqL0KWOlCbD0VE c36GtJetf9qQ3iNIukVCS57NK49f0lbpsbv8+4PqRJQq5Ukx2U22xxiVD5UfZ16zYqT3 QW1vpAG1fSCLZUxI2LtW8h4aU9SLkG0g3lF/kERjyv5zQTvGR6o0z59JbCcw33zxYki+ g9xA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b=azJMuu5n; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hr10-20020a1709073f8a00b0073069573e2esi5257762ejc.667.2022.08.10.13.49.05; Wed, 10 Aug 2022 13:49:05 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b=azJMuu5n; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 239F368B93E; Wed, 10 Aug 2022 23:47:41 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from btbn.de (btbn.de [136.243.74.85]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 238B268B89B for ; Wed, 10 Aug 2022 23:47:31 +0300 (EEST) Received: from [authenticated] by btbn.de (Postfix) with ESMTPSA id 0E3182F260C; Wed, 10 Aug 2022 22:47:25 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rothenpieler.org; s=mail; t=1660164445; bh=H2csBUjFdQ2Aipfsh/Ovz+x8Wry9D0rysCl51yVY7JY=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=azJMuu5n+i/JHawR/wGptM9rWi4Eq0znBGWWPVxDPGD4OD71SPqzrgmBAvVqKfpjQ J1+5lDXBbb4xsFZWgj+aQQY17rBxpf5ZtCM960bChQ12xD3dE3gAxhFr4Vr4mLu00R wZhwHt33ZKahh2vuifPpCr+HxbHsyNZ6cp19/v0KeYTt8oXExg0JUWTP7xrvzB5tHE c7vWyG9r2gj1bSYLGrjfxwLCtxJ0C2O0+EHu+gsbxbZfed0aZObvWl+ZXl0FSVqjqw KPFyS4qdN91Tcac0G3+CDp8jwSzEl/wQJUAUq7XLTuCHV37KWPDwiEXl3EkfeVWt/G c2f4yyOqqjLUw== From: Timo Rothenpieler To: ffmpeg-devel@ffmpeg.org Date: Wed, 10 Aug 2022 22:47:07 +0200 Message-Id: <20220810204712.3123-6-timo@rothenpieler.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220810204712.3123-1-timo@rothenpieler.org> References: <20220810204712.3123-1-timo@rothenpieler.org> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 06/11] avutil/half2float: adjust conversion of NaN X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Timo Rothenpieler Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: rJ9fBeAzk25e IEEE-754 differentiates two different kind of NaNs. Quiet and Signaling ones. They are differentiated by the MSB of the mantissa. For whatever reason, actual hardware conversion of half to single always sets the signaling bit to 1 if the mantissa is != 0, and to 0 if it's 0. So our code has to follow suite or fate-testing hardware float16 will be impossible. --- libavcodec/exr.c | 2 +- libavcodec/pnm.h | 2 +- libavutil/half2float.h | 5 +++++ tests/ref/fate/exr-rgb-scanline-zip-half-0x0-0xFFFF | 2 +- 4 files changed, 8 insertions(+), 3 deletions(-) diff --git a/libavcodec/exr.c b/libavcodec/exr.c index 5c6ca9adbf..47f4786491 100644 --- a/libavcodec/exr.c +++ b/libavcodec/exr.c @@ -191,7 +191,7 @@ typedef struct EXRContext { float gamma; union av_intfloat32 gamma_table[65536]; - uint32_t mantissatable[2048]; + uint32_t mantissatable[3072]; uint32_t exponenttable[64]; uint16_t offsettable[64]; } EXRContext; diff --git a/libavcodec/pnm.h b/libavcodec/pnm.h index 5bf2eaa4d9..7e5445f529 100644 --- a/libavcodec/pnm.h +++ b/libavcodec/pnm.h @@ -34,7 +34,7 @@ typedef struct PNMContext { int half; float scale; - uint32_t mantissatable[2048]; + uint32_t mantissatable[3072]; uint32_t exponenttable[64]; uint16_t offsettable[64]; } PNMContext; diff --git a/libavutil/half2float.h b/libavutil/half2float.h index 1f6deade07..5af4690cfe 100644 --- a/libavutil/half2float.h +++ b/libavutil/half2float.h @@ -45,6 +45,9 @@ static void half2float_table(uint32_t *mantissatable, uint32_t *exponenttable, mantissatable[i] = convertmantissa(i); for (int i = 1024; i < 2048; i++) mantissatable[i] = 0x38000000UL + ((i - 1024) << 13UL); + for (int i = 2048; i < 3072; i++) + mantissatable[i] = mantissatable[i - 1024] | 0x400000UL; + mantissatable[2048] = mantissatable[1024]; exponenttable[0] = 0; for (int i = 1; i < 31; i++) @@ -58,7 +61,9 @@ static void half2float_table(uint32_t *mantissatable, uint32_t *exponenttable, offsettable[0] = 0; for (int i = 1; i < 64; i++) offsettable[i] = 1024; + offsettable[31] = 2048; offsettable[32] = 0; + offsettable[63] = 2048; } static uint32_t half2float(uint16_t h, const uint32_t *mantissatable, const uint32_t *exponenttable, diff --git a/tests/ref/fate/exr-rgb-scanline-zip-half-0x0-0xFFFF b/tests/ref/fate/exr-rgb-scanline-zip-half-0x0-0xFFFF index b6201116fe..e45a40b498 100644 --- a/tests/ref/fate/exr-rgb-scanline-zip-half-0x0-0xFFFF +++ b/tests/ref/fate/exr-rgb-scanline-zip-half-0x0-0xFFFF @@ -3,4 +3,4 @@ #codec_id 0: rawvideo #dimensions 0: 256x256 #sar 0: 1/1 -0, 0, 0, 1, 786432, 0x1445e411 +0, 0, 0, 1, 786432, 0xce9be2be From patchwork Wed Aug 10 20:47:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Timo Rothenpieler X-Patchwork-Id: 37230 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3d0d:b0:8d:a68e:8a0e with SMTP id y13csp1142912pzi; Wed, 10 Aug 2022 13:49:15 -0700 (PDT) X-Google-Smtp-Source: AA6agR6KELa+47kRBF/z/fi16JMYqbfBOIywLmE1UZAwKVsq9X5eOWaJc8yhXExHRGBtxzRE5Cr8 X-Received: by 2002:a17:906:6a03:b0:730:a20e:cf33 with SMTP id qw3-20020a1709066a0300b00730a20ecf33mr22027182ejc.620.1660164554831; Wed, 10 Aug 2022 13:49:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660164554; cv=none; d=google.com; s=arc-20160816; b=SLhzmO33ZjsTDHEfO/WscYddskHmPpiexxQJIScik2o/KwAybYk6HHP7vjG4XrpZdX xF10Fq3b4XCqYa1jawgDrDJVk9epk2CM6/kJrAyTrSTt008c8DI/h7icwgcZyoXe3yVi aPOJxyEuD4CCYQG9GQA/6LUs8iyuyA6Rcl3G5fyuSQukxImte+6n/ZNpPou3JSk3TckT rhLQ+lbgqtMe4x0dNZRCUDzAtjzPTNdg5utvSM+wnJQcUl5sOF6oQY9TYYXE705OeMt4 Q7o7McdwlJ6oDVd+eRsCUjUUqRmCQLzrMsBWmU/D1fCIjw0/k+7MUGOytLB+J76RAbkh 819w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=2RQnI+IoqISndY2AhSh1iW87QvyVVkBdU8W2lBugoXw=; b=IR5JNYxO1nno3fLENvFDUtdDYW60HtjK4lpW1JvgIgbsZHkCs6ahwlWNRQzmd2f9eu /D133zFyrmlzs5zpdQ19borBpUFVD4jVmYn0J01Lgahc/si9THKiqcXIc/y5HJuOA3Kp tVRO4SPkXLX+9w2SFs/j9Uq9cByAKntJghLt9+kqn6gbjSHjZDZkWVPT77Hzb2yb+hdt pEBYgxcTsOozed4W6LRt1NcB37CR1r7ieMAvXvglZIfrTT3QVohhqv5cUuyjGhYSjEWT 2C1sxNIUBBZDkgRqUqqcniJoBtTgDeCj5m9Z/JlUG1RSKBZAGwnAqDKqFaPnR/6c7zWK QlMw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b="Wr/FVQGp"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id js12-20020a17090797cc00b00730a1069b72si5558886ejc.684.2022.08.10.13.49.14; Wed, 10 Aug 2022 13:49:14 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b="Wr/FVQGp"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 39B8468B947; Wed, 10 Aug 2022 23:47:42 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from btbn.de (btbn.de [136.243.74.85]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 207A568B896 for ; Wed, 10 Aug 2022 23:47:31 +0300 (EEST) Received: from [authenticated] by btbn.de (Postfix) with ESMTPSA id 294B62F260D; Wed, 10 Aug 2022 22:47:25 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rothenpieler.org; s=mail; t=1660164445; bh=5KB+vvK9crHgt1cgefZLtqp3WZ95ATxoJ7UVHhxSVsY=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=Wr/FVQGpT2zmyuYPynfYtAJUzO+/MwWqPSZ8IKcukM0Sj1/vNvKYr1wFw4MeuS0RW rSWm6GhTC0BbILlBbfMMmTjKTJlMIO1VCfMDhOkmrixoy+luMjM5bVPBMuSylow/hJ mjXnXarl1qDrML5cQVofIoXs5OPEk6WU2AvADCR3GFZh6cGjKI4UjxFcXPzukTbj6p O3YkGBrPRw6gg9Z4dKVjlIQwjfc55BXXHAqlLmcvcqUnTWwjqJSAhieWDzZ73WRmie aqbDasQdSz0H2mfW0V75Dx4xjB3kwakG/h0vi3VoldQsSsnz0kWplgE/2pWRZD2J7u eK2KbbAwCI0mA== From: Timo Rothenpieler To: ffmpeg-devel@ffmpeg.org Date: Wed, 10 Aug 2022 22:47:08 +0200 Message-Id: <20220810204712.3123-7-timo@rothenpieler.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220810204712.3123-1-timo@rothenpieler.org> References: <20220810204712.3123-1-timo@rothenpieler.org> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 07/11] avutil/half2float: move tables to header-internal structs X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Timo Rothenpieler Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: UGkGuxst2r+K Having to put the knowledge of the size of those arrays into a multitude of places is rather smelly. --- libavcodec/exr.c | 27 ++++++++-------------- libavcodec/exrenc.c | 11 +++++---- libavcodec/pnm.h | 5 ++--- libavcodec/pnmdec.c | 42 ++++++++-------------------------- libavcodec/pnmenc.c | 13 +++++------ libavutil/float2half.h | 51 +++++++++++++++++++++++------------------- libavutil/half2float.h | 46 ++++++++++++++++++++----------------- 7 files changed, 84 insertions(+), 111 deletions(-) diff --git a/libavcodec/exr.c b/libavcodec/exr.c index 47f4786491..825354873d 100644 --- a/libavcodec/exr.c +++ b/libavcodec/exr.c @@ -191,9 +191,7 @@ typedef struct EXRContext { float gamma; union av_intfloat32 gamma_table[65536]; - uint32_t mantissatable[3072]; - uint32_t exponenttable[64]; - uint16_t offsettable[64]; + half2float_tables h2f_tables; } EXRContext; static int zip_uncompress(const EXRContext *s, const uint8_t *src, int compressed_size, @@ -899,10 +897,7 @@ static int ac_uncompress(const EXRContext *s, GetByteContext *gb, float *block) n += val & 0xff; } else { ret = n; - block[ff_zigzag_direct[n]] = av_int2float(half2float(val, - s->mantissatable, - s->exponenttable, - s->offsettable)); + block[ff_zigzag_direct[n]] = av_int2float(half2float(val, &s->h2f_tables)); n++; } } @@ -1120,8 +1115,7 @@ static int dwa_uncompress(const EXRContext *s, const uint8_t *src, int compresse uint16_t *dc = (uint16_t *)td->dc_data; union av_intfloat32 dc_val; - dc_val.i = half2float(dc[idx], s->mantissatable, - s->exponenttable, s->offsettable); + dc_val.i = half2float(dc[idx], &s->h2f_tables); block[0] = dc_val.f; ac_uncompress(s, &agb, block); @@ -1171,7 +1165,7 @@ static int dwa_uncompress(const EXRContext *s, const uint8_t *src, int compresse for (int x = 0; x < td->xsize; x++) { uint16_t ha = ai0[x] | (ai1[x] << 8); - ao[x] = half2float(ha, s->mantissatable, s->exponenttable, s->offsettable); + ao[x] = half2float(ha, &s->h2f_tables); } } @@ -1427,10 +1421,7 @@ static int decode_block(AVCodecContext *avctx, void *tdata, } } else { for (x = 0; x < xsize; x++) { - ptr_x[0].i = half2float(bytestream_get_le16(&src), - s->mantissatable, - s->exponenttable, - s->offsettable); + ptr_x[0].i = half2float(bytestream_get_le16(&src), &s->h2f_tables); ptr_x++; } } @@ -2217,7 +2208,7 @@ static av_cold int decode_init(AVCodecContext *avctx) float one_gamma = 1.0f / s->gamma; avpriv_trc_function trc_func = NULL; - half2float_table(s->mantissatable, s->exponenttable, s->offsettable); + init_half2float_tables(&s->h2f_tables); s->avctx = avctx; @@ -2230,18 +2221,18 @@ static av_cold int decode_init(AVCodecContext *avctx) trc_func = avpriv_get_trc_function_from_trc(s->apply_trc_type); if (trc_func) { for (i = 0; i < 65536; ++i) { - t.i = half2float(i, s->mantissatable, s->exponenttable, s->offsettable); + t.i = half2float(i, &s->h2f_tables); t.f = trc_func(t.f); s->gamma_table[i] = t; } } else { if (one_gamma > 0.9999f && one_gamma < 1.0001f) { for (i = 0; i < 65536; ++i) { - s->gamma_table[i].i = half2float(i, s->mantissatable, s->exponenttable, s->offsettable); + s->gamma_table[i].i = half2float(i, &s->h2f_tables); } } else { for (i = 0; i < 65536; ++i) { - t.i = half2float(i, s->mantissatable, s->exponenttable, s->offsettable); + t.i = half2float(i, &s->h2f_tables); /* If negative value we reuse half value */ if (t.f <= 0.0f) { s->gamma_table[i] = t; diff --git a/libavcodec/exrenc.c b/libavcodec/exrenc.c index 56c084d483..6ab9400b7c 100644 --- a/libavcodec/exrenc.c +++ b/libavcodec/exrenc.c @@ -87,15 +87,14 @@ typedef struct EXRContext { EXRScanlineData *scanline; - uint16_t basetable[512]; - uint8_t shifttable[512]; + float2half_tables f2h_tables; } EXRContext; static av_cold int encode_init(AVCodecContext *avctx) { EXRContext *s = avctx->priv_data; - float2half_tables(s->basetable, s->shifttable); + init_float2half_tables(&s->f2h_tables); switch (avctx->pix_fmt) { case AV_PIX_FMT_GBRPF32: @@ -256,7 +255,7 @@ static int encode_scanline_rle(EXRContext *s, const AVFrame *frame) const uint32_t *src = (const uint32_t *)(frame->data[ch] + y * frame->linesize[ch]); for (int x = 0; x < frame->width; x++) - dst[x] = float2half(src[x], s->basetable, s->shifttable); + dst[x] = float2half(src[x], &s->f2h_tables); } break; } @@ -324,7 +323,7 @@ static int encode_scanline_zip(EXRContext *s, const AVFrame *frame) const uint32_t *src = (const uint32_t *)(frame->data[ch] + (y * s->scanline_height + l) * frame->linesize[ch]); for (int x = 0; x < frame->width; x++) - dst[x] = float2half(src[x], s->basetable, s->shifttable); + dst[x] = float2half(src[x], &s->f2h_tables); } } break; @@ -482,7 +481,7 @@ static int encode_frame(AVCodecContext *avctx, AVPacket *pkt, const uint32_t *src = (const uint32_t *)(frame->data[ch] + y * frame->linesize[ch]); for (int x = 0; x < frame->width; x++) - bytestream2_put_le16(pb, float2half(src[x], s->basetable, s->shifttable)); + bytestream2_put_le16(pb, float2half(src[x], &s->f2h_tables)); } } } diff --git a/libavcodec/pnm.h b/libavcodec/pnm.h index 7e5445f529..25251d9e4a 100644 --- a/libavcodec/pnm.h +++ b/libavcodec/pnm.h @@ -22,6 +22,7 @@ #ifndef AVCODEC_PNM_H #define AVCODEC_PNM_H +#include "libavutil/half2float.h" #include "avcodec.h" typedef struct PNMContext { @@ -34,9 +35,7 @@ typedef struct PNMContext { int half; float scale; - uint32_t mantissatable[3072]; - uint32_t exponenttable[64]; - uint16_t offsettable[64]; + half2float_tables h2f_tables; } PNMContext; int ff_pnm_decode_header(AVCodecContext *avctx, PNMContext * const s); diff --git a/libavcodec/pnmdec.c b/libavcodec/pnmdec.c index 9383dc8e60..6adc348ec8 100644 --- a/libavcodec/pnmdec.c +++ b/libavcodec/pnmdec.c @@ -313,18 +313,9 @@ static int pnm_decode_frame(AVCodecContext *avctx, AVFrame *p, b = (float *)p->data[1]; for (int i = 0; i < avctx->height; i++) { for (int j = 0; j < avctx->width; j++) { - r[j] = av_int2float(half2float(AV_RL16(s->bytestream+0), - s->mantissatable, - s->exponenttable, - s->offsettable)) * scale; - g[j] = av_int2float(half2float(AV_RL16(s->bytestream+2), - s->mantissatable, - s->exponenttable, - s->offsettable)) * scale; - b[j] = av_int2float(half2float(AV_RL16(s->bytestream+4), - s->mantissatable, - s->exponenttable, - s->offsettable)) * scale; + r[j] = av_int2float(half2float(AV_RL16(s->bytestream+0), &s->h2f_tables)) * scale; + g[j] = av_int2float(half2float(AV_RL16(s->bytestream+2), &s->h2f_tables)) * scale; + b[j] = av_int2float(half2float(AV_RL16(s->bytestream+4), &s->h2f_tables)) * scale; s->bytestream += 6; } @@ -340,18 +331,9 @@ static int pnm_decode_frame(AVCodecContext *avctx, AVFrame *p, b = (float *)p->data[1]; for (int i = 0; i < avctx->height; i++) { for (int j = 0; j < avctx->width; j++) { - r[j] = av_int2float(half2float(AV_RB16(s->bytestream+0), - s->mantissatable, - s->exponenttable, - s->offsettable)) * scale; - g[j] = av_int2float(half2float(AV_RB16(s->bytestream+2), - s->mantissatable, - s->exponenttable, - s->offsettable)) * scale; - b[j] = av_int2float(half2float(AV_RB16(s->bytestream+4), - s->mantissatable, - s->exponenttable, - s->offsettable)) * scale; + r[j] = av_int2float(half2float(AV_RB16(s->bytestream+0), &s->h2f_tables)) * scale; + g[j] = av_int2float(half2float(AV_RB16(s->bytestream+2), &s->h2f_tables)) * scale; + b[j] = av_int2float(half2float(AV_RB16(s->bytestream+4), &s->h2f_tables)) * scale; s->bytestream += 6; } @@ -394,10 +376,7 @@ static int pnm_decode_frame(AVCodecContext *avctx, AVFrame *p, float *g = (float *)p->data[0]; for (int i = 0; i < avctx->height; i++) { for (int j = 0; j < avctx->width; j++) { - g[j] = av_int2float(half2float(AV_RL16(s->bytestream), - s->mantissatable, - s->exponenttable, - s->offsettable)) * scale; + g[j] = av_int2float(half2float(AV_RL16(s->bytestream), &s->h2f_tables)) * scale; s->bytestream += 2; } g += p->linesize[0] / 4; @@ -406,10 +385,7 @@ static int pnm_decode_frame(AVCodecContext *avctx, AVFrame *p, float *g = (float *)p->data[0]; for (int i = 0; i < avctx->height; i++) { for (int j = 0; j < avctx->width; j++) { - g[j] = av_int2float(half2float(AV_RB16(s->bytestream), - s->mantissatable, - s->exponenttable, - s->offsettable)) * scale; + g[j] = av_int2float(half2float(AV_RB16(s->bytestream), &s->h2f_tables)) * scale; s->bytestream += 2; } g += p->linesize[0] / 4; @@ -501,7 +477,7 @@ static av_cold int phm_dec_init(AVCodecContext *avctx) { PNMContext *s = avctx->priv_data; - half2float_table(s->mantissatable, s->exponenttable, s->offsettable); + init_half2float_tables(&s->h2f_tables); return 0; } diff --git a/libavcodec/pnmenc.c b/libavcodec/pnmenc.c index 7ce534d06e..70992531bf 100644 --- a/libavcodec/pnmenc.c +++ b/libavcodec/pnmenc.c @@ -30,8 +30,7 @@ #include "encode.h" typedef struct PHMEncContext { - uint16_t basetable[512]; - uint8_t shifttable[512]; + float2half_tables f2h_tables; } PHMEncContext; static int pnm_encode_frame(AVCodecContext *avctx, AVPacket *pkt, @@ -169,9 +168,9 @@ static int pnm_encode_frame(AVCodecContext *avctx, AVPacket *pkt, for (int i = 0; i < avctx->height; i++) { for (int j = 0; j < avctx->width; j++) { - AV_WN16(bytestream + 0, float2half(av_float2int(r[j]), s->basetable, s->shifttable)); - AV_WN16(bytestream + 2, float2half(av_float2int(g[j]), s->basetable, s->shifttable)); - AV_WN16(bytestream + 4, float2half(av_float2int(b[j]), s->basetable, s->shifttable)); + AV_WN16(bytestream + 0, float2half(av_float2int(r[j]), &s->f2h_tables)); + AV_WN16(bytestream + 2, float2half(av_float2int(g[j]), &s->f2h_tables)); + AV_WN16(bytestream + 4, float2half(av_float2int(b[j]), &s->f2h_tables)); bytestream += 6; } @@ -184,7 +183,7 @@ static int pnm_encode_frame(AVCodecContext *avctx, AVPacket *pkt, for (int i = 0; i < avctx->height; i++) { for (int j = 0; j < avctx->width; j++) { - AV_WN16(bytestream, float2half(av_float2int(g[j]), s->basetable, s->shifttable)); + AV_WN16(bytestream, float2half(av_float2int(g[j]), &s->f2h_tables)); bytestream += 2; } @@ -295,7 +294,7 @@ static av_cold int phm_enc_init(AVCodecContext *avctx) { PHMEncContext *s = avctx->priv_data; - float2half_tables(s->basetable, s->shifttable); + init_float2half_tables(&s->f2h_tables); return 0; } diff --git a/libavutil/float2half.h b/libavutil/float2half.h index d6aaab8278..9252560649 100644 --- a/libavutil/float2half.h +++ b/libavutil/float2half.h @@ -21,45 +21,50 @@ #include -static void float2half_tables(uint16_t *basetable, uint8_t *shifttable) +typedef struct float2half_tables { + uint16_t basetable[512]; + uint8_t shifttable[512]; +} float2half_tables; + +static void init_float2half_tables(float2half_tables *t) { for (int i = 0; i < 256; i++) { int e = i - 127; if (e < -24) { // Very small numbers map to zero - basetable[i|0x000] = 0x0000; - basetable[i|0x100] = 0x8000; - shifttable[i|0x000] = 24; - shifttable[i|0x100] = 24; + t->basetable[i|0x000] = 0x0000; + t->basetable[i|0x100] = 0x8000; + t->shifttable[i|0x000] = 24; + t->shifttable[i|0x100] = 24; } else if (e < -14) { // Small numbers map to denorms - basetable[i|0x000] = (0x0400>>(-e-14)); - basetable[i|0x100] = (0x0400>>(-e-14)) | 0x8000; - shifttable[i|0x000] = -e-1; - shifttable[i|0x100] = -e-1; + t->basetable[i|0x000] = (0x0400>>(-e-14)); + t->basetable[i|0x100] = (0x0400>>(-e-14)) | 0x8000; + t->shifttable[i|0x000] = -e-1; + t->shifttable[i|0x100] = -e-1; } else if (e <= 15) { // Normal numbers just lose precision - basetable[i|0x000] = ((e + 15) << 10); - basetable[i|0x100] = ((e + 15) << 10) | 0x8000; - shifttable[i|0x000] = 13; - shifttable[i|0x100] = 13; + t->basetable[i|0x000] = ((e + 15) << 10); + t->basetable[i|0x100] = ((e + 15) << 10) | 0x8000; + t->shifttable[i|0x000] = 13; + t->shifttable[i|0x100] = 13; } else if (e < 128) { // Large numbers map to Infinity - basetable[i|0x000] = 0x7C00; - basetable[i|0x100] = 0xFC00; - shifttable[i|0x000] = 24; - shifttable[i|0x100] = 24; + t->basetable[i|0x000] = 0x7C00; + t->basetable[i|0x100] = 0xFC00; + t->shifttable[i|0x000] = 24; + t->shifttable[i|0x100] = 24; } else { // Infinity and NaN's stay Infinity and NaN's - basetable[i|0x000] = 0x7C00; - basetable[i|0x100] = 0xFC00; - shifttable[i|0x000] = 13; - shifttable[i|0x100] = 13; + t->basetable[i|0x000] = 0x7C00; + t->basetable[i|0x100] = 0xFC00; + t->shifttable[i|0x000] = 13; + t->shifttable[i|0x100] = 13; } } } -static uint16_t float2half(uint32_t f, uint16_t *basetable, uint8_t *shifttable) +static uint16_t float2half(uint32_t f, const float2half_tables *t) { uint16_t h; - h = basetable[(f >> 23) & 0x1ff] + ((f & 0x007fffff) >> shifttable[(f >> 23) & 0x1ff]); + h = t->basetable[(f >> 23) & 0x1ff] + ((f & 0x007fffff) >> t->shifttable[(f >> 23) & 0x1ff]); return h; } diff --git a/libavutil/half2float.h b/libavutil/half2float.h index 5af4690cfe..10b6fef4e6 100644 --- a/libavutil/half2float.h +++ b/libavutil/half2float.h @@ -21,6 +21,12 @@ #include +typedef struct half2float_tables { + uint32_t mantissatable[3072]; + uint32_t exponenttable[64]; + uint16_t offsettable[64]; +} half2float_tables; + static uint32_t convertmantissa(uint32_t i) { int32_t m = i << 13; // Zero pad mantissa bits @@ -37,41 +43,39 @@ static uint32_t convertmantissa(uint32_t i) return m | e; // Return combined number } -static void half2float_table(uint32_t *mantissatable, uint32_t *exponenttable, - uint16_t *offsettable) +static void init_half2float_tables(half2float_tables *t) { - mantissatable[0] = 0; + t->mantissatable[0] = 0; for (int i = 1; i < 1024; i++) - mantissatable[i] = convertmantissa(i); + t->mantissatable[i] = convertmantissa(i); for (int i = 1024; i < 2048; i++) - mantissatable[i] = 0x38000000UL + ((i - 1024) << 13UL); + t->mantissatable[i] = 0x38000000UL + ((i - 1024) << 13UL); for (int i = 2048; i < 3072; i++) - mantissatable[i] = mantissatable[i - 1024] | 0x400000UL; - mantissatable[2048] = mantissatable[1024]; + t->mantissatable[i] = t->mantissatable[i - 1024] | 0x400000UL; + t->mantissatable[2048] = t->mantissatable[1024]; - exponenttable[0] = 0; + t->exponenttable[0] = 0; for (int i = 1; i < 31; i++) - exponenttable[i] = i << 23; + t->exponenttable[i] = i << 23; for (int i = 33; i < 63; i++) - exponenttable[i] = 0x80000000UL + ((i - 32) << 23UL); - exponenttable[31]= 0x47800000UL; - exponenttable[32]= 0x80000000UL; - exponenttable[63]= 0xC7800000UL; + t->exponenttable[i] = 0x80000000UL + ((i - 32) << 23UL); + t->exponenttable[31]= 0x47800000UL; + t->exponenttable[32]= 0x80000000UL; + t->exponenttable[63]= 0xC7800000UL; - offsettable[0] = 0; + t->offsettable[0] = 0; for (int i = 1; i < 64; i++) - offsettable[i] = 1024; - offsettable[31] = 2048; - offsettable[32] = 0; - offsettable[63] = 2048; + t->offsettable[i] = 1024; + t->offsettable[31] = 2048; + t->offsettable[32] = 0; + t->offsettable[63] = 2048; } -static uint32_t half2float(uint16_t h, const uint32_t *mantissatable, const uint32_t *exponenttable, - const uint16_t *offsettable) +static uint32_t half2float(uint16_t h, const half2float_tables *t) { uint32_t f; - f = mantissatable[offsettable[h >> 10] + (h & 0x3ff)] + exponenttable[h >> 10]; + f = t->mantissatable[t->offsettable[h >> 10] + (h & 0x3ff)] + t->exponenttable[h >> 10]; return f; } From patchwork Wed Aug 10 20:47:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Timo Rothenpieler X-Patchwork-Id: 37227 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3d0d:b0:8d:a68e:8a0e with SMTP id y13csp1142710pzi; Wed, 10 Aug 2022 13:48:47 -0700 (PDT) X-Google-Smtp-Source: AA6agR4NSLG8xklyYBEY9gN/yI+MUrFXUUixkw4jalxYOr+w0ncjOMNMHpRbh4LCa9cLz1m0bJXL X-Received: by 2002:a05:6402:3220:b0:43d:ca4f:d2b9 with SMTP id g32-20020a056402322000b0043dca4fd2b9mr27936753eda.177.1660164526998; Wed, 10 Aug 2022 13:48:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660164526; cv=none; d=google.com; s=arc-20160816; b=T+6G8vxpf9ZLEl3WdCJ56v3LEfCiH1t42VsTphccSAOy78FIAQWrVrZHiNjWGFHKtx L76+CITfcaf9soG0GnCeCYVgZVE2sgqf437WvpRrmHuQ6P5Z8Yz3crNtK3gIqFqXdozV RFPkxyf1IB7H7G3Du9w/MqXU8MFCaY75OKvvkYcHIkodIQRGeu8S/zvSocaklLliqF/V KDAJaK8+tN1Ctbx+veKZJBpHN1g0lT8/N5A2MWfhfe4zKwsPO/tqiRLveoCnKGF5WHPC YCcRJ1UXGHHY1ZYF8AGgp6M7gPUBJ4lu5ylBT+3L0+B7XHgjodBwW/n6qzCrDIv9+SBd wHyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=VzCY+C4+LMrFrtHF21V6NKxiRcHRk3TGsnlQrWw1o4w=; b=jC/8jeFIh4msD0BEt5xBC4zE/ccqpqVLtV12mSZAWxMemEx3FlIGbM4WIKMHweuyGg NsFbuzM2ezUO3M0hQgN2z8lqjSqv2bY2DnO5c6vArqSHx3qoASviaFVvevCprIp6ZpYh rO7N5RkFSJI4oayuhvCDPoxTGl+31QMakA3jIThDDWAaDxIGmjMhZWCO2MrFJiaROr+C n2KpEYIRu92SSxkNlNohrb0fQY7BHraDdHu22ruStzdf9C5u9BzOHI/jVXKkM4QaoFzU yFvQqRFKrYm0BX41Q3lAsfvbcnYuqrIHiJ2PqVJKpChH3CIpipD8txX9VJKNuCIwQw4b Ipnw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b=UCBUG2PJ; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id z11-20020a05640235cb00b004377e09fb70si13252708edc.551.2022.08.10.13.48.46; Wed, 10 Aug 2022 13:48:46 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b=UCBUG2PJ; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2FF0E68B871; Wed, 10 Aug 2022 23:47:39 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from btbn.de (btbn.de [136.243.74.85]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1B14568B810 for ; Wed, 10 Aug 2022 23:47:31 +0300 (EEST) Received: from [authenticated] by btbn.de (Postfix) with ESMTPSA id 45AC92F260E; Wed, 10 Aug 2022 22:47:25 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rothenpieler.org; s=mail; t=1660164445; bh=XwRiEcr4iHYNfeISTznrmlj6RGS4De+KGp7K6Gmz+h0=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=UCBUG2PJ/yUQIklsfkbQ0M4fcy8enHrODGtd8HPG9NJAHeGUGlgHD8nRS5mAJlkgS ZZtwn7s6YLWuuwI2Sgj5RD+qCf0WeHxRm8EaCav0hg8ndcZWnedUAVwOolawHAj20K QSHyCXnaP4SFo0YXvYcSvy93dCt8wdcaITEiucx6Jc7x5TzCyCRN0XtO+AQFmLmWqi sy9XxGugxlMEoBehOuJuJGil6dZXAcUSJ4kbl19C7Xdq/KDkVOlzbHnHrwsNAYC0HZ an4sGsnzoiuzXhdj5haF/fZAMNXgzzGEfRHhDUFtfHZk5RSM4BXEwir3qtXmRjwDIv EgNA/uVRRB0vQ== From: Timo Rothenpieler To: ffmpeg-devel@ffmpeg.org Date: Wed, 10 Aug 2022 22:47:09 +0200 Message-Id: <20220810204712.3123-8-timo@rothenpieler.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220810204712.3123-1-timo@rothenpieler.org> References: <20220810204712.3123-1-timo@rothenpieler.org> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 08/11] avutil/half2float: move non-inline init code out of header X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Timo Rothenpieler Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: +rWCTh6TxK17 --- libavcodec/Makefile | 8 +++--- libavcodec/exr.c | 2 +- libavcodec/exrenc.c | 2 +- libavcodec/float2half.c | 19 +++++++++++++ libavcodec/half2float.c | 19 +++++++++++++ libavcodec/pnmdec.c | 2 +- libavcodec/pnmenc.c | 2 +- libavutil/float2half.c | 53 ++++++++++++++++++++++++++++++++++ libavutil/float2half.h | 36 ++--------------------- libavutil/half2float.c | 63 +++++++++++++++++++++++++++++++++++++++++ libavutil/half2float.h | 46 ++---------------------------- 11 files changed, 166 insertions(+), 86 deletions(-) create mode 100644 libavcodec/float2half.c create mode 100644 libavcodec/half2float.c create mode 100644 libavutil/float2half.c create mode 100644 libavutil/half2float.c diff --git a/libavcodec/Makefile b/libavcodec/Makefile index 029f1bad3d..cb80f73d99 100644 --- a/libavcodec/Makefile +++ b/libavcodec/Makefile @@ -337,8 +337,8 @@ OBJS-$(CONFIG_EIGHTSVX_FIB_DECODER) += 8svx.o OBJS-$(CONFIG_ESCAPE124_DECODER) += escape124.o OBJS-$(CONFIG_ESCAPE130_DECODER) += escape130.o OBJS-$(CONFIG_EVRC_DECODER) += evrcdec.o acelp_vectors.o lsp.o -OBJS-$(CONFIG_EXR_DECODER) += exr.o exrdsp.o -OBJS-$(CONFIG_EXR_ENCODER) += exrenc.o +OBJS-$(CONFIG_EXR_DECODER) += exr.o exrdsp.o half2float.o +OBJS-$(CONFIG_EXR_ENCODER) += exrenc.o float2half.o OBJS-$(CONFIG_FASTAUDIO_DECODER) += fastaudio.o OBJS-$(CONFIG_FFV1_DECODER) += ffv1dec.o ffv1.o OBJS-$(CONFIG_FFV1_ENCODER) += ffv1enc.o ffv1.o @@ -570,8 +570,8 @@ OBJS-$(CONFIG_PGMYUV_DECODER) += pnmdec.o pnm.o OBJS-$(CONFIG_PGMYUV_ENCODER) += pnmenc.o OBJS-$(CONFIG_PGSSUB_DECODER) += pgssubdec.o OBJS-$(CONFIG_PGX_DECODER) += pgxdec.o -OBJS-$(CONFIG_PHM_DECODER) += pnmdec.o pnm.o -OBJS-$(CONFIG_PHM_ENCODER) += pnmenc.o +OBJS-$(CONFIG_PHM_DECODER) += pnmdec.o pnm.o half2float.o +OBJS-$(CONFIG_PHM_ENCODER) += pnmenc.o float2half.o OBJS-$(CONFIG_PHOTOCD_DECODER) += photocd.o OBJS-$(CONFIG_PICTOR_DECODER) += pictordec.o cga_data.o OBJS-$(CONFIG_PIXLET_DECODER) += pixlet.o diff --git a/libavcodec/exr.c b/libavcodec/exr.c index 825354873d..a3582bfdd6 100644 --- a/libavcodec/exr.c +++ b/libavcodec/exr.c @@ -2208,7 +2208,7 @@ static av_cold int decode_init(AVCodecContext *avctx) float one_gamma = 1.0f / s->gamma; avpriv_trc_function trc_func = NULL; - init_half2float_tables(&s->h2f_tables); + ff_init_half2float_tables(&s->h2f_tables); s->avctx = avctx; diff --git a/libavcodec/exrenc.c b/libavcodec/exrenc.c index 6ab9400b7c..77b1ce052b 100644 --- a/libavcodec/exrenc.c +++ b/libavcodec/exrenc.c @@ -94,7 +94,7 @@ static av_cold int encode_init(AVCodecContext *avctx) { EXRContext *s = avctx->priv_data; - init_float2half_tables(&s->f2h_tables); + ff_init_float2half_tables(&s->f2h_tables); switch (avctx->pix_fmt) { case AV_PIX_FMT_GBRPF32: diff --git a/libavcodec/float2half.c b/libavcodec/float2half.c new file mode 100644 index 0000000000..90a6f63fac --- /dev/null +++ b/libavcodec/float2half.c @@ -0,0 +1,19 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/float2half.c" diff --git a/libavcodec/half2float.c b/libavcodec/half2float.c new file mode 100644 index 0000000000..1b023f96a5 --- /dev/null +++ b/libavcodec/half2float.c @@ -0,0 +1,19 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/half2float.c" diff --git a/libavcodec/pnmdec.c b/libavcodec/pnmdec.c index 6adc348ec8..fbed282e93 100644 --- a/libavcodec/pnmdec.c +++ b/libavcodec/pnmdec.c @@ -477,7 +477,7 @@ static av_cold int phm_dec_init(AVCodecContext *avctx) { PNMContext *s = avctx->priv_data; - init_half2float_tables(&s->h2f_tables); + ff_init_half2float_tables(&s->h2f_tables); return 0; } diff --git a/libavcodec/pnmenc.c b/libavcodec/pnmenc.c index 70992531bf..50f55bb1b9 100644 --- a/libavcodec/pnmenc.c +++ b/libavcodec/pnmenc.c @@ -294,7 +294,7 @@ static av_cold int phm_enc_init(AVCodecContext *avctx) { PHMEncContext *s = avctx->priv_data; - init_float2half_tables(&s->f2h_tables); + ff_init_float2half_tables(&s->f2h_tables); return 0; } diff --git a/libavutil/float2half.c b/libavutil/float2half.c new file mode 100644 index 0000000000..dba14cef5d --- /dev/null +++ b/libavutil/float2half.c @@ -0,0 +1,53 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/float2half.h" + +void ff_init_float2half_tables(float2half_tables *t) +{ + for (int i = 0; i < 256; i++) { + int e = i - 127; + + if (e < -24) { // Very small numbers map to zero + t->basetable[i|0x000] = 0x0000; + t->basetable[i|0x100] = 0x8000; + t->shifttable[i|0x000] = 24; + t->shifttable[i|0x100] = 24; + } else if (e < -14) { // Small numbers map to denorms + t->basetable[i|0x000] = (0x0400>>(-e-14)); + t->basetable[i|0x100] = (0x0400>>(-e-14)) | 0x8000; + t->shifttable[i|0x000] = -e-1; + t->shifttable[i|0x100] = -e-1; + } else if (e <= 15) { // Normal numbers just lose precision + t->basetable[i|0x000] = ((e + 15) << 10); + t->basetable[i|0x100] = ((e + 15) << 10) | 0x8000; + t->shifttable[i|0x000] = 13; + t->shifttable[i|0x100] = 13; + } else if (e < 128) { // Large numbers map to Infinity + t->basetable[i|0x000] = 0x7C00; + t->basetable[i|0x100] = 0xFC00; + t->shifttable[i|0x000] = 24; + t->shifttable[i|0x100] = 24; + } else { // Infinity and NaN's stay Infinity and NaN's + t->basetable[i|0x000] = 0x7C00; + t->basetable[i|0x100] = 0xFC00; + t->shifttable[i|0x000] = 13; + t->shifttable[i|0x100] = 13; + } + } +} diff --git a/libavutil/float2half.h b/libavutil/float2half.h index 9252560649..b8c9cdfc4f 100644 --- a/libavutil/float2half.h +++ b/libavutil/float2half.h @@ -26,41 +26,9 @@ typedef struct float2half_tables { uint8_t shifttable[512]; } float2half_tables; -static void init_float2half_tables(float2half_tables *t) -{ - for (int i = 0; i < 256; i++) { - int e = i - 127; - - if (e < -24) { // Very small numbers map to zero - t->basetable[i|0x000] = 0x0000; - t->basetable[i|0x100] = 0x8000; - t->shifttable[i|0x000] = 24; - t->shifttable[i|0x100] = 24; - } else if (e < -14) { // Small numbers map to denorms - t->basetable[i|0x000] = (0x0400>>(-e-14)); - t->basetable[i|0x100] = (0x0400>>(-e-14)) | 0x8000; - t->shifttable[i|0x000] = -e-1; - t->shifttable[i|0x100] = -e-1; - } else if (e <= 15) { // Normal numbers just lose precision - t->basetable[i|0x000] = ((e + 15) << 10); - t->basetable[i|0x100] = ((e + 15) << 10) | 0x8000; - t->shifttable[i|0x000] = 13; - t->shifttable[i|0x100] = 13; - } else if (e < 128) { // Large numbers map to Infinity - t->basetable[i|0x000] = 0x7C00; - t->basetable[i|0x100] = 0xFC00; - t->shifttable[i|0x000] = 24; - t->shifttable[i|0x100] = 24; - } else { // Infinity and NaN's stay Infinity and NaN's - t->basetable[i|0x000] = 0x7C00; - t->basetable[i|0x100] = 0xFC00; - t->shifttable[i|0x000] = 13; - t->shifttable[i|0x100] = 13; - } - } -} +void ff_init_float2half_tables(float2half_tables *t); -static uint16_t float2half(uint32_t f, const float2half_tables *t) +static inline uint16_t float2half(uint32_t f, const float2half_tables *t) { uint16_t h; diff --git a/libavutil/half2float.c b/libavutil/half2float.c new file mode 100644 index 0000000000..baac8e4093 --- /dev/null +++ b/libavutil/half2float.c @@ -0,0 +1,63 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/half2float.h" + +static uint32_t convertmantissa(uint32_t i) +{ + int32_t m = i << 13; // Zero pad mantissa bits + int32_t e = 0; // Zero exponent + + while (!(m & 0x00800000)) { // While not normalized + e -= 0x00800000; // Decrement exponent (1<<23) + m <<= 1; // Shift mantissa + } + + m &= ~0x00800000; // Clear leading 1 bit + e += 0x38800000; // Adjust bias ((127-14)<<23) + + return m | e; // Return combined number +} + +void ff_init_half2float_tables(half2float_tables *t) +{ + t->mantissatable[0] = 0; + for (int i = 1; i < 1024; i++) + t->mantissatable[i] = convertmantissa(i); + for (int i = 1024; i < 2048; i++) + t->mantissatable[i] = 0x38000000UL + ((i - 1024) << 13UL); + for (int i = 2048; i < 3072; i++) + t->mantissatable[i] = t->mantissatable[i - 1024] | 0x400000UL; + t->mantissatable[2048] = t->mantissatable[1024]; + + t->exponenttable[0] = 0; + for (int i = 1; i < 31; i++) + t->exponenttable[i] = i << 23; + for (int i = 33; i < 63; i++) + t->exponenttable[i] = 0x80000000UL + ((i - 32) << 23UL); + t->exponenttable[31]= 0x47800000UL; + t->exponenttable[32]= 0x80000000UL; + t->exponenttable[63]= 0xC7800000UL; + + t->offsettable[0] = 0; + for (int i = 1; i < 64; i++) + t->offsettable[i] = 1024; + t->offsettable[31] = 2048; + t->offsettable[32] = 0; + t->offsettable[63] = 2048; +} diff --git a/libavutil/half2float.h b/libavutil/half2float.h index 10b6fef4e6..cb58e44a1c 100644 --- a/libavutil/half2float.h +++ b/libavutil/half2float.h @@ -27,51 +27,9 @@ typedef struct half2float_tables { uint16_t offsettable[64]; } half2float_tables; -static uint32_t convertmantissa(uint32_t i) -{ - int32_t m = i << 13; // Zero pad mantissa bits - int32_t e = 0; // Zero exponent - - while (!(m & 0x00800000)) { // While not normalized - e -= 0x00800000; // Decrement exponent (1<<23) - m <<= 1; // Shift mantissa - } - - m &= ~0x00800000; // Clear leading 1 bit - e += 0x38800000; // Adjust bias ((127-14)<<23) - - return m | e; // Return combined number -} - -static void init_half2float_tables(half2float_tables *t) -{ - t->mantissatable[0] = 0; - for (int i = 1; i < 1024; i++) - t->mantissatable[i] = convertmantissa(i); - for (int i = 1024; i < 2048; i++) - t->mantissatable[i] = 0x38000000UL + ((i - 1024) << 13UL); - for (int i = 2048; i < 3072; i++) - t->mantissatable[i] = t->mantissatable[i - 1024] | 0x400000UL; - t->mantissatable[2048] = t->mantissatable[1024]; - - t->exponenttable[0] = 0; - for (int i = 1; i < 31; i++) - t->exponenttable[i] = i << 23; - for (int i = 33; i < 63; i++) - t->exponenttable[i] = 0x80000000UL + ((i - 32) << 23UL); - t->exponenttable[31]= 0x47800000UL; - t->exponenttable[32]= 0x80000000UL; - t->exponenttable[63]= 0xC7800000UL; - - t->offsettable[0] = 0; - for (int i = 1; i < 64; i++) - t->offsettable[i] = 1024; - t->offsettable[31] = 2048; - t->offsettable[32] = 0; - t->offsettable[63] = 2048; -} +void ff_init_half2float_tables(half2float_tables *t); -static uint32_t half2float(uint16_t h, const half2float_tables *t) +static inline uint32_t half2float(uint16_t h, const half2float_tables *t) { uint32_t f; From patchwork Wed Aug 10 20:47:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Timo Rothenpieler X-Patchwork-Id: 37225 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3d0d:b0:8d:a68e:8a0e with SMTP id y13csp1142564pzi; Wed, 10 Aug 2022 13:48:28 -0700 (PDT) X-Google-Smtp-Source: AA6agR6YYMhHQI9O5TUIvjIXVpkNHsLOddvPiAThsfMVG2Hd6BtCQaBjea4Q4SHitZW3K10Jvigs X-Received: by 2002:a05:6402:28cb:b0:43b:c6d7:ef92 with SMTP id ef11-20020a05640228cb00b0043bc6d7ef92mr28596360edb.333.1660164507779; Wed, 10 Aug 2022 13:48:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660164507; cv=none; d=google.com; s=arc-20160816; b=CqiWzxS1/gad+fg8w4e958ML/54bE9HemjkIYSuJjLCV3q1x0YtJu5DWlWwGJ+B3al WrDhnHdRptIZeYzgEjqLtkpV0cFLJ+Ehvy6XRaUFeRH5tQCIv0eVzookujqbYpS2YPaS zDalHGNtu7fBSrNK1MkFcaFaurmtcKjTgZnJfiRx7O+ypw++oZjCXt7+4bLrgCrYClIO P+dgQWSkCduQXqazcwILPJpmwDVKaLrmQ/tOts0C+Dq8kXgXmz/ZYWohEsrBtoxa21dJ OZAO/r80tJFItjhVvkNRbDBkH5RBXv4Fr/u1sNMNQ/pEI0DD+Gf1HWe0YlrChdlkQzQU nzcA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=vyBzX5RFZ+UPP8R+9UYcXtZFIOM8p6glwQnetq1J19c=; b=T3SzEsCcfoApsAQ5YNtXyPJeF+ipUxo1ZC+8jGpmpOYQx1wgLCn3S4St6JWQn60OpN G7GCuJV7nNIlwX+j1z4bvNMekwxF5sNAGgwiZULyIspGyIjdWKvP1XtpcpDX1lr6/aSI fo9GP3zi8y6EIFDcTg5y+rB3Z6pvumMf2rKPrHfvzowNjZy+cxnWlyfC7sATAGOJBzy1 5CGKVhFEmUrT/bLJ0kF9rYfiaQdp/UCzPfSXZQ7UCH+2Wk4SMGOZUd5ezjg1TgZ/Zfqc hCTpTrsEtoJLfwxb5C3c9FIsKT68BBl7hcRJLDCrWON5UzZRKXPbxJpVAKgjmtptdUvL p7Hg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b=ex07BzQf; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id gt38-20020a1709072da600b0073127725a02si5633887ejc.770.2022.08.10.13.48.27; Wed, 10 Aug 2022 13:48:27 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b=ex07BzQf; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4125E68B8FC; Wed, 10 Aug 2022 23:47:37 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from btbn.de (btbn.de [136.243.74.85]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1B48668B883 for ; Wed, 10 Aug 2022 23:47:31 +0300 (EEST) Received: from [authenticated] by btbn.de (Postfix) with ESMTPSA id 62E922F260F; Wed, 10 Aug 2022 22:47:25 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rothenpieler.org; s=mail; t=1660164445; bh=tRI0BK5KQlVJV8F94kb8xqVuesqZ4ZNFe9JELoi9AP8=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=ex07BzQfo1nq6On/ltbEpp9jqBqjhH3NOr4X9ZhaSZzJ7X65TVmCteRkmrTuhlViT 15uMgBF/zX+hg9BNeEC00OVRjsRFtfmND/8kDDzDkSVNEtVwzle6Bolwj5IUboyzOD +hdEJTTaHy/ICHQKD2lmtM8L2AdYV1KMsL1uGAPEziHI7pDTk6ZrO1ikDQXrUgRpLc v8/3Vp+81ViM5Xtq/YecuQlMKaEZS/o5NclkydOWyPjilQ9JWhKDSkXmhH7ZCRK8dd dbfVNquFK7njPUBUzY9jOt0mQG+WpQVs3wiLPE1eef3Ao+5ccsLt7prVAqfoppXkCT A0CYKBC0zWADA== From: Timo Rothenpieler To: ffmpeg-devel@ffmpeg.org Date: Wed, 10 Aug 2022 22:47:10 +0200 Message-Id: <20220810204712.3123-9-timo@rothenpieler.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220810204712.3123-1-timo@rothenpieler.org> References: <20220810204712.3123-1-timo@rothenpieler.org> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 09/11] avutil/half2float: use native _Float16 if available X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Timo Rothenpieler Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 3+VkvMpABZ+U _Float16 support was available on arm/aarch64 for a while, and with gcc 12 was enabled on x86 as long as SSE2 is supported. If the target arch supports f16c, gcc emits fairly efficient assembly, taking advantage of it. This is the case on x86-64-v3 or higher. Without f16c, it emulates it in software using sse2 instructions. --- configure | 4 ++++ libavutil/float2half.c | 2 ++ libavutil/float2half.h | 16 ++++++++++++++++ libavutil/half2float.c | 4 ++++ libavutil/half2float.h | 16 ++++++++++++++++ 5 files changed, 42 insertions(+) diff --git a/configure b/configure index 6761d0cb32..2536ae012d 100755 --- a/configure +++ b/configure @@ -2143,6 +2143,7 @@ ARCH_FEATURES=" fast_64bit fast_clz fast_cmov + float16 local_aligned simd_align_16 simd_align_32 @@ -5125,6 +5126,8 @@ elif enabled arm; then ;; esac + test_cflags -mfp16-format=ieee && add_cflags -mfp16-format=ieee + elif enabled avr32; then case $cpu in @@ -6228,6 +6231,7 @@ check_builtin MemoryBarrier windows.h "MemoryBarrier()" check_builtin sync_val_compare_and_swap "" "int *ptr; int oldval, newval; __sync_val_compare_and_swap(ptr, oldval, newval)" check_builtin gmtime_r time.h "time_t *time; struct tm *tm; gmtime_r(time, tm)" check_builtin localtime_r time.h "time_t *time; struct tm *tm; localtime_r(time, tm)" +check_builtin float16 "" "_Float16 f16var" case "$custom_allocator" in jemalloc) diff --git a/libavutil/float2half.c b/libavutil/float2half.c index dba14cef5d..1390d3acc0 100644 --- a/libavutil/float2half.c +++ b/libavutil/float2half.c @@ -20,6 +20,7 @@ void ff_init_float2half_tables(float2half_tables *t) { +#if !HAVE_FLOAT16 for (int i = 0; i < 256; i++) { int e = i - 127; @@ -50,4 +51,5 @@ void ff_init_float2half_tables(float2half_tables *t) t->shifttable[i|0x100] = 13; } } +#endif } diff --git a/libavutil/float2half.h b/libavutil/float2half.h index b8c9cdfc4f..8c1fb804b7 100644 --- a/libavutil/float2half.h +++ b/libavutil/float2half.h @@ -20,21 +20,37 @@ #define AVUTIL_FLOAT2HALF_H #include +#include "intfloat.h" + +#include "config.h" typedef struct float2half_tables { +#if HAVE_FLOAT16 + uint8_t dummy; +#else uint16_t basetable[512]; uint8_t shifttable[512]; +#endif } float2half_tables; void ff_init_float2half_tables(float2half_tables *t); static inline uint16_t float2half(uint32_t f, const float2half_tables *t) { +#if HAVE_FLOAT16 + union { + _Float16 f; + uint16_t i; + } u; + u.f = av_int2float(f); + return u.i; +#else uint16_t h; h = t->basetable[(f >> 23) & 0x1ff] + ((f & 0x007fffff) >> t->shifttable[(f >> 23) & 0x1ff]); return h; +#endif } #endif /* AVUTIL_FLOAT2HALF_H */ diff --git a/libavutil/half2float.c b/libavutil/half2float.c index baac8e4093..873226d3a0 100644 --- a/libavutil/half2float.c +++ b/libavutil/half2float.c @@ -18,6 +18,7 @@ #include "libavutil/half2float.h" +#if !HAVE_FLOAT16 static uint32_t convertmantissa(uint32_t i) { int32_t m = i << 13; // Zero pad mantissa bits @@ -33,9 +34,11 @@ static uint32_t convertmantissa(uint32_t i) return m | e; // Return combined number } +#endif void ff_init_half2float_tables(half2float_tables *t) { +#if !HAVE_FLOAT16 t->mantissatable[0] = 0; for (int i = 1; i < 1024; i++) t->mantissatable[i] = convertmantissa(i); @@ -60,4 +63,5 @@ void ff_init_half2float_tables(half2float_tables *t) t->offsettable[31] = 2048; t->offsettable[32] = 0; t->offsettable[63] = 2048; +#endif } diff --git a/libavutil/half2float.h b/libavutil/half2float.h index cb58e44a1c..b2a7c934a6 100644 --- a/libavutil/half2float.h +++ b/libavutil/half2float.h @@ -20,22 +20,38 @@ #define AVUTIL_HALF2FLOAT_H #include +#include "intfloat.h" + +#include "config.h" typedef struct half2float_tables { +#if HAVE_FLOAT16 + uint8_t dummy; +#else uint32_t mantissatable[3072]; uint32_t exponenttable[64]; uint16_t offsettable[64]; +#endif } half2float_tables; void ff_init_half2float_tables(half2float_tables *t); static inline uint32_t half2float(uint16_t h, const half2float_tables *t) { +#if HAVE_FLOAT16 + union { + _Float16 f; + uint16_t i; + } u; + u.i = h; + return av_float2int(u.f); +#else uint32_t f; f = t->mantissatable[t->offsettable[h >> 10] + (h & 0x3ff)] + t->exponenttable[h >> 10]; return f; +#endif } #endif /* AVUTIL_HALF2FLOAT_H */ From patchwork Wed Aug 10 20:47:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Timo Rothenpieler X-Patchwork-Id: 37228 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3d0d:b0:8d:a68e:8a0e with SMTP id y13csp1142819pzi; Wed, 10 Aug 2022 13:48:56 -0700 (PDT) X-Google-Smtp-Source: AA6agR5pI0HNDUAA/50DeoXYANZ8Ry6FX1Xds+vyghnyQ9k6WdogmtCfeNYLaaNtEUkhP1Q/8L8u X-Received: by 2002:a17:907:6818:b0:730:d99f:7b91 with SMTP id qz24-20020a170907681800b00730d99f7b91mr20278680ejc.496.1660164536621; Wed, 10 Aug 2022 13:48:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660164536; cv=none; d=google.com; s=arc-20160816; b=bMUH7CrL55RWL2JCO3uLt8j1zzTjCB1nol+eJazzTZ2DFCBSrs0fY6Qaqcbb1uB0fL IU9Ko6NRybNY9zvwWxFhT4VFIPKSy40bPKmohqNVY/4o4AsrFR2NMOENvyKpnS693+Bh 2LBVVH100k1zfzxveq4ML3RIqROkdgWI/CQW9YiiqMhwYn7Co/pYVJ4/q1Qhr5hGCe+U yK4FqvJOdTv2Cp2OVdpO6zWvV7Ty/Ytj3Ngt33+E7n16cSuKfsXNj7fNUkW2eE4d5Ymd 0BB9rIAM0wIkFiqL0hD4wdSvFOpZCDGRtggtYd8WOZ4FDvzb9zhXf/ELcLDkIeXlNnMJ yNMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=gCWZxpenJ2fd1XALd4EY0+VNPNyZlCCYLkKdI6Cq6VE=; b=ZA7fhopGESxLy971xSuOCDHQxgtFCKdxduuMPh0kqqPqSgeFtaZdHHSpuh/b2LWvjL 2N4zBrkpEk9PEhOT1OLZ8639AbT06o8PtA+g//QgrxtWGxnBbChDOCCLFmckYqe5FoRj cqb4LkvibXHA+FYPxYPnYzHgtS9rVQJo0ECZYDobiCFD3X7uFXRvYIJVLDaKdCivyg+j J0hynDQiyh7iFvk+ymORPujGJSc/Xl/sONXxi4ew18RVCBZ54/GPFewMhq1hDnnL8NEz c/d2+rWvGI5lSuR2Vcg0khzBdaImHLrNClO4rLhXWTYARrlS7x9j79eD963ZXwY/6MWT UV4A== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b=ih8+3qhq; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id dy10-20020a05640231ea00b0043e01070ae6si10960051edb.512.2022.08.10.13.48.56; Wed, 10 Aug 2022 13:48:56 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b=ih8+3qhq; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 256D568B92B; Wed, 10 Aug 2022 23:47:40 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from btbn.de (btbn.de [136.243.74.85]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1AFA768B586 for ; Wed, 10 Aug 2022 23:47:31 +0300 (EEST) Received: from [authenticated] by btbn.de (Postfix) with ESMTPSA id 7F4A62F2614; Wed, 10 Aug 2022 22:47:25 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rothenpieler.org; s=mail; t=1660164445; bh=10ZztRD3NN895KYxp+7NXQGv8c7KBxN5jx75Qd7mnyI=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=ih8+3qhqSLf8AsnwHjTVot4GoiuWKfcPy0sUegJzaPdybwukw/o2DLdhT3eCT97dU kGKb4w7b8LaaJ4vAsIRK1y2muaevC1iQA0rRs+qCAG9uNbKgP3ZekcRGMsXctOPMNP 0Vv2RuPlaoCbWz16zTmfPDkLUrW3cYHqiyMNXyi/KB2x6UQWUSz2lP7lk9NaumIqc4 5QD4vksIPDII9QhGuE07J+Qe++sexXL4MDeszvgHrBX2JmgKh/YNR1sQJ5xhb48gAJ SgFN7lnj0L3RRzA/7H6dm29hdEZJ8w5SStz0/MzyKFpIgaY4t/hEFSiEPyVKwoh3Um xMy1rpElHrQCg== From: Timo Rothenpieler To: ffmpeg-devel@ffmpeg.org Date: Wed, 10 Aug 2022 22:47:11 +0200 Message-Id: <20220810204712.3123-10-timo@rothenpieler.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220810204712.3123-1-timo@rothenpieler.org> References: <20220810204712.3123-1-timo@rothenpieler.org> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 10/11] swscale: add SwsContext parameter to input functions X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Timo Rothenpieler Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: QukDH8QDpgXX --- libswscale/hscale.c | 12 +-- libswscale/input.c | 149 ++++++++++++++++++---------------- libswscale/swscale_internal.h | 17 ++-- libswscale/x86/swscale.c | 13 +-- 4 files changed, 106 insertions(+), 85 deletions(-) diff --git a/libswscale/hscale.c b/libswscale/hscale.c index eca0635338..6789ce7540 100644 --- a/libswscale/hscale.c +++ b/libswscale/hscale.c @@ -105,18 +105,18 @@ static int lum_convert(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int uint8_t * dst = desc->dst->plane[0].line[i]; if (c->lumToYV12) { - c->lumToYV12(dst, src[0], src[1], src[2], srcW, pal); + c->lumToYV12(dst, src[0], src[1], src[2], srcW, pal, c->input_opaque); } else if (c->readLumPlanar) { - c->readLumPlanar(dst, src, srcW, c->input_rgb2yuv_table); + c->readLumPlanar(dst, src, srcW, c->input_rgb2yuv_table, c->input_opaque); } if (desc->alpha) { dst = desc->dst->plane[3].line[i]; if (c->alpToYV12) { - c->alpToYV12(dst, src[3], src[1], src[2], srcW, pal); + c->alpToYV12(dst, src[3], src[1], src[2], srcW, pal, c->input_opaque); } else if (c->readAlpPlanar) { - c->readAlpPlanar(dst, src, srcW, NULL); + c->readAlpPlanar(dst, src, srcW, NULL, c->input_opaque); } } } @@ -224,9 +224,9 @@ static int chr_convert(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int uint8_t * dst1 = desc->dst->plane[1].line[i]; uint8_t * dst2 = desc->dst->plane[2].line[i]; if (c->chrToYV12) { - c->chrToYV12(dst1, dst2, src[0], src[1], src[2], srcW, pal); + c->chrToYV12(dst1, dst2, src[0], src[1], src[2], srcW, pal, c->input_opaque); } else if (c->readChrPlanar) { - c->readChrPlanar(dst1, dst2, src, srcW, c->input_rgb2yuv_table); + c->readChrPlanar(dst1, dst2, src, srcW, c->input_rgb2yuv_table, c->input_opaque); } } return sliceH; diff --git a/libswscale/input.c b/libswscale/input.c index 68abc4d62c..36ef1e43ac 100644 --- a/libswscale/input.c +++ b/libswscale/input.c @@ -88,7 +88,7 @@ rgb64ToUV_half_c_template(uint16_t *dstU, uint16_t *dstV, #define rgb64funcs(pattern, BE_LE, origin) \ static void pattern ## 64 ## BE_LE ## ToY_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused0, const uint8_t *unused1,\ - int width, uint32_t *rgb2yuv) \ + int width, uint32_t *rgb2yuv, void *opq) \ { \ const uint16_t *src = (const uint16_t *) _src; \ uint16_t *dst = (uint16_t *) _dst; \ @@ -97,7 +97,7 @@ static void pattern ## 64 ## BE_LE ## ToY_c(uint8_t *_dst, const uint8_t *_src, \ static void pattern ## 64 ## BE_LE ## ToUV_c(uint8_t *_dstU, uint8_t *_dstV, \ const uint8_t *unused0, const uint8_t *_src1, const uint8_t *_src2, \ - int width, uint32_t *rgb2yuv) \ + int width, uint32_t *rgb2yuv, void *opq) \ { \ const uint16_t *src1 = (const uint16_t *) _src1, \ *src2 = (const uint16_t *) _src2; \ @@ -107,7 +107,7 @@ static void pattern ## 64 ## BE_LE ## ToUV_c(uint8_t *_dstU, uint8_t *_dstV, \ \ static void pattern ## 64 ## BE_LE ## ToUV_half_c(uint8_t *_dstU, uint8_t *_dstV, \ const uint8_t *unused0, const uint8_t *_src1, const uint8_t *_src2, \ - int width, uint32_t *rgb2yuv) \ + int width, uint32_t *rgb2yuv, void *opq) \ { \ const uint16_t *src1 = (const uint16_t *) _src1, \ *src2 = (const uint16_t *) _src2; \ @@ -192,7 +192,8 @@ static void pattern ## 48 ## BE_LE ## ToY_c(uint8_t *_dst, \ const uint8_t *_src, \ const uint8_t *unused0, const uint8_t *unused1,\ int width, \ - uint32_t *rgb2yuv) \ + uint32_t *rgb2yuv, \ + void *opq) \ { \ const uint16_t *src = (const uint16_t *)_src; \ uint16_t *dst = (uint16_t *)_dst; \ @@ -205,7 +206,8 @@ static void pattern ## 48 ## BE_LE ## ToUV_c(uint8_t *_dstU, \ const uint8_t *_src1, \ const uint8_t *_src2, \ int width, \ - uint32_t *rgb2yuv) \ + uint32_t *rgb2yuv, \ + void *opq) \ { \ const uint16_t *src1 = (const uint16_t *)_src1, \ *src2 = (const uint16_t *)_src2; \ @@ -220,7 +222,8 @@ static void pattern ## 48 ## BE_LE ## ToUV_half_c(uint8_t *_dstU, \ const uint8_t *_src1, \ const uint8_t *_src2, \ int width, \ - uint32_t *rgb2yuv) \ + uint32_t *rgb2yuv, \ + void *opq) \ { \ const uint16_t *src1 = (const uint16_t *)_src1, \ *src2 = (const uint16_t *)_src2; \ @@ -345,7 +348,7 @@ static av_always_inline void rgb16_32ToUV_half_c_template(int16_t *dstU, #define rgb16_32_wrapper(fmt, name, shr, shg, shb, shp, maskr, \ maskg, maskb, rsh, gsh, bsh, S) \ static void name ## ToY_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, \ - int width, uint32_t *tab) \ + int width, uint32_t *tab, void *opq) \ { \ rgb16_32ToY_c_template((int16_t*)dst, src, width, fmt, shr, shg, shb, shp, \ maskr, maskg, maskb, rsh, gsh, bsh, S, tab); \ @@ -353,7 +356,7 @@ static void name ## ToY_c(uint8_t *dst, const uint8_t *src, const uint8_t *unuse \ static void name ## ToUV_c(uint8_t *dstU, uint8_t *dstV, \ const uint8_t *unused0, const uint8_t *src, const uint8_t *dummy, \ - int width, uint32_t *tab) \ + int width, uint32_t *tab, void *opq) \ { \ rgb16_32ToUV_c_template((int16_t*)dstU, (int16_t*)dstV, src, width, fmt, \ shr, shg, shb, shp, \ @@ -363,7 +366,7 @@ static void name ## ToUV_c(uint8_t *dstU, uint8_t *dstV, \ static void name ## ToUV_half_c(uint8_t *dstU, uint8_t *dstV, \ const uint8_t *unused0, const uint8_t *src, \ const uint8_t *dummy, \ - int width, uint32_t *tab) \ + int width, uint32_t *tab, void *opq) \ { \ rgb16_32ToUV_half_c_template((int16_t*)dstU, (int16_t*)dstV, src, width, fmt, \ shr, shg, shb, shp, \ @@ -392,7 +395,7 @@ rgb16_32_wrapper(AV_PIX_FMT_X2BGR10LE, bgr30le, 0, 6, 16, 0, 0x3FF, 0xFFC00, 0x3 static void gbr24pToUV_half_c(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *gsrc, const uint8_t *bsrc, const uint8_t *rsrc, - int width, uint32_t *rgb2yuv) + int width, uint32_t *rgb2yuv, void *opq) { uint16_t *dstU = (uint16_t *)_dstU; uint16_t *dstV = (uint16_t *)_dstV; @@ -411,7 +414,7 @@ static void gbr24pToUV_half_c(uint8_t *_dstU, uint8_t *_dstV, } static void rgba64leToA_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused1, - const uint8_t *unused2, int width, uint32_t *unused) + const uint8_t *unused2, int width, uint32_t *unused, void *opq) { int16_t *dst = (int16_t *)_dst; const uint16_t *src = (const uint16_t *)_src; @@ -421,7 +424,7 @@ static void rgba64leToA_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unu } static void rgba64beToA_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused1, - const uint8_t *unused2, int width, uint32_t *unused) + const uint8_t *unused2, int width, uint32_t *unused, void *opq) { int16_t *dst = (int16_t *)_dst; const uint16_t *src = (const uint16_t *)_src; @@ -430,7 +433,8 @@ static void rgba64beToA_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unu dst[i] = AV_RB16(src + 4 * i + 3); } -static void abgrToA_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, int width, uint32_t *unused) +static void abgrToA_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, + const uint8_t *unused2, int width, uint32_t *unused, void *opq) { int16_t *dst = (int16_t *)_dst; int i; @@ -439,7 +443,8 @@ static void abgrToA_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, } } -static void rgbaToA_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, int width, uint32_t *unused) +static void rgbaToA_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, + const uint8_t *unused2, int width, uint32_t *unused, void *opq) { int16_t *dst = (int16_t *)_dst; int i; @@ -448,7 +453,8 @@ static void rgbaToA_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, } } -static void palToA_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, int width, uint32_t *pal) +static void palToA_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, + const uint8_t *unused2, int width, uint32_t *pal, void *opq) { int16_t *dst = (int16_t *)_dst; int i; @@ -459,7 +465,8 @@ static void palToA_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, } } -static void palToY_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, int width, uint32_t *pal) +static void palToY_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, + const uint8_t *unused2, int width, uint32_t *pal, void *opq) { int16_t *dst = (int16_t *)_dst; int i; @@ -471,8 +478,8 @@ static void palToY_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, } static void palToUV_c(uint8_t *_dstU, uint8_t *_dstV, - const uint8_t *unused0, const uint8_t *src1, const uint8_t *src2, - int width, uint32_t *pal) + const uint8_t *unused0, const uint8_t *src1, const uint8_t *src2, + int width, uint32_t *pal, void *opq) { uint16_t *dstU = (uint16_t *)_dstU; int16_t *dstV = (int16_t *)_dstV; @@ -486,7 +493,8 @@ static void palToUV_c(uint8_t *_dstU, uint8_t *_dstV, } } -static void monowhite2Y_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, int width, uint32_t *unused) +static void monowhite2Y_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, + const uint8_t *unused2, int width, uint32_t *unused, void *opq) { int16_t *dst = (int16_t *)_dst; int i, j; @@ -503,7 +511,8 @@ static void monowhite2Y_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unus } } -static void monoblack2Y_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, int width, uint32_t *unused) +static void monoblack2Y_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, + const uint8_t *unused2, int width, uint32_t *unused, void *opq) { int16_t *dst = (int16_t *)_dst; int i, j; @@ -520,8 +529,8 @@ static void monoblack2Y_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unus } } -static void yuy2ToY_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, int width, - uint32_t *unused) +static void yuy2ToY_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, int width, + uint32_t *unused, void *opq) { int i; for (i = 0; i < width; i++) @@ -529,7 +538,7 @@ static void yuy2ToY_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1, } static void yuy2ToUV_c(uint8_t *dstU, uint8_t *dstV, const uint8_t *unused0, const uint8_t *src1, - const uint8_t *src2, int width, uint32_t *unused) + const uint8_t *src2, int width, uint32_t *unused, void *opq) { int i; for (i = 0; i < width; i++) { @@ -540,7 +549,7 @@ static void yuy2ToUV_c(uint8_t *dstU, uint8_t *dstV, const uint8_t *unused0, con } static void yvy2ToUV_c(uint8_t *dstU, uint8_t *dstV, const uint8_t *unused0, const uint8_t *src1, - const uint8_t *src2, int width, uint32_t *unused) + const uint8_t *src2, int width, uint32_t *unused, void *opq) { int i; for (i = 0; i < width; i++) { @@ -551,7 +560,7 @@ static void yvy2ToUV_c(uint8_t *dstU, uint8_t *dstV, const uint8_t *unused0, con } static void y210le_UV_c(uint8_t *dstU, uint8_t *dstV, const uint8_t *unused0, const uint8_t *src, - const uint8_t *unused1, int width, uint32_t *unused2) + const uint8_t *unused1, int width, uint32_t *unused2, void *opq) { int i; for (i = 0; i < width; i++) { @@ -561,7 +570,7 @@ static void y210le_UV_c(uint8_t *dstU, uint8_t *dstV, const uint8_t *unused0, co } static void y210le_Y_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused0, - const uint8_t *unused1, int width, uint32_t *unused2) + const uint8_t *unused1, int width, uint32_t *unused2, void *opq) { int i; for (i = 0; i < width; i++) @@ -569,7 +578,7 @@ static void y210le_Y_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused0, } static void bswap16Y_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused1, const uint8_t *unused2, int width, - uint32_t *unused) + uint32_t *unused, void *opq) { int i; const uint16_t *src = (const uint16_t *)_src; @@ -579,7 +588,7 @@ static void bswap16Y_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused } static void bswap16UV_c(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *unused0, const uint8_t *_src1, - const uint8_t *_src2, int width, uint32_t *unused) + const uint8_t *_src2, int width, uint32_t *unused, void *opq) { int i; const uint16_t *src1 = (const uint16_t *)_src1, @@ -592,7 +601,7 @@ static void bswap16UV_c(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *unused0, } static void read_ya16le_gray_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, int width, - uint32_t *unused) + uint32_t *unused, void *opq) { int i; for (i = 0; i < width; i++) @@ -600,7 +609,7 @@ static void read_ya16le_gray_c(uint8_t *dst, const uint8_t *src, const uint8_t * } static void read_ya16le_alpha_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, int width, - uint32_t *unused) + uint32_t *unused, void *opq) { int i; for (i = 0; i < width; i++) @@ -608,7 +617,7 @@ static void read_ya16le_alpha_c(uint8_t *dst, const uint8_t *src, const uint8_t } static void read_ya16be_gray_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, int width, - uint32_t *unused) + uint32_t *unused, void *opq) { int i; for (i = 0; i < width; i++) @@ -616,7 +625,7 @@ static void read_ya16be_gray_c(uint8_t *dst, const uint8_t *src, const uint8_t * } static void read_ya16be_alpha_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, int width, - uint32_t *unused) + uint32_t *unused, void *opq) { int i; for (i = 0; i < width; i++) @@ -624,7 +633,7 @@ static void read_ya16be_alpha_c(uint8_t *dst, const uint8_t *src, const uint8_t } static void read_ayuv64le_Y_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused0, const uint8_t *unused1, int width, - uint32_t *unused2) + uint32_t *unused2, void *opq) { int i; for (i = 0; i < width; i++) @@ -633,7 +642,7 @@ static void read_ayuv64le_Y_c(uint8_t *dst, const uint8_t *src, const uint8_t *u static void read_ayuv64le_UV_c(uint8_t *dstU, uint8_t *dstV, const uint8_t *unused0, const uint8_t *src, - const uint8_t *unused1, int width, uint32_t *unused2) + const uint8_t *unused1, int width, uint32_t *unused2, void *opq) { int i; for (i = 0; i < width; i++) { @@ -643,7 +652,7 @@ static void read_ayuv64le_UV_c(uint8_t *dstU, uint8_t *dstV, const uint8_t *unus } static void read_ayuv64le_A_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused0, const uint8_t *unused1, int width, - uint32_t *unused2) + uint32_t *unused2, void *opq) { int i; for (i = 0; i < width; i++) @@ -651,7 +660,7 @@ static void read_ayuv64le_A_c(uint8_t *dst, const uint8_t *src, const uint8_t *u } static void read_vuya_UV_c(uint8_t *dstU, uint8_t *dstV, const uint8_t *unused0, const uint8_t *src, - const uint8_t *unused1, int width, uint32_t *unused2) + const uint8_t *unused1, int width, uint32_t *unused2, void *opq) { int i; for (i = 0; i < width; i++) { @@ -661,7 +670,7 @@ static void read_vuya_UV_c(uint8_t *dstU, uint8_t *dstV, const uint8_t *unused0, } static void read_vuya_Y_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused0, const uint8_t *unused1, int width, - uint32_t *unused2) + uint32_t *unused2, void *opq) { int i; for (i = 0; i < width; i++) @@ -669,7 +678,7 @@ static void read_vuya_Y_c(uint8_t *dst, const uint8_t *src, const uint8_t *unuse } static void read_vuya_A_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused0, const uint8_t *unused1, int width, - uint32_t *unused2) + uint32_t *unused2, void *opq) { int i; for (i = 0; i < width; i++) @@ -679,7 +688,7 @@ static void read_vuya_A_c(uint8_t *dst, const uint8_t *src, const uint8_t *unuse /* This is almost identical to the previous, end exists only because * yuy2ToY/UV)(dst, src + 1, ...) would have 100% unaligned accesses. */ static void uyvyToY_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, int width, - uint32_t *unused) + uint32_t *unused, void *opq) { int i; for (i = 0; i < width; i++) @@ -687,7 +696,7 @@ static void uyvyToY_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1, } static void uyvyToUV_c(uint8_t *dstU, uint8_t *dstV, const uint8_t *unused0, const uint8_t *src1, - const uint8_t *src2, int width, uint32_t *unused) + const uint8_t *src2, int width, uint32_t *unused, void *opq) { int i; for (i = 0; i < width; i++) { @@ -709,20 +718,20 @@ static av_always_inline void nvXXtoUV_c(uint8_t *dst1, uint8_t *dst2, static void nv12ToUV_c(uint8_t *dstU, uint8_t *dstV, const uint8_t *unused0, const uint8_t *src1, const uint8_t *src2, - int width, uint32_t *unused) + int width, uint32_t *unused, void *opq) { nvXXtoUV_c(dstU, dstV, src1, width); } static void nv21ToUV_c(uint8_t *dstU, uint8_t *dstV, const uint8_t *unused0, const uint8_t *src1, const uint8_t *src2, - int width, uint32_t *unused) + int width, uint32_t *unused, void *opq) { nvXXtoUV_c(dstV, dstU, src1, width); } static void p010LEToY_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1, - const uint8_t *unused2, int width, uint32_t *unused) + const uint8_t *unused2, int width, uint32_t *unused, void *opq) { int i; for (i = 0; i < width; i++) { @@ -731,7 +740,7 @@ static void p010LEToY_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1 } static void p010BEToY_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1, - const uint8_t *unused2, int width, uint32_t *unused) + const uint8_t *unused2, int width, uint32_t *unused, void *opq) { int i; for (i = 0; i < width; i++) { @@ -741,7 +750,7 @@ static void p010BEToY_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1 static void p010LEToUV_c(uint8_t *dstU, uint8_t *dstV, const uint8_t *unused0, const uint8_t *src1, const uint8_t *src2, - int width, uint32_t *unused) + int width, uint32_t *unused, void *opq) { int i; for (i = 0; i < width; i++) { @@ -751,8 +760,8 @@ static void p010LEToUV_c(uint8_t *dstU, uint8_t *dstV, } static void p010BEToUV_c(uint8_t *dstU, uint8_t *dstV, - const uint8_t *unused0, const uint8_t *src1, const uint8_t *src2, - int width, uint32_t *unused) + const uint8_t *unused0, const uint8_t *src1, const uint8_t *src2, + int width, uint32_t *unused, void *opq) { int i; for (i = 0; i < width; i++) { @@ -762,8 +771,8 @@ static void p010BEToUV_c(uint8_t *dstU, uint8_t *dstV, } static void p016LEToUV_c(uint8_t *dstU, uint8_t *dstV, - const uint8_t *unused0, const uint8_t *src1, const uint8_t *src2, - int width, uint32_t *unused) + const uint8_t *unused0, const uint8_t *src1, const uint8_t *src2, + int width, uint32_t *unused, void *opq) { int i; for (i = 0; i < width; i++) { @@ -773,8 +782,8 @@ static void p016LEToUV_c(uint8_t *dstU, uint8_t *dstV, } static void p016BEToUV_c(uint8_t *dstU, uint8_t *dstV, - const uint8_t *unused0, const uint8_t *src1, const uint8_t *src2, - int width, uint32_t *unused) + const uint8_t *unused0, const uint8_t *src1, const uint8_t *src2, + int width, uint32_t *unused, void *opq) { int i; for (i = 0; i < width; i++) { @@ -786,7 +795,7 @@ static void p016BEToUV_c(uint8_t *dstU, uint8_t *dstV, #define input_pixel(pos) (isBE(origin) ? AV_RB16(pos) : AV_RL16(pos)) static void bgr24ToY_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, - int width, uint32_t *rgb2yuv) + int width, uint32_t *rgb2yuv, void *opq) { int16_t *dst = (int16_t *)_dst; int32_t ry = rgb2yuv[RY_IDX], gy = rgb2yuv[GY_IDX], by = rgb2yuv[BY_IDX]; @@ -801,7 +810,7 @@ static void bgr24ToY_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1 } static void bgr24ToUV_c(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *unused0, const uint8_t *src1, - const uint8_t *src2, int width, uint32_t *rgb2yuv) + const uint8_t *src2, int width, uint32_t *rgb2yuv, void *opq) { int16_t *dstU = (int16_t *)_dstU; int16_t *dstV = (int16_t *)_dstV; @@ -820,7 +829,7 @@ static void bgr24ToUV_c(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *unused0, } static void bgr24ToUV_half_c(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *unused0, const uint8_t *src1, - const uint8_t *src2, int width, uint32_t *rgb2yuv) + const uint8_t *src2, int width, uint32_t *rgb2yuv, void *opq) { int16_t *dstU = (int16_t *)_dstU; int16_t *dstV = (int16_t *)_dstV; @@ -839,7 +848,7 @@ static void bgr24ToUV_half_c(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *unus } static void rgb24ToY_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, int width, - uint32_t *rgb2yuv) + uint32_t *rgb2yuv, void *opq) { int16_t *dst = (int16_t *)_dst; int32_t ry = rgb2yuv[RY_IDX], gy = rgb2yuv[GY_IDX], by = rgb2yuv[BY_IDX]; @@ -854,7 +863,7 @@ static void rgb24ToY_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1 } static void rgb24ToUV_c(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *unused0, const uint8_t *src1, - const uint8_t *src2, int width, uint32_t *rgb2yuv) + const uint8_t *src2, int width, uint32_t *rgb2yuv, void *opq) { int16_t *dstU = (int16_t *)_dstU; int16_t *dstV = (int16_t *)_dstV; @@ -873,7 +882,7 @@ static void rgb24ToUV_c(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *unused0, } static void rgb24ToUV_half_c(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *unused0, const uint8_t *src1, - const uint8_t *src2, int width, uint32_t *rgb2yuv) + const uint8_t *src2, int width, uint32_t *rgb2yuv, void *opq) { int16_t *dstU = (int16_t *)_dstU; int16_t *dstV = (int16_t *)_dstV; @@ -891,7 +900,7 @@ static void rgb24ToUV_half_c(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *unus } } -static void planar_rgb_to_y(uint8_t *_dst, const uint8_t *src[4], int width, int32_t *rgb2yuv) +static void planar_rgb_to_y(uint8_t *_dst, const uint8_t *src[4], int width, int32_t *rgb2yuv, void *opq) { uint16_t *dst = (uint16_t *)_dst; int32_t ry = rgb2yuv[RY_IDX], gy = rgb2yuv[GY_IDX], by = rgb2yuv[BY_IDX]; @@ -905,7 +914,7 @@ static void planar_rgb_to_y(uint8_t *_dst, const uint8_t *src[4], int width, int } } -static void planar_rgb_to_a(uint8_t *_dst, const uint8_t *src[4], int width, int32_t *unused) +static void planar_rgb_to_a(uint8_t *_dst, const uint8_t *src[4], int width, int32_t *unused, void *opq) { uint16_t *dst = (uint16_t *)_dst; int i; @@ -913,7 +922,7 @@ static void planar_rgb_to_a(uint8_t *_dst, const uint8_t *src[4], int width, int dst[i] = src[3][i] << 6; } -static void planar_rgb_to_uv(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *src[4], int width, int32_t *rgb2yuv) +static void planar_rgb_to_uv(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *src[4], int width, int32_t *rgb2yuv, void *opq) { uint16_t *dstU = (uint16_t *)_dstU; uint16_t *dstV = (uint16_t *)_dstV; @@ -1049,24 +1058,27 @@ static av_always_inline void grayf32ToY16_c(uint8_t *_dst, const uint8_t *_src, #define rgb9plus_planar_funcs_endian(nbits, endian_name, endian) \ static void planar_rgb##nbits##endian_name##_to_y(uint8_t *dst, const uint8_t *src[4], \ - int w, int32_t *rgb2yuv) \ + int w, int32_t *rgb2yuv, void *opq) \ { \ planar_rgb16_to_y(dst, src, w, nbits, endian, rgb2yuv); \ } \ static void planar_rgb##nbits##endian_name##_to_uv(uint8_t *dstU, uint8_t *dstV, \ - const uint8_t *src[4], int w, int32_t *rgb2yuv) \ + const uint8_t *src[4], int w, int32_t *rgb2yuv, \ + void *opq) \ { \ planar_rgb16_to_uv(dstU, dstV, src, w, nbits, endian, rgb2yuv); \ } \ #define rgb9plus_planar_transparency_funcs(nbits) \ static void planar_rgb##nbits##le_to_a(uint8_t *dst, const uint8_t *src[4], \ - int w, int32_t *rgb2yuv) \ + int w, int32_t *rgb2yuv, \ + void *opq) \ { \ planar_rgb16_to_a(dst, src, w, nbits, 0, rgb2yuv); \ } \ static void planar_rgb##nbits##be_to_a(uint8_t *dst, const uint8_t *src[4], \ - int w, int32_t *rgb2yuv) \ + int w, int32_t *rgb2yuv, \ + void *opq) \ { \ planar_rgb16_to_a(dst, src, w, nbits, 1, rgb2yuv); \ } @@ -1087,23 +1099,24 @@ rgb9plus_planar_transparency_funcs(16) #define rgbf32_planar_funcs_endian(endian_name, endian) \ static void planar_rgbf32##endian_name##_to_y(uint8_t *dst, const uint8_t *src[4], \ - int w, int32_t *rgb2yuv) \ + int w, int32_t *rgb2yuv, void *opq) \ { \ planar_rgbf32_to_y(dst, src, w, endian, rgb2yuv); \ } \ static void planar_rgbf32##endian_name##_to_uv(uint8_t *dstU, uint8_t *dstV, \ - const uint8_t *src[4], int w, int32_t *rgb2yuv) \ + const uint8_t *src[4], int w, int32_t *rgb2yuv, \ + void *opq) \ { \ planar_rgbf32_to_uv(dstU, dstV, src, w, endian, rgb2yuv); \ } \ static void planar_rgbf32##endian_name##_to_a(uint8_t *dst, const uint8_t *src[4], \ - int w, int32_t *rgb2yuv) \ + int w, int32_t *rgb2yuv, void *opq) \ { \ planar_rgbf32_to_a(dst, src, w, endian, rgb2yuv); \ } \ static void grayf32##endian_name##ToY16_c(uint8_t *dst, const uint8_t *src, \ const uint8_t *unused1, const uint8_t *unused2, \ - int width, uint32_t *unused) \ + int width, uint32_t *unused, void *opq) \ { \ grayf32ToY16_c(dst, src, unused1, unused2, width, endian, unused); \ } diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h index e118b54457..9ab542933f 100644 --- a/libswscale/swscale_internal.h +++ b/libswscale/swscale_internal.h @@ -559,26 +559,31 @@ typedef struct SwsContext { yuv2packedX_fn yuv2packedX; yuv2anyX_fn yuv2anyX; + /// Opaque data pointer passed to all input functions. + void *input_opaque; + /// Unscaled conversion of luma plane to YV12 for horizontal scaler. void (*lumToYV12)(uint8_t *dst, const uint8_t *src, const uint8_t *src2, const uint8_t *src3, - int width, uint32_t *pal); + int width, uint32_t *pal, void *opq); /// Unscaled conversion of alpha plane to YV12 for horizontal scaler. void (*alpToYV12)(uint8_t *dst, const uint8_t *src, const uint8_t *src2, const uint8_t *src3, - int width, uint32_t *pal); + int width, uint32_t *pal, void *opq); /// Unscaled conversion of chroma planes to YV12 for horizontal scaler. void (*chrToYV12)(uint8_t *dstU, uint8_t *dstV, const uint8_t *src1, const uint8_t *src2, const uint8_t *src3, - int width, uint32_t *pal); + int width, uint32_t *pal, void *opq); /** * Functions to read planar input, such as planar RGB, and convert * internally to Y/UV/A. */ /** @{ */ - void (*readLumPlanar)(uint8_t *dst, const uint8_t *src[4], int width, int32_t *rgb2yuv); + void (*readLumPlanar)(uint8_t *dst, const uint8_t *src[4], int width, int32_t *rgb2yuv, + void *opq); void (*readChrPlanar)(uint8_t *dstU, uint8_t *dstV, const uint8_t *src[4], - int width, int32_t *rgb2yuv); - void (*readAlpPlanar)(uint8_t *dst, const uint8_t *src[4], int width, int32_t *rgb2yuv); + int width, int32_t *rgb2yuv, void *opq); + void (*readAlpPlanar)(uint8_t *dst, const uint8_t *src[4], int width, int32_t *rgb2yuv, + void *opq); /** @} */ /** diff --git a/libswscale/x86/swscale.c b/libswscale/x86/swscale.c index 628f12137c..270798ba3d 100644 --- a/libswscale/x86/swscale.c +++ b/libswscale/x86/swscale.c @@ -299,13 +299,13 @@ VSCALE_FUNCS(avx, avx); #define INPUT_Y_FUNC(fmt, opt) \ void ff_ ## fmt ## ToY_ ## opt(uint8_t *dst, const uint8_t *src, \ const uint8_t *unused1, const uint8_t *unused2, \ - int w, uint32_t *unused) + int w, uint32_t *unused, void *opq) #define INPUT_UV_FUNC(fmt, opt) \ void ff_ ## fmt ## ToUV_ ## opt(uint8_t *dstU, uint8_t *dstV, \ const uint8_t *unused0, \ const uint8_t *src1, \ const uint8_t *src2, \ - int w, uint32_t *unused) + int w, uint32_t *unused, void *opq) #define INPUT_FUNC(fmt, opt) \ INPUT_Y_FUNC(fmt, opt); \ INPUT_UV_FUNC(fmt, opt) @@ -373,15 +373,18 @@ YUV2GBRP_DECL(avx2); #define INPUT_PLANAR_RGB_Y_FN_DECL(fmt, opt) \ void ff_planar_##fmt##_to_y_##opt(uint8_t *dst, \ - const uint8_t *src[4], int w, int32_t *rgb2yuv) + const uint8_t *src[4], int w, int32_t *rgb2yuv, \ + void *opq) #define INPUT_PLANAR_RGB_UV_FN_DECL(fmt, opt) \ void ff_planar_##fmt##_to_uv_##opt(uint8_t *dstU, uint8_t *dstV, \ - const uint8_t *src[4], int w, int32_t *rgb2yuv) + const uint8_t *src[4], int w, int32_t *rgb2yuv, \ + void *opq) #define INPUT_PLANAR_RGB_A_FN_DECL(fmt, opt) \ void ff_planar_##fmt##_to_a_##opt(uint8_t *dst, \ - const uint8_t *src[4], int w, int32_t *rgb2yuv) + const uint8_t *src[4], int w, int32_t *rgb2yuv, \ + void *opq) #define INPUT_PLANAR_RGBXX_A_DECL(fmt, opt) \ From patchwork Wed Aug 10 20:47:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Timo Rothenpieler X-Patchwork-Id: 37226 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3d0d:b0:8d:a68e:8a0e with SMTP id y13csp1142635pzi; Wed, 10 Aug 2022 13:48:37 -0700 (PDT) X-Google-Smtp-Source: AA6agR6+ymClCox6WvJvhubDqlIPo1h82HFf0X/m9Y1AVfj5sa3zllqyZpGka3Ok8/8q6xR0OOme X-Received: by 2002:a05:6402:5510:b0:43a:76ff:b044 with SMTP id fi16-20020a056402551000b0043a76ffb044mr28465046edb.197.1660164517362; Wed, 10 Aug 2022 13:48:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660164517; cv=none; d=google.com; s=arc-20160816; b=yeili7riTJ8qlYrC4vUroQPkke0RcESWeFY07QCWsR37+Hf7VRgVSaTmKa/2yvnc35 KmFjy7anKm7MUv1KRt+rPhXEvpUTvaHJNZkgo17sHW4+Ra13J2xnGiO05Xe4mYFAu0G0 iIkRlkQ9UoCPf8IwB0FcjAcEQgzfAk+EzSBIiIEhjMdWXrDviRdDYByiawpewJOz6Jdn o7DjOuF0JD/TfxwzTzhP/987JpP0U6xw1geVl80pV37NhSeiYQa7FIOiR3004kbmupUl FmpvvCwRtkgygsnFFB5Yy111DlEVehcuXD3TggGLGIDgCpTCWXRU0dpL5wxKErR6bT41 YNBA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=iyxr59NfpnZG5EpRvvRtW66VZru0bOgHzo7k6t05+wk=; b=ZoNYcugbZ9JXW2rFnFSBLXcw7uHG+ZOU4DgviyjQSsM5lZioI0ghW8sq108V7VBGCa pGD8dA31u7kbk4vB8tBIVcC5M4Y9XUsCiLg+/n7p/zP3fd4sVmhI4VqnmNFIRTQvrgYm oEc0/I1arBNY5nr6ChwWCR0/4jtZGWGLpZGhnOZpfrnZXlY+VwcOtmFKzz6u93DkOvPf FdziIlalO4FDm57NY1AlmqDYRszCbmlZrHLd5G/cLbGa3Y6taWMz+nfFoyKTpwQf1pfo jjYYwEay9MpBHFfnF3i/HwXfBeanIOorI2s0SOVskMaiKcN+SeNSIJqtRNIX+c4qjyn0 K9Fg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b="oWMV/4as"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id qb10-20020a1709077e8a00b00730a21eaa9dsi5655909ejc.760.2022.08.10.13.48.37; Wed, 10 Aug 2022 13:48:37 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b="oWMV/4as"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3E53A68B905; Wed, 10 Aug 2022 23:47:38 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from btbn.de (btbn.de [136.243.74.85]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1A90C68B1E1 for ; Wed, 10 Aug 2022 23:47:31 +0300 (EEST) Received: from [authenticated] by btbn.de (Postfix) with ESMTPSA id 9EF862F2619; Wed, 10 Aug 2022 22:47:25 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rothenpieler.org; s=mail; t=1660164445; bh=cuPxc9F3vL6jyMoVFwagK647MRoYokzSNNPzUUYpcWA=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=oWMV/4asg24B63XxN8bk3Ly1OevjQfDmt2hLy/l3xIOYrFi6g12Ux4L5yIjs1/aFg l0b4GfJpU455CCV2UUcxuqO8NjE1tgVE2Z+XTZ6ypa+SMoOAHVBsI0x2jV8Kq/gGq1 GPbzNKrCmd+WXAd2OYJV6x9iRKXjcDNjdhHeMaD1C8FhzOSwu6/kNG3jG069kKLv4P 3/LPYKXX7a8+n8H7Xv2YHir5F+TUlEhJjaBpjo9IEpouha8PJjtn7J2VSHEh+JyA9C 6KP5lTjGRyl3fqGMUJo+jn/mYs3NziN7bCZltYYUe+XP/OE/oa9r9qS7cPXg4sViyL mULQVdRgY6G4A== From: Timo Rothenpieler To: ffmpeg-devel@ffmpeg.org Date: Wed, 10 Aug 2022 22:47:12 +0200 Message-Id: <20220810204712.3123-11-timo@rothenpieler.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220810204712.3123-1-timo@rothenpieler.org> References: <20220810204712.3123-1-timo@rothenpieler.org> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 11/11] swscale/input: add rgbaf16 input support X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Timo Rothenpieler Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: HCyXbiRPlmOH This is by no means perfect, since at least ddagrab will return scRGB data with values outside of 0.0f to 1.0f for HDR values. Its primary purpose is to be able to work with the format at all. --- libavutil/Makefile | 1 + libswscale/half2float.c | 19 +++++ libswscale/input.c | 130 ++++++++++++++++++++++++++++++++++ libswscale/slice.c | 9 ++- libswscale/swscale_internal.h | 10 +++ libswscale/utils.c | 2 + libswscale/version.h | 2 +- 7 files changed, 171 insertions(+), 2 deletions(-) create mode 100644 libswscale/half2float.c diff --git a/libavutil/Makefile b/libavutil/Makefile index 3d9c07aea8..1aac1a4cc0 100644 --- a/libavutil/Makefile +++ b/libavutil/Makefile @@ -131,6 +131,7 @@ OBJS = adler32.o \ float_dsp.o \ fixed_dsp.o \ frame.o \ + half2float.o \ hash.o \ hdr_dynamic_metadata.o \ hdr_dynamic_vivid_metadata.o \ diff --git a/libswscale/half2float.c b/libswscale/half2float.c new file mode 100644 index 0000000000..1b023f96a5 --- /dev/null +++ b/libswscale/half2float.c @@ -0,0 +1,19 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/half2float.c" diff --git a/libswscale/input.c b/libswscale/input.c index 36ef1e43ac..818b57d2c3 100644 --- a/libswscale/input.c +++ b/libswscale/input.c @@ -1124,6 +1124,112 @@ static void grayf32##endian_name##ToY16_c(uint8_t *dst, const uint8_t *src, rgbf32_planar_funcs_endian(le, 0) rgbf32_planar_funcs_endian(be, 1) +#define rdpx(src) av_int2float(half2float(is_be ? AV_RB16(&src) : AV_RL16(&src), h2f_tbl)) + +static av_always_inline void rgbaf16ToUV_half_endian(uint16_t *dstU, uint16_t *dstV, int is_be, + const uint16_t *src, int width, + int32_t *rgb2yuv, half2float_tables *h2f_tbl) +{ + int32_t ru = rgb2yuv[RU_IDX], gu = rgb2yuv[GU_IDX], bu = rgb2yuv[BU_IDX]; + int32_t rv = rgb2yuv[RV_IDX], gv = rgb2yuv[GV_IDX], bv = rgb2yuv[BV_IDX]; + int i; + for (i = 0; i < width; i++) { + int r = (lrintf(av_clipf(65535.0f * rdpx(src[i*8+0]), 0.0f, 65535.0f)) + + lrintf(av_clipf(65535.0f * rdpx(src[i*8+4]), 0.0f, 65535.0f))) >> 1; + int g = (lrintf(av_clipf(65535.0f * rdpx(src[i*8+1]), 0.0f, 65535.0f)) + + lrintf(av_clipf(65535.0f * rdpx(src[i*8+5]), 0.0f, 65535.0f))) >> 1; + int b = (lrintf(av_clipf(65535.0f * rdpx(src[i*8+2]), 0.0f, 65535.0f)) + + lrintf(av_clipf(65535.0f * rdpx(src[i*8+6]), 0.0f, 65535.0f))) >> 1; + + dstU[i] = (ru*r + gu*g + bu*b + (0x10001<<(RGB2YUV_SHIFT-1))) >> RGB2YUV_SHIFT; + dstV[i] = (rv*r + gv*g + bv*b + (0x10001<<(RGB2YUV_SHIFT-1))) >> RGB2YUV_SHIFT; + } +} + +static av_always_inline void rgbaf16ToUV_endian(uint16_t *dstU, uint16_t *dstV, int is_be, + const uint16_t *src, int width, + int32_t *rgb2yuv, half2float_tables *h2f_tbl) +{ + int32_t ru = rgb2yuv[RU_IDX], gu = rgb2yuv[GU_IDX], bu = rgb2yuv[BU_IDX]; + int32_t rv = rgb2yuv[RV_IDX], gv = rgb2yuv[GV_IDX], bv = rgb2yuv[BV_IDX]; + int i; + for (i = 0; i < width; i++) { + int r = lrintf(av_clipf(65535.0f * rdpx(src[i*4+0]), 0.0f, 65535.0f)); + int g = lrintf(av_clipf(65535.0f * rdpx(src[i*4+1]), 0.0f, 65535.0f)); + int b = lrintf(av_clipf(65535.0f * rdpx(src[i*4+2]), 0.0f, 65535.0f)); + + dstU[i] = (ru*r + gu*g + bu*b + (0x10001<<(RGB2YUV_SHIFT-1))) >> RGB2YUV_SHIFT; + dstV[i] = (rv*r + gv*g + bv*b + (0x10001<<(RGB2YUV_SHIFT-1))) >> RGB2YUV_SHIFT; + } +} + +static av_always_inline void rgbaf16ToY_endian(uint16_t *dst, const uint16_t *src, int is_be, + int width, int32_t *rgb2yuv, half2float_tables *h2f_tbl) +{ + int32_t ry = rgb2yuv[RY_IDX], gy = rgb2yuv[GY_IDX], by = rgb2yuv[BY_IDX]; + int i; + for (i = 0; i < width; i++) { + int r = lrintf(av_clipf(65535.0f * rdpx(src[i*4+0]), 0.0f, 65535.0f)); + int g = lrintf(av_clipf(65535.0f * rdpx(src[i*4+1]), 0.0f, 65535.0f)); + int b = lrintf(av_clipf(65535.0f * rdpx(src[i*4+2]), 0.0f, 65535.0f)); + + dst[i] = (ry*r + gy*g + by*b + (0x2001<<(RGB2YUV_SHIFT-1))) >> RGB2YUV_SHIFT; + } +} + +static av_always_inline void rgbaf16ToA_endian(uint16_t *dst, const uint16_t *src, int is_be, + int width, half2float_tables *h2f_tbl) +{ + int i; + for (i=0; isrcFormat; @@ -1388,6 +1494,12 @@ av_cold void ff_sws_init_input_funcs(SwsContext *c) case AV_PIX_FMT_X2BGR10LE: c->chrToYV12 = bgr30leToUV_half_c; break; + case AV_PIX_FMT_RGBAF16BE: + c->chrToYV12 = rgbaf16beToUV_half_c; + break; + case AV_PIX_FMT_RGBAF16LE: + c->chrToYV12 = rgbaf16leToUV_half_c; + break; } } else { switch (srcFormat) { @@ -1475,6 +1587,12 @@ av_cold void ff_sws_init_input_funcs(SwsContext *c) case AV_PIX_FMT_X2BGR10LE: c->chrToYV12 = bgr30leToUV_c; break; + case AV_PIX_FMT_RGBAF16BE: + c->chrToYV12 = rgbaf16beToUV_c; + break; + case AV_PIX_FMT_RGBAF16LE: + c->chrToYV12 = rgbaf16leToUV_c; + break; } } @@ -1763,6 +1881,12 @@ av_cold void ff_sws_init_input_funcs(SwsContext *c) case AV_PIX_FMT_X2BGR10LE: c->lumToYV12 = bgr30leToY_c; break; + case AV_PIX_FMT_RGBAF16BE: + c->lumToYV12 = rgbaf16beToY_c; + break; + case AV_PIX_FMT_RGBAF16LE: + c->lumToYV12 = rgbaf16leToY_c; + break; } if (c->needAlpha) { if (is16BPS(srcFormat) || isNBPS(srcFormat)) { @@ -1782,6 +1906,12 @@ av_cold void ff_sws_init_input_funcs(SwsContext *c) case AV_PIX_FMT_ARGB: c->alpToYV12 = abgrToA_c; break; + case AV_PIX_FMT_RGBAF16BE: + c->alpToYV12 = rgbaf16beToA_c; + break; + case AV_PIX_FMT_RGBAF16LE: + c->alpToYV12 = rgbaf16leToA_c; + break; case AV_PIX_FMT_YA8: c->alpToYV12 = uyvyToY_c; break; diff --git a/libswscale/slice.c b/libswscale/slice.c index b3ee06d632..db1c696727 100644 --- a/libswscale/slice.c +++ b/libswscale/slice.c @@ -282,7 +282,13 @@ int ff_init_filters(SwsContext * c) c->descIndex[0] = num_ydesc + (need_gamma ? 1 : 0); c->descIndex[1] = num_ydesc + num_cdesc + (need_gamma ? 1 : 0); - + if (isFloat16(c->srcFormat)) { + c->h2f_tables = av_malloc(sizeof(*c->h2f_tables)); + if (!c->h2f_tables) + return AVERROR(ENOMEM); + ff_init_half2float_tables(c->h2f_tables); + c->input_opaque = c->h2f_tables; + } c->desc = av_calloc(c->numDesc, sizeof(*c->desc)); if (!c->desc) @@ -393,5 +399,6 @@ int ff_free_filters(SwsContext *c) free_slice(&c->slice[i]); av_freep(&c->slice); } + av_freep(&c->h2f_tables); return 0; } diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h index 9ab542933f..7d9f785298 100644 --- a/libswscale/swscale_internal.h +++ b/libswscale/swscale_internal.h @@ -35,6 +35,7 @@ #include "libavutil/pixdesc.h" #include "libavutil/slicethread.h" #include "libavutil/ppc/util_altivec.h" +#include "libavutil/half2float.h" #define STR(s) AV_TOSTRING(s) // AV_STRINGIFY is too long @@ -679,6 +680,8 @@ typedef struct SwsContext { unsigned int dst_slice_align; atomic_int stride_unaligned_warned; atomic_int data_unaligned_warned; + + half2float_tables *h2f_tables; } SwsContext; //FIXME check init (where 0) @@ -840,6 +843,13 @@ static av_always_inline int isFloat(enum AVPixelFormat pix_fmt) return desc->flags & AV_PIX_FMT_FLAG_FLOAT; } +static av_always_inline int isFloat16(enum AVPixelFormat pix_fmt) +{ + const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(pix_fmt); + av_assert0(desc); + return (desc->flags & AV_PIX_FMT_FLAG_FLOAT) && desc->comp[0].depth == 16; +} + static av_always_inline int isALPHA(enum AVPixelFormat pix_fmt) { const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(pix_fmt); diff --git a/libswscale/utils.c b/libswscale/utils.c index 34503e57f4..81646c0d73 100644 --- a/libswscale/utils.c +++ b/libswscale/utils.c @@ -259,6 +259,8 @@ static const FormatEntry format_entries[] = { [AV_PIX_FMT_P416LE] = { 1, 1 }, [AV_PIX_FMT_NV16] = { 1, 1 }, [AV_PIX_FMT_VUYA] = { 1, 1 }, + [AV_PIX_FMT_RGBAF16BE] = { 1, 0 }, + [AV_PIX_FMT_RGBAF16LE] = { 1, 0 }, }; int ff_shuffle_filter_coefficients(SwsContext *c, int *filterPos, diff --git a/libswscale/version.h b/libswscale/version.h index 3193562d18..d8694bb5c0 100644 --- a/libswscale/version.h +++ b/libswscale/version.h @@ -29,7 +29,7 @@ #include "version_major.h" #define LIBSWSCALE_VERSION_MINOR 8 -#define LIBSWSCALE_VERSION_MICRO 102 +#define LIBSWSCALE_VERSION_MICRO 103 #define LIBSWSCALE_VERSION_INT AV_VERSION_INT(LIBSWSCALE_VERSION_MAJOR, \ LIBSWSCALE_VERSION_MINOR, \