From patchwork Wed Aug 10 20:47:02 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Timo Rothenpieler <timo@rothenpieler.org>
X-Patchwork-Id: 37222
Delivered-To: ffmpegpatchwork2@gmail.com
Received: by 2002:a05:6a20:3d0d:b0:8d:a68e:8a0e with SMTP id y13csp1142332pzi;
        Wed, 10 Aug 2022 13:48:01 -0700 (PDT)
X-Google-Smtp-Source: 
 AA6agR7vIs7oard1TXU33pozpxIQDfNEZfMLBz75ylNZFtSVjknhbFlSo465iAKzlRug94WNS9eJ
X-Received: by 2002:a05:6402:3708:b0:433:2d3b:ed5 with SMTP id
 ek8-20020a056402370800b004332d3b0ed5mr28023493edb.246.1660164481262;
        Wed, 10 Aug 2022 13:48:01 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1660164481; cv=none;
        d=google.com; s=arc-20160816;
        b=jyD3gn2KXS9igVKGDywZhlBbd6esbpDRG+ZHhLIthCVjU7ZVta24SkUCJ6zIt3IWnw
         rb1TcUcakBd2HLtNacHROYG2VrdUlqi8X9xGQpOaGfQWE3MfmPxdwZIgA589/8HQjfCH
         V2j1lGYC9jTZDuEsLMnO/vh27sLCQcdeBaXpf92ieuf0O+D3sFnjrbsi0t0LyyhOLBdE
         4dv0IOlxUKgy4EI3uBRDCqb5qntgqDQ8Ys6498DCQDYNVhUwf0d3Wgr8tICRXwE2g3tV
         LtigiL2oy3ra4zUSb1T1DA2/nNlSqSK6afjqXxIquBzhUwcsgpGZrLgNdfsd+Ut3A+QA
         8PQA==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=sender:errors-to:content-transfer-encoding:cc:reply-to
         :list-subscribe:list-help:list-post:list-archive:list-unsubscribe
         :list-id:precedence:subject:mime-version:message-id:date:to:from
         :dkim-signature:delivered-to;
        bh=7CxywYZsqxtB8yA/HRMQZ6zt1wfmqY1Y3EBieSS5zd4=;
        b=p+9W0YgXq+i4FSKEB1Kthjli57eF8aoC0NSyYJ8FLzmltDjGHpZ6Ps4IwgfxeDn86g
         3z4bDVcrgxjNKtHTcto5+Fzpnj19wFJavF/DaPkxwmXVBfOFjANR8jwEmbNJmkBYsGxa
         9gfILVTQimdm0W5q+WkER4ziX8rkk3G3uh0KuskgX390ESUyrWT0mD8slO0R7yEUpVyY
         EicNIFPTNuIBh5KeP4IfG6BqUCDEW6cKTH7Qxci8sU7e2IhweE0ilg6zqD2FBYDoRraz
         PECYdcGy15xr+p84cdNpTJI3Z+oaBSnicS3IKDTFZSVJ9v+IFhbDywczJOk0xV3u9k7/
         18dg==
ARC-Authentication-Results: i=1; mx.google.com;
       dkim=neutral (body hash did not verify) header.i=@rothenpieler.org
 header.s=mail header.b=FXepwxq0;
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100])
        by mx.google.com with ESMTP id
 f20-20020a056402355400b0043c2e89d0acsi14447667edd.3.2022.08.10.13.47.49;
        Wed, 10 Aug 2022 13:48:01 -0700 (PDT)
Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100;
Authentication-Results: mx.google.com;
       dkim=neutral (body hash did not verify) header.i=@rothenpieler.org
 header.s=mail header.b=FXepwxq0;
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 370D468B308;
	Wed, 10 Aug 2022 23:47:33 +0300 (EEST)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from btbn.de (btbn.de [136.243.74.85])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id ADEF068B7AD
 for <ffmpeg-devel@ffmpeg.org>; Wed, 10 Aug 2022 23:47:25 +0300 (EEST)
Received: from [authenticated] by btbn.de (Postfix) with ESMTPSA id
 6D3C62F2604; Wed, 10 Aug 2022 22:47:24 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rothenpieler.org;
 s=mail; t=1660164444;
 bh=a9hV2rb1WlJgg3c+YboDyrziG45hZ1U3wXoOLkpDAqQ=;
 h=From:To:Cc:Subject:Date;
 b=FXepwxq0wwa9YvPcVkP1kLIVvaYf/MEcKE3PpytLd7Afc0+TbEPIs/rhH3owqEe1r
 3Dm+liQqi6yGBY3UVzApViXqBv1wFdXPBxp9vMY6thaTueTe61FHBF8voMhP9bcjUs
 HUtHV07HvEYHiYrIb4VtJqz45JCAwjDY+OK1NegpPB3gwt1uzYArEd15ZvDcc+EA/X
 augsUnjunvbcvmHARq+ugpLx2IZInhlQkGfgFHE/GkpfipHJDcDmcoP1JKOG0G257M
 Aix+/zV/7qzZfChZ7VkMdcP9eu50cc4w0HPSZ5K7LI05vX34VmbiciZ6C2awG5yYFx
 SujCoD/TOwlqg==
From: Timo Rothenpieler <timo@rothenpieler.org>
To: ffmpeg-devel@ffmpeg.org
Date: Wed, 10 Aug 2022 22:47:02 +0200
Message-Id: <20220810204712.3123-1-timo@rothenpieler.org>
X-Mailer: git-send-email 2.34.1
MIME-Version: 1.0
Subject: [FFmpeg-devel] [PATCH 01/11] lavu/pixfmt: add packed RGBA float16
 format
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Cc: Timo Rothenpieler <timo@rothenpieler.org>
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
X-TUID: r9S6C32781jY

This is the default format of the Windows compositor and what DXGI
Desktop Duplication will give you for any kind of HDR output.
---
 libavutil/pixdesc.c              | 28 ++++++++++++++++++++++++++++
 libavutil/pixfmt.h               |  5 +++++
 libavutil/version.h              |  4 ++--
 tests/ref/fate/imgutils          |  2 ++
 tests/ref/fate/sws-pixdesc-query | 13 +++++++++++++
 5 files changed, 50 insertions(+), 2 deletions(-)

diff --git a/libavutil/pixdesc.c b/libavutil/pixdesc.c
index e078fd5320..f7558ff8b9 100644
--- a/libavutil/pixdesc.c
+++ b/libavutil/pixdesc.c
@@ -2504,6 +2504,34 @@ static const AVPixFmtDescriptor av_pix_fmt_descriptors[AV_PIX_FMT_NB] = {
         },
         .flags = AV_PIX_FMT_FLAG_ALPHA,
     },
+    [AV_PIX_FMT_RGBAF16BE] = {
+        .name = "rgbaf16be",
+        .nb_components = 4,
+        .log2_chroma_w = 0,
+        .log2_chroma_h = 0,
+        .comp = {
+            { 0, 8, 0, 0, 16 },       /* R */
+            { 0, 8, 2, 0, 16 },       /* G */
+            { 0, 8, 4, 0, 16 },       /* B */
+            { 0, 8, 6, 0, 16 },       /* A */
+        },
+        .flags = AV_PIX_FMT_FLAG_BE | AV_PIX_FMT_FLAG_RGB |
+                 AV_PIX_FMT_FLAG_ALPHA | AV_PIX_FMT_FLAG_FLOAT,
+    },
+    [AV_PIX_FMT_RGBAF16LE] = {
+        .name = "rgbaf16le",
+        .nb_components = 4,
+        .log2_chroma_w = 0,
+        .log2_chroma_h = 0,
+        .comp = {
+            { 0, 8, 0, 0, 16 },       /* R */
+            { 0, 8, 2, 0, 16 },       /* G */
+            { 0, 8, 4, 0, 16 },       /* B */
+            { 0, 8, 6, 0, 16 },       /* A */
+        },
+        .flags = AV_PIX_FMT_FLAG_RGB | AV_PIX_FMT_FLAG_ALPHA |
+                 AV_PIX_FMT_FLAG_FLOAT,
+    },
 };
 
 static const char * const color_range_names[] = {
diff --git a/libavutil/pixfmt.h b/libavutil/pixfmt.h
index 9d1fdaf82d..86c9bdefeb 100644
--- a/libavutil/pixfmt.h
+++ b/libavutil/pixfmt.h
@@ -369,6 +369,9 @@ enum AVPixelFormat {
 
     AV_PIX_FMT_VUYA,        ///< packed VUYA 4:4:4, 32bpp, VUYAVUYA...
 
+    AV_PIX_FMT_RGBAF16BE,   ///< IEEE-754 half precision packed RGBA 16:16:16:16, 64bpp, RGBARGBA..., big-endian
+    AV_PIX_FMT_RGBAF16LE,   ///< IEEE-754 half precision packed RGBA 16:16:16:16, 64bpp, RGBARGBA..., little-endian
+
     AV_PIX_FMT_NB         ///< number of pixel formats, DO NOT USE THIS if you want to link with shared libav* because the number of formats might differ between versions
 };
 
@@ -466,6 +469,8 @@ enum AVPixelFormat {
 #define AV_PIX_FMT_P216       AV_PIX_FMT_NE(P216BE, P216LE)
 #define AV_PIX_FMT_P416       AV_PIX_FMT_NE(P416BE, P416LE)
 
+#define AV_PIX_FMT_RGBAF16    AV_PIX_FMT_NE(RGBAF16BE, RGBAF16LE)
+
 /**
   * Chromaticity coordinates of the source primaries.
   * These values match the ones defined by ISO/IEC 23091-2_2019 subclause 8.1 and ITU-T H.273.
diff --git a/libavutil/version.h b/libavutil/version.h
index ee43526dc6..f0a8b5c098 100644
--- a/libavutil/version.h
+++ b/libavutil/version.h
@@ -79,8 +79,8 @@
  */
 
 #define LIBAVUTIL_VERSION_MAJOR  57
-#define LIBAVUTIL_VERSION_MINOR  32
-#define LIBAVUTIL_VERSION_MICRO 101
+#define LIBAVUTIL_VERSION_MINOR  33
+#define LIBAVUTIL_VERSION_MICRO 100
 
 #define LIBAVUTIL_VERSION_INT   AV_VERSION_INT(LIBAVUTIL_VERSION_MAJOR, \
                                                LIBAVUTIL_VERSION_MINOR, \
diff --git a/tests/ref/fate/imgutils b/tests/ref/fate/imgutils
index 4ec66febb8..01c9877de5 100644
--- a/tests/ref/fate/imgutils
+++ b/tests/ref/fate/imgutils
@@ -247,3 +247,5 @@ p216le          planes: 2, linesizes: 128 128   0   0, plane_sizes:  6144  6144
 p416be          planes: 2, linesizes: 128 256   0   0, plane_sizes:  6144 12288     0     0, plane_offsets:  6144     0     0, total_size: 18432
 p416le          planes: 2, linesizes: 128 256   0   0, plane_sizes:  6144 12288     0     0, plane_offsets:  6144     0     0, total_size: 18432
 vuya            planes: 1, linesizes: 256   0   0   0, plane_sizes: 12288     0     0     0, plane_offsets:     0     0     0, total_size: 12288
+rgbaf16be       planes: 1, linesizes: 512   0   0   0, plane_sizes: 24576     0     0     0, plane_offsets:     0     0     0, total_size: 24576
+rgbaf16le       planes: 1, linesizes: 512   0   0   0, plane_sizes: 24576     0     0     0, plane_offsets:     0     0     0, total_size: 24576
diff --git a/tests/ref/fate/sws-pixdesc-query b/tests/ref/fate/sws-pixdesc-query
index bd0f1fcb82..f79d99e513 100644
--- a/tests/ref/fate/sws-pixdesc-query
+++ b/tests/ref/fate/sws-pixdesc-query
@@ -21,6 +21,8 @@ is16BPS:
   rgb48le
   rgba64be
   rgba64le
+  rgbaf16be
+  rgbaf16le
   ya16be
   ya16le
   yuv420p16be
@@ -157,6 +159,7 @@ isBE:
   rgb555be
   rgb565be
   rgba64be
+  rgbaf16be
   x2bgr10be
   x2rgb10be
   xyz12be
@@ -479,6 +482,8 @@ isRGB:
   rgb8
   rgba64be
   rgba64le
+  rgbaf16be
+  rgbaf16le
   x2bgr10be
   x2bgr10le
   x2rgb10be
@@ -629,6 +634,8 @@ AnyRGB:
   rgb8
   rgba64be
   rgba64le
+  rgbaf16be
+  rgbaf16le
   x2bgr10be
   x2bgr10le
   x2rgb10be
@@ -655,6 +662,8 @@ ALPHA:
   rgb32_1
   rgba64be
   rgba64le
+  rgbaf16be
+  rgbaf16le
   vuya
   ya16be
   ya16le
@@ -739,6 +748,8 @@ Packed:
   rgb8
   rgba64be
   rgba64le
+  rgbaf16be
+  rgbaf16le
   uyvy422
   uyyvyy411
   vuya
@@ -918,6 +929,8 @@ PackedRGB:
   rgb8
   rgba64be
   rgba64le
+  rgbaf16be
+  rgbaf16le
   x2bgr10be
   x2bgr10le
   x2rgb10be

From patchwork Wed Aug 10 20:47:03 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Timo Rothenpieler <timo@rothenpieler.org>
X-Patchwork-Id: 37220
Delivered-To: ffmpegpatchwork2@gmail.com
Received: by 2002:a05:6a20:3d0d:b0:8d:a68e:8a0e with SMTP id y13csp1142147pzi;
        Wed, 10 Aug 2022 13:47:38 -0700 (PDT)
X-Google-Smtp-Source: 
 AA6agR5n6oB8LtwJTAFwjm3wEi9U/8Jjh5lOzDh0znrCiJFnEXj2H+IAuDITJVVG8VPZPPAITk4H
X-Received: by 2002:a17:906:fd84:b0:730:acee:d067 with SMTP id
 xa4-20020a170906fd8400b00730aceed067mr21829351ejb.206.1660164457939;
        Wed, 10 Aug 2022 13:47:37 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1660164457; cv=none;
        d=google.com; s=arc-20160816;
        b=FJ8HaR2jGp/gwAZoiIHliBnfLjs6XALcjHeV7d6bricRitMOT5O0paA4kHsFrZSfG/
         073cmwEbHLwtzov7VXDbTy20pg8/hOsrFsmRoNgUQf7rhUrswkcB/tk1z59DGNGIVGVE
         3P6hAV/RIHYOpkzcYSDGiUHOIQ0DSztKNREUahVK6yJ2Q3ptsQycnWqtati6Vmvi/RZQ
         kg9+kfh4zkbStdT8MFlolhr9C7V7NqjpoKle/Ibdsr0H3YY3WqeMUzfvjbipbuk9C0Bh
         3IUsF7PuG/OT7kSJVareZGGHbbsXc2fBnoKriIE1qOMFNB8+vUlEmOE1fAdm52nYEyax
         6r/g==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=sender:errors-to:content-transfer-encoding:cc:reply-to
         :list-subscribe:list-help:list-post:list-archive:list-unsubscribe
         :list-id:precedence:subject:mime-version:references:in-reply-to
         :message-id:date:to:from:dkim-signature:delivered-to;
        bh=VGr4mhXNczhJHhwsw7kw6zy2/0Xpsec8s/7z5RzWCn0=;
        b=KAWd7Bx8hlcucZgZUrKnWnO18QYoAa9MLDkwkBZCpselDr9SAxYj7lz4EPg/sfUSxh
         ThVwqGzXwqtGzsgRw2kaTlJS45IBZm4kQHrlix3BvuCwQ8z1CU8eSuIzn/yJBaaAD189
         nPwXRV1Srlm2s0XkqXd/ahVI38tegVKExsTM5H0yYNKBcc7FmPjPxAnIGlGKH9d/VwDe
         n6gIHvkID64ok9MLvLHsu3yQo60xVgk1SzsYDv/h1oZ4PBUPw2gSdz58oCF6/MO7Y9IM
         7mMmxmVsG6m9c8KVuFP3g6AokKZVJiF4s1gwfXOMnL/KKef9ADU9N8nFaa76JDBnX1ZU
         +zSQ==
ARC-Authentication-Results: i=1; mx.google.com;
       dkim=neutral (body hash did not verify) header.i=@rothenpieler.org
 header.s=mail header.b=MW51Tt1c;
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100])
        by mx.google.com with ESMTP id
 b7-20020a056402278700b0043d35059c65si13656681ede.160.2022.08.10.13.47.36;
        Wed, 10 Aug 2022 13:47:37 -0700 (PDT)
Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100;
Authentication-Results: mx.google.com;
       dkim=neutral (body hash did not verify) header.i=@rothenpieler.org
 header.s=mail header.b=MW51Tt1c;
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E5E9268B8C8;
	Wed, 10 Aug 2022 23:47:31 +0300 (EEST)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from btbn.de (btbn.de [136.243.74.85])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id AA60868B586
 for <ffmpeg-devel@ffmpeg.org>; Wed, 10 Aug 2022 23:47:25 +0300 (EEST)
Received: from [authenticated] by btbn.de (Postfix) with ESMTPSA id
 89E262F1950; Wed, 10 Aug 2022 22:47:24 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rothenpieler.org;
 s=mail; t=1660164444;
 bh=lAqmVdqSlxk80XnTUlXyRFQF9ncu0ttbnPSalBdmU88=;
 h=From:To:Cc:Subject:Date:In-Reply-To:References;
 b=MW51Tt1chXuq9kPUoGHRi491WOOEOlY3vdxq3Ihc/9FtGkGH3gthDdzRaWi26osYg
 ZPU69QTURboJPYMU/7pqCLue8/br3AXxJTPQZdZRKbe66SSqEQ3Sy+LUrsnHr6M8AK
 6mEm8Yvf/XPdLLF6K06p20XNLW6FVkT9b+ZubwgSksjyeNQhLyEbx5PxGvWKYFQ6P6
 VNjTjpuLfYvZUv20SuMgtNuNceQ76l6ovbx+In4dPE10q+yiwdrIaKHRDH8Kwil9mF
 WXdMk27RvfDsLyi/7KMwT5pZSSW8yMEC4Ux79aXCyHf0ttcHxaMntj1aenr2z2Abl1
 P+IdDam8L7NEQ==
From: Timo Rothenpieler <timo@rothenpieler.org>
To: ffmpeg-devel@ffmpeg.org
Date: Wed, 10 Aug 2022 22:47:03 +0200
Message-Id: <20220810204712.3123-2-timo@rothenpieler.org>
X-Mailer: git-send-email 2.34.1
In-Reply-To: <20220810204712.3123-1-timo@rothenpieler.org>
References: <20220810204712.3123-1-timo@rothenpieler.org>
MIME-Version: 1.0
Subject: [FFmpeg-devel] [PATCH 02/11] avutil/hwcontext_d3d11va: add support
 for rgbaf16 pixel format
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Cc: Timo Rothenpieler <timo@rothenpieler.org>
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
X-TUID: pwFPdTHADJoq

---
 libavutil/hwcontext_d3d11va.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libavutil/hwcontext_d3d11va.c b/libavutil/hwcontext_d3d11va.c
index 27c0c80413..363ec6a47d 100644
--- a/libavutil/hwcontext_d3d11va.c
+++ b/libavutil/hwcontext_d3d11va.c
@@ -88,6 +88,7 @@ static const struct {
     { DXGI_FORMAT_P010,         AV_PIX_FMT_P010 },
     { DXGI_FORMAT_B8G8R8A8_UNORM,    AV_PIX_FMT_BGRA },
     { DXGI_FORMAT_R10G10B10A2_UNORM, AV_PIX_FMT_X2BGR10 },
+    { DXGI_FORMAT_R16G16B16A16_FLOAT, AV_PIX_FMT_RGBAF16 },
     // Special opaque formats. The pix_fmt is merely a place holder, as the
     // opaque format cannot be accessed directly.
     { DXGI_FORMAT_420_OPAQUE,   AV_PIX_FMT_YUV420P },

From patchwork Wed Aug 10 20:47:04 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Timo Rothenpieler <timo@rothenpieler.org>
X-Patchwork-Id: 37221
Delivered-To: ffmpegpatchwork2@gmail.com
Received: by 2002:a05:6a20:3d0d:b0:8d:a68e:8a0e with SMTP id y13csp1142316pzi;
        Wed, 10 Aug 2022 13:47:59 -0700 (PDT)
X-Google-Smtp-Source: 
 AA6agR5cBEJd5fBu3MSKX9bXG51j6TnnW4uRLopeGH8pXwEDuXS59O5Plb0V5lCKe47zt9M1+2L2
X-Received: by 2002:a05:6402:3222:b0:43e:49f9:11e with SMTP id
 g34-20020a056402322200b0043e49f9011emr28467937eda.426.1660164478925;
        Wed, 10 Aug 2022 13:47:58 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1660164478; cv=none;
        d=google.com; s=arc-20160816;
        b=hQOXoIz2krlf1WFqayZDC6vl+zzKrzcLgcRw61xLveREkD4UtkDcIiZhfwh9ipO3yj
         Eg/baJEA88CNtcWPHAOkIXqEB0zy3SGy9SI+NrevPMz2DpxNsPBqTsDezocRHkYO+qzR
         Q5xfqjeZo72xpMlv2kcyHWTYGklwBMc555GXuqJfICruKESEGqTa4/SBahjbPOz7U1G/
         /3Ltzd7xgZaxGQyd2IcSeVdxLmW53x/qLjGfH8dfLinuDlRfTJ8sGcW8WGJGylKZrSGn
         BSBPkhdfpy8OSiQ8+C2SbBzlLjSHCsQ/hlYNSjfZdEkPaVW9SrVUTEr6S0Qs1wpRozf/
         8sRQ==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=sender:errors-to:content-transfer-encoding:cc:reply-to
         :list-subscribe:list-help:list-post:list-archive:list-unsubscribe
         :list-id:precedence:subject:mime-version:references:in-reply-to
         :message-id:date:to:from:dkim-signature:delivered-to;
        bh=+nA1E43S6UHMMQw5GuN4plOv1+Ui3YVsz2SrK6mWORs=;
        b=YK3IKWiSXqDQzowl9dpraBUv94HgYZcrRKSKtGmVn+x6Jq1gFQJ/IviatLaOzCHc4u
         kDLTnxaRHYOwIUWXe7yXiV/BDSv+Qerv8v3n8DVmr084BiDQOVqoqhsYfgm5PyHSQtRt
         lr9jf5OMWlZ3Rk0JQ/r63niMru9YuN00cwTmennzWJlvhxkSH28TzIExQuP4YvPIINM2
         05a4alj8FgndPs5Th2SAwTdXuJKlGrwoCXWzUE+QgSc3tC413iKZ5RxgmwLDt89S7Vya
         VzXmrfhASnXCBuTCl0JoQT5BdQsbKnx4AtkKYsN5YZ+gynw/ukRno2WYHpnfGFlZzoQm
         NETA==
ARC-Authentication-Results: i=1; mx.google.com;
       dkim=neutral (body hash did not verify) header.i=@rothenpieler.org
 header.s=mail header.b="eeSQJw/v";
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100])
        by mx.google.com with ESMTP id
 g24-20020a50ee18000000b00440253e1144si10503743eds.277.2022.08.10.13.47.58;
        Wed, 10 Aug 2022 13:47:58 -0700 (PDT)
Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100;
Authentication-Results: mx.google.com;
       dkim=neutral (body hash did not verify) header.i=@rothenpieler.org
 header.s=mail header.b="eeSQJw/v";
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4F67368B1E1;
	Wed, 10 Aug 2022 23:47:34 +0300 (EEST)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from btbn.de (btbn.de [136.243.74.85])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id B10DA68B7F3
 for <ffmpeg-devel@ffmpeg.org>; Wed, 10 Aug 2022 23:47:25 +0300 (EEST)
Received: from [authenticated] by btbn.de (Postfix) with ESMTPSA id
 B0FD02F2605; Wed, 10 Aug 2022 22:47:24 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rothenpieler.org;
 s=mail; t=1660164444;
 bh=4TnNCQ+Ql+0myt0HXWahMYmNY2bdj3sCoKj011c1T1Q=;
 h=From:To:Cc:Subject:Date:In-Reply-To:References;
 b=eeSQJw/vWwrBPYmIWw5OVX3eezpxKWFyzFFDY1g2XGQMxg27hXl/2zqja1AaCKxxc
 nQvm4REUWAQCoBBrz2nks7ndNWTGgDKz9FFwgkfUo8OLZFgA7/x+y5QLA/fjWIGon6
 o7DlMFzl821bFJP/aAWJjXo5FwweJHrXHoYTEW5VanOJylW+RE2nJtn+oLBwQjjySJ
 ghHVQS6JxVuQB2sNKuqQR3CH+DVprdVdCLC4eVYylNV3hL+6apUZlU8YdLbBZv3MH7
 InX4mRi8ySOjMAQeKbweatDL3Pr2Ma5VkOWsZBOLLcLYi+BLu2k2fzYeHToKwXtoFE
 k03gGgqtKWItw==
From: Timo Rothenpieler <timo@rothenpieler.org>
To: ffmpeg-devel@ffmpeg.org
Date: Wed, 10 Aug 2022 22:47:04 +0200
Message-Id: <20220810204712.3123-3-timo@rothenpieler.org>
X-Mailer: git-send-email 2.34.1
In-Reply-To: <20220810204712.3123-1-timo@rothenpieler.org>
References: <20220810204712.3123-1-timo@rothenpieler.org>
MIME-Version: 1.0
Subject: [FFmpeg-devel] [PATCH 03/11] avfilter/vsrc_ddagrab: add rgbaf16
 output support
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Cc: Timo Rothenpieler <timo@rothenpieler.org>
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
X-TUID: /5o+RL/FtKzO

---
 libavfilter/version.h      |  2 +-
 libavfilter/vsrc_ddagrab.c | 13 +++++++++++++
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/libavfilter/version.h b/libavfilter/version.h
index 19a009c110..fa67606495 100644
--- a/libavfilter/version.h
+++ b/libavfilter/version.h
@@ -32,7 +32,7 @@
 #include "version_major.h"
 
 #define LIBAVFILTER_VERSION_MINOR  46
-#define LIBAVFILTER_VERSION_MICRO 101
+#define LIBAVFILTER_VERSION_MICRO 102
 
 
 #define LIBAVFILTER_VERSION_INT AV_VERSION_INT(LIBAVFILTER_VERSION_MAJOR, \
diff --git a/libavfilter/vsrc_ddagrab.c b/libavfilter/vsrc_ddagrab.c
index ce36716281..252505b96d 100644
--- a/libavfilter/vsrc_ddagrab.c
+++ b/libavfilter/vsrc_ddagrab.c
@@ -115,6 +115,8 @@ static const AVOption ddagrab_options[] = {
     { "bgra",       "only output 8 Bit BGRA",            0,            AV_OPT_TYPE_CONST,      { .i64 = DXGI_FORMAT_B8G8R8A8_UNORM },    0, INT_MAX, FLAGS, "output_fmt" },
     { "10bit",      "only output default 10 Bit format", 0,            AV_OPT_TYPE_CONST,      { .i64 = DXGI_FORMAT_R10G10B10A2_UNORM }, 0, INT_MAX, FLAGS, "output_fmt" },
     { "x2bgr10",    "only output 10 Bit X2BGR10",        0,            AV_OPT_TYPE_CONST,      { .i64 = DXGI_FORMAT_R10G10B10A2_UNORM }, 0, INT_MAX, FLAGS, "output_fmt" },
+    { "16bit",      "only output default 16 Bit format", 0,            AV_OPT_TYPE_CONST,      { .i64 = DXGI_FORMAT_R16G16B16A16_FLOAT },0, INT_MAX, FLAGS, "output_fmt" },
+    { "rgbaf16",    "only output 16 Bit RGBAF16",        0,            AV_OPT_TYPE_CONST,      { .i64 = DXGI_FORMAT_R16G16B16A16_FLOAT },0, INT_MAX, FLAGS, "output_fmt" },
     { NULL }
 };
 
@@ -212,6 +214,7 @@ static av_cold int init_dxgi_dda(AVFilterContext *avctx)
     if (set_thread_dpi && SUCCEEDED(hr)) {
         DPI_AWARENESS_CONTEXT prev_dpi_ctx;
         DXGI_FORMAT formats[] = {
+            DXGI_FORMAT_R16G16B16A16_FLOAT,
             DXGI_FORMAT_R10G10B10A2_UNORM,
             DXGI_FORMAT_B8G8R8A8_UNORM
         };
@@ -665,6 +668,10 @@ static av_cold int init_hwframes_ctx(AVFilterContext *avctx)
         av_log(avctx, AV_LOG_VERBOSE, "Probed 10 bit RGB frame format\n");
         dda->frames_ctx->sw_format = AV_PIX_FMT_X2BGR10;
         break;
+    case DXGI_FORMAT_R16G16B16A16_FLOAT:
+        av_log(avctx, AV_LOG_VERBOSE, "Probed 16 bit float RGB frame format\n");
+        dda->frames_ctx->sw_format = AV_PIX_FMT_RGBAF16;
+        break;
     default:
         av_log(avctx, AV_LOG_ERROR, "Unexpected texture output format!\n");
         return AVERROR_BUG;
@@ -990,6 +997,12 @@ static int ddagrab_request_frame(AVFilterLink *outlink)
         frame->color_primaries = AVCOL_PRI_BT709;
         frame->color_trc       = AVCOL_TRC_IEC61966_2_1;
         frame->colorspace      = AVCOL_SPC_RGB;
+    } else if(desc.Format == DXGI_FORMAT_R16G16B16A16_FLOAT) {
+        // According to MSDN, all floating point formats contain sRGB image data with linear 1.0 gamma.
+        frame->color_range     = AVCOL_RANGE_JPEG;
+        frame->color_primaries = AVCOL_PRI_BT709;
+        frame->color_trc       = AVCOL_TRC_LINEAR;
+        frame->colorspace      = AVCOL_SPC_RGB;
     } else {
         ret = AVERROR_BUG;
         goto fail;

From patchwork Wed Aug 10 20:47:05 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Timo Rothenpieler <timo@rothenpieler.org>
X-Patchwork-Id: 37224
Delivered-To: ffmpegpatchwork2@gmail.com
Received: by 2002:a05:6a20:3d0d:b0:8d:a68e:8a0e with SMTP id y13csp1142497pzi;
        Wed, 10 Aug 2022 13:48:18 -0700 (PDT)
X-Google-Smtp-Source: 
 AA6agR6Fal1f76ii9fM2x44XRUyl7IFPFtaLBDaUVSpdjHbs2sej4k2y/VpOInosgf5pDKfZyqeC
X-Received: by 2002:a17:907:a427:b0:732:ea25:2d38 with SMTP id
 sg39-20020a170907a42700b00732ea252d38mr6593843ejc.87.1660164498106;
        Wed, 10 Aug 2022 13:48:18 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1660164498; cv=none;
        d=google.com; s=arc-20160816;
        b=KQGkhgRo4M602GYSdAtLxDORUBhOI/GYxijP854SL3mmDEUQrgL5hdDuijgxPmit+V
         59gXQR7uUt0i/eb2VP3VF9PxH6wRXNuDt8s1ugwy0eqGJ7MhOKRrv0wGVkHPzBFsDGD+
         NPJRTE9gvLfEBQaIi0k2+NIzA8EYwFGY4vZqssHbe6K6A61l4EJj0qI3mzv2CVbzfoJP
         YfBL2q6v26Psy1ZQQmfp//cs32isEHYx9Jgr8NQkoHEvX9pIJbj0YSo2LpF3jkByQ8uR
         CEM/atpXhF+0d+AdL7RClPYBIZx5xA6T4YSZbTsFAzqLNmQEdqGP22NSm79yI/YMAlrB
         ergw==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=sender:errors-to:content-transfer-encoding:cc:reply-to
         :list-subscribe:list-help:list-post:list-archive:list-unsubscribe
         :list-id:precedence:subject:mime-version:references:in-reply-to
         :message-id:date:to:from:dkim-signature:delivered-to;
        bh=yUBjkzyONpcfmnwKuT5i23JSRJgPFierbegZ6SOFMgg=;
        b=Ed6yEp8Jbol+j17wXiWBMo0GVI8gJMZxuo6muUYxEg+bmZR1ZWZKoy3wZOOGbWrk0f
         BzKLpnaoU7QW1RNTzO2iq4/Bzong9SFyEgs6fprheQ+926KFvZwNnGEGpYNbERe6Ngrp
         D8IZ9Ww0ZTYabQ2gVaykECR46XQCBq5IzwhjTxdNIN6XMuDiuihSoIpBSnsJPJwaKctj
         cwmwYQ8wXODDBoS1rRPA+jegpL41KcVG31QJX4BVsNg943pYMxqbBm/bSZfY+ofNyqyU
         mmmXW7C8cUo2gbQErJRdreYDpUOjijFphveWq21Br2xvSkqyZ8f2Gc1DVhZ5icYd2uWT
         EXQQ==
ARC-Authentication-Results: i=1; mx.google.com;
       dkim=neutral (body hash did not verify) header.i=@rothenpieler.org
 header.s=mail header.b=RkmP0qbJ;
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100])
        by mx.google.com with ESMTP id
 g10-20020a1709067c4a00b0073304017857si2682355ejp.171.2022.08.10.13.48.17;
        Wed, 10 Aug 2022 13:48:18 -0700 (PDT)
Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100;
Authentication-Results: mx.google.com;
       dkim=neutral (body hash did not verify) header.i=@rothenpieler.org
 header.s=mail header.b=RkmP0qbJ;
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5DEF268B82B;
	Wed, 10 Aug 2022 23:47:36 +0300 (EEST)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from btbn.de (btbn.de [136.243.74.85])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id B300B68B810
 for <ffmpeg-devel@ffmpeg.org>; Wed, 10 Aug 2022 23:47:25 +0300 (EEST)
Received: from [authenticated] by btbn.de (Postfix) with ESMTPSA id
 CC91D2F260A; Wed, 10 Aug 2022 22:47:24 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rothenpieler.org;
 s=mail; t=1660164444;
 bh=5JXiadhX66Oeb3v8epLGJztzidQz75zIzINHwUjk9gk=;
 h=From:To:Cc:Subject:Date:In-Reply-To:References;
 b=RkmP0qbJgKXEcY5UFZlb/OP9wEGYF4scDvlXkATM7OPJs00J17juPrREswlTpSr3g
 Cew7mNmTOYwX2ETSzBC4Clo5MMjseSChTR6pm+wS29/9IrhpuFbwgZS2U95fLECGTG
 oG3s8CvL+YNNAVQ63Xy/+2OisxArKSo54e2mzld4HTOlForAKN6bC9yYbldhSXwc1U
 K/ylzq/TidUmsR4ve4MtyEQB9WjBsiLad0MOKX4kmKcshfBoFdxh5e1rRUzbkA3hUZ
 xJ8z7PXvXkzjxrhgDaWP9A21DEOGWnhiLFPWvyePa8w4ezb+XK7QcDmaji8IUGvH6C
 ouITo02pWnt4Q==
From: Timo Rothenpieler <timo@rothenpieler.org>
To: ffmpeg-devel@ffmpeg.org
Date: Wed, 10 Aug 2022 22:47:05 +0200
Message-Id: <20220810204712.3123-4-timo@rothenpieler.org>
X-Mailer: git-send-email 2.34.1
In-Reply-To: <20220810204712.3123-1-timo@rothenpieler.org>
References: <20220810204712.3123-1-timo@rothenpieler.org>
MIME-Version: 1.0
Subject: [FFmpeg-devel] [PATCH 04/11] avfilter/vsrc_ddagrab: add options for
 more control over output format fallback
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Cc: Timo Rothenpieler <timo@rothenpieler.org>
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
X-TUID: 66eYrUHuaCi5

---
 libavfilter/vsrc_ddagrab.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/libavfilter/vsrc_ddagrab.c b/libavfilter/vsrc_ddagrab.c
index 252505b96d..00c72187ea 100644
--- a/libavfilter/vsrc_ddagrab.c
+++ b/libavfilter/vsrc_ddagrab.c
@@ -98,6 +98,8 @@ typedef struct DdagrabContext {
     int        offset_x;
     int        offset_y;
     int        out_fmt;
+    int        allow_fallback;
+    int        force_fmt;
 } DdagrabContext;
 
 #define OFFSET(x) offsetof(DdagrabContext, x)
@@ -117,6 +119,10 @@ static const AVOption ddagrab_options[] = {
     { "x2bgr10",    "only output 10 Bit X2BGR10",        0,            AV_OPT_TYPE_CONST,      { .i64 = DXGI_FORMAT_R10G10B10A2_UNORM }, 0, INT_MAX, FLAGS, "output_fmt" },
     { "16bit",      "only output default 16 Bit format", 0,            AV_OPT_TYPE_CONST,      { .i64 = DXGI_FORMAT_R16G16B16A16_FLOAT },0, INT_MAX, FLAGS, "output_fmt" },
     { "rgbaf16",    "only output 16 Bit RGBAF16",        0,            AV_OPT_TYPE_CONST,      { .i64 = DXGI_FORMAT_R16G16B16A16_FLOAT },0, INT_MAX, FLAGS, "output_fmt" },
+    { "allow_fallback", "don't error on fallback to default 8 Bit format",
+                                                   OFFSET(allow_fallback), AV_OPT_TYPE_BOOL,   { .i64 = 0    },       0,       1, FLAGS },
+    { "force_fmt",  "exclude BGRA from format list (experimental, discouraged by Microsoft)",
+                                                   OFFSET(force_fmt),  AV_OPT_TYPE_BOOL,       { .i64 = 0    },       0,       1, FLAGS },
     { NULL }
 };
 
@@ -226,7 +232,7 @@ static av_cold int init_dxgi_dda(AVFilterContext *avctx)
         } else if (dda->out_fmt) {
             formats[0] = dda->out_fmt;
             formats[1] = DXGI_FORMAT_B8G8R8A8_UNORM;
-            nb_formats = 2;
+            nb_formats = dda->force_fmt ? 1 : 2;
         }
 
         IDXGIOutput_Release(dxgi_output);
@@ -262,7 +268,7 @@ static av_cold int init_dxgi_dda(AVFilterContext *avctx)
 #else
     {
 #endif
-        if (dda->out_fmt && dda->out_fmt != DXGI_FORMAT_B8G8R8A8_UNORM) {
+        if (dda->out_fmt && dda->out_fmt != DXGI_FORMAT_B8G8R8A8_UNORM && (!dda->allow_fallback || dda->force_fmt)) {
             av_log(avctx, AV_LOG_ERROR, "Only 8 bit output supported with legacy API\n");
             return AVERROR(ENOTSUP);
         }
@@ -733,7 +739,7 @@ static int ddagrab_config_props(AVFilterLink *outlink)
     if (ret < 0)
         return ret;
 
-    if (dda->out_fmt && dda->raw_format != dda->out_fmt) {
+    if (dda->out_fmt && dda->raw_format != dda->out_fmt && (!dda->allow_fallback || dda->force_fmt)) {
         av_log(avctx, AV_LOG_ERROR, "Requested output format unavailable.\n");
         return AVERROR(ENOTSUP);
     }

From patchwork Wed Aug 10 20:47:06 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Timo Rothenpieler <timo@rothenpieler.org>
X-Patchwork-Id: 37223
Delivered-To: ffmpegpatchwork2@gmail.com
Received: by 2002:a05:6a20:3d0d:b0:8d:a68e:8a0e with SMTP id y13csp1142415pzi;
        Wed, 10 Aug 2022 13:48:08 -0700 (PDT)
X-Google-Smtp-Source: 
 AA6agR7PYVLjNak6zBgwgdjcMq2ahooNAvHXUwfHWUu3S35LAB8JUK+orTXhFVkbeT2AyP5R/FMo
X-Received: by 2002:a17:907:da2:b0:731:60e4:2261 with SMTP id
 go34-20020a1709070da200b0073160e42261mr10803063ejc.679.1660164488654;
        Wed, 10 Aug 2022 13:48:08 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1660164488; cv=none;
        d=google.com; s=arc-20160816;
        b=xdjSDEiXyLvk8iepzCBdP2lTKpQlMsm8rndIBe7J9j1iCwk3QHWCvpd/R4Dp7MBnLA
         DUPH0FC1OSh/HEFMS/0y//GW2C5PxAihkY/OwK7qgPsJFI0na+cjk+Kt3m9cpSFimZC5
         MiOwAFAwnJyOxd5MmOtPd4OZ9iGYoZByLieC+fzlVlDoUCw0ISzwtEKM9kUu8tKY+pzM
         uC+aWwWrXl23+TxFgsuoe/8M2m6Y9yhDptKcCJ959IepLizDMHR7gBFOF+pU7V7lEkTO
         jBkQqJ3PMOC7gdMIcSRNvKIuR3ja8x/RXjIzaqmWtQ4v9LzNkjMR2TdD4BLM2atEwSqP
         7KpA==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=sender:errors-to:content-transfer-encoding:cc:reply-to
         :list-subscribe:list-help:list-post:list-archive:list-unsubscribe
         :list-id:precedence:subject:mime-version:references:in-reply-to
         :message-id:date:to:from:dkim-signature:delivered-to;
        bh=KGzEnEwzbqen9iKksD8wcq87gmGrd6QN9eNV2fBGpUc=;
        b=icNvGxB4IYBYE+B3y73IRVNenGt54kZsfveQt7pMSggB1BiGAAkfiX/M1lDWx0+oB1
         nE9D9Xg5MkJtPiljyD86DLONa+0QkIWSL4UnMb/gQcl/pSEdORNmullYzQ0FrOjR0izS
         ecPY4W5BNplYbXRl9tXfKsL28t6BaN38QkoMQatFxtZmDrm0nurL93YYlX6C9n+VkOl0
         RAVTJ94YXHUM7HlMuq4QeDG7291q0yoN75togSxrSFUwD/lcQG7Sc5/X0nCq+mbZI+Pk
         Jn2Eltkmz/guCgfyfdY186iuWP6qiCkGsnNyD+zjX9oTuEl8pBeaFbGxXKlrRZO4o1Nd
         oekg==
ARC-Authentication-Results: i=1; mx.google.com;
       dkim=neutral (body hash did not verify) header.i=@rothenpieler.org
 header.s=mail header.b=PVZYwObf;
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100])
        by mx.google.com with ESMTP id
 dd7-20020a1709069b8700b0073045237fd2si5745399ejc.751.2022.08.10.13.48.08;
        Wed, 10 Aug 2022 13:48:08 -0700 (PDT)
Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100;
Authentication-Results: mx.google.com;
       dkim=neutral (body hash did not verify) header.i=@rothenpieler.org
 header.s=mail header.b=PVZYwObf;
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 62D7368B89B;
	Wed, 10 Aug 2022 23:47:35 +0300 (EEST)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from btbn.de (btbn.de [136.243.74.85])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id B3B7968B81D
 for <ffmpeg-devel@ffmpeg.org>; Wed, 10 Aug 2022 23:47:25 +0300 (EEST)
Received: from [authenticated] by btbn.de (Postfix) with ESMTPSA id
 E77B12F260B; Wed, 10 Aug 2022 22:47:24 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rothenpieler.org;
 s=mail; t=1660164445;
 bh=hIA6fXAMgpIjKVN47QgW6IcamIRVV4SEKeeSRuuBMr0=;
 h=From:To:Cc:Subject:Date:In-Reply-To:References;
 b=PVZYwObfvFSW8nSEU0tr9oJNRbsTLTYyxMdXQj7et4fZ8k/QAHBGL6Oe9wquPOlK+
 P2ijHzZaBN14wq3/jPccEl38Ck4rr/0+iJ18BWL5fRLCN4toaR43ls0ttw8aeEqdKe
 PENeCddnqd68ZUVEncKPJ0/9lL/3G3mVPtYpOaPylvTf4i1KjVYkZn+HQawTWAeEMJ
 N1F1ouC2fpS0/COlvFwYFiDFAPPwIbYBN+dHTIIaYYxkDefwgrpw3g2Hjh+oAiYTQU
 r0da7SwHNfGAmlXhigFbNrp2LkGDO53OX18//fGYQDvqtGZrgoiG3LgVaQMlV93i3D
 58XWXEItjQTPg==
From: Timo Rothenpieler <timo@rothenpieler.org>
To: ffmpeg-devel@ffmpeg.org
Date: Wed, 10 Aug 2022 22:47:06 +0200
Message-Id: <20220810204712.3123-5-timo@rothenpieler.org>
X-Mailer: git-send-email 2.34.1
In-Reply-To: <20220810204712.3123-1-timo@rothenpieler.org>
References: <20220810204712.3123-1-timo@rothenpieler.org>
MIME-Version: 1.0
Subject: [FFmpeg-devel] [PATCH 05/11] avutil: move half-precision float
 helper to avutil
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Cc: Timo Rothenpieler <timo@rothenpieler.org>
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
X-TUID: 7sSmrOL2dAQX

---
 libavcodec/exr.c                       | 2 +-
 libavcodec/exrenc.c                    | 2 +-
 libavcodec/pnmdec.c                    | 3 ++-
 libavcodec/pnmenc.c                    | 2 +-
 {libavcodec => libavutil}/float2half.h | 6 +++---
 {libavcodec => libavutil}/half2float.h | 6 +++---
 6 files changed, 11 insertions(+), 10 deletions(-)
 rename {libavcodec => libavutil}/float2half.h (96%)
 rename {libavcodec => libavutil}/half2float.h (96%)

diff --git a/libavcodec/exr.c b/libavcodec/exr.c
index 3a6b9c3014..5c6ca9adbf 100644
--- a/libavcodec/exr.c
+++ b/libavcodec/exr.c
@@ -41,6 +41,7 @@
 #include "libavutil/avstring.h"
 #include "libavutil/opt.h"
 #include "libavutil/color_utils.h"
+#include "libavutil/half2float.h"
 
 #include "avcodec.h"
 #include "bytestream.h"
@@ -53,7 +54,6 @@
 #include "exrdsp.h"
 #include "get_bits.h"
 #include "internal.h"
-#include "half2float.h"
 #include "mathops.h"
 #include "thread.h"
 
diff --git a/libavcodec/exrenc.c b/libavcodec/exrenc.c
index 8cf7827bb6..56c084d483 100644
--- a/libavcodec/exrenc.c
+++ b/libavcodec/exrenc.c
@@ -31,11 +31,11 @@
 #include "libavutil/intreadwrite.h"
 #include "libavutil/imgutils.h"
 #include "libavutil/pixdesc.h"
+#include "libavutil/float2half.h"
 #include "avcodec.h"
 #include "bytestream.h"
 #include "codec_internal.h"
 #include "encode.h"
-#include "float2half.h"
 
 enum ExrCompr {
     EXR_RAW,
diff --git a/libavcodec/pnmdec.c b/libavcodec/pnmdec.c
index 130407df25..9383dc8e60 100644
--- a/libavcodec/pnmdec.c
+++ b/libavcodec/pnmdec.c
@@ -21,12 +21,13 @@
 
 #include "config_components.h"
 
+#include "libavutil/half2float.h"
+
 #include "avcodec.h"
 #include "codec_internal.h"
 #include "internal.h"
 #include "put_bits.h"
 #include "pnm.h"
-#include "half2float.h"
 
 static void samplecpy(uint8_t *dst, const uint8_t *src, int n, int maxval)
 {
diff --git a/libavcodec/pnmenc.c b/libavcodec/pnmenc.c
index b16c93c88f..7ce534d06e 100644
--- a/libavcodec/pnmenc.c
+++ b/libavcodec/pnmenc.c
@@ -24,10 +24,10 @@
 #include "libavutil/intreadwrite.h"
 #include "libavutil/imgutils.h"
 #include "libavutil/pixdesc.h"
+#include "libavutil/float2half.h"
 #include "avcodec.h"
 #include "codec_internal.h"
 #include "encode.h"
-#include "float2half.h"
 
 typedef struct PHMEncContext {
     uint16_t basetable[512];
diff --git a/libavcodec/float2half.h b/libavutil/float2half.h
similarity index 96%
rename from libavcodec/float2half.h
rename to libavutil/float2half.h
index e05125088c..d6aaab8278 100644
--- a/libavcodec/float2half.h
+++ b/libavutil/float2half.h
@@ -16,8 +16,8 @@
  * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
  */
 
-#ifndef AVCODEC_FLOAT2HALF_H
-#define AVCODEC_FLOAT2HALF_H
+#ifndef AVUTIL_FLOAT2HALF_H
+#define AVUTIL_FLOAT2HALF_H
 
 #include <stdint.h>
 
@@ -64,4 +64,4 @@ static uint16_t float2half(uint32_t f, uint16_t *basetable, uint8_t *shifttable)
     return h;
 }
 
-#endif /* AVCODEC_FLOAT2HALF_H */
+#endif /* AVUTIL_FLOAT2HALF_H */
diff --git a/libavcodec/half2float.h b/libavutil/half2float.h
similarity index 96%
rename from libavcodec/half2float.h
rename to libavutil/half2float.h
index 7df6747e50..1f6deade07 100644
--- a/libavcodec/half2float.h
+++ b/libavutil/half2float.h
@@ -16,8 +16,8 @@
  * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
  */
 
-#ifndef AVCODEC_HALF2FLOAT_H
-#define AVCODEC_HALF2FLOAT_H
+#ifndef AVUTIL_HALF2FLOAT_H
+#define AVUTIL_HALF2FLOAT_H
 
 #include <stdint.h>
 
@@ -71,4 +71,4 @@ static uint32_t half2float(uint16_t h, const uint32_t *mantissatable, const uint
     return f;
 }
 
-#endif /* AVCODEC_HALF2FLOAT_H */
+#endif /* AVUTIL_HALF2FLOAT_H */

From patchwork Wed Aug 10 20:47:07 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Timo Rothenpieler <timo@rothenpieler.org>
X-Patchwork-Id: 37229
Delivered-To: ffmpegpatchwork2@gmail.com
Received: by 2002:a05:6a20:3d0d:b0:8d:a68e:8a0e with SMTP id y13csp1142868pzi;
        Wed, 10 Aug 2022 13:49:06 -0700 (PDT)
X-Google-Smtp-Source: 
 AA6agR7k4PEOOySPo6k0EFX5IvytqTiLdMGWeurg54kqYBdKT4OH+/OlmUSmMzfN/SBjmJKxX6bJ
X-Received: by 2002:a17:907:94c7:b0:730:d5bc:14c with SMTP id
 dn7-20020a17090794c700b00730d5bc014cmr22015938ejc.68.1660164545869;
        Wed, 10 Aug 2022 13:49:05 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1660164545; cv=none;
        d=google.com; s=arc-20160816;
        b=Y5wiBEag9c2/PMTMvrvLur4kXq89mTuCiqHn0BKkx4V6kAW6MhoXVQD+MnY3O+MCHp
         8mITVgOAk3E25R3cCBdc/jzEiUeGF5mkuNeRriUYdpN762xPDiaPFj6IG5Ef/sifrxSm
         74pClWGize/8jcgAO2imsk/EA+uhnt6MjQDtLC4q4MYWSnzC337kX0VC0EJqntXkJc47
         U+TSTBDBWhpKnOXMAoLInknBngyuh9tldBvLnKU+cw+KcOoClt3EaqUBu8sXog8zKLwg
         0FbBcCm5vUDtFsYgalPStRyLk1R9Dy+2J63SZF58REf35gzgIJsN8DMZC5YOSY13zqrf
         YarQ==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=sender:errors-to:content-transfer-encoding:cc:reply-to
         :list-subscribe:list-help:list-post:list-archive:list-unsubscribe
         :list-id:precedence:subject:mime-version:references:in-reply-to
         :message-id:date:to:from:dkim-signature:delivered-to;
        bh=tTi/43BreJYp/zvBD7SDdVv4Gvwy3AH8gmSa/2rSAr4=;
        b=y+rhN7el1ikfMX5xnP1Bv7sx89k10ycXpLPa4mSdxG58Rwg5fBPesO4/pFGY5K8jNG
         Wufj6bAMqZcfvNNFXmRSpFi27ieXWunzxpXZ1yb8TyQ+yBoKo9KzO0rs3e1mJexXDHZR
         zVg7EczwX57MxbHP/G2MYVH6f7x8uDaS7nQ7l20BhPp7D8patElNcDCqL0KWOlCbD0VE
         c36GtJetf9qQ3iNIukVCS57NK49f0lbpsbv8+4PqRJQq5Ukx2U22xxiVD5UfZ16zYqT3
         QW1vpAG1fSCLZUxI2LtW8h4aU9SLkG0g3lF/kERjyv5zQTvGR6o0z59JbCcw33zxYki+
         g9xA==
ARC-Authentication-Results: i=1; mx.google.com;
       dkim=neutral (body hash did not verify) header.i=@rothenpieler.org
 header.s=mail header.b=azJMuu5n;
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100])
        by mx.google.com with ESMTP id
 hr10-20020a1709073f8a00b0073069573e2esi5257762ejc.667.2022.08.10.13.49.05;
        Wed, 10 Aug 2022 13:49:05 -0700 (PDT)
Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100;
Authentication-Results: mx.google.com;
       dkim=neutral (body hash did not verify) header.i=@rothenpieler.org
 header.s=mail header.b=azJMuu5n;
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 239F368B93E;
	Wed, 10 Aug 2022 23:47:41 +0300 (EEST)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from btbn.de (btbn.de [136.243.74.85])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 238B268B89B
 for <ffmpeg-devel@ffmpeg.org>; Wed, 10 Aug 2022 23:47:31 +0300 (EEST)
Received: from [authenticated] by btbn.de (Postfix) with ESMTPSA id
 0E3182F260C; Wed, 10 Aug 2022 22:47:25 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rothenpieler.org;
 s=mail; t=1660164445;
 bh=H2csBUjFdQ2Aipfsh/Ovz+x8Wry9D0rysCl51yVY7JY=;
 h=From:To:Cc:Subject:Date:In-Reply-To:References;
 b=azJMuu5n+i/JHawR/wGptM9rWi4Eq0znBGWWPVxDPGD4OD71SPqzrgmBAvVqKfpjQ
 J1+5lDXBbb4xsFZWgj+aQQY17rBxpf5ZtCM960bChQ12xD3dE3gAxhFr4Vr4mLu00R
 wZhwHt33ZKahh2vuifPpCr+HxbHsyNZ6cp19/v0KeYTt8oXExg0JUWTP7xrvzB5tHE
 c7vWyG9r2gj1bSYLGrjfxwLCtxJ0C2O0+EHu+gsbxbZfed0aZObvWl+ZXl0FSVqjqw
 KPFyS4qdN91Tcac0G3+CDp8jwSzEl/wQJUAUq7XLTuCHV37KWPDwiEXl3EkfeVWt/G
 c2f4yyOqqjLUw==
From: Timo Rothenpieler <timo@rothenpieler.org>
To: ffmpeg-devel@ffmpeg.org
Date: Wed, 10 Aug 2022 22:47:07 +0200
Message-Id: <20220810204712.3123-6-timo@rothenpieler.org>
X-Mailer: git-send-email 2.34.1
In-Reply-To: <20220810204712.3123-1-timo@rothenpieler.org>
References: <20220810204712.3123-1-timo@rothenpieler.org>
MIME-Version: 1.0
Subject: [FFmpeg-devel] [PATCH 06/11] avutil/half2float: adjust conversion
 of NaN
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Cc: Timo Rothenpieler <timo@rothenpieler.org>
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
X-TUID: rJ9fBeAzk25e

IEEE-754 differentiates two different kind of NaNs.
Quiet and Signaling ones. They are differentiated by the MSB of the
mantissa.

For whatever reason, actual hardware conversion of half to single always
sets the signaling bit to 1 if the mantissa is != 0, and to 0 if it's 0.
So our code has to follow suite or fate-testing hardware float16 will be
impossible.
---
 libavcodec/exr.c                                    | 2 +-
 libavcodec/pnm.h                                    | 2 +-
 libavutil/half2float.h                              | 5 +++++
 tests/ref/fate/exr-rgb-scanline-zip-half-0x0-0xFFFF | 2 +-
 4 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/libavcodec/exr.c b/libavcodec/exr.c
index 5c6ca9adbf..47f4786491 100644
--- a/libavcodec/exr.c
+++ b/libavcodec/exr.c
@@ -191,7 +191,7 @@ typedef struct EXRContext {
     float gamma;
     union av_intfloat32 gamma_table[65536];
 
-    uint32_t mantissatable[2048];
+    uint32_t mantissatable[3072];
     uint32_t exponenttable[64];
     uint16_t offsettable[64];
 } EXRContext;
diff --git a/libavcodec/pnm.h b/libavcodec/pnm.h
index 5bf2eaa4d9..7e5445f529 100644
--- a/libavcodec/pnm.h
+++ b/libavcodec/pnm.h
@@ -34,7 +34,7 @@ typedef struct PNMContext {
     int half;
     float scale;
 
-    uint32_t mantissatable[2048];
+    uint32_t mantissatable[3072];
     uint32_t exponenttable[64];
     uint16_t offsettable[64];
 } PNMContext;
diff --git a/libavutil/half2float.h b/libavutil/half2float.h
index 1f6deade07..5af4690cfe 100644
--- a/libavutil/half2float.h
+++ b/libavutil/half2float.h
@@ -45,6 +45,9 @@ static void half2float_table(uint32_t *mantissatable, uint32_t *exponenttable,
         mantissatable[i] = convertmantissa(i);
     for (int i = 1024; i < 2048; i++)
         mantissatable[i] = 0x38000000UL + ((i - 1024) << 13UL);
+    for (int i = 2048; i < 3072; i++)
+        mantissatable[i] = mantissatable[i - 1024] | 0x400000UL;
+    mantissatable[2048] = mantissatable[1024];
 
     exponenttable[0] = 0;
     for (int i = 1; i < 31; i++)
@@ -58,7 +61,9 @@ static void half2float_table(uint32_t *mantissatable, uint32_t *exponenttable,
     offsettable[0] = 0;
     for (int i = 1; i < 64; i++)
         offsettable[i] = 1024;
+    offsettable[31] = 2048;
     offsettable[32] = 0;
+    offsettable[63] = 2048;
 }
 
 static uint32_t half2float(uint16_t h, const uint32_t *mantissatable, const uint32_t *exponenttable,
diff --git a/tests/ref/fate/exr-rgb-scanline-zip-half-0x0-0xFFFF b/tests/ref/fate/exr-rgb-scanline-zip-half-0x0-0xFFFF
index b6201116fe..e45a40b498 100644
--- a/tests/ref/fate/exr-rgb-scanline-zip-half-0x0-0xFFFF
+++ b/tests/ref/fate/exr-rgb-scanline-zip-half-0x0-0xFFFF
@@ -3,4 +3,4 @@
 #codec_id 0: rawvideo
 #dimensions 0: 256x256
 #sar 0: 1/1
-0,          0,          0,        1,   786432, 0x1445e411
+0,          0,          0,        1,   786432, 0xce9be2be

From patchwork Wed Aug 10 20:47:08 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Timo Rothenpieler <timo@rothenpieler.org>
X-Patchwork-Id: 37230
Delivered-To: ffmpegpatchwork2@gmail.com
Received: by 2002:a05:6a20:3d0d:b0:8d:a68e:8a0e with SMTP id y13csp1142912pzi;
        Wed, 10 Aug 2022 13:49:15 -0700 (PDT)
X-Google-Smtp-Source: 
 AA6agR6KELa+47kRBF/z/fi16JMYqbfBOIywLmE1UZAwKVsq9X5eOWaJc8yhXExHRGBtxzRE5Cr8
X-Received: by 2002:a17:906:6a03:b0:730:a20e:cf33 with SMTP id
 qw3-20020a1709066a0300b00730a20ecf33mr22027182ejc.620.1660164554831;
        Wed, 10 Aug 2022 13:49:14 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1660164554; cv=none;
        d=google.com; s=arc-20160816;
        b=SLhzmO33ZjsTDHEfO/WscYddskHmPpiexxQJIScik2o/KwAybYk6HHP7vjG4XrpZdX
         xF10Fq3b4XCqYa1jawgDrDJVk9epk2CM6/kJrAyTrSTt008c8DI/h7icwgcZyoXe3yVi
         aPOJxyEuD4CCYQG9GQA/6LUs8iyuyA6Rcl3G5fyuSQukxImte+6n/ZNpPou3JSk3TckT
         rhLQ+lbgqtMe4x0dNZRCUDzAtjzPTNdg5utvSM+wnJQcUl5sOF6oQY9TYYXE705OeMt4
         Q7o7McdwlJ6oDVd+eRsCUjUUqRmCQLzrMsBWmU/D1fCIjw0/k+7MUGOytLB+J76RAbkh
         819w==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=sender:errors-to:content-transfer-encoding:cc:reply-to
         :list-subscribe:list-help:list-post:list-archive:list-unsubscribe
         :list-id:precedence:subject:mime-version:references:in-reply-to
         :message-id:date:to:from:dkim-signature:delivered-to;
        bh=2RQnI+IoqISndY2AhSh1iW87QvyVVkBdU8W2lBugoXw=;
        b=IR5JNYxO1nno3fLENvFDUtdDYW60HtjK4lpW1JvgIgbsZHkCs6ahwlWNRQzmd2f9eu
         /D133zFyrmlzs5zpdQ19borBpUFVD4jVmYn0J01Lgahc/si9THKiqcXIc/y5HJuOA3Kp
         tVRO4SPkXLX+9w2SFs/j9Uq9cByAKntJghLt9+kqn6gbjSHjZDZkWVPT77Hzb2yb+hdt
         pEBYgxcTsOozed4W6LRt1NcB37CR1r7ieMAvXvglZIfrTT3QVohhqv5cUuyjGhYSjEWT
         2C1sxNIUBBZDkgRqUqqcniJoBtTgDeCj5m9Z/JlUG1RSKBZAGwnAqDKqFaPnR/6c7zWK
         QlMw==
ARC-Authentication-Results: i=1; mx.google.com;
       dkim=neutral (body hash did not verify) header.i=@rothenpieler.org
 header.s=mail header.b="Wr/FVQGp";
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100])
        by mx.google.com with ESMTP id
 js12-20020a17090797cc00b00730a1069b72si5558886ejc.684.2022.08.10.13.49.14;
        Wed, 10 Aug 2022 13:49:14 -0700 (PDT)
Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100;
Authentication-Results: mx.google.com;
       dkim=neutral (body hash did not verify) header.i=@rothenpieler.org
 header.s=mail header.b="Wr/FVQGp";
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 39B8468B947;
	Wed, 10 Aug 2022 23:47:42 +0300 (EEST)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from btbn.de (btbn.de [136.243.74.85])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 207A568B896
 for <ffmpeg-devel@ffmpeg.org>; Wed, 10 Aug 2022 23:47:31 +0300 (EEST)
Received: from [authenticated] by btbn.de (Postfix) with ESMTPSA id
 294B62F260D; Wed, 10 Aug 2022 22:47:25 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rothenpieler.org;
 s=mail; t=1660164445;
 bh=5KB+vvK9crHgt1cgefZLtqp3WZ95ATxoJ7UVHhxSVsY=;
 h=From:To:Cc:Subject:Date:In-Reply-To:References;
 b=Wr/FVQGpT2zmyuYPynfYtAJUzO+/MwWqPSZ8IKcukM0Sj1/vNvKYr1wFw4MeuS0RW
 rSWm6GhTC0BbILlBbfMMmTjKTJlMIO1VCfMDhOkmrixoy+luMjM5bVPBMuSylow/hJ
 mjXnXarl1qDrML5cQVofIoXs5OPEk6WU2AvADCR3GFZh6cGjKI4UjxFcXPzukTbj6p
 O3YkGBrPRw6gg9Z4dKVjlIQwjfc55BXXHAqlLmcvcqUnTWwjqJSAhieWDzZ73WRmie
 aqbDasQdSz0H2mfW0V75Dx4xjB3kwakG/h0vi3VoldQsSsnz0kWplgE/2pWRZD2J7u
 eK2KbbAwCI0mA==
From: Timo Rothenpieler <timo@rothenpieler.org>
To: ffmpeg-devel@ffmpeg.org
Date: Wed, 10 Aug 2022 22:47:08 +0200
Message-Id: <20220810204712.3123-7-timo@rothenpieler.org>
X-Mailer: git-send-email 2.34.1
In-Reply-To: <20220810204712.3123-1-timo@rothenpieler.org>
References: <20220810204712.3123-1-timo@rothenpieler.org>
MIME-Version: 1.0
Subject: [FFmpeg-devel] [PATCH 07/11] avutil/half2float: move tables to
 header-internal structs
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Cc: Timo Rothenpieler <timo@rothenpieler.org>
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
X-TUID: UGkGuxst2r+K

Having to put the knowledge of the size of those arrays into a multitude
of places is rather smelly.
---
 libavcodec/exr.c       | 27 ++++++++--------------
 libavcodec/exrenc.c    | 11 +++++----
 libavcodec/pnm.h       |  5 ++---
 libavcodec/pnmdec.c    | 42 ++++++++--------------------------
 libavcodec/pnmenc.c    | 13 +++++------
 libavutil/float2half.h | 51 +++++++++++++++++++++++-------------------
 libavutil/half2float.h | 46 ++++++++++++++++++++-----------------
 7 files changed, 84 insertions(+), 111 deletions(-)

diff --git a/libavcodec/exr.c b/libavcodec/exr.c
index 47f4786491..825354873d 100644
--- a/libavcodec/exr.c
+++ b/libavcodec/exr.c
@@ -191,9 +191,7 @@ typedef struct EXRContext {
     float gamma;
     union av_intfloat32 gamma_table[65536];
 
-    uint32_t mantissatable[3072];
-    uint32_t exponenttable[64];
-    uint16_t offsettable[64];
+    half2float_tables h2f_tables;
 } EXRContext;
 
 static int zip_uncompress(const EXRContext *s, const uint8_t *src, int compressed_size,
@@ -899,10 +897,7 @@ static int ac_uncompress(const EXRContext *s, GetByteContext *gb, float *block)
             n += val & 0xff;
         } else {
             ret = n;
-            block[ff_zigzag_direct[n]] = av_int2float(half2float(val,
-                                                      s->mantissatable,
-                                                      s->exponenttable,
-                                                      s->offsettable));
+            block[ff_zigzag_direct[n]] = av_int2float(half2float(val, &s->h2f_tables));
             n++;
         }
     }
@@ -1120,8 +1115,7 @@ static int dwa_uncompress(const EXRContext *s, const uint8_t *src, int compresse
                 uint16_t *dc = (uint16_t *)td->dc_data;
                 union av_intfloat32 dc_val;
 
-                dc_val.i = half2float(dc[idx], s->mantissatable,
-                                      s->exponenttable, s->offsettable);
+                dc_val.i = half2float(dc[idx], &s->h2f_tables);
 
                 block[0] = dc_val.f;
                 ac_uncompress(s, &agb, block);
@@ -1171,7 +1165,7 @@ static int dwa_uncompress(const EXRContext *s, const uint8_t *src, int compresse
         for (int x = 0; x < td->xsize; x++) {
             uint16_t ha = ai0[x] | (ai1[x] << 8);
 
-            ao[x] = half2float(ha, s->mantissatable, s->exponenttable, s->offsettable);
+            ao[x] = half2float(ha, &s->h2f_tables);
         }
     }
 
@@ -1427,10 +1421,7 @@ static int decode_block(AVCodecContext *avctx, void *tdata,
                         }
                     } else {
                         for (x = 0; x < xsize; x++) {
-                            ptr_x[0].i = half2float(bytestream_get_le16(&src),
-                                                    s->mantissatable,
-                                                    s->exponenttable,
-                                                    s->offsettable);
+                            ptr_x[0].i = half2float(bytestream_get_le16(&src), &s->h2f_tables);
                             ptr_x++;
                         }
                     }
@@ -2217,7 +2208,7 @@ static av_cold int decode_init(AVCodecContext *avctx)
     float one_gamma = 1.0f / s->gamma;
     avpriv_trc_function trc_func = NULL;
 
-    half2float_table(s->mantissatable, s->exponenttable, s->offsettable);
+    init_half2float_tables(&s->h2f_tables);
 
     s->avctx              = avctx;
 
@@ -2230,18 +2221,18 @@ static av_cold int decode_init(AVCodecContext *avctx)
     trc_func = avpriv_get_trc_function_from_trc(s->apply_trc_type);
     if (trc_func) {
         for (i = 0; i < 65536; ++i) {
-            t.i = half2float(i, s->mantissatable, s->exponenttable, s->offsettable);
+            t.i = half2float(i, &s->h2f_tables);
             t.f = trc_func(t.f);
             s->gamma_table[i] = t;
         }
     } else {
         if (one_gamma > 0.9999f && one_gamma < 1.0001f) {
             for (i = 0; i < 65536; ++i) {
-                s->gamma_table[i].i = half2float(i, s->mantissatable, s->exponenttable, s->offsettable);
+                s->gamma_table[i].i = half2float(i, &s->h2f_tables);
             }
         } else {
             for (i = 0; i < 65536; ++i) {
-                t.i = half2float(i, s->mantissatable, s->exponenttable, s->offsettable);
+                t.i = half2float(i, &s->h2f_tables);
                 /* If negative value we reuse half value */
                 if (t.f <= 0.0f) {
                     s->gamma_table[i] = t;
diff --git a/libavcodec/exrenc.c b/libavcodec/exrenc.c
index 56c084d483..6ab9400b7c 100644
--- a/libavcodec/exrenc.c
+++ b/libavcodec/exrenc.c
@@ -87,15 +87,14 @@ typedef struct EXRContext {
 
     EXRScanlineData *scanline;
 
-    uint16_t basetable[512];
-    uint8_t shifttable[512];
+    float2half_tables f2h_tables;
 } EXRContext;
 
 static av_cold int encode_init(AVCodecContext *avctx)
 {
     EXRContext *s = avctx->priv_data;
 
-    float2half_tables(s->basetable, s->shifttable);
+    init_float2half_tables(&s->f2h_tables);
 
     switch (avctx->pix_fmt) {
     case AV_PIX_FMT_GBRPF32:
@@ -256,7 +255,7 @@ static int encode_scanline_rle(EXRContext *s, const AVFrame *frame)
                 const uint32_t *src = (const uint32_t *)(frame->data[ch] + y * frame->linesize[ch]);
 
                 for (int x = 0; x < frame->width; x++)
-                    dst[x] = float2half(src[x], s->basetable, s->shifttable);
+                    dst[x] = float2half(src[x], &s->f2h_tables);
             }
             break;
         }
@@ -324,7 +323,7 @@ static int encode_scanline_zip(EXRContext *s, const AVFrame *frame)
                     const uint32_t *src = (const uint32_t *)(frame->data[ch] + (y * s->scanline_height + l) * frame->linesize[ch]);
 
                     for (int x = 0; x < frame->width; x++)
-                        dst[x] = float2half(src[x], s->basetable, s->shifttable);
+                        dst[x] = float2half(src[x], &s->f2h_tables);
                 }
             }
             break;
@@ -482,7 +481,7 @@ static int encode_frame(AVCodecContext *avctx, AVPacket *pkt,
                     const uint32_t *src = (const uint32_t *)(frame->data[ch] + y * frame->linesize[ch]);
 
                     for (int x = 0; x < frame->width; x++)
-                        bytestream2_put_le16(pb, float2half(src[x], s->basetable, s->shifttable));
+                        bytestream2_put_le16(pb, float2half(src[x], &s->f2h_tables));
                 }
             }
         }
diff --git a/libavcodec/pnm.h b/libavcodec/pnm.h
index 7e5445f529..25251d9e4a 100644
--- a/libavcodec/pnm.h
+++ b/libavcodec/pnm.h
@@ -22,6 +22,7 @@
 #ifndef AVCODEC_PNM_H
 #define AVCODEC_PNM_H
 
+#include "libavutil/half2float.h"
 #include "avcodec.h"
 
 typedef struct PNMContext {
@@ -34,9 +35,7 @@ typedef struct PNMContext {
     int half;
     float scale;
 
-    uint32_t mantissatable[3072];
-    uint32_t exponenttable[64];
-    uint16_t offsettable[64];
+    half2float_tables h2f_tables;
 } PNMContext;
 
 int ff_pnm_decode_header(AVCodecContext *avctx, PNMContext * const s);
diff --git a/libavcodec/pnmdec.c b/libavcodec/pnmdec.c
index 9383dc8e60..6adc348ec8 100644
--- a/libavcodec/pnmdec.c
+++ b/libavcodec/pnmdec.c
@@ -313,18 +313,9 @@ static int pnm_decode_frame(AVCodecContext *avctx, AVFrame *p,
                 b = (float *)p->data[1];
                 for (int i = 0; i < avctx->height; i++) {
                     for (int j = 0; j < avctx->width; j++) {
-                        r[j] = av_int2float(half2float(AV_RL16(s->bytestream+0),
-                                                       s->mantissatable,
-                                                       s->exponenttable,
-                                                       s->offsettable)) * scale;
-                        g[j] = av_int2float(half2float(AV_RL16(s->bytestream+2),
-                                                       s->mantissatable,
-                                                       s->exponenttable,
-                                                       s->offsettable)) * scale;
-                        b[j] = av_int2float(half2float(AV_RL16(s->bytestream+4),
-                                                       s->mantissatable,
-                                                       s->exponenttable,
-                                                       s->offsettable)) * scale;
+                        r[j] = av_int2float(half2float(AV_RL16(s->bytestream+0), &s->h2f_tables)) * scale;
+                        g[j] = av_int2float(half2float(AV_RL16(s->bytestream+2), &s->h2f_tables)) * scale;
+                        b[j] = av_int2float(half2float(AV_RL16(s->bytestream+4), &s->h2f_tables)) * scale;
                         s->bytestream += 6;
                     }
 
@@ -340,18 +331,9 @@ static int pnm_decode_frame(AVCodecContext *avctx, AVFrame *p,
                 b = (float *)p->data[1];
                 for (int i = 0; i < avctx->height; i++) {
                     for (int j = 0; j < avctx->width; j++) {
-                        r[j] = av_int2float(half2float(AV_RB16(s->bytestream+0),
-                                                       s->mantissatable,
-                                                       s->exponenttable,
-                                                       s->offsettable)) * scale;
-                        g[j] = av_int2float(half2float(AV_RB16(s->bytestream+2),
-                                                       s->mantissatable,
-                                                       s->exponenttable,
-                                                       s->offsettable)) * scale;
-                        b[j] = av_int2float(half2float(AV_RB16(s->bytestream+4),
-                                                       s->mantissatable,
-                                                       s->exponenttable,
-                                                       s->offsettable)) * scale;
+                        r[j] = av_int2float(half2float(AV_RB16(s->bytestream+0), &s->h2f_tables)) * scale;
+                        g[j] = av_int2float(half2float(AV_RB16(s->bytestream+2), &s->h2f_tables)) * scale;
+                        b[j] = av_int2float(half2float(AV_RB16(s->bytestream+4), &s->h2f_tables)) * scale;
                         s->bytestream += 6;
                     }
 
@@ -394,10 +376,7 @@ static int pnm_decode_frame(AVCodecContext *avctx, AVFrame *p,
                 float *g = (float *)p->data[0];
                 for (int i = 0; i < avctx->height; i++) {
                     for (int j = 0; j < avctx->width; j++) {
-                        g[j] = av_int2float(half2float(AV_RL16(s->bytestream),
-                                                       s->mantissatable,
-                                                       s->exponenttable,
-                                                       s->offsettable)) * scale;
+                        g[j] = av_int2float(half2float(AV_RL16(s->bytestream), &s->h2f_tables)) * scale;
                         s->bytestream += 2;
                     }
                     g += p->linesize[0] / 4;
@@ -406,10 +385,7 @@ static int pnm_decode_frame(AVCodecContext *avctx, AVFrame *p,
                 float *g = (float *)p->data[0];
                 for (int i = 0; i < avctx->height; i++) {
                     for (int j = 0; j < avctx->width; j++) {
-                        g[j] = av_int2float(half2float(AV_RB16(s->bytestream),
-                                                       s->mantissatable,
-                                                       s->exponenttable,
-                                                       s->offsettable)) * scale;
+                        g[j] = av_int2float(half2float(AV_RB16(s->bytestream), &s->h2f_tables)) * scale;
                         s->bytestream += 2;
                     }
                     g += p->linesize[0] / 4;
@@ -501,7 +477,7 @@ static av_cold int phm_dec_init(AVCodecContext *avctx)
 {
     PNMContext *s = avctx->priv_data;
 
-    half2float_table(s->mantissatable, s->exponenttable, s->offsettable);
+    init_half2float_tables(&s->h2f_tables);
 
     return 0;
 }
diff --git a/libavcodec/pnmenc.c b/libavcodec/pnmenc.c
index 7ce534d06e..70992531bf 100644
--- a/libavcodec/pnmenc.c
+++ b/libavcodec/pnmenc.c
@@ -30,8 +30,7 @@
 #include "encode.h"
 
 typedef struct PHMEncContext {
-    uint16_t basetable[512];
-    uint8_t shifttable[512];
+    float2half_tables f2h_tables;
 } PHMEncContext;
 
 static int pnm_encode_frame(AVCodecContext *avctx, AVPacket *pkt,
@@ -169,9 +168,9 @@ static int pnm_encode_frame(AVCodecContext *avctx, AVPacket *pkt,
 
         for (int i = 0; i < avctx->height; i++) {
             for (int j = 0; j < avctx->width; j++) {
-                AV_WN16(bytestream + 0, float2half(av_float2int(r[j]), s->basetable, s->shifttable));
-                AV_WN16(bytestream + 2, float2half(av_float2int(g[j]), s->basetable, s->shifttable));
-                AV_WN16(bytestream + 4, float2half(av_float2int(b[j]), s->basetable, s->shifttable));
+                AV_WN16(bytestream + 0, float2half(av_float2int(r[j]), &s->f2h_tables));
+                AV_WN16(bytestream + 2, float2half(av_float2int(g[j]), &s->f2h_tables));
+                AV_WN16(bytestream + 4, float2half(av_float2int(b[j]), &s->f2h_tables));
                 bytestream += 6;
             }
 
@@ -184,7 +183,7 @@ static int pnm_encode_frame(AVCodecContext *avctx, AVPacket *pkt,
 
         for (int i = 0; i < avctx->height; i++) {
             for (int j = 0; j < avctx->width; j++) {
-                AV_WN16(bytestream, float2half(av_float2int(g[j]), s->basetable, s->shifttable));
+                AV_WN16(bytestream, float2half(av_float2int(g[j]), &s->f2h_tables));
                 bytestream += 2;
             }
 
@@ -295,7 +294,7 @@ static av_cold int phm_enc_init(AVCodecContext *avctx)
 {
     PHMEncContext *s = avctx->priv_data;
 
-    float2half_tables(s->basetable, s->shifttable);
+    init_float2half_tables(&s->f2h_tables);
 
     return 0;
 }
diff --git a/libavutil/float2half.h b/libavutil/float2half.h
index d6aaab8278..9252560649 100644
--- a/libavutil/float2half.h
+++ b/libavutil/float2half.h
@@ -21,45 +21,50 @@
 
 #include <stdint.h>
 
-static void float2half_tables(uint16_t *basetable, uint8_t *shifttable)
+typedef struct float2half_tables {
+    uint16_t basetable[512];
+    uint8_t shifttable[512];
+} float2half_tables;
+
+static void init_float2half_tables(float2half_tables *t)
 {
     for (int i = 0; i < 256; i++) {
         int e = i - 127;
 
         if (e < -24) { // Very small numbers map to zero
-            basetable[i|0x000]  = 0x0000;
-            basetable[i|0x100]  = 0x8000;
-            shifttable[i|0x000] = 24;
-            shifttable[i|0x100] = 24;
+            t->basetable[i|0x000]  = 0x0000;
+            t->basetable[i|0x100]  = 0x8000;
+            t->shifttable[i|0x000] = 24;
+            t->shifttable[i|0x100] = 24;
         } else if (e < -14) { // Small numbers map to denorms
-            basetable[i|0x000] = (0x0400>>(-e-14));
-            basetable[i|0x100] = (0x0400>>(-e-14)) | 0x8000;
-            shifttable[i|0x000] = -e-1;
-            shifttable[i|0x100] = -e-1;
+            t->basetable[i|0x000] = (0x0400>>(-e-14));
+            t->basetable[i|0x100] = (0x0400>>(-e-14)) | 0x8000;
+            t->shifttable[i|0x000] = -e-1;
+            t->shifttable[i|0x100] = -e-1;
         } else if (e <= 15) { // Normal numbers just lose precision
-            basetable[i|0x000] = ((e + 15) << 10);
-            basetable[i|0x100] = ((e + 15) << 10) | 0x8000;
-            shifttable[i|0x000] = 13;
-            shifttable[i|0x100] = 13;
+            t->basetable[i|0x000] = ((e + 15) << 10);
+            t->basetable[i|0x100] = ((e + 15) << 10) | 0x8000;
+            t->shifttable[i|0x000] = 13;
+            t->shifttable[i|0x100] = 13;
         } else if (e < 128) { // Large numbers map to Infinity
-            basetable[i|0x000]  = 0x7C00;
-            basetable[i|0x100]  = 0xFC00;
-            shifttable[i|0x000] = 24;
-            shifttable[i|0x100] = 24;
+            t->basetable[i|0x000]  = 0x7C00;
+            t->basetable[i|0x100]  = 0xFC00;
+            t->shifttable[i|0x000] = 24;
+            t->shifttable[i|0x100] = 24;
         } else { // Infinity and NaN's stay Infinity and NaN's
-            basetable[i|0x000]  = 0x7C00;
-            basetable[i|0x100]  = 0xFC00;
-            shifttable[i|0x000] = 13;
-            shifttable[i|0x100] = 13;
+            t->basetable[i|0x000]  = 0x7C00;
+            t->basetable[i|0x100]  = 0xFC00;
+            t->shifttable[i|0x000] = 13;
+            t->shifttable[i|0x100] = 13;
         }
     }
 }
 
-static uint16_t float2half(uint32_t f, uint16_t *basetable, uint8_t *shifttable)
+static uint16_t float2half(uint32_t f, const float2half_tables *t)
 {
     uint16_t h;
 
-    h = basetable[(f >> 23) & 0x1ff] + ((f & 0x007fffff) >> shifttable[(f >> 23) & 0x1ff]);
+    h = t->basetable[(f >> 23) & 0x1ff] + ((f & 0x007fffff) >> t->shifttable[(f >> 23) & 0x1ff]);
 
     return h;
 }
diff --git a/libavutil/half2float.h b/libavutil/half2float.h
index 5af4690cfe..10b6fef4e6 100644
--- a/libavutil/half2float.h
+++ b/libavutil/half2float.h
@@ -21,6 +21,12 @@
 
 #include <stdint.h>
 
+typedef struct half2float_tables {
+    uint32_t mantissatable[3072];
+    uint32_t exponenttable[64];
+    uint16_t offsettable[64];
+} half2float_tables;
+
 static uint32_t convertmantissa(uint32_t i)
 {
     int32_t m = i << 13; // Zero pad mantissa bits
@@ -37,41 +43,39 @@ static uint32_t convertmantissa(uint32_t i)
     return m | e; // Return combined number
 }
 
-static void half2float_table(uint32_t *mantissatable, uint32_t *exponenttable,
-                             uint16_t *offsettable)
+static void init_half2float_tables(half2float_tables *t)
 {
-    mantissatable[0] = 0;
+    t->mantissatable[0] = 0;
     for (int i = 1; i < 1024; i++)
-        mantissatable[i] = convertmantissa(i);
+        t->mantissatable[i] = convertmantissa(i);
     for (int i = 1024; i < 2048; i++)
-        mantissatable[i] = 0x38000000UL + ((i - 1024) << 13UL);
+        t->mantissatable[i] = 0x38000000UL + ((i - 1024) << 13UL);
     for (int i = 2048; i < 3072; i++)
-        mantissatable[i] = mantissatable[i - 1024] | 0x400000UL;
-    mantissatable[2048] = mantissatable[1024];
+        t->mantissatable[i] = t->mantissatable[i - 1024] | 0x400000UL;
+    t->mantissatable[2048] = t->mantissatable[1024];
 
-    exponenttable[0] = 0;
+    t->exponenttable[0] = 0;
     for (int i = 1; i < 31; i++)
-        exponenttable[i] = i << 23;
+        t->exponenttable[i] = i << 23;
     for (int i = 33; i < 63; i++)
-        exponenttable[i] = 0x80000000UL + ((i - 32) << 23UL);
-    exponenttable[31]= 0x47800000UL;
-    exponenttable[32]= 0x80000000UL;
-    exponenttable[63]= 0xC7800000UL;
+        t->exponenttable[i] = 0x80000000UL + ((i - 32) << 23UL);
+    t->exponenttable[31]= 0x47800000UL;
+    t->exponenttable[32]= 0x80000000UL;
+    t->exponenttable[63]= 0xC7800000UL;
 
-    offsettable[0] = 0;
+    t->offsettable[0] = 0;
     for (int i = 1; i < 64; i++)
-        offsettable[i] = 1024;
-    offsettable[31] = 2048;
-    offsettable[32] = 0;
-    offsettable[63] = 2048;
+        t->offsettable[i] = 1024;
+    t->offsettable[31] = 2048;
+    t->offsettable[32] = 0;
+    t->offsettable[63] = 2048;
 }
 
-static uint32_t half2float(uint16_t h, const uint32_t *mantissatable, const uint32_t *exponenttable,
-                           const uint16_t *offsettable)
+static uint32_t half2float(uint16_t h, const half2float_tables *t)
 {
     uint32_t f;
 
-    f = mantissatable[offsettable[h >> 10] + (h & 0x3ff)] + exponenttable[h >> 10];
+    f = t->mantissatable[t->offsettable[h >> 10] + (h & 0x3ff)] + t->exponenttable[h >> 10];
 
     return f;
 }

From patchwork Wed Aug 10 20:47:09 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Timo Rothenpieler <timo@rothenpieler.org>
X-Patchwork-Id: 37227
Delivered-To: ffmpegpatchwork2@gmail.com
Received: by 2002:a05:6a20:3d0d:b0:8d:a68e:8a0e with SMTP id y13csp1142710pzi;
        Wed, 10 Aug 2022 13:48:47 -0700 (PDT)
X-Google-Smtp-Source: 
 AA6agR4NSLG8xklyYBEY9gN/yI+MUrFXUUixkw4jalxYOr+w0ncjOMNMHpRbh4LCa9cLz1m0bJXL
X-Received: by 2002:a05:6402:3220:b0:43d:ca4f:d2b9 with SMTP id
 g32-20020a056402322000b0043dca4fd2b9mr27936753eda.177.1660164526998;
        Wed, 10 Aug 2022 13:48:46 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1660164526; cv=none;
        d=google.com; s=arc-20160816;
        b=T+6G8vxpf9ZLEl3WdCJ56v3LEfCiH1t42VsTphccSAOy78FIAQWrVrZHiNjWGFHKtx
         L76+CITfcaf9soG0GnCeCYVgZVE2sgqf437WvpRrmHuQ6P5Z8Yz3crNtK3gIqFqXdozV
         RFPkxyf1IB7H7G3Du9w/MqXU8MFCaY75OKvvkYcHIkodIQRGeu8S/zvSocaklLliqF/V
         KDAJaK8+tN1Ctbx+veKZJBpHN1g0lT8/N5A2MWfhfe4zKwsPO/tqiRLveoCnKGF5WHPC
         YCcRJ1UXGHHY1ZYF8AGgp6M7gPUBJ4lu5ylBT+3L0+B7XHgjodBwW/n6qzCrDIv9+SBd
         wHyg==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=sender:errors-to:content-transfer-encoding:cc:reply-to
         :list-subscribe:list-help:list-post:list-archive:list-unsubscribe
         :list-id:precedence:subject:mime-version:references:in-reply-to
         :message-id:date:to:from:dkim-signature:delivered-to;
        bh=VzCY+C4+LMrFrtHF21V6NKxiRcHRk3TGsnlQrWw1o4w=;
        b=jC/8jeFIh4msD0BEt5xBC4zE/ccqpqVLtV12mSZAWxMemEx3FlIGbM4WIKMHweuyGg
         NsFbuzM2ezUO3M0hQgN2z8lqjSqv2bY2DnO5c6vArqSHx3qoASviaFVvevCprIp6ZpYh
         rO7N5RkFSJI4oayuhvCDPoxTGl+31QMakA3jIThDDWAaDxIGmjMhZWCO2MrFJiaROr+C
         n2KpEYIRu92SSxkNlNohrb0fQY7BHraDdHu22ruStzdf9C5u9BzOHI/jVXKkM4QaoFzU
         yFvQqRFKrYm0BX41Q3lAsfvbcnYuqrIHiJ2PqVJKpChH3CIpipD8txX9VJKNuCIwQw4b
         Ipnw==
ARC-Authentication-Results: i=1; mx.google.com;
       dkim=neutral (body hash did not verify) header.i=@rothenpieler.org
 header.s=mail header.b=UCBUG2PJ;
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100])
        by mx.google.com with ESMTP id
 z11-20020a05640235cb00b004377e09fb70si13252708edc.551.2022.08.10.13.48.46;
        Wed, 10 Aug 2022 13:48:46 -0700 (PDT)
Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100;
Authentication-Results: mx.google.com;
       dkim=neutral (body hash did not verify) header.i=@rothenpieler.org
 header.s=mail header.b=UCBUG2PJ;
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2FF0E68B871;
	Wed, 10 Aug 2022 23:47:39 +0300 (EEST)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from btbn.de (btbn.de [136.243.74.85])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1B14568B810
 for <ffmpeg-devel@ffmpeg.org>; Wed, 10 Aug 2022 23:47:31 +0300 (EEST)
Received: from [authenticated] by btbn.de (Postfix) with ESMTPSA id
 45AC92F260E; Wed, 10 Aug 2022 22:47:25 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rothenpieler.org;
 s=mail; t=1660164445;
 bh=XwRiEcr4iHYNfeISTznrmlj6RGS4De+KGp7K6Gmz+h0=;
 h=From:To:Cc:Subject:Date:In-Reply-To:References;
 b=UCBUG2PJ/yUQIklsfkbQ0M4fcy8enHrODGtd8HPG9NJAHeGUGlgHD8nRS5mAJlkgS
 ZZtwn7s6YLWuuwI2Sgj5RD+qCf0WeHxRm8EaCav0hg8ndcZWnedUAVwOolawHAj20K
 QSHyCXnaP4SFo0YXvYcSvy93dCt8wdcaITEiucx6Jc7x5TzCyCRN0XtO+AQFmLmWqi
 sy9XxGugxlMEoBehOuJuJGil6dZXAcUSJ4kbl19C7Xdq/KDkVOlzbHnHrwsNAYC0HZ
 an4sGsnzoiuzXhdj5haF/fZAMNXgzzGEfRHhDUFtfHZk5RSM4BXEwir3qtXmRjwDIv
 EgNA/uVRRB0vQ==
From: Timo Rothenpieler <timo@rothenpieler.org>
To: ffmpeg-devel@ffmpeg.org
Date: Wed, 10 Aug 2022 22:47:09 +0200
Message-Id: <20220810204712.3123-8-timo@rothenpieler.org>
X-Mailer: git-send-email 2.34.1
In-Reply-To: <20220810204712.3123-1-timo@rothenpieler.org>
References: <20220810204712.3123-1-timo@rothenpieler.org>
MIME-Version: 1.0
Subject: [FFmpeg-devel] [PATCH 08/11] avutil/half2float: move non-inline
 init code out of header
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Cc: Timo Rothenpieler <timo@rothenpieler.org>
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
X-TUID: +rWCTh6TxK17

---
 libavcodec/Makefile     |  8 +++---
 libavcodec/exr.c        |  2 +-
 libavcodec/exrenc.c     |  2 +-
 libavcodec/float2half.c | 19 +++++++++++++
 libavcodec/half2float.c | 19 +++++++++++++
 libavcodec/pnmdec.c     |  2 +-
 libavcodec/pnmenc.c     |  2 +-
 libavutil/float2half.c  | 53 ++++++++++++++++++++++++++++++++++
 libavutil/float2half.h  | 36 ++---------------------
 libavutil/half2float.c  | 63 +++++++++++++++++++++++++++++++++++++++++
 libavutil/half2float.h  | 46 ++----------------------------
 11 files changed, 166 insertions(+), 86 deletions(-)
 create mode 100644 libavcodec/float2half.c
 create mode 100644 libavcodec/half2float.c
 create mode 100644 libavutil/float2half.c
 create mode 100644 libavutil/half2float.c

diff --git a/libavcodec/Makefile b/libavcodec/Makefile
index 029f1bad3d..cb80f73d99 100644
--- a/libavcodec/Makefile
+++ b/libavcodec/Makefile
@@ -337,8 +337,8 @@ OBJS-$(CONFIG_EIGHTSVX_FIB_DECODER)    += 8svx.o
 OBJS-$(CONFIG_ESCAPE124_DECODER)       += escape124.o
 OBJS-$(CONFIG_ESCAPE130_DECODER)       += escape130.o
 OBJS-$(CONFIG_EVRC_DECODER)            += evrcdec.o acelp_vectors.o lsp.o
-OBJS-$(CONFIG_EXR_DECODER)             += exr.o exrdsp.o
-OBJS-$(CONFIG_EXR_ENCODER)             += exrenc.o
+OBJS-$(CONFIG_EXR_DECODER)             += exr.o exrdsp.o half2float.o
+OBJS-$(CONFIG_EXR_ENCODER)             += exrenc.o float2half.o
 OBJS-$(CONFIG_FASTAUDIO_DECODER)       += fastaudio.o
 OBJS-$(CONFIG_FFV1_DECODER)            += ffv1dec.o ffv1.o
 OBJS-$(CONFIG_FFV1_ENCODER)            += ffv1enc.o ffv1.o
@@ -570,8 +570,8 @@ OBJS-$(CONFIG_PGMYUV_DECODER)          += pnmdec.o pnm.o
 OBJS-$(CONFIG_PGMYUV_ENCODER)          += pnmenc.o
 OBJS-$(CONFIG_PGSSUB_DECODER)          += pgssubdec.o
 OBJS-$(CONFIG_PGX_DECODER)             += pgxdec.o
-OBJS-$(CONFIG_PHM_DECODER)             += pnmdec.o pnm.o
-OBJS-$(CONFIG_PHM_ENCODER)             += pnmenc.o
+OBJS-$(CONFIG_PHM_DECODER)             += pnmdec.o pnm.o half2float.o
+OBJS-$(CONFIG_PHM_ENCODER)             += pnmenc.o float2half.o
 OBJS-$(CONFIG_PHOTOCD_DECODER)         += photocd.o
 OBJS-$(CONFIG_PICTOR_DECODER)          += pictordec.o cga_data.o
 OBJS-$(CONFIG_PIXLET_DECODER)          += pixlet.o
diff --git a/libavcodec/exr.c b/libavcodec/exr.c
index 825354873d..a3582bfdd6 100644
--- a/libavcodec/exr.c
+++ b/libavcodec/exr.c
@@ -2208,7 +2208,7 @@ static av_cold int decode_init(AVCodecContext *avctx)
     float one_gamma = 1.0f / s->gamma;
     avpriv_trc_function trc_func = NULL;
 
-    init_half2float_tables(&s->h2f_tables);
+    ff_init_half2float_tables(&s->h2f_tables);
 
     s->avctx              = avctx;
 
diff --git a/libavcodec/exrenc.c b/libavcodec/exrenc.c
index 6ab9400b7c..77b1ce052b 100644
--- a/libavcodec/exrenc.c
+++ b/libavcodec/exrenc.c
@@ -94,7 +94,7 @@ static av_cold int encode_init(AVCodecContext *avctx)
 {
     EXRContext *s = avctx->priv_data;
 
-    init_float2half_tables(&s->f2h_tables);
+    ff_init_float2half_tables(&s->f2h_tables);
 
     switch (avctx->pix_fmt) {
     case AV_PIX_FMT_GBRPF32:
diff --git a/libavcodec/float2half.c b/libavcodec/float2half.c
new file mode 100644
index 0000000000..90a6f63fac
--- /dev/null
+++ b/libavcodec/float2half.c
@@ -0,0 +1,19 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavutil/float2half.c"
diff --git a/libavcodec/half2float.c b/libavcodec/half2float.c
new file mode 100644
index 0000000000..1b023f96a5
--- /dev/null
+++ b/libavcodec/half2float.c
@@ -0,0 +1,19 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavutil/half2float.c"
diff --git a/libavcodec/pnmdec.c b/libavcodec/pnmdec.c
index 6adc348ec8..fbed282e93 100644
--- a/libavcodec/pnmdec.c
+++ b/libavcodec/pnmdec.c
@@ -477,7 +477,7 @@ static av_cold int phm_dec_init(AVCodecContext *avctx)
 {
     PNMContext *s = avctx->priv_data;
 
-    init_half2float_tables(&s->h2f_tables);
+    ff_init_half2float_tables(&s->h2f_tables);
 
     return 0;
 }
diff --git a/libavcodec/pnmenc.c b/libavcodec/pnmenc.c
index 70992531bf..50f55bb1b9 100644
--- a/libavcodec/pnmenc.c
+++ b/libavcodec/pnmenc.c
@@ -294,7 +294,7 @@ static av_cold int phm_enc_init(AVCodecContext *avctx)
 {
     PHMEncContext *s = avctx->priv_data;
 
-    init_float2half_tables(&s->f2h_tables);
+    ff_init_float2half_tables(&s->f2h_tables);
 
     return 0;
 }
diff --git a/libavutil/float2half.c b/libavutil/float2half.c
new file mode 100644
index 0000000000..dba14cef5d
--- /dev/null
+++ b/libavutil/float2half.c
@@ -0,0 +1,53 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavutil/float2half.h"
+
+void ff_init_float2half_tables(float2half_tables *t)
+{
+    for (int i = 0; i < 256; i++) {
+        int e = i - 127;
+
+        if (e < -24) { // Very small numbers map to zero
+            t->basetable[i|0x000]  = 0x0000;
+            t->basetable[i|0x100]  = 0x8000;
+            t->shifttable[i|0x000] = 24;
+            t->shifttable[i|0x100] = 24;
+        } else if (e < -14) { // Small numbers map to denorms
+            t->basetable[i|0x000] = (0x0400>>(-e-14));
+            t->basetable[i|0x100] = (0x0400>>(-e-14)) | 0x8000;
+            t->shifttable[i|0x000] = -e-1;
+            t->shifttable[i|0x100] = -e-1;
+        } else if (e <= 15) { // Normal numbers just lose precision
+            t->basetable[i|0x000] = ((e + 15) << 10);
+            t->basetable[i|0x100] = ((e + 15) << 10) | 0x8000;
+            t->shifttable[i|0x000] = 13;
+            t->shifttable[i|0x100] = 13;
+        } else if (e < 128) { // Large numbers map to Infinity
+            t->basetable[i|0x000]  = 0x7C00;
+            t->basetable[i|0x100]  = 0xFC00;
+            t->shifttable[i|0x000] = 24;
+            t->shifttable[i|0x100] = 24;
+        } else { // Infinity and NaN's stay Infinity and NaN's
+            t->basetable[i|0x000]  = 0x7C00;
+            t->basetable[i|0x100]  = 0xFC00;
+            t->shifttable[i|0x000] = 13;
+            t->shifttable[i|0x100] = 13;
+        }
+    }
+}
diff --git a/libavutil/float2half.h b/libavutil/float2half.h
index 9252560649..b8c9cdfc4f 100644
--- a/libavutil/float2half.h
+++ b/libavutil/float2half.h
@@ -26,41 +26,9 @@ typedef struct float2half_tables {
     uint8_t shifttable[512];
 } float2half_tables;
 
-static void init_float2half_tables(float2half_tables *t)
-{
-    for (int i = 0; i < 256; i++) {
-        int e = i - 127;
-
-        if (e < -24) { // Very small numbers map to zero
-            t->basetable[i|0x000]  = 0x0000;
-            t->basetable[i|0x100]  = 0x8000;
-            t->shifttable[i|0x000] = 24;
-            t->shifttable[i|0x100] = 24;
-        } else if (e < -14) { // Small numbers map to denorms
-            t->basetable[i|0x000] = (0x0400>>(-e-14));
-            t->basetable[i|0x100] = (0x0400>>(-e-14)) | 0x8000;
-            t->shifttable[i|0x000] = -e-1;
-            t->shifttable[i|0x100] = -e-1;
-        } else if (e <= 15) { // Normal numbers just lose precision
-            t->basetable[i|0x000] = ((e + 15) << 10);
-            t->basetable[i|0x100] = ((e + 15) << 10) | 0x8000;
-            t->shifttable[i|0x000] = 13;
-            t->shifttable[i|0x100] = 13;
-        } else if (e < 128) { // Large numbers map to Infinity
-            t->basetable[i|0x000]  = 0x7C00;
-            t->basetable[i|0x100]  = 0xFC00;
-            t->shifttable[i|0x000] = 24;
-            t->shifttable[i|0x100] = 24;
-        } else { // Infinity and NaN's stay Infinity and NaN's
-            t->basetable[i|0x000]  = 0x7C00;
-            t->basetable[i|0x100]  = 0xFC00;
-            t->shifttable[i|0x000] = 13;
-            t->shifttable[i|0x100] = 13;
-        }
-    }
-}
+void ff_init_float2half_tables(float2half_tables *t);
 
-static uint16_t float2half(uint32_t f, const float2half_tables *t)
+static inline uint16_t float2half(uint32_t f, const float2half_tables *t)
 {
     uint16_t h;
 
diff --git a/libavutil/half2float.c b/libavutil/half2float.c
new file mode 100644
index 0000000000..baac8e4093
--- /dev/null
+++ b/libavutil/half2float.c
@@ -0,0 +1,63 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavutil/half2float.h"
+
+static uint32_t convertmantissa(uint32_t i)
+{
+    int32_t m = i << 13; // Zero pad mantissa bits
+    int32_t e = 0; // Zero exponent
+
+    while (!(m & 0x00800000)) { // While not normalized
+        e -= 0x00800000; // Decrement exponent (1<<23)
+        m <<= 1; // Shift mantissa
+    }
+
+    m &= ~0x00800000; // Clear leading 1 bit
+    e +=  0x38800000; // Adjust bias ((127-14)<<23)
+
+    return m | e; // Return combined number
+}
+
+void ff_init_half2float_tables(half2float_tables *t)
+{
+    t->mantissatable[0] = 0;
+    for (int i = 1; i < 1024; i++)
+        t->mantissatable[i] = convertmantissa(i);
+    for (int i = 1024; i < 2048; i++)
+        t->mantissatable[i] = 0x38000000UL + ((i - 1024) << 13UL);
+    for (int i = 2048; i < 3072; i++)
+        t->mantissatable[i] = t->mantissatable[i - 1024] | 0x400000UL;
+    t->mantissatable[2048] = t->mantissatable[1024];
+
+    t->exponenttable[0] = 0;
+    for (int i = 1; i < 31; i++)
+        t->exponenttable[i] = i << 23;
+    for (int i = 33; i < 63; i++)
+        t->exponenttable[i] = 0x80000000UL + ((i - 32) << 23UL);
+    t->exponenttable[31]= 0x47800000UL;
+    t->exponenttable[32]= 0x80000000UL;
+    t->exponenttable[63]= 0xC7800000UL;
+
+    t->offsettable[0] = 0;
+    for (int i = 1; i < 64; i++)
+        t->offsettable[i] = 1024;
+    t->offsettable[31] = 2048;
+    t->offsettable[32] = 0;
+    t->offsettable[63] = 2048;
+}
diff --git a/libavutil/half2float.h b/libavutil/half2float.h
index 10b6fef4e6..cb58e44a1c 100644
--- a/libavutil/half2float.h
+++ b/libavutil/half2float.h
@@ -27,51 +27,9 @@ typedef struct half2float_tables {
     uint16_t offsettable[64];
 } half2float_tables;
 
-static uint32_t convertmantissa(uint32_t i)
-{
-    int32_t m = i << 13; // Zero pad mantissa bits
-    int32_t e = 0; // Zero exponent
-
-    while (!(m & 0x00800000)) { // While not normalized
-        e -= 0x00800000; // Decrement exponent (1<<23)
-        m <<= 1; // Shift mantissa
-    }
-
-    m &= ~0x00800000; // Clear leading 1 bit
-    e +=  0x38800000; // Adjust bias ((127-14)<<23)
-
-    return m | e; // Return combined number
-}
-
-static void init_half2float_tables(half2float_tables *t)
-{
-    t->mantissatable[0] = 0;
-    for (int i = 1; i < 1024; i++)
-        t->mantissatable[i] = convertmantissa(i);
-    for (int i = 1024; i < 2048; i++)
-        t->mantissatable[i] = 0x38000000UL + ((i - 1024) << 13UL);
-    for (int i = 2048; i < 3072; i++)
-        t->mantissatable[i] = t->mantissatable[i - 1024] | 0x400000UL;
-    t->mantissatable[2048] = t->mantissatable[1024];
-
-    t->exponenttable[0] = 0;
-    for (int i = 1; i < 31; i++)
-        t->exponenttable[i] = i << 23;
-    for (int i = 33; i < 63; i++)
-        t->exponenttable[i] = 0x80000000UL + ((i - 32) << 23UL);
-    t->exponenttable[31]= 0x47800000UL;
-    t->exponenttable[32]= 0x80000000UL;
-    t->exponenttable[63]= 0xC7800000UL;
-
-    t->offsettable[0] = 0;
-    for (int i = 1; i < 64; i++)
-        t->offsettable[i] = 1024;
-    t->offsettable[31] = 2048;
-    t->offsettable[32] = 0;
-    t->offsettable[63] = 2048;
-}
+void ff_init_half2float_tables(half2float_tables *t);
 
-static uint32_t half2float(uint16_t h, const half2float_tables *t)
+static inline uint32_t half2float(uint16_t h, const half2float_tables *t)
 {
     uint32_t f;
 

From patchwork Wed Aug 10 20:47:10 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Timo Rothenpieler <timo@rothenpieler.org>
X-Patchwork-Id: 37225
Delivered-To: ffmpegpatchwork2@gmail.com
Received: by 2002:a05:6a20:3d0d:b0:8d:a68e:8a0e with SMTP id y13csp1142564pzi;
        Wed, 10 Aug 2022 13:48:28 -0700 (PDT)
X-Google-Smtp-Source: 
 AA6agR6YYMhHQI9O5TUIvjIXVpkNHsLOddvPiAThsfMVG2Hd6BtCQaBjea4Q4SHitZW3K10Jvigs
X-Received: by 2002:a05:6402:28cb:b0:43b:c6d7:ef92 with SMTP id
 ef11-20020a05640228cb00b0043bc6d7ef92mr28596360edb.333.1660164507779;
        Wed, 10 Aug 2022 13:48:27 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1660164507; cv=none;
        d=google.com; s=arc-20160816;
        b=CqiWzxS1/gad+fg8w4e958ML/54bE9HemjkIYSuJjLCV3q1x0YtJu5DWlWwGJ+B3al
         WrDhnHdRptIZeYzgEjqLtkpV0cFLJ+Ehvy6XRaUFeRH5tQCIv0eVzookujqbYpS2YPaS
         zDalHGNtu7fBSrNK1MkFcaFaurmtcKjTgZnJfiRx7O+ypw++oZjCXt7+4bLrgCrYClIO
         P+dgQWSkCduQXqazcwILPJpmwDVKaLrmQ/tOts0C+Dq8kXgXmz/ZYWohEsrBtoxa21dJ
         OZAO/r80tJFItjhVvkNRbDBkH5RBXv4Fr/u1sNMNQ/pEI0DD+Gf1HWe0YlrChdlkQzQU
         nzcA==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=sender:errors-to:content-transfer-encoding:cc:reply-to
         :list-subscribe:list-help:list-post:list-archive:list-unsubscribe
         :list-id:precedence:subject:mime-version:references:in-reply-to
         :message-id:date:to:from:dkim-signature:delivered-to;
        bh=vyBzX5RFZ+UPP8R+9UYcXtZFIOM8p6glwQnetq1J19c=;
        b=T3SzEsCcfoApsAQ5YNtXyPJeF+ipUxo1ZC+8jGpmpOYQx1wgLCn3S4St6JWQn60OpN
         G7GCuJV7nNIlwX+j1z4bvNMekwxF5sNAGgwiZULyIspGyIjdWKvP1XtpcpDX1lr6/aSI
         fo9GP3zi8y6EIFDcTg5y+rB3Z6pvumMf2rKPrHfvzowNjZy+cxnWlyfC7sATAGOJBzy1
         5CGKVhFEmUrT/bLJ0kF9rYfiaQdp/UCzPfSXZQ7UCH+2Wk4SMGOZUd5ezjg1TgZ/Zfqc
         hCTpTrsEtoJLfwxb5C3c9FIsKT68BBl7hcRJLDCrWON5UzZRKXPbxJpVAKgjmtptdUvL
         p7Hg==
ARC-Authentication-Results: i=1; mx.google.com;
       dkim=neutral (body hash did not verify) header.i=@rothenpieler.org
 header.s=mail header.b=ex07BzQf;
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100])
        by mx.google.com with ESMTP id
 gt38-20020a1709072da600b0073127725a02si5633887ejc.770.2022.08.10.13.48.27;
        Wed, 10 Aug 2022 13:48:27 -0700 (PDT)
Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100;
Authentication-Results: mx.google.com;
       dkim=neutral (body hash did not verify) header.i=@rothenpieler.org
 header.s=mail header.b=ex07BzQf;
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4125E68B8FC;
	Wed, 10 Aug 2022 23:47:37 +0300 (EEST)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from btbn.de (btbn.de [136.243.74.85])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1B48668B883
 for <ffmpeg-devel@ffmpeg.org>; Wed, 10 Aug 2022 23:47:31 +0300 (EEST)
Received: from [authenticated] by btbn.de (Postfix) with ESMTPSA id
 62E922F260F; Wed, 10 Aug 2022 22:47:25 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rothenpieler.org;
 s=mail; t=1660164445;
 bh=tRI0BK5KQlVJV8F94kb8xqVuesqZ4ZNFe9JELoi9AP8=;
 h=From:To:Cc:Subject:Date:In-Reply-To:References;
 b=ex07BzQfo1nq6On/ltbEpp9jqBqjhH3NOr4X9ZhaSZzJ7X65TVmCteRkmrTuhlViT
 15uMgBF/zX+hg9BNeEC00OVRjsRFtfmND/8kDDzDkSVNEtVwzle6Bolwj5IUboyzOD
 +hdEJTTaHy/ICHQKD2lmtM8L2AdYV1KMsL1uGAPEziHI7pDTk6ZrO1ikDQXrUgRpLc
 v8/3Vp+81ViM5Xtq/YecuQlMKaEZS/o5NclkydOWyPjilQ9JWhKDSkXmhH7ZCRK8dd
 dbfVNquFK7njPUBUzY9jOt0mQG+WpQVs3wiLPE1eef3Ao+5ccsLt7prVAqfoppXkCT
 A0CYKBC0zWADA==
From: Timo Rothenpieler <timo@rothenpieler.org>
To: ffmpeg-devel@ffmpeg.org
Date: Wed, 10 Aug 2022 22:47:10 +0200
Message-Id: <20220810204712.3123-9-timo@rothenpieler.org>
X-Mailer: git-send-email 2.34.1
In-Reply-To: <20220810204712.3123-1-timo@rothenpieler.org>
References: <20220810204712.3123-1-timo@rothenpieler.org>
MIME-Version: 1.0
Subject: [FFmpeg-devel] [PATCH 09/11] avutil/half2float: use native _Float16
 if available
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Cc: Timo Rothenpieler <timo@rothenpieler.org>
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
X-TUID: 3+VkvMpABZ+U

_Float16 support was available on arm/aarch64 for a while, and with gcc
12 was enabled on x86 as long as SSE2 is supported.

If the target arch supports f16c, gcc emits fairly efficient assembly,
taking advantage of it. This is the case on x86-64-v3 or higher.
Without f16c, it emulates it in software using sse2 instructions.
---
 configure              |  4 ++++
 libavutil/float2half.c |  2 ++
 libavutil/float2half.h | 16 ++++++++++++++++
 libavutil/half2float.c |  4 ++++
 libavutil/half2float.h | 16 ++++++++++++++++
 5 files changed, 42 insertions(+)

diff --git a/configure b/configure
index 6761d0cb32..2536ae012d 100755
--- a/configure
+++ b/configure
@@ -2143,6 +2143,7 @@ ARCH_FEATURES="
     fast_64bit
     fast_clz
     fast_cmov
+    float16
     local_aligned
     simd_align_16
     simd_align_32
@@ -5125,6 +5126,8 @@ elif enabled arm; then
             ;;
     esac
 
+    test_cflags -mfp16-format=ieee && add_cflags -mfp16-format=ieee
+
 elif enabled avr32; then
 
     case $cpu in
@@ -6228,6 +6231,7 @@ check_builtin MemoryBarrier windows.h "MemoryBarrier()"
 check_builtin sync_val_compare_and_swap "" "int *ptr; int oldval, newval; __sync_val_compare_and_swap(ptr, oldval, newval)"
 check_builtin gmtime_r time.h "time_t *time; struct tm *tm; gmtime_r(time, tm)"
 check_builtin localtime_r time.h "time_t *time; struct tm *tm; localtime_r(time, tm)"
+check_builtin float16 "" "_Float16 f16var"
 
 case "$custom_allocator" in
     jemalloc)
diff --git a/libavutil/float2half.c b/libavutil/float2half.c
index dba14cef5d..1390d3acc0 100644
--- a/libavutil/float2half.c
+++ b/libavutil/float2half.c
@@ -20,6 +20,7 @@
 
 void ff_init_float2half_tables(float2half_tables *t)
 {
+#if !HAVE_FLOAT16
     for (int i = 0; i < 256; i++) {
         int e = i - 127;
 
@@ -50,4 +51,5 @@ void ff_init_float2half_tables(float2half_tables *t)
             t->shifttable[i|0x100] = 13;
         }
     }
+#endif
 }
diff --git a/libavutil/float2half.h b/libavutil/float2half.h
index b8c9cdfc4f..8c1fb804b7 100644
--- a/libavutil/float2half.h
+++ b/libavutil/float2half.h
@@ -20,21 +20,37 @@
 #define AVUTIL_FLOAT2HALF_H
 
 #include <stdint.h>
+#include "intfloat.h"
+
+#include "config.h"
 
 typedef struct float2half_tables {
+#if HAVE_FLOAT16
+    uint8_t dummy;
+#else
     uint16_t basetable[512];
     uint8_t shifttable[512];
+#endif
 } float2half_tables;
 
 void ff_init_float2half_tables(float2half_tables *t);
 
 static inline uint16_t float2half(uint32_t f, const float2half_tables *t)
 {
+#if HAVE_FLOAT16
+    union {
+        _Float16 f;
+        uint16_t i;
+    } u;
+    u.f = av_int2float(f);
+    return u.i;
+#else
     uint16_t h;
 
     h = t->basetable[(f >> 23) & 0x1ff] + ((f & 0x007fffff) >> t->shifttable[(f >> 23) & 0x1ff]);
 
     return h;
+#endif
 }
 
 #endif /* AVUTIL_FLOAT2HALF_H */
diff --git a/libavutil/half2float.c b/libavutil/half2float.c
index baac8e4093..873226d3a0 100644
--- a/libavutil/half2float.c
+++ b/libavutil/half2float.c
@@ -18,6 +18,7 @@
 
 #include "libavutil/half2float.h"
 
+#if !HAVE_FLOAT16
 static uint32_t convertmantissa(uint32_t i)
 {
     int32_t m = i << 13; // Zero pad mantissa bits
@@ -33,9 +34,11 @@ static uint32_t convertmantissa(uint32_t i)
 
     return m | e; // Return combined number
 }
+#endif
 
 void ff_init_half2float_tables(half2float_tables *t)
 {
+#if !HAVE_FLOAT16
     t->mantissatable[0] = 0;
     for (int i = 1; i < 1024; i++)
         t->mantissatable[i] = convertmantissa(i);
@@ -60,4 +63,5 @@ void ff_init_half2float_tables(half2float_tables *t)
     t->offsettable[31] = 2048;
     t->offsettable[32] = 0;
     t->offsettable[63] = 2048;
+#endif
 }
diff --git a/libavutil/half2float.h b/libavutil/half2float.h
index cb58e44a1c..b2a7c934a6 100644
--- a/libavutil/half2float.h
+++ b/libavutil/half2float.h
@@ -20,22 +20,38 @@
 #define AVUTIL_HALF2FLOAT_H
 
 #include <stdint.h>
+#include "intfloat.h"
+
+#include "config.h"
 
 typedef struct half2float_tables {
+#if HAVE_FLOAT16
+    uint8_t dummy;
+#else
     uint32_t mantissatable[3072];
     uint32_t exponenttable[64];
     uint16_t offsettable[64];
+#endif
 } half2float_tables;
 
 void ff_init_half2float_tables(half2float_tables *t);
 
 static inline uint32_t half2float(uint16_t h, const half2float_tables *t)
 {
+#if HAVE_FLOAT16
+    union {
+        _Float16 f;
+        uint16_t i;
+    } u;
+    u.i = h;
+    return av_float2int(u.f);
+#else
     uint32_t f;
 
     f = t->mantissatable[t->offsettable[h >> 10] + (h & 0x3ff)] + t->exponenttable[h >> 10];
 
     return f;
+#endif
 }
 
 #endif /* AVUTIL_HALF2FLOAT_H */

From patchwork Wed Aug 10 20:47:11 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Timo Rothenpieler <timo@rothenpieler.org>
X-Patchwork-Id: 37228
Delivered-To: ffmpegpatchwork2@gmail.com
Received: by 2002:a05:6a20:3d0d:b0:8d:a68e:8a0e with SMTP id y13csp1142819pzi;
        Wed, 10 Aug 2022 13:48:56 -0700 (PDT)
X-Google-Smtp-Source: 
 AA6agR5pI0HNDUAA/50DeoXYANZ8Ry6FX1Xds+vyghnyQ9k6WdogmtCfeNYLaaNtEUkhP1Q/8L8u
X-Received: by 2002:a17:907:6818:b0:730:d99f:7b91 with SMTP id
 qz24-20020a170907681800b00730d99f7b91mr20278680ejc.496.1660164536621;
        Wed, 10 Aug 2022 13:48:56 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1660164536; cv=none;
        d=google.com; s=arc-20160816;
        b=bMUH7CrL55RWL2JCO3uLt8j1zzTjCB1nol+eJazzTZ2DFCBSrs0fY6Qaqcbb1uB0fL
         IU9Ko6NRybNY9zvwWxFhT4VFIPKSy40bPKmohqNVY/4o4AsrFR2NMOENvyKpnS693+Bh
         2LBVVH100k1zfzxveq4ML3RIqROkdgWI/CQW9YiiqMhwYn7Co/pYVJ4/q1Qhr5hGCe+U
         yK4FqvJOdTv2Cp2OVdpO6zWvV7Ty/Ytj3Ngt33+E7n16cSuKfsXNj7fNUkW2eE4d5Ymd
         0BB9rIAM0wIkFiqL0hD4wdSvFOpZCDGRtggtYd8WOZ4FDvzb9zhXf/ELcLDkIeXlNnMJ
         yNMA==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=sender:errors-to:content-transfer-encoding:cc:reply-to
         :list-subscribe:list-help:list-post:list-archive:list-unsubscribe
         :list-id:precedence:subject:mime-version:references:in-reply-to
         :message-id:date:to:from:dkim-signature:delivered-to;
        bh=gCWZxpenJ2fd1XALd4EY0+VNPNyZlCCYLkKdI6Cq6VE=;
        b=ZA7fhopGESxLy971xSuOCDHQxgtFCKdxduuMPh0kqqPqSgeFtaZdHHSpuh/b2LWvjL
         2N4zBrkpEk9PEhOT1OLZ8639AbT06o8PtA+g//QgrxtWGxnBbChDOCCLFmckYqe5FoRj
         cqb4LkvibXHA+FYPxYPnYzHgtS9rVQJo0ECZYDobiCFD3X7uFXRvYIJVLDaKdCivyg+j
         J0hynDQiyh7iFvk+ymORPujGJSc/Xl/sONXxi4ew18RVCBZ54/GPFewMhq1hDnnL8NEz
         c/d2+rWvGI5lSuR2Vcg0khzBdaImHLrNClO4rLhXWTYARrlS7x9j79eD963ZXwY/6MWT
         UV4A==
ARC-Authentication-Results: i=1; mx.google.com;
       dkim=neutral (body hash did not verify) header.i=@rothenpieler.org
 header.s=mail header.b=ih8+3qhq;
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100])
        by mx.google.com with ESMTP id
 dy10-20020a05640231ea00b0043e01070ae6si10960051edb.512.2022.08.10.13.48.56;
        Wed, 10 Aug 2022 13:48:56 -0700 (PDT)
Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100;
Authentication-Results: mx.google.com;
       dkim=neutral (body hash did not verify) header.i=@rothenpieler.org
 header.s=mail header.b=ih8+3qhq;
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 256D568B92B;
	Wed, 10 Aug 2022 23:47:40 +0300 (EEST)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from btbn.de (btbn.de [136.243.74.85])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1AFA768B586
 for <ffmpeg-devel@ffmpeg.org>; Wed, 10 Aug 2022 23:47:31 +0300 (EEST)
Received: from [authenticated] by btbn.de (Postfix) with ESMTPSA id
 7F4A62F2614; Wed, 10 Aug 2022 22:47:25 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rothenpieler.org;
 s=mail; t=1660164445;
 bh=10ZztRD3NN895KYxp+7NXQGv8c7KBxN5jx75Qd7mnyI=;
 h=From:To:Cc:Subject:Date:In-Reply-To:References;
 b=ih8+3qhqSLf8AsnwHjTVot4GoiuWKfcPy0sUegJzaPdybwukw/o2DLdhT3eCT97dU
 kGKb4w7b8LaaJ4vAsIRK1y2muaevC1iQA0rRs+qCAG9uNbKgP3ZekcRGMsXctOPMNP
 0Vv2RuPlaoCbWz16zTmfPDkLUrW3cYHqiyMNXyi/KB2x6UQWUSz2lP7lk9NaumIqc4
 5QD4vksIPDII9QhGuE07J+Qe++sexXL4MDeszvgHrBX2JmgKh/YNR1sQJ5xhb48gAJ
 SgFN7lnj0L3RRzA/7H6dm29hdEZJ8w5SStz0/MzyKFpIgaY4t/hEFSiEPyVKwoh3Um
 xMy1rpElHrQCg==
From: Timo Rothenpieler <timo@rothenpieler.org>
To: ffmpeg-devel@ffmpeg.org
Date: Wed, 10 Aug 2022 22:47:11 +0200
Message-Id: <20220810204712.3123-10-timo@rothenpieler.org>
X-Mailer: git-send-email 2.34.1
In-Reply-To: <20220810204712.3123-1-timo@rothenpieler.org>
References: <20220810204712.3123-1-timo@rothenpieler.org>
MIME-Version: 1.0
Subject: [FFmpeg-devel] [PATCH 10/11] swscale: add SwsContext parameter to
 input functions
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Cc: Timo Rothenpieler <timo@rothenpieler.org>
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
X-TUID: QukDH8QDpgXX

---
 libswscale/hscale.c           |  12 +--
 libswscale/input.c            | 149 ++++++++++++++++++----------------
 libswscale/swscale_internal.h |  17 ++--
 libswscale/x86/swscale.c      |  13 +--
 4 files changed, 106 insertions(+), 85 deletions(-)

diff --git a/libswscale/hscale.c b/libswscale/hscale.c
index eca0635338..6789ce7540 100644
--- a/libswscale/hscale.c
+++ b/libswscale/hscale.c
@@ -105,18 +105,18 @@ static int lum_convert(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int
         uint8_t * dst = desc->dst->plane[0].line[i];
 
         if (c->lumToYV12) {
-            c->lumToYV12(dst, src[0], src[1], src[2], srcW, pal);
+            c->lumToYV12(dst, src[0], src[1], src[2], srcW, pal, c->input_opaque);
         } else if (c->readLumPlanar) {
-            c->readLumPlanar(dst, src, srcW, c->input_rgb2yuv_table);
+            c->readLumPlanar(dst, src, srcW, c->input_rgb2yuv_table, c->input_opaque);
         }
 
 
         if (desc->alpha) {
             dst = desc->dst->plane[3].line[i];
             if (c->alpToYV12) {
-                c->alpToYV12(dst, src[3], src[1], src[2], srcW, pal);
+                c->alpToYV12(dst, src[3], src[1], src[2], srcW, pal, c->input_opaque);
             } else if (c->readAlpPlanar) {
-                c->readAlpPlanar(dst, src, srcW, NULL);
+                c->readAlpPlanar(dst, src, srcW, NULL, c->input_opaque);
             }
         }
     }
@@ -224,9 +224,9 @@ static int chr_convert(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int
         uint8_t * dst1 = desc->dst->plane[1].line[i];
         uint8_t * dst2 = desc->dst->plane[2].line[i];
         if (c->chrToYV12) {
-            c->chrToYV12(dst1, dst2, src[0], src[1], src[2], srcW, pal);
+            c->chrToYV12(dst1, dst2, src[0], src[1], src[2], srcW, pal, c->input_opaque);
         } else if (c->readChrPlanar) {
-            c->readChrPlanar(dst1, dst2, src, srcW, c->input_rgb2yuv_table);
+            c->readChrPlanar(dst1, dst2, src, srcW, c->input_rgb2yuv_table, c->input_opaque);
         }
     }
     return sliceH;
diff --git a/libswscale/input.c b/libswscale/input.c
index 68abc4d62c..36ef1e43ac 100644
--- a/libswscale/input.c
+++ b/libswscale/input.c
@@ -88,7 +88,7 @@ rgb64ToUV_half_c_template(uint16_t *dstU, uint16_t *dstV,
 
 #define rgb64funcs(pattern, BE_LE, origin) \
 static void pattern ## 64 ## BE_LE ## ToY_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused0, const uint8_t *unused1,\
-                                    int width, uint32_t *rgb2yuv) \
+                                    int width, uint32_t *rgb2yuv, void *opq) \
 { \
     const uint16_t *src = (const uint16_t *) _src; \
     uint16_t *dst = (uint16_t *) _dst; \
@@ -97,7 +97,7 @@ static void pattern ## 64 ## BE_LE ## ToY_c(uint8_t *_dst, const uint8_t *_src,
  \
 static void pattern ## 64 ## BE_LE ## ToUV_c(uint8_t *_dstU, uint8_t *_dstV, \
                                     const uint8_t *unused0, const uint8_t *_src1, const uint8_t *_src2, \
-                                    int width, uint32_t *rgb2yuv) \
+                                    int width, uint32_t *rgb2yuv, void *opq) \
 { \
     const uint16_t *src1 = (const uint16_t *) _src1, \
                    *src2 = (const uint16_t *) _src2; \
@@ -107,7 +107,7 @@ static void pattern ## 64 ## BE_LE ## ToUV_c(uint8_t *_dstU, uint8_t *_dstV, \
  \
 static void pattern ## 64 ## BE_LE ## ToUV_half_c(uint8_t *_dstU, uint8_t *_dstV, \
                                     const uint8_t *unused0, const uint8_t *_src1, const uint8_t *_src2, \
-                                    int width, uint32_t *rgb2yuv) \
+                                    int width, uint32_t *rgb2yuv, void *opq) \
 { \
     const uint16_t *src1 = (const uint16_t *) _src1, \
                    *src2 = (const uint16_t *) _src2; \
@@ -192,7 +192,8 @@ static void pattern ## 48 ## BE_LE ## ToY_c(uint8_t *_dst,              \
                                             const uint8_t *_src,        \
                                             const uint8_t *unused0, const uint8_t *unused1,\
                                             int width,                  \
-                                            uint32_t *rgb2yuv)          \
+                                            uint32_t *rgb2yuv,          \
+                                            void *opq)                  \
 {                                                                       \
     const uint16_t *src = (const uint16_t *)_src;                       \
     uint16_t *dst       = (uint16_t *)_dst;                             \
@@ -205,7 +206,8 @@ static void pattern ## 48 ## BE_LE ## ToUV_c(uint8_t *_dstU,            \
                                              const uint8_t *_src1,      \
                                              const uint8_t *_src2,      \
                                              int width,                 \
-                                             uint32_t *rgb2yuv)         \
+                                             uint32_t *rgb2yuv,         \
+                                             void *opq)                 \
 {                                                                       \
     const uint16_t *src1 = (const uint16_t *)_src1,                     \
                    *src2 = (const uint16_t *)_src2;                     \
@@ -220,7 +222,8 @@ static void pattern ## 48 ## BE_LE ## ToUV_half_c(uint8_t *_dstU,       \
                                                   const uint8_t *_src1, \
                                                   const uint8_t *_src2, \
                                                   int width,            \
-                                                  uint32_t *rgb2yuv)    \
+                                                  uint32_t *rgb2yuv,    \
+                                                  void *opq)            \
 {                                                                       \
     const uint16_t *src1 = (const uint16_t *)_src1,                     \
                    *src2 = (const uint16_t *)_src2;                     \
@@ -345,7 +348,7 @@ static av_always_inline void rgb16_32ToUV_half_c_template(int16_t *dstU,
 #define rgb16_32_wrapper(fmt, name, shr, shg, shb, shp, maskr,          \
                          maskg, maskb, rsh, gsh, bsh, S)                \
 static void name ## ToY_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2,            \
-                          int width, uint32_t *tab)                     \
+                          int width, uint32_t *tab, void *opq)          \
 {                                                                       \
     rgb16_32ToY_c_template((int16_t*)dst, src, width, fmt, shr, shg, shb, shp,    \
                            maskr, maskg, maskb, rsh, gsh, bsh, S, tab); \
@@ -353,7 +356,7 @@ static void name ## ToY_c(uint8_t *dst, const uint8_t *src, const uint8_t *unuse
                                                                         \
 static void name ## ToUV_c(uint8_t *dstU, uint8_t *dstV,                \
                            const uint8_t *unused0, const uint8_t *src, const uint8_t *dummy,    \
-                           int width, uint32_t *tab)                    \
+                           int width, uint32_t *tab, void *opq)         \
 {                                                                       \
     rgb16_32ToUV_c_template((int16_t*)dstU, (int16_t*)dstV, src, width, fmt,                \
                             shr, shg, shb, shp,                         \
@@ -363,7 +366,7 @@ static void name ## ToUV_c(uint8_t *dstU, uint8_t *dstV,                \
 static void name ## ToUV_half_c(uint8_t *dstU, uint8_t *dstV,           \
                                 const uint8_t *unused0, const uint8_t *src,                     \
                                 const uint8_t *dummy,                   \
-                                int width, uint32_t *tab)               \
+                                int width, uint32_t *tab, void *opq)    \
 {                                                                       \
     rgb16_32ToUV_half_c_template((int16_t*)dstU, (int16_t*)dstV, src, width, fmt,           \
                                  shr, shg, shb, shp,                    \
@@ -392,7 +395,7 @@ rgb16_32_wrapper(AV_PIX_FMT_X2BGR10LE, bgr30le, 0, 6, 16, 0, 0x3FF, 0xFFC00, 0x3
 
 static void gbr24pToUV_half_c(uint8_t *_dstU, uint8_t *_dstV,
                          const uint8_t *gsrc, const uint8_t *bsrc, const uint8_t *rsrc,
-                         int width, uint32_t *rgb2yuv)
+                         int width, uint32_t *rgb2yuv, void *opq)
 {
     uint16_t *dstU = (uint16_t *)_dstU;
     uint16_t *dstV = (uint16_t *)_dstV;
@@ -411,7 +414,7 @@ static void gbr24pToUV_half_c(uint8_t *_dstU, uint8_t *_dstV,
 }
 
 static void rgba64leToA_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused1,
-                          const uint8_t *unused2, int width, uint32_t *unused)
+                          const uint8_t *unused2, int width, uint32_t *unused, void *opq)
 {
     int16_t *dst = (int16_t *)_dst;
     const uint16_t *src = (const uint16_t *)_src;
@@ -421,7 +424,7 @@ static void rgba64leToA_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unu
 }
 
 static void rgba64beToA_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused1,
-                          const uint8_t *unused2, int width, uint32_t *unused)
+                          const uint8_t *unused2, int width, uint32_t *unused, void *opq)
 {
     int16_t *dst = (int16_t *)_dst;
     const uint16_t *src = (const uint16_t *)_src;
@@ -430,7 +433,8 @@ static void rgba64beToA_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unu
         dst[i] = AV_RB16(src + 4 * i + 3);
 }
 
-static void abgrToA_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, int width, uint32_t *unused)
+static void abgrToA_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1,
+                      const uint8_t *unused2, int width, uint32_t *unused, void *opq)
 {
     int16_t *dst = (int16_t *)_dst;
     int i;
@@ -439,7 +443,8 @@ static void abgrToA_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1,
     }
 }
 
-static void rgbaToA_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, int width, uint32_t *unused)
+static void rgbaToA_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1,
+                      const uint8_t *unused2, int width, uint32_t *unused, void *opq)
 {
     int16_t *dst = (int16_t *)_dst;
     int i;
@@ -448,7 +453,8 @@ static void rgbaToA_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1,
     }
 }
 
-static void palToA_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, int width, uint32_t *pal)
+static void palToA_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1,
+                     const uint8_t *unused2, int width, uint32_t *pal, void *opq)
 {
     int16_t *dst = (int16_t *)_dst;
     int i;
@@ -459,7 +465,8 @@ static void palToA_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1,
     }
 }
 
-static void palToY_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, int width, uint32_t *pal)
+static void palToY_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1,
+                     const uint8_t *unused2, int width, uint32_t *pal, void *opq)
 {
     int16_t *dst = (int16_t *)_dst;
     int i;
@@ -471,8 +478,8 @@ static void palToY_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1,
 }
 
 static void palToUV_c(uint8_t *_dstU, uint8_t *_dstV,
-                           const uint8_t *unused0, const uint8_t *src1, const uint8_t *src2,
-                      int width, uint32_t *pal)
+                      const uint8_t *unused0, const uint8_t *src1, const uint8_t *src2,
+                      int width, uint32_t *pal, void *opq)
 {
     uint16_t *dstU = (uint16_t *)_dstU;
     int16_t *dstV = (int16_t *)_dstV;
@@ -486,7 +493,8 @@ static void palToUV_c(uint8_t *_dstU, uint8_t *_dstV,
     }
 }
 
-static void monowhite2Y_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2,  int width, uint32_t *unused)
+static void monowhite2Y_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1,
+                          const uint8_t *unused2,  int width, uint32_t *unused, void *opq)
 {
     int16_t *dst = (int16_t *)_dst;
     int i, j;
@@ -503,7 +511,8 @@ static void monowhite2Y_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unus
     }
 }
 
-static void monoblack2Y_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2,  int width, uint32_t *unused)
+static void monoblack2Y_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1,
+                          const uint8_t *unused2,  int width, uint32_t *unused, void *opq)
 {
     int16_t *dst = (int16_t *)_dst;
     int i, j;
@@ -520,8 +529,8 @@ static void monoblack2Y_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unus
     }
 }
 
-static void yuy2ToY_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2,  int width,
-                      uint32_t *unused)
+static void yuy2ToY_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, int width,
+                      uint32_t *unused, void *opq)
 {
     int i;
     for (i = 0; i < width; i++)
@@ -529,7 +538,7 @@ static void yuy2ToY_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1,
 }
 
 static void yuy2ToUV_c(uint8_t *dstU, uint8_t *dstV, const uint8_t *unused0, const uint8_t *src1,
-                       const uint8_t *src2, int width, uint32_t *unused)
+                       const uint8_t *src2, int width, uint32_t *unused, void *opq)
 {
     int i;
     for (i = 0; i < width; i++) {
@@ -540,7 +549,7 @@ static void yuy2ToUV_c(uint8_t *dstU, uint8_t *dstV, const uint8_t *unused0, con
 }
 
 static void yvy2ToUV_c(uint8_t *dstU, uint8_t *dstV, const uint8_t *unused0, const uint8_t *src1,
-                       const uint8_t *src2, int width, uint32_t *unused)
+                       const uint8_t *src2, int width, uint32_t *unused, void *opq)
 {
     int i;
     for (i = 0; i < width; i++) {
@@ -551,7 +560,7 @@ static void yvy2ToUV_c(uint8_t *dstU, uint8_t *dstV, const uint8_t *unused0, con
 }
 
 static void y210le_UV_c(uint8_t *dstU, uint8_t *dstV, const uint8_t *unused0, const uint8_t *src,
-                        const uint8_t *unused1, int width, uint32_t *unused2)
+                        const uint8_t *unused1, int width, uint32_t *unused2, void *opq)
 {
     int i;
     for (i = 0; i < width; i++) {
@@ -561,7 +570,7 @@ static void y210le_UV_c(uint8_t *dstU, uint8_t *dstV, const uint8_t *unused0, co
 }
 
 static void y210le_Y_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused0,
-                       const uint8_t *unused1, int width, uint32_t *unused2)
+                       const uint8_t *unused1, int width, uint32_t *unused2, void *opq)
 {
     int i;
     for (i = 0; i < width; i++)
@@ -569,7 +578,7 @@ static void y210le_Y_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused0,
 }
 
 static void bswap16Y_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused1, const uint8_t *unused2, int width,
-                       uint32_t *unused)
+                       uint32_t *unused, void *opq)
 {
     int i;
     const uint16_t *src = (const uint16_t *)_src;
@@ -579,7 +588,7 @@ static void bswap16Y_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused
 }
 
 static void bswap16UV_c(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *unused0, const uint8_t *_src1,
-                        const uint8_t *_src2, int width, uint32_t *unused)
+                        const uint8_t *_src2, int width, uint32_t *unused, void *opq)
 {
     int i;
     const uint16_t *src1 = (const uint16_t *)_src1,
@@ -592,7 +601,7 @@ static void bswap16UV_c(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *unused0,
 }
 
 static void read_ya16le_gray_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, int width,
-                               uint32_t *unused)
+                               uint32_t *unused, void *opq)
 {
     int i;
     for (i = 0; i < width; i++)
@@ -600,7 +609,7 @@ static void read_ya16le_gray_c(uint8_t *dst, const uint8_t *src, const uint8_t *
 }
 
 static void read_ya16le_alpha_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, int width,
-                                uint32_t *unused)
+                                uint32_t *unused, void *opq)
 {
     int i;
     for (i = 0; i < width; i++)
@@ -608,7 +617,7 @@ static void read_ya16le_alpha_c(uint8_t *dst, const uint8_t *src, const uint8_t
 }
 
 static void read_ya16be_gray_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, int width,
-                               uint32_t *unused)
+                               uint32_t *unused, void *opq)
 {
     int i;
     for (i = 0; i < width; i++)
@@ -616,7 +625,7 @@ static void read_ya16be_gray_c(uint8_t *dst, const uint8_t *src, const uint8_t *
 }
 
 static void read_ya16be_alpha_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, int width,
-                                uint32_t *unused)
+                                uint32_t *unused, void *opq)
 {
     int i;
     for (i = 0; i < width; i++)
@@ -624,7 +633,7 @@ static void read_ya16be_alpha_c(uint8_t *dst, const uint8_t *src, const uint8_t
 }
 
 static void read_ayuv64le_Y_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused0, const uint8_t *unused1, int width,
-                               uint32_t *unused2)
+                               uint32_t *unused2, void *opq)
 {
     int i;
     for (i = 0; i < width; i++)
@@ -633,7 +642,7 @@ static void read_ayuv64le_Y_c(uint8_t *dst, const uint8_t *src, const uint8_t *u
 
 
 static void read_ayuv64le_UV_c(uint8_t *dstU, uint8_t *dstV, const uint8_t *unused0, const uint8_t *src,
-                               const uint8_t *unused1, int width, uint32_t *unused2)
+                               const uint8_t *unused1, int width, uint32_t *unused2, void *opq)
 {
     int i;
     for (i = 0; i < width; i++) {
@@ -643,7 +652,7 @@ static void read_ayuv64le_UV_c(uint8_t *dstU, uint8_t *dstV, const uint8_t *unus
 }
 
 static void read_ayuv64le_A_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused0, const uint8_t *unused1, int width,
-                                uint32_t *unused2)
+                              uint32_t *unused2, void *opq)
 {
     int i;
     for (i = 0; i < width; i++)
@@ -651,7 +660,7 @@ static void read_ayuv64le_A_c(uint8_t *dst, const uint8_t *src, const uint8_t *u
 }
 
 static void read_vuya_UV_c(uint8_t *dstU, uint8_t *dstV, const uint8_t *unused0, const uint8_t *src,
-                           const uint8_t *unused1, int width, uint32_t *unused2)
+                           const uint8_t *unused1, int width, uint32_t *unused2, void *opq)
 {
     int i;
     for (i = 0; i < width; i++) {
@@ -661,7 +670,7 @@ static void read_vuya_UV_c(uint8_t *dstU, uint8_t *dstV, const uint8_t *unused0,
 }
 
 static void read_vuya_Y_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused0, const uint8_t *unused1, int width,
-                          uint32_t *unused2)
+                          uint32_t *unused2, void *opq)
 {
     int i;
     for (i = 0; i < width; i++)
@@ -669,7 +678,7 @@ static void read_vuya_Y_c(uint8_t *dst, const uint8_t *src, const uint8_t *unuse
 }
 
 static void read_vuya_A_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused0, const uint8_t *unused1, int width,
-                          uint32_t *unused2)
+                          uint32_t *unused2, void *opq)
 {
     int i;
     for (i = 0; i < width; i++)
@@ -679,7 +688,7 @@ static void read_vuya_A_c(uint8_t *dst, const uint8_t *src, const uint8_t *unuse
 /* This is almost identical to the previous, end exists only because
  * yuy2ToY/UV)(dst, src + 1, ...) would have 100% unaligned accesses. */
 static void uyvyToY_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2,  int width,
-                      uint32_t *unused)
+                      uint32_t *unused, void *opq)
 {
     int i;
     for (i = 0; i < width; i++)
@@ -687,7 +696,7 @@ static void uyvyToY_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1,
 }
 
 static void uyvyToUV_c(uint8_t *dstU, uint8_t *dstV, const uint8_t *unused0, const uint8_t *src1,
-                       const uint8_t *src2, int width, uint32_t *unused)
+                       const uint8_t *src2, int width, uint32_t *unused, void *opq)
 {
     int i;
     for (i = 0; i < width; i++) {
@@ -709,20 +718,20 @@ static av_always_inline void nvXXtoUV_c(uint8_t *dst1, uint8_t *dst2,
 
 static void nv12ToUV_c(uint8_t *dstU, uint8_t *dstV,
                        const uint8_t *unused0, const uint8_t *src1, const uint8_t *src2,
-                       int width, uint32_t *unused)
+                       int width, uint32_t *unused, void *opq)
 {
     nvXXtoUV_c(dstU, dstV, src1, width);
 }
 
 static void nv21ToUV_c(uint8_t *dstU, uint8_t *dstV,
                        const uint8_t *unused0, const uint8_t *src1, const uint8_t *src2,
-                       int width, uint32_t *unused)
+                       int width, uint32_t *unused, void *opq)
 {
     nvXXtoUV_c(dstV, dstU, src1, width);
 }
 
 static void p010LEToY_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1,
-                        const uint8_t *unused2, int width, uint32_t *unused)
+                        const uint8_t *unused2, int width, uint32_t *unused, void *opq)
 {
     int i;
     for (i = 0; i < width; i++) {
@@ -731,7 +740,7 @@ static void p010LEToY_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1
 }
 
 static void p010BEToY_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1,
-                        const uint8_t *unused2, int width, uint32_t *unused)
+                        const uint8_t *unused2, int width, uint32_t *unused, void *opq)
 {
     int i;
     for (i = 0; i < width; i++) {
@@ -741,7 +750,7 @@ static void p010BEToY_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1
 
 static void p010LEToUV_c(uint8_t *dstU, uint8_t *dstV,
                        const uint8_t *unused0, const uint8_t *src1, const uint8_t *src2,
-                       int width, uint32_t *unused)
+                       int width, uint32_t *unused, void *opq)
 {
     int i;
     for (i = 0; i < width; i++) {
@@ -751,8 +760,8 @@ static void p010LEToUV_c(uint8_t *dstU, uint8_t *dstV,
 }
 
 static void p010BEToUV_c(uint8_t *dstU, uint8_t *dstV,
-                       const uint8_t *unused0, const uint8_t *src1, const uint8_t *src2,
-                       int width, uint32_t *unused)
+                         const uint8_t *unused0, const uint8_t *src1, const uint8_t *src2,
+                         int width, uint32_t *unused, void *opq)
 {
     int i;
     for (i = 0; i < width; i++) {
@@ -762,8 +771,8 @@ static void p010BEToUV_c(uint8_t *dstU, uint8_t *dstV,
 }
 
 static void p016LEToUV_c(uint8_t *dstU, uint8_t *dstV,
-                       const uint8_t *unused0, const uint8_t *src1, const uint8_t *src2,
-                       int width, uint32_t *unused)
+                         const uint8_t *unused0, const uint8_t *src1, const uint8_t *src2,
+                         int width, uint32_t *unused, void *opq)
 {
     int i;
     for (i = 0; i < width; i++) {
@@ -773,8 +782,8 @@ static void p016LEToUV_c(uint8_t *dstU, uint8_t *dstV,
 }
 
 static void p016BEToUV_c(uint8_t *dstU, uint8_t *dstV,
-                       const uint8_t *unused0, const uint8_t *src1, const uint8_t *src2,
-                       int width, uint32_t *unused)
+                         const uint8_t *unused0, const uint8_t *src1, const uint8_t *src2,
+                         int width, uint32_t *unused, void *opq)
 {
     int i;
     for (i = 0; i < width; i++) {
@@ -786,7 +795,7 @@ static void p016BEToUV_c(uint8_t *dstU, uint8_t *dstV,
 #define input_pixel(pos) (isBE(origin) ? AV_RB16(pos) : AV_RL16(pos))
 
 static void bgr24ToY_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2,
-                       int width, uint32_t *rgb2yuv)
+                       int width, uint32_t *rgb2yuv, void *opq)
 {
     int16_t *dst = (int16_t *)_dst;
     int32_t ry = rgb2yuv[RY_IDX], gy = rgb2yuv[GY_IDX], by = rgb2yuv[BY_IDX];
@@ -801,7 +810,7 @@ static void bgr24ToY_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1
 }
 
 static void bgr24ToUV_c(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *unused0, const uint8_t *src1,
-                        const uint8_t *src2, int width, uint32_t *rgb2yuv)
+                        const uint8_t *src2, int width, uint32_t *rgb2yuv, void *opq)
 {
     int16_t *dstU = (int16_t *)_dstU;
     int16_t *dstV = (int16_t *)_dstV;
@@ -820,7 +829,7 @@ static void bgr24ToUV_c(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *unused0,
 }
 
 static void bgr24ToUV_half_c(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *unused0, const uint8_t *src1,
-                             const uint8_t *src2, int width, uint32_t *rgb2yuv)
+                             const uint8_t *src2, int width, uint32_t *rgb2yuv, void *opq)
 {
     int16_t *dstU = (int16_t *)_dstU;
     int16_t *dstV = (int16_t *)_dstV;
@@ -839,7 +848,7 @@ static void bgr24ToUV_half_c(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *unus
 }
 
 static void rgb24ToY_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, int width,
-                       uint32_t *rgb2yuv)
+                       uint32_t *rgb2yuv, void *opq)
 {
     int16_t *dst = (int16_t *)_dst;
     int32_t ry = rgb2yuv[RY_IDX], gy = rgb2yuv[GY_IDX], by = rgb2yuv[BY_IDX];
@@ -854,7 +863,7 @@ static void rgb24ToY_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1
 }
 
 static void rgb24ToUV_c(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *unused0, const uint8_t *src1,
-                        const uint8_t *src2, int width, uint32_t *rgb2yuv)
+                        const uint8_t *src2, int width, uint32_t *rgb2yuv, void *opq)
 {
     int16_t *dstU = (int16_t *)_dstU;
     int16_t *dstV = (int16_t *)_dstV;
@@ -873,7 +882,7 @@ static void rgb24ToUV_c(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *unused0,
 }
 
 static void rgb24ToUV_half_c(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *unused0, const uint8_t *src1,
-                             const uint8_t *src2, int width, uint32_t *rgb2yuv)
+                             const uint8_t *src2, int width, uint32_t *rgb2yuv, void *opq)
 {
     int16_t *dstU = (int16_t *)_dstU;
     int16_t *dstV = (int16_t *)_dstV;
@@ -891,7 +900,7 @@ static void rgb24ToUV_half_c(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *unus
     }
 }
 
-static void planar_rgb_to_y(uint8_t *_dst, const uint8_t *src[4], int width, int32_t *rgb2yuv)
+static void planar_rgb_to_y(uint8_t *_dst, const uint8_t *src[4], int width, int32_t *rgb2yuv, void *opq)
 {
     uint16_t *dst = (uint16_t *)_dst;
     int32_t ry = rgb2yuv[RY_IDX], gy = rgb2yuv[GY_IDX], by = rgb2yuv[BY_IDX];
@@ -905,7 +914,7 @@ static void planar_rgb_to_y(uint8_t *_dst, const uint8_t *src[4], int width, int
     }
 }
 
-static void planar_rgb_to_a(uint8_t *_dst, const uint8_t *src[4], int width, int32_t *unused)
+static void planar_rgb_to_a(uint8_t *_dst, const uint8_t *src[4], int width, int32_t *unused, void *opq)
 {
     uint16_t *dst = (uint16_t *)_dst;
     int i;
@@ -913,7 +922,7 @@ static void planar_rgb_to_a(uint8_t *_dst, const uint8_t *src[4], int width, int
         dst[i] = src[3][i] << 6;
 }
 
-static void planar_rgb_to_uv(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *src[4], int width, int32_t *rgb2yuv)
+static void planar_rgb_to_uv(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *src[4], int width, int32_t *rgb2yuv, void *opq)
 {
     uint16_t *dstU = (uint16_t *)_dstU;
     uint16_t *dstV = (uint16_t *)_dstV;
@@ -1049,24 +1058,27 @@ static av_always_inline void grayf32ToY16_c(uint8_t *_dst, const uint8_t *_src,
 
 #define rgb9plus_planar_funcs_endian(nbits, endian_name, endian)                                    \
 static void planar_rgb##nbits##endian_name##_to_y(uint8_t *dst, const uint8_t *src[4],              \
-                                                  int w, int32_t *rgb2yuv)                          \
+                                                  int w, int32_t *rgb2yuv, void *opq)               \
 {                                                                                                   \
     planar_rgb16_to_y(dst, src, w, nbits, endian, rgb2yuv);                                         \
 }                                                                                                   \
 static void planar_rgb##nbits##endian_name##_to_uv(uint8_t *dstU, uint8_t *dstV,                    \
-                                                   const uint8_t *src[4], int w, int32_t *rgb2yuv)  \
+                                                   const uint8_t *src[4], int w, int32_t *rgb2yuv,  \
+                                                   void *opq)                                       \
 {                                                                                                   \
     planar_rgb16_to_uv(dstU, dstV, src, w, nbits, endian, rgb2yuv);                                 \
 }                                                                                                   \
 
 #define rgb9plus_planar_transparency_funcs(nbits)                           \
 static void planar_rgb##nbits##le_to_a(uint8_t *dst, const uint8_t *src[4], \
-                                       int w, int32_t *rgb2yuv)             \
+                                       int w, int32_t *rgb2yuv,             \
+                                       void *opq)                           \
 {                                                                           \
     planar_rgb16_to_a(dst, src, w, nbits, 0, rgb2yuv);                      \
 }                                                                           \
 static void planar_rgb##nbits##be_to_a(uint8_t *dst, const uint8_t *src[4], \
-                                       int w, int32_t *rgb2yuv)             \
+                                       int w, int32_t *rgb2yuv,             \
+                                       void *opq)                           \
 {                                                                           \
     planar_rgb16_to_a(dst, src, w, nbits, 1, rgb2yuv);                      \
 }
@@ -1087,23 +1099,24 @@ rgb9plus_planar_transparency_funcs(16)
 
 #define rgbf32_planar_funcs_endian(endian_name, endian)                                             \
 static void planar_rgbf32##endian_name##_to_y(uint8_t *dst, const uint8_t *src[4],                  \
-                                                  int w, int32_t *rgb2yuv)                          \
+                                                  int w, int32_t *rgb2yuv, void *opq)               \
 {                                                                                                   \
     planar_rgbf32_to_y(dst, src, w, endian, rgb2yuv);                                               \
 }                                                                                                   \
 static void planar_rgbf32##endian_name##_to_uv(uint8_t *dstU, uint8_t *dstV,                        \
-                                                   const uint8_t *src[4], int w, int32_t *rgb2yuv)  \
+                                               const uint8_t *src[4], int w, int32_t *rgb2yuv,      \
+                                               void *opq)                                           \
 {                                                                                                   \
     planar_rgbf32_to_uv(dstU, dstV, src, w, endian, rgb2yuv);                                       \
 }                                                                                                   \
 static void planar_rgbf32##endian_name##_to_a(uint8_t *dst, const uint8_t *src[4],                  \
-                                              int w, int32_t *rgb2yuv)                              \
+                                              int w, int32_t *rgb2yuv, void *opq)                   \
 {                                                                                                   \
     planar_rgbf32_to_a(dst, src, w, endian, rgb2yuv);                                               \
 }                                                                                                   \
 static void grayf32##endian_name##ToY16_c(uint8_t *dst, const uint8_t *src,                         \
                                           const uint8_t *unused1, const uint8_t *unused2,           \
-                                          int width, uint32_t *unused)                              \
+                                          int width, uint32_t *unused, void *opq)                   \
 {                                                                                                   \
     grayf32ToY16_c(dst, src, unused1, unused2, width, endian, unused);                              \
 }
diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h
index e118b54457..9ab542933f 100644
--- a/libswscale/swscale_internal.h
+++ b/libswscale/swscale_internal.h
@@ -559,26 +559,31 @@ typedef struct SwsContext {
     yuv2packedX_fn yuv2packedX;
     yuv2anyX_fn yuv2anyX;
 
+    /// Opaque data pointer passed to all input functions.
+    void *input_opaque;
+
     /// Unscaled conversion of luma plane to YV12 for horizontal scaler.
     void (*lumToYV12)(uint8_t *dst, const uint8_t *src, const uint8_t *src2, const uint8_t *src3,
-                      int width, uint32_t *pal);
+                      int width, uint32_t *pal, void *opq);
     /// Unscaled conversion of alpha plane to YV12 for horizontal scaler.
     void (*alpToYV12)(uint8_t *dst, const uint8_t *src, const uint8_t *src2, const uint8_t *src3,
-                      int width, uint32_t *pal);
+                      int width, uint32_t *pal, void *opq);
     /// Unscaled conversion of chroma planes to YV12 for horizontal scaler.
     void (*chrToYV12)(uint8_t *dstU, uint8_t *dstV,
                       const uint8_t *src1, const uint8_t *src2, const uint8_t *src3,
-                      int width, uint32_t *pal);
+                      int width, uint32_t *pal, void *opq);
 
     /**
      * Functions to read planar input, such as planar RGB, and convert
      * internally to Y/UV/A.
      */
     /** @{ */
-    void (*readLumPlanar)(uint8_t *dst, const uint8_t *src[4], int width, int32_t *rgb2yuv);
+    void (*readLumPlanar)(uint8_t *dst, const uint8_t *src[4], int width, int32_t *rgb2yuv,
+                          void *opq);
     void (*readChrPlanar)(uint8_t *dstU, uint8_t *dstV, const uint8_t *src[4],
-                          int width, int32_t *rgb2yuv);
-    void (*readAlpPlanar)(uint8_t *dst, const uint8_t *src[4], int width, int32_t *rgb2yuv);
+                          int width, int32_t *rgb2yuv, void *opq);
+    void (*readAlpPlanar)(uint8_t *dst, const uint8_t *src[4], int width, int32_t *rgb2yuv,
+                          void *opq);
     /** @} */
 
     /**
diff --git a/libswscale/x86/swscale.c b/libswscale/x86/swscale.c
index 628f12137c..270798ba3d 100644
--- a/libswscale/x86/swscale.c
+++ b/libswscale/x86/swscale.c
@@ -299,13 +299,13 @@ VSCALE_FUNCS(avx, avx);
 #define INPUT_Y_FUNC(fmt, opt) \
 void ff_ ## fmt ## ToY_  ## opt(uint8_t *dst, const uint8_t *src, \
                                 const uint8_t *unused1, const uint8_t *unused2, \
-                                int w, uint32_t *unused)
+                                int w, uint32_t *unused, void *opq)
 #define INPUT_UV_FUNC(fmt, opt) \
 void ff_ ## fmt ## ToUV_ ## opt(uint8_t *dstU, uint8_t *dstV, \
                                 const uint8_t *unused0, \
                                 const uint8_t *src1, \
                                 const uint8_t *src2, \
-                                int w, uint32_t *unused)
+                                int w, uint32_t *unused, void *opq)
 #define INPUT_FUNC(fmt, opt) \
     INPUT_Y_FUNC(fmt, opt); \
     INPUT_UV_FUNC(fmt, opt)
@@ -373,15 +373,18 @@ YUV2GBRP_DECL(avx2);
 
 #define INPUT_PLANAR_RGB_Y_FN_DECL(fmt, opt)                               \
 void ff_planar_##fmt##_to_y_##opt(uint8_t *dst,                            \
-                           const uint8_t *src[4], int w, int32_t *rgb2yuv)
+                           const uint8_t *src[4], int w, int32_t *rgb2yuv, \
+                           void *opq)
 
 #define INPUT_PLANAR_RGB_UV_FN_DECL(fmt, opt)                              \
 void ff_planar_##fmt##_to_uv_##opt(uint8_t *dstU, uint8_t *dstV,           \
-                           const uint8_t *src[4], int w, int32_t *rgb2yuv)
+                           const uint8_t *src[4], int w, int32_t *rgb2yuv, \
+                           void *opq)
 
 #define INPUT_PLANAR_RGB_A_FN_DECL(fmt, opt)                               \
 void ff_planar_##fmt##_to_a_##opt(uint8_t *dst,                            \
-                           const uint8_t *src[4], int w, int32_t *rgb2yuv)
+                           const uint8_t *src[4], int w, int32_t *rgb2yuv, \
+                           void *opq)
 
 
 #define INPUT_PLANAR_RGBXX_A_DECL(fmt, opt) \

From patchwork Wed Aug 10 20:47:12 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Timo Rothenpieler <timo@rothenpieler.org>
X-Patchwork-Id: 37226
Delivered-To: ffmpegpatchwork2@gmail.com
Received: by 2002:a05:6a20:3d0d:b0:8d:a68e:8a0e with SMTP id y13csp1142635pzi;
        Wed, 10 Aug 2022 13:48:37 -0700 (PDT)
X-Google-Smtp-Source: 
 AA6agR6+ymClCox6WvJvhubDqlIPo1h82HFf0X/m9Y1AVfj5sa3zllqyZpGka3Ok8/8q6xR0OOme
X-Received: by 2002:a05:6402:5510:b0:43a:76ff:b044 with SMTP id
 fi16-20020a056402551000b0043a76ffb044mr28465046edb.197.1660164517362;
        Wed, 10 Aug 2022 13:48:37 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1660164517; cv=none;
        d=google.com; s=arc-20160816;
        b=yeili7riTJ8qlYrC4vUroQPkke0RcESWeFY07QCWsR37+Hf7VRgVSaTmKa/2yvnc35
         KmFjy7anKm7MUv1KRt+rPhXEvpUTvaHJNZkgo17sHW4+Ra13J2xnGiO05Xe4mYFAu0G0
         iIkRlkQ9UoCPf8IwB0FcjAcEQgzfAk+EzSBIiIEhjMdWXrDviRdDYByiawpewJOz6Jdn
         o7DjOuF0JD/TfxwzTzhP/987JpP0U6xw1geVl80pV37NhSeiYQa7FIOiR3004kbmupUl
         FmpvvCwRtkgygsnFFB5Yy111DlEVehcuXD3TggGLGIDgCpTCWXRU0dpL5wxKErR6bT41
         YNBA==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=sender:errors-to:content-transfer-encoding:cc:reply-to
         :list-subscribe:list-help:list-post:list-archive:list-unsubscribe
         :list-id:precedence:subject:mime-version:references:in-reply-to
         :message-id:date:to:from:dkim-signature:delivered-to;
        bh=iyxr59NfpnZG5EpRvvRtW66VZru0bOgHzo7k6t05+wk=;
        b=ZoNYcugbZ9JXW2rFnFSBLXcw7uHG+ZOU4DgviyjQSsM5lZioI0ghW8sq108V7VBGCa
         pGD8dA31u7kbk4vB8tBIVcC5M4Y9XUsCiLg+/n7p/zP3fd4sVmhI4VqnmNFIRTQvrgYm
         oEc0/I1arBNY5nr6ChwWCR0/4jtZGWGLpZGhnOZpfrnZXlY+VwcOtmFKzz6u93DkOvPf
         FdziIlalO4FDm57NY1AlmqDYRszCbmlZrHLd5G/cLbGa3Y6taWMz+nfFoyKTpwQf1pfo
         jjYYwEay9MpBHFfnF3i/HwXfBeanIOorI2s0SOVskMaiKcN+SeNSIJqtRNIX+c4qjyn0
         K9Fg==
ARC-Authentication-Results: i=1; mx.google.com;
       dkim=neutral (body hash did not verify) header.i=@rothenpieler.org
 header.s=mail header.b="oWMV/4as";
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100])
        by mx.google.com with ESMTP id
 qb10-20020a1709077e8a00b00730a21eaa9dsi5655909ejc.760.2022.08.10.13.48.37;
        Wed, 10 Aug 2022 13:48:37 -0700 (PDT)
Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100;
Authentication-Results: mx.google.com;
       dkim=neutral (body hash did not verify) header.i=@rothenpieler.org
 header.s=mail header.b="oWMV/4as";
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3E53A68B905;
	Wed, 10 Aug 2022 23:47:38 +0300 (EEST)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from btbn.de (btbn.de [136.243.74.85])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1A90C68B1E1
 for <ffmpeg-devel@ffmpeg.org>; Wed, 10 Aug 2022 23:47:31 +0300 (EEST)
Received: from [authenticated] by btbn.de (Postfix) with ESMTPSA id
 9EF862F2619; Wed, 10 Aug 2022 22:47:25 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rothenpieler.org;
 s=mail; t=1660164445;
 bh=cuPxc9F3vL6jyMoVFwagK647MRoYokzSNNPzUUYpcWA=;
 h=From:To:Cc:Subject:Date:In-Reply-To:References;
 b=oWMV/4asg24B63XxN8bk3Ly1OevjQfDmt2hLy/l3xIOYrFi6g12Ux4L5yIjs1/aFg
 l0b4GfJpU455CCV2UUcxuqO8NjE1tgVE2Z+XTZ6ypa+SMoOAHVBsI0x2jV8Kq/gGq1
 GPbzNKrCmd+WXAd2OYJV6x9iRKXjcDNjdhHeMaD1C8FhzOSwu6/kNG3jG069kKLv4P
 3/LPYKXX7a8+n8H7Xv2YHir5F+TUlEhJjaBpjo9IEpouha8PJjtn7J2VSHEh+JyA9C
 6KP5lTjGRyl3fqGMUJo+jn/mYs3NziN7bCZltYYUe+XP/OE/oa9r9qS7cPXg4sViyL
 mULQVdRgY6G4A==
From: Timo Rothenpieler <timo@rothenpieler.org>
To: ffmpeg-devel@ffmpeg.org
Date: Wed, 10 Aug 2022 22:47:12 +0200
Message-Id: <20220810204712.3123-11-timo@rothenpieler.org>
X-Mailer: git-send-email 2.34.1
In-Reply-To: <20220810204712.3123-1-timo@rothenpieler.org>
References: <20220810204712.3123-1-timo@rothenpieler.org>
MIME-Version: 1.0
Subject: [FFmpeg-devel] [PATCH 11/11] swscale/input: add rgbaf16 input
 support
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Cc: Timo Rothenpieler <timo@rothenpieler.org>
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
X-TUID: HCyXbiRPlmOH

This is by no means perfect, since at least ddagrab will return scRGB
data with values outside of 0.0f to 1.0f for HDR values.
Its primary purpose is to be able to work with the format at all.
---
 libavutil/Makefile            |   1 +
 libswscale/half2float.c       |  19 +++++
 libswscale/input.c            | 130 ++++++++++++++++++++++++++++++++++
 libswscale/slice.c            |   9 ++-
 libswscale/swscale_internal.h |  10 +++
 libswscale/utils.c            |   2 +
 libswscale/version.h          |   2 +-
 7 files changed, 171 insertions(+), 2 deletions(-)
 create mode 100644 libswscale/half2float.c

diff --git a/libavutil/Makefile b/libavutil/Makefile
index 3d9c07aea8..1aac1a4cc0 100644
--- a/libavutil/Makefile
+++ b/libavutil/Makefile
@@ -131,6 +131,7 @@ OBJS = adler32.o                                                        \
        float_dsp.o                                                      \
        fixed_dsp.o                                                      \
        frame.o                                                          \
+       half2float.o                                                     \
        hash.o                                                           \
        hdr_dynamic_metadata.o                                           \
        hdr_dynamic_vivid_metadata.o                                     \
diff --git a/libswscale/half2float.c b/libswscale/half2float.c
new file mode 100644
index 0000000000..1b023f96a5
--- /dev/null
+++ b/libswscale/half2float.c
@@ -0,0 +1,19 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavutil/half2float.c"
diff --git a/libswscale/input.c b/libswscale/input.c
index 36ef1e43ac..818b57d2c3 100644
--- a/libswscale/input.c
+++ b/libswscale/input.c
@@ -1124,6 +1124,112 @@ static void grayf32##endian_name##ToY16_c(uint8_t *dst, const uint8_t *src,
 rgbf32_planar_funcs_endian(le, 0)
 rgbf32_planar_funcs_endian(be, 1)
 
+#define rdpx(src) av_int2float(half2float(is_be ? AV_RB16(&src) : AV_RL16(&src), h2f_tbl))
+
+static av_always_inline void rgbaf16ToUV_half_endian(uint16_t *dstU, uint16_t *dstV, int is_be,
+                                                     const uint16_t *src, int width,
+                                                     int32_t *rgb2yuv, half2float_tables *h2f_tbl)
+{
+    int32_t ru = rgb2yuv[RU_IDX], gu = rgb2yuv[GU_IDX], bu = rgb2yuv[BU_IDX];
+    int32_t rv = rgb2yuv[RV_IDX], gv = rgb2yuv[GV_IDX], bv = rgb2yuv[BV_IDX];
+    int i;
+    for (i = 0; i < width; i++) {
+        int r = (lrintf(av_clipf(65535.0f * rdpx(src[i*8+0]), 0.0f, 65535.0f)) +
+                 lrintf(av_clipf(65535.0f * rdpx(src[i*8+4]), 0.0f, 65535.0f))) >> 1;
+        int g = (lrintf(av_clipf(65535.0f * rdpx(src[i*8+1]), 0.0f, 65535.0f)) +
+                 lrintf(av_clipf(65535.0f * rdpx(src[i*8+5]), 0.0f, 65535.0f))) >> 1;
+        int b = (lrintf(av_clipf(65535.0f * rdpx(src[i*8+2]), 0.0f, 65535.0f)) +
+                 lrintf(av_clipf(65535.0f * rdpx(src[i*8+6]), 0.0f, 65535.0f))) >> 1;
+
+        dstU[i] = (ru*r + gu*g + bu*b + (0x10001<<(RGB2YUV_SHIFT-1))) >> RGB2YUV_SHIFT;
+        dstV[i] = (rv*r + gv*g + bv*b + (0x10001<<(RGB2YUV_SHIFT-1))) >> RGB2YUV_SHIFT;
+    }
+}
+
+static av_always_inline void rgbaf16ToUV_endian(uint16_t *dstU, uint16_t *dstV, int is_be,
+                                                const uint16_t *src, int width,
+                                                int32_t *rgb2yuv, half2float_tables *h2f_tbl)
+{
+    int32_t ru = rgb2yuv[RU_IDX], gu = rgb2yuv[GU_IDX], bu = rgb2yuv[BU_IDX];
+    int32_t rv = rgb2yuv[RV_IDX], gv = rgb2yuv[GV_IDX], bv = rgb2yuv[BV_IDX];
+    int i;
+    for (i = 0; i < width; i++) {
+        int r = lrintf(av_clipf(65535.0f * rdpx(src[i*4+0]), 0.0f, 65535.0f));
+        int g = lrintf(av_clipf(65535.0f * rdpx(src[i*4+1]), 0.0f, 65535.0f));
+        int b = lrintf(av_clipf(65535.0f * rdpx(src[i*4+2]), 0.0f, 65535.0f));
+
+        dstU[i] = (ru*r + gu*g + bu*b + (0x10001<<(RGB2YUV_SHIFT-1))) >> RGB2YUV_SHIFT;
+        dstV[i] = (rv*r + gv*g + bv*b + (0x10001<<(RGB2YUV_SHIFT-1))) >> RGB2YUV_SHIFT;
+    }
+}
+
+static av_always_inline void rgbaf16ToY_endian(uint16_t *dst, const uint16_t *src, int is_be,
+                                               int width, int32_t *rgb2yuv, half2float_tables *h2f_tbl)
+{
+    int32_t ry = rgb2yuv[RY_IDX], gy = rgb2yuv[GY_IDX], by = rgb2yuv[BY_IDX];
+    int i;
+    for (i = 0; i < width; i++) {
+        int r = lrintf(av_clipf(65535.0f * rdpx(src[i*4+0]), 0.0f, 65535.0f));
+        int g = lrintf(av_clipf(65535.0f * rdpx(src[i*4+1]), 0.0f, 65535.0f));
+        int b = lrintf(av_clipf(65535.0f * rdpx(src[i*4+2]), 0.0f, 65535.0f));
+
+        dst[i] = (ry*r + gy*g + by*b + (0x2001<<(RGB2YUV_SHIFT-1))) >> RGB2YUV_SHIFT;
+    }
+}
+
+static av_always_inline void rgbaf16ToA_endian(uint16_t *dst, const uint16_t *src, int is_be,
+                                               int width, half2float_tables *h2f_tbl)
+{
+    int i;
+    for (i=0; i<width; i++) {
+        dst[i] = lrintf(av_clipf(65535.0f * rdpx(src[i*4+3]), 0.0f, 65535.0f));
+    }
+}
+
+#undef rdpx
+
+#define rgbaf16_funcs_endian(endian_name, endian)                                                         \
+static void rgbaf16##endian_name##ToUV_half_c(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *unused,      \
+                                              const uint8_t *src1, const uint8_t *src2,                   \
+                                              int width, uint32_t *_rgb2yuv, void *opq)                   \
+{                                                                                                         \
+    const uint16_t *src = (const uint16_t*)src1;                                                          \
+    uint16_t *dstU = (uint16_t*)_dstU;                                                                    \
+    uint16_t *dstV = (uint16_t*)_dstV;                                                                    \
+    int32_t *rgb2yuv = (int32_t*)_rgb2yuv;                                                                \
+    av_assert1(src1==src2);                                                                               \
+    rgbaf16ToUV_half_endian(dstU, dstV, endian, src, width, rgb2yuv, opq);                                \
+}                                                                                                         \
+static void rgbaf16##endian_name##ToUV_c(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *unused,           \
+                                         const uint8_t *src1, const uint8_t *src2,                        \
+                                         int width, uint32_t *_rgb2yuv, void *opq)                        \
+{                                                                                                         \
+    const uint16_t *src = (const uint16_t*)src1;                                                          \
+    uint16_t *dstU = (uint16_t*)_dstU;                                                                    \
+    uint16_t *dstV = (uint16_t*)_dstV;                                                                    \
+    int32_t *rgb2yuv = (int32_t*)_rgb2yuv;                                                                \
+    av_assert1(src1==src2);                                                                               \
+    rgbaf16ToUV_half_endian(dstU, dstV, endian, src, width, rgb2yuv, opq);                                \
+}                                                                                                         \
+static void rgbaf16##endian_name##ToY_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused0,       \
+                                        const uint8_t *unused1, int width, uint32_t *_rgb2yuv, void *opq) \
+{                                                                                                         \
+    const uint16_t *src = (const uint16_t*)_src;                                                          \
+    uint16_t *dst = (uint16_t*)_dst;                                                                      \
+    int32_t *rgb2yuv = (int32_t*)_rgb2yuv;                                                                \
+    rgbaf16ToY_endian(dst, src, endian, width, rgb2yuv, opq);                                             \
+}                                                                                                         \
+static void rgbaf16##endian_name##ToA_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused0,       \
+                                        const uint8_t *unused1, int width, uint32_t *unused2, void *opq)  \
+{                                                                                                         \
+    const uint16_t *src = (const uint16_t*)_src;                                                          \
+    uint16_t *dst = (uint16_t*)_dst;                                                                      \
+    rgbaf16ToA_endian(dst, src, endian, width, opq);                                                      \
+}
+
+rgbaf16_funcs_endian(le, 0)
+rgbaf16_funcs_endian(be, 1)
+
 av_cold void ff_sws_init_input_funcs(SwsContext *c)
 {
     enum AVPixelFormat srcFormat = c->srcFormat;
@@ -1388,6 +1494,12 @@ av_cold void ff_sws_init_input_funcs(SwsContext *c)
         case AV_PIX_FMT_X2BGR10LE:
             c->chrToYV12 = bgr30leToUV_half_c;
             break;
+        case AV_PIX_FMT_RGBAF16BE:
+            c->chrToYV12 = rgbaf16beToUV_half_c;
+            break;
+        case AV_PIX_FMT_RGBAF16LE:
+            c->chrToYV12 = rgbaf16leToUV_half_c;
+            break;
         }
     } else {
         switch (srcFormat) {
@@ -1475,6 +1587,12 @@ av_cold void ff_sws_init_input_funcs(SwsContext *c)
         case AV_PIX_FMT_X2BGR10LE:
             c->chrToYV12 = bgr30leToUV_c;
             break;
+        case AV_PIX_FMT_RGBAF16BE:
+            c->chrToYV12 = rgbaf16beToUV_c;
+            break;
+        case AV_PIX_FMT_RGBAF16LE:
+            c->chrToYV12 = rgbaf16leToUV_c;
+            break;
         }
     }
 
@@ -1763,6 +1881,12 @@ av_cold void ff_sws_init_input_funcs(SwsContext *c)
     case AV_PIX_FMT_X2BGR10LE:
         c->lumToYV12 = bgr30leToY_c;
         break;
+    case AV_PIX_FMT_RGBAF16BE:
+        c->lumToYV12 = rgbaf16beToY_c;
+        break;
+    case AV_PIX_FMT_RGBAF16LE:
+        c->lumToYV12 = rgbaf16leToY_c;
+        break;
     }
     if (c->needAlpha) {
         if (is16BPS(srcFormat) || isNBPS(srcFormat)) {
@@ -1782,6 +1906,12 @@ av_cold void ff_sws_init_input_funcs(SwsContext *c)
         case AV_PIX_FMT_ARGB:
             c->alpToYV12 = abgrToA_c;
             break;
+        case AV_PIX_FMT_RGBAF16BE:
+            c->alpToYV12 = rgbaf16beToA_c;
+            break;
+        case AV_PIX_FMT_RGBAF16LE:
+            c->alpToYV12 = rgbaf16leToA_c;
+            break;
         case AV_PIX_FMT_YA8:
             c->alpToYV12 = uyvyToY_c;
             break;
diff --git a/libswscale/slice.c b/libswscale/slice.c
index b3ee06d632..db1c696727 100644
--- a/libswscale/slice.c
+++ b/libswscale/slice.c
@@ -282,7 +282,13 @@ int ff_init_filters(SwsContext * c)
     c->descIndex[0] = num_ydesc + (need_gamma ? 1 : 0);
     c->descIndex[1] = num_ydesc + num_cdesc + (need_gamma ? 1 : 0);
 
-
+    if (isFloat16(c->srcFormat)) {
+        c->h2f_tables = av_malloc(sizeof(*c->h2f_tables));
+        if (!c->h2f_tables)
+            return AVERROR(ENOMEM);
+        ff_init_half2float_tables(c->h2f_tables);
+        c->input_opaque = c->h2f_tables;
+    }
 
     c->desc  = av_calloc(c->numDesc,  sizeof(*c->desc));
     if (!c->desc)
@@ -393,5 +399,6 @@ int ff_free_filters(SwsContext *c)
             free_slice(&c->slice[i]);
         av_freep(&c->slice);
     }
+    av_freep(&c->h2f_tables);
     return 0;
 }
diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h
index 9ab542933f..7d9f785298 100644
--- a/libswscale/swscale_internal.h
+++ b/libswscale/swscale_internal.h
@@ -35,6 +35,7 @@
 #include "libavutil/pixdesc.h"
 #include "libavutil/slicethread.h"
 #include "libavutil/ppc/util_altivec.h"
+#include "libavutil/half2float.h"
 
 #define STR(s) AV_TOSTRING(s) // AV_STRINGIFY is too long
 
@@ -679,6 +680,8 @@ typedef struct SwsContext {
     unsigned int dst_slice_align;
     atomic_int   stride_unaligned_warned;
     atomic_int   data_unaligned_warned;
+
+    half2float_tables *h2f_tables;
 } SwsContext;
 //FIXME check init (where 0)
 
@@ -840,6 +843,13 @@ static av_always_inline int isFloat(enum AVPixelFormat pix_fmt)
     return desc->flags & AV_PIX_FMT_FLAG_FLOAT;
 }
 
+static av_always_inline int isFloat16(enum AVPixelFormat pix_fmt)
+{
+    const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(pix_fmt);
+    av_assert0(desc);
+    return (desc->flags & AV_PIX_FMT_FLAG_FLOAT) && desc->comp[0].depth == 16;
+}
+
 static av_always_inline int isALPHA(enum AVPixelFormat pix_fmt)
 {
     const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(pix_fmt);
diff --git a/libswscale/utils.c b/libswscale/utils.c
index 34503e57f4..81646c0d73 100644
--- a/libswscale/utils.c
+++ b/libswscale/utils.c
@@ -259,6 +259,8 @@ static const FormatEntry format_entries[] = {
     [AV_PIX_FMT_P416LE]      = { 1, 1 },
     [AV_PIX_FMT_NV16]        = { 1, 1 },
     [AV_PIX_FMT_VUYA]        = { 1, 1 },
+    [AV_PIX_FMT_RGBAF16BE]   = { 1, 0 },
+    [AV_PIX_FMT_RGBAF16LE]   = { 1, 0 },
 };
 
 int ff_shuffle_filter_coefficients(SwsContext *c, int *filterPos,
diff --git a/libswscale/version.h b/libswscale/version.h
index 3193562d18..d8694bb5c0 100644
--- a/libswscale/version.h
+++ b/libswscale/version.h
@@ -29,7 +29,7 @@
 #include "version_major.h"
 
 #define LIBSWSCALE_VERSION_MINOR   8
-#define LIBSWSCALE_VERSION_MICRO 102
+#define LIBSWSCALE_VERSION_MICRO 103
 
 #define LIBSWSCALE_VERSION_INT  AV_VERSION_INT(LIBSWSCALE_VERSION_MAJOR, \
                                                LIBSWSCALE_VERSION_MINOR, \