From patchwork Sun Nov 14 02:56:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Reid X-Patchwork-Id: 31415 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a6b:d206:0:0:0:0:0 with SMTP id q6csp3967866iob; Sat, 13 Nov 2021 18:57:20 -0800 (PST) X-Google-Smtp-Source: ABdhPJytCxisnTrKiD97Gw40AjEoypHTHZJAT1EML2vEBlWAlCwUjmULo6mMzOmvz2vkyuJ9I0oj X-Received: by 2002:a50:fb09:: with SMTP id d9mr5116053edq.283.1636858640602; Sat, 13 Nov 2021 18:57:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1636858640; cv=none; d=google.com; s=arc-20160816; b=D9T2fkjqL4ufn6pCVlux/CRsVNCPUyBz9fvzypeWGHpvj2sz/u2e4qYltNXlgpESAz bMr//1ddjif6YVrMtusJMwLzzGqofuW22RjLWKaACYCApke94dBiR7SlMkV/5tmqRXvs 4WcmMe31jgO9tKQoqpuexfZ+BgNCeKzYaQrE61ItR+Mk6lkqk3CGTsFrHCsxp3hvhCDn 60v27qsfvkjy9HC9wPbHYo79K5p6GWy62lFL9Gnn2LQCAeXCBKjhV4GAOzENAuGar6sd kQ39CmxaFLa7+2KbsTgJAaB+ohYiGQmj04IGRturrKqqaKbP7m5t6uaOypO98kNE2VtY c45w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=WdzzkPAcHMEuOoP3gebEW3/Hl2nH/LSm9j9MNR42IMc=; b=Z2MA27hCTNJZQ0sKEVwfSdaeflC3BQSqoNHRF+4cLR6P0SwJhzpY3vQu+ZYMTeABzH EQIE2UdEPNUznHrV9+wn4kjrLSxC+aijk9ORErwO/l3u0422L8h8hODClZvB+OSpoXbe KOBpooMAfDTdk06W2VAnVItTLs6yl56IPRwCRK6lGj4hv/Wb12AzJCzMpyCSE6j750Rg Zd0Zy+5uo0hp5hvI+OmHZNmJLcnDa4dIgw/cLZT5me9W67vhMiXt9yPuoAqQCgmgsvwO LX+HLDbtwy5cgiCUYTHD1scmjKgepKhaBNOfY7x6WfaCvGSBOwTZtZVXG+P2WINhlGtU 2Pnw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=UrOd1Sxm; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id j26si6795474edf.42.2021.11.13.18.57.20; Sat, 13 Nov 2021 18:57:20 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=UrOd1Sxm; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2FC4368A77A; Sun, 14 Nov 2021 04:57:16 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pg1-f181.google.com (mail-pg1-f181.google.com [209.85.215.181]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id CE249680B9A for ; Sun, 14 Nov 2021 04:57:08 +0200 (EET) Received: by mail-pg1-f181.google.com with SMTP id r5so5506130pgi.6 for ; Sat, 13 Nov 2021 18:57:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=8uilHZ/iaypRK2SeWiiB0hH4ulBqUS9TEs8gVlZk85c=; b=UrOd1Sxm+9uZKH4NFjFnBVU4UvxE8D+MmMDq+JSp6bM2iRjqC0In+DbTS2lQvAIgee fh+xCljeREDx8rJzd2tbG76uF7G3zgT+UU+jqzN81uDvP0DXf0JBETU7lVEhu6co3kSn Zsg+97wyynGqQ0ed9fW/euHIgfLOIcsBco2VHh8QhO9MNoHvFo6p+Lz5uVODdsjlxa/a rLiYSu7HW8nj3YWCDf4xs5WKS1bdOz7zKzIedUsrvXriAskLSmz3RmPCZdWDMMd4TYi/ 08TyGWdFV1qvDIoZaRJBqakiSKxVbYJFo2Gk1jInsPcz5PUmi14aD6PslsdXYtZANzwb Aseg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=8uilHZ/iaypRK2SeWiiB0hH4ulBqUS9TEs8gVlZk85c=; b=yH78RDQNhv/SvxdiEy6cKiDvzwHidZzZMag0cGCBj5eoqkwzM2z3IujnGna0hGELG3 xK2wgegMF2xMfu3l6Ud2MQrSNvmsn4v8HjSBo5UffMuI4JMGTGnDBJiZlq2NdIukI+dL kjyZiy8GYCnPp9P2eIXXWU/MxrycWfH2JRRJguW1r0r9d0o//d6d18yYOg0ySFWKAWv9 7x/CUnSm9bqXKXBSmBoEP9zAKgm3LHrDtB7r0t5+ardVyZT7te+7FOygkiIHmAb6BV5s q/7H22NAPhL/JI69r4l/R7CHQWFwmCnHVVSyCWVfF1JoJ7cVnH845ctNbNugvqguWx1s PDQQ== X-Gm-Message-State: AOAM531qBb5p3pDoMhpT01xzrMaAmreEmuy+n+cYZeWgy/hLmjA95yEI jQBdCENaLNve0O551qdVba0g+u7Pjko= X-Received: by 2002:a63:5614:: with SMTP id k20mr17537800pgb.252.1636858626108; Sat, 13 Nov 2021 18:57:06 -0800 (PST) Received: from localhost.localdomain (S0106bc4dfba470f3.vc.shawcable.net. [174.7.244.175]) by smtp.gmail.com with ESMTPSA id pg13sm9264469pjb.8.2021.11.13.18.57.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 13 Nov 2021 18:57:05 -0800 (PST) From: mindmark@gmail.com To: ffmpeg-devel@ffmpeg.org Date: Sat, 13 Nov 2021 18:56:52 -0800 Message-Id: <20211114025653.654-1-mindmark@gmail.com> X-Mailer: git-send-email 2.31.1.windows.1 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2 1/2] swscale/input: unify grayf32 funcs with rgbf32 funcs X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Mark Reid Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 7Twt/uiC+0tV From: Mark Reid This is ment to be a cosmetic change old timings: 42780 UNITS in grayf32le, 1 runs, 0 skips 56720 UNITS in grayf32le, 2 runs, 0 skips 67265 UNITS in grayf32le, 4 runs, 0 skips 58082 UNITS in grayf32le, 8 runs, 0 skips 63512 UNITS in grayf32le, 16 runs, 0 skips 52720 UNITS in grayf32le, 32 runs, 0 skips 46491 UNITS in grayf32le, 64 runs, 0 skips 68500 UNITS in grayf32be, 1 runs, 0 skips 66930 UNITS in grayf32be, 2 runs, 0 skips 62305 UNITS in grayf32be, 4 runs, 0 skips 55510 UNITS in grayf32be, 8 runs, 0 skips 50216 UNITS in grayf32be, 16 runs, 0 skips 44480 UNITS in grayf32be, 32 runs, 0 skips 42394 UNITS in grayf32be, 64 runs, 0 skips new timings: 46660 UNITS in grayf32le, 1 runs, 0 skips 51830 UNITS in grayf32le, 2 runs, 0 skips 53390 UNITS in grayf32le, 4 runs, 0 skips 50910 UNITS in grayf32le, 8 runs, 0 skips 44968 UNITS in grayf32le, 16 runs, 0 skips 40349 UNITS in grayf32le, 32 runs, 0 skips 38330 UNITS in grayf32le, 64 runs, 0 skips 39980 UNITS in grayf32be, 1 runs, 0 skips 49630 UNITS in grayf32be, 2 runs, 0 skips 53540 UNITS in grayf32be, 4 runs, 0 skips 59767 UNITS in grayf32be, 8 runs, 0 skips 51206 UNITS in grayf32be, 16 runs, 0 skips 44743 UNITS in grayf32be, 32 runs, 0 skips 41468 UNITS in grayf32be, 64 runs, 0 skips --- libswscale/input.c | 36 +++++++++++------------------------- 1 file changed, 11 insertions(+), 25 deletions(-) -- 2.31.1.windows.1 diff --git a/libswscale/input.c b/libswscale/input.c index 336f957c8c..90efdd2ffc 100644 --- a/libswscale/input.c +++ b/libswscale/input.c @@ -1013,31 +1013,19 @@ static av_always_inline void planar_rgbf32_to_y(uint8_t *_dst, const uint8_t *_s } } -#undef rdpx - static av_always_inline void grayf32ToY16_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused1, - const uint8_t *unused2, int width, uint32_t *unused) + const uint8_t *unused2, int width, int is_be, uint32_t *unused) { int i; const float *src = (const float *)_src; uint16_t *dst = (uint16_t *)_dst; for (i = 0; i < width; ++i){ - dst[i] = av_clip_uint16(lrintf(65535.0f * src[i])); + dst[i] = av_clip_uint16(lrintf(65535.0f * rdpx(src + i))); } } -static av_always_inline void grayf32ToY16_bswap_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused1, - const uint8_t *unused2, int width, uint32_t *unused) -{ - int i; - const uint32_t *src = (const uint32_t *)_src; - uint16_t *dst = (uint16_t *)_dst; - - for (i = 0; i < width; ++i){ - dst[i] = av_clip_uint16(lrintf(65535.0f * av_int2float(av_bswap32(src[i])))); - } -} +#undef rdpx #define rgb9plus_planar_funcs_endian(nbits, endian_name, endian) \ static void planar_rgb##nbits##endian_name##_to_y(uint8_t *dst, const uint8_t *src[4], \ @@ -1092,6 +1080,12 @@ static void planar_rgbf32##endian_name##_to_a(uint8_t *dst, const uint8_t *src[4 int w, int32_t *rgb2yuv) \ { \ planar_rgbf32_to_a(dst, src, w, endian, rgb2yuv); \ +} \ +static void grayf32##endian_name##ToY16_c(uint8_t *dst, const uint8_t *src, \ + const uint8_t *unused1, const uint8_t *unused2, \ + int width, uint32_t *unused) \ +{ \ + grayf32ToY16_c(dst, src, unused1, unused2, width, endian, unused); \ } rgbf32_planar_funcs_endian(le, 0) @@ -1699,18 +1693,10 @@ av_cold void ff_sws_init_input_funcs(SwsContext *c) c->lumToYV12 = p010BEToY_c; break; case AV_PIX_FMT_GRAYF32LE: -#if HAVE_BIGENDIAN - c->lumToYV12 = grayf32ToY16_bswap_c; -#else - c->lumToYV12 = grayf32ToY16_c; -#endif + c->lumToYV12 = grayf32leToY16_c; break; case AV_PIX_FMT_GRAYF32BE: -#if HAVE_BIGENDIAN - c->lumToYV12 = grayf32ToY16_c; -#else - c->lumToYV12 = grayf32ToY16_bswap_c; -#endif + c->lumToYV12 = grayf32beToY16_c; break; case AV_PIX_FMT_Y210LE: c->lumToYV12 = y210le_Y_c; From patchwork Sun Nov 14 02:56:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Reid X-Patchwork-Id: 31414 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a6b:d206:0:0:0:0:0 with SMTP id q6csp3967967iob; Sat, 13 Nov 2021 18:57:29 -0800 (PST) X-Google-Smtp-Source: ABdhPJx0BLUaR732m5sh4+FDin3ln6r7mgrzqj+ZjDEcXZQLpTJ3dzxlHvONSKo2IainTTEjPlR2 X-Received: by 2002:a17:906:d92f:: with SMTP id rn15mr35528273ejb.557.1636858649695; Sat, 13 Nov 2021 18:57:29 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1636858649; cv=none; d=google.com; s=arc-20160816; b=vCjRcaDv0PwOaKeyMGTf6Zdwhqupi/KrnQzA2gu+hHp6ga2Owc4hk98y6153R3EqB8 7u1ScAAfr6IU72CMqXwSbZdnLzpdCeL9Tx+nzVROj8m/u/JxkFtYeyAEnSpfc5edWrmJ Aylkm2t7gtEda8CzIWt3V10uLWQO0MOOQlctpu0wHRpIuqqjYVko9uzD7t/HyGi83POW QPqR2rIIcZu+bhkz0NLDF2FG7mTeSVU5KY4eJNwmoksHT1BOwm8Yx2tPvyB8ZD/2Vm09 00SEs0ioofiR0hhK1VNU0DBLherNELeZfqt52j0FliAOam+5GQyYWtuLalZbzU8PddFq qRzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=Sv8sPkpEAxg5gWEYb4RuO8tTxNGbzgWGq3eetnlrElY=; b=GTpIjI8wyzGGVXOEVPp/VZOZdau9s619OYb52x4RaD4Dd7z+cmKa0cQR77C2JGfz8y G0rX2M/65omI4SIxFcke3ucDqnhppocO4Z3aCet7L/rIKhIMFa1Kb3VxFDvS1RG2PIPH 77rJI89yY8ESrMEMkI4lp+egcG9xFbHyEyAV160C2hS4K2rZDyNIbNU07wu4b491dcTD pc+vo08vrH9EqE69xnFX/L3kOXa6rG3w9B6FUje8pOpbMyG0zsT4BvnyhxleblmoLZqc 4JFiDXRVZ0RQ2AyHHnt+YH/XUmYEbP3cXD4CIuXqa0gZmCTSMlybpqO6iSmEaoKgaYCh PyPQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=eODRhP7Y; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id go40si24597863ejc.27.2021.11.13.18.57.29; Sat, 13 Nov 2021 18:57:29 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=eODRhP7Y; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 7257E68A7AD; Sun, 14 Nov 2021 04:57:17 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id E81A3680CB2 for ; Sun, 14 Nov 2021 04:57:08 +0200 (EET) Received: by mail-pl1-f172.google.com with SMTP id k4so11494974plx.8 for ; Sat, 13 Nov 2021 18:57:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=PNxS8vq4LlSW8WNlN70oD8vDRWZTCqNB0/lJVp2BCxE=; b=eODRhP7YGI4f1Tws7UWwZCAwe69ypDVdjSAFOYw6O/D2yS4g7FxDr0tE7OuQnPNPbZ ubKDHkYEHdebu/Kcnzv29MqGK00rx1g2JgT7I+EACV4nTimWp212QMsI0XWZrp+GsRxW 1k3ll/9FCTtugP9g7eruUT1V/ny7hKraYmy6u/23RQHch3QuRhnGfCgNN3/n9SzBGRFd gZS5VhIMzXzC+74ifsfz7peCJ2Z/ua5eqUA+s8Yg1LchSzj+opR/bw7AOJSzQYmmq9ke vESOj5tJHd6jZzuSOKGCdM1Fp0vh8tH8BMAQbzbowx7a/h77FTHvqbd9WPV6z7zFz5k+ 2yWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=PNxS8vq4LlSW8WNlN70oD8vDRWZTCqNB0/lJVp2BCxE=; b=jwsuj/9y2B4IXjdMkQrb0upNwbDV0RXHWewoQxUbC2bxsHE3U40JODH8eoCfJToqPF fWTP7l2BxdW/PCRzOHLA/9t7P3BQFbxK5nrCpGjJrgDWrBXgYw4XKb671sMpYcD0ngW0 90EwKIgTCU8tUFB7vDBISh0evzhCPqwctT5IURkTssmwg0MEyeialOo+2pym00flHIj4 u/czVhsybv7vxhP0yhKa8He/wdEsAOo8ugNH51K3v7sU0i0JtQCWQur0KkR0Yst80qZY heN68K8VSlrlu/9Rc9d6wbKviGmGUszygFbLJUKWG+Erug2SKz6yEyKBJuU3Un5QJxT+ /8og== X-Gm-Message-State: AOAM5308opdPBaFyOpp5s2dl3fH051eOZbHH2Inc9kY5eQvJI1H59wlx by8+ESu4ON7FMjsULPRKNNVKAzbX7Xc= X-Received: by 2002:a17:902:d3c1:b0:142:2794:c8cc with SMTP id w1-20020a170902d3c100b001422794c8ccmr22333481plb.67.1636858626632; Sat, 13 Nov 2021 18:57:06 -0800 (PST) Received: from localhost.localdomain (S0106bc4dfba470f3.vc.shawcable.net. [174.7.244.175]) by smtp.gmail.com with ESMTPSA id pg13sm9264469pjb.8.2021.11.13.18.57.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 13 Nov 2021 18:57:06 -0800 (PST) From: mindmark@gmail.com To: ffmpeg-devel@ffmpeg.org Date: Sat, 13 Nov 2021 18:56:53 -0800 Message-Id: <20211114025653.654-2-mindmark@gmail.com> X-Mailer: git-send-email 2.31.1.windows.1 In-Reply-To: <20211114025653.654-1-mindmark@gmail.com> References: <20211114025653.654-1-mindmark@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2 2/2] swscale/input: clamp rgbf32 values before lrintf X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Mark Reid Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 4q6nXUlXFI+z From: Mark Reid if the float pixel * 65535.0f > 2147483647.0f lrintf may overfow and return negative values, depending on implementation. nan and +/-inf values may also be implementation defined clamp the value first so lrintf so, always works. values <=0.0f, -inf, nan = 0.0f values >=1.0f, +inf = 1.0f the clamping adds some performance overhead, but using a inline function seems help the compiler optimize on the compiliers I tested. old timings 213920 UNITS in planar_rgbf32le_to_uv, 1 runs, 0 skips 218830 UNITS in planar_rgbf32le_to_uv, 2 runs, 0 skips 223285 UNITS in planar_rgbf32le_to_uv, 4 runs, 0 skips 215405 UNITS in planar_rgbf32le_to_uv, 8 runs, 0 skips 208920 UNITS in planar_rgbf32le_to_uv, 16 runs, 0 skips 205115 UNITS in planar_rgbf32le_to_uv, 32 runs, 0 skips 212220 UNITS in planar_rgbf32le_to_uv, 64 runs, 0 skips 216440 UNITS in planar_rgbf32be_to_uv, 1 runs, 0 skips 222450 UNITS in planar_rgbf32be_to_uv, 2 runs, 0 skips 228780 UNITS in planar_rgbf32be_to_uv, 4 runs, 0 skips 226900 UNITS in planar_rgbf32be_to_uv, 8 runs, 0 skips 223168 UNITS in planar_rgbf32be_to_uv, 16 runs, 0 skips 249340 UNITS in planar_rgbf32be_to_uv, 32 runs, 0 skips 233746 UNITS in planar_rgbf32be_to_uv, 64 runs, 0 skips 173360 UNITS in planar_rgbf32le_to_y, 1 runs, 0 skips 179970 UNITS in planar_rgbf32le_to_y, 2 runs, 0 skips 182960 UNITS in planar_rgbf32le_to_y, 4 runs, 0 skips 177040 UNITS in planar_rgbf32le_to_y, 8 runs, 0 skips 170351 UNITS in planar_rgbf32le_to_y, 16 runs, 0 skips 167136 UNITS in planar_rgbf32le_to_y, 32 runs, 0 skips 165821 UNITS in planar_rgbf32le_to_y, 64 runs, 0 skips 181040 UNITS in planar_rgbf32be_to_y, 1 runs, 0 skips 182920 UNITS in planar_rgbf32be_to_y, 2 runs, 0 skips 180935 UNITS in planar_rgbf32be_to_y, 4 runs, 0 skips 180897 UNITS in planar_rgbf32be_to_y, 8 runs, 0 skips 179640 UNITS in planar_rgbf32be_to_y, 16 runs, 0 skips 178912 UNITS in planar_rgbf32be_to_y, 32 runs, 0 skips 177983 UNITS in planar_rgbf32be_to_y, 64 runs, 0 skips new timings 228860 UNITS in planar_rgbf32le_to_uv, 1 runs, 0 skips 232400 UNITS in planar_rgbf32le_to_uv, 2 runs, 0 skips 237270 UNITS in planar_rgbf32le_to_uv, 4 runs, 0 skips 229992 UNITS in planar_rgbf32le_to_uv, 8 runs, 0 skips 222270 UNITS in planar_rgbf32le_to_uv, 16 runs, 0 skips 218896 UNITS in planar_rgbf32le_to_uv, 32 runs, 0 skips 216938 UNITS in planar_rgbf32le_to_uv, 64 runs, 0 skips 232340 UNITS in planar_rgbf32be_to_uv, 1 runs, 0 skips 231830 UNITS in planar_rgbf32be_to_uv, 2 runs, 0 skips 242235 UNITS in planar_rgbf32be_to_uv, 4 runs, 0 skips 235210 UNITS in planar_rgbf32be_to_uv, 8 runs, 0 skips 229040 UNITS in planar_rgbf32be_to_uv, 16 runs, 0 skips 224996 UNITS in planar_rgbf32be_to_uv, 32 runs, 0 skips 223581 UNITS in planar_rgbf32be_to_uv, 64 runs, 0 skips 179220 UNITS in planar_rgbf32le_to_y, 1 runs, 0 skips 174790 UNITS in planar_rgbf32le_to_y, 2 runs, 0 skips 182630 UNITS in planar_rgbf32le_to_y, 4 runs, 0 skips 183002 UNITS in planar_rgbf32le_to_y, 8 runs, 0 skips 181005 UNITS in planar_rgbf32le_to_y, 16 runs, 0 skips 179390 UNITS in planar_rgbf32le_to_y, 32 runs, 0 skips 192476 UNITS in planar_rgbf32le_to_y, 64 runs, 0 skips 195620 UNITS in planar_rgbf32be_to_y, 1 runs, 0 skips 195860 UNITS in planar_rgbf32be_to_y, 2 runs, 0 skips 198700 UNITS in planar_rgbf32be_to_y, 4 runs, 0 skips 197252 UNITS in planar_rgbf32be_to_y, 8 runs, 0 skips 195702 UNITS in planar_rgbf32be_to_y, 16 runs, 0 skips 194853 UNITS in planar_rgbf32be_to_y, 32 runs, 0 skips 194459 UNITS in planar_rgbf32be_to_y, 64 runs, 0 skips --- libswscale/input.c | 21 +++++++++++++-------- 1 file changed, 13 insertions(+), 8 deletions(-) -- 2.31.1.windows.1 diff --git a/libswscale/input.c b/libswscale/input.c index 90efdd2ffc..2a13846abe 100644 --- a/libswscale/input.c +++ b/libswscale/input.c @@ -966,6 +966,11 @@ static av_always_inline void planar_rgb16_to_uv(uint8_t *_dstU, uint8_t *_dstV, #define rdpx(src) (is_be ? av_int2float(AV_RB32(src)): av_int2float(AV_RL32(src))) +static av_always_inline float clampf(float x, float min, float max) +{ + return FFMIN(FFMAX(x, min), max); +} + static av_always_inline void planar_rgbf32_to_a(uint8_t *_dst, const uint8_t *_src[4], int width, int is_be, int32_t *rgb2yuv) { int i; @@ -973,7 +978,7 @@ static av_always_inline void planar_rgbf32_to_a(uint8_t *_dst, const uint8_t *_s uint16_t *dst = (uint16_t *)_dst; for (i = 0; i < width; i++) { - dst[i] = av_clip_uint16(lrintf(65535.0f * rdpx(src[3] + i))); + dst[i] = lrintf(clampf(65535.0f * rdpx(src[3] + i), 0.0f, 65535.0f)); } } @@ -987,9 +992,9 @@ static av_always_inline void planar_rgbf32_to_uv(uint8_t *_dstU, uint8_t *_dstV, int32_t rv = rgb2yuv[RV_IDX], gv = rgb2yuv[GV_IDX], bv = rgb2yuv[BV_IDX]; for (i = 0; i < width; i++) { - int g = av_clip_uint16(lrintf(65535.0f * rdpx(src[0] + i))); - int b = av_clip_uint16(lrintf(65535.0f * rdpx(src[1] + i))); - int r = av_clip_uint16(lrintf(65535.0f * rdpx(src[2] + i))); + int g = lrintf(clampf(65535.0f * rdpx(src[0] + i), 0.0f, 65535.0f)); + int b = lrintf(clampf(65535.0f * rdpx(src[1] + i), 0.0f, 65535.0f)); + int r = lrintf(clampf(65535.0f * rdpx(src[2] + i), 0.0f, 65535.0f)); dstU[i] = (ru*r + gu*g + bu*b + (0x10001 << (RGB2YUV_SHIFT - 1))) >> RGB2YUV_SHIFT; dstV[i] = (rv*r + gv*g + bv*b + (0x10001 << (RGB2YUV_SHIFT - 1))) >> RGB2YUV_SHIFT; @@ -1005,9 +1010,9 @@ static av_always_inline void planar_rgbf32_to_y(uint8_t *_dst, const uint8_t *_s int32_t ry = rgb2yuv[RY_IDX], gy = rgb2yuv[GY_IDX], by = rgb2yuv[BY_IDX]; for (i = 0; i < width; i++) { - int g = av_clip_uint16(lrintf(65535.0f * rdpx(src[0] + i))); - int b = av_clip_uint16(lrintf(65535.0f * rdpx(src[1] + i))); - int r = av_clip_uint16(lrintf(65535.0f * rdpx(src[2] + i))); + int g = lrintf(clampf(65535.0f * rdpx(src[0] + i), 0.0f, 65535.0f)); + int b = lrintf(clampf(65535.0f * rdpx(src[1] + i), 0.0f, 65535.0f)); + int r = lrintf(clampf(65535.0f * rdpx(src[2] + i), 0.0f, 65535.0f)); dst[i] = (ry*r + gy*g + by*b + (0x2001 << (RGB2YUV_SHIFT - 1))) >> RGB2YUV_SHIFT; } @@ -1021,7 +1026,7 @@ static av_always_inline void grayf32ToY16_c(uint8_t *_dst, const uint8_t *_src, uint16_t *dst = (uint16_t *)_dst; for (i = 0; i < width; ++i){ - dst[i] = av_clip_uint16(lrintf(65535.0f * rdpx(src + i))); + dst[i] = lrintf(clampf(65535.0f * rdpx(src + i), 0.0f, 65535.0f)); } }