From patchwork Mon Nov 15 06:22:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Reid X-Patchwork-Id: 31423 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a6b:d206:0:0:0:0:0 with SMTP id q6csp5399406iob; Sun, 14 Nov 2021 22:22:57 -0800 (PST) X-Google-Smtp-Source: ABdhPJykkYbytBFycTlJRm70b31rBfmZXnAuB7wLhObHDSs2ziAeE8508hQZtnekoSnZrWE4/zj9 X-Received: by 2002:aa7:d510:: with SMTP id y16mr13573595edq.338.1636957377586; Sun, 14 Nov 2021 22:22:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1636957377; cv=none; d=google.com; s=arc-20160816; b=eiA6PpN6gTWAdLqaPAODwaZu+535liHbpdR+1V+QhofTOVEOPKfF8tc4fDDXyxyck8 CMlLbEISLD24h0jYh0BD8EfLMMlv9wCA7KmEyRgws1tBvwBQbt3zEgAM9DGBXNgK+qBK hTT4zBIh7ILd3j2NaN4HUhG1XVbzFJ9Xh5A4GrKBQfqohOtS4Dej6hUAM9+iELu/MNsX XDNln8Vce3RQYztEIYkAZnnmSnNQ/e5WU9av0LQ6Lz9VqhdHUlaigiQWT8NVpnoyGShj e6tM7amE2wKu+X4qid+AlZGKgaaNV2AZCfvDWaMfvzhS8ddiDykYz6b4wDFNXg8vZJjo S5tw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=XdZ2OUyp3FT7ekt9fDBFLEwRPsmf/jlpqWkXNqrKgaw=; b=RD0v5KTT+HFK3E9oZEEo1jzp4eQ15Xjn8AhH77sUOOZgQWp3OAT839mP/u67S9fr6N ncHKGdOYCw4EdVgZN7E6YGhTCAk1uLM10viaKdrWNlixBdsbq1lKqnLWp6qKjQ213p9t Rv26Bt2+XoTk2E5idxdGjJIZ0Bsvy8Dzq1auOa2P43h8d2D9DS6H2sKPrsXb0QzLxlPQ f27a9PjpyJ3LvAm0DIL8hd1yOOnAhsFv3mUYjpTwV1rYC+KTm6hObh6hkETTfIVc5lhf UHioVmvlnp5S0rx8uzRO0G9+/C/9Aex31fILfT0uRBOE5SgqgGk9j47fXQ0qYiOdZZ/g 8kIg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=PBmkXC4o; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id gk1si36444659ejc.780.2021.11.14.22.22.57; Sun, 14 Nov 2021 22:22:57 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=PBmkXC4o; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E648468A878; Mon, 15 Nov 2021 08:22:44 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C5DB568A74F for ; Mon, 15 Nov 2021 08:22:36 +0200 (EET) Received: by mail-pl1-f171.google.com with SMTP id m24so3604277pls.10 for ; Sun, 14 Nov 2021 22:22:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ULJewh9LS4jqWqyOOzn7zEit8ml4EBWkSjKIkQDCm88=; b=PBmkXC4ovHwcTXTZVX60880pBUdpvSRQZMzeuXxLSeCtaR2vjnE4T8ebB/3N/x7jWl mFs1CSGb3zhhh1noxgZL4pvugvKHOGkubWs3yD1xXu+EIxhx5clHjtK3yFpOQSUHCTbq aTBXYJ2hjalIq4JnWcfxK34GBgTZhWRBXHrwAPSzQR2ARUX1R/apQf+pze7q0014VHPl RiLzlo7qY6jX5G90eJJhOEXu0/Z61h0oFKdmaRZQ7/RE/65wlz+zMIaK7TCEhqnH7skR qTJpgqqhA/xKFlUF3tLs8ngug+stmzuF3qdPY8GnSCtVqZsOFw7NjnsYy9wQeAS3nPsN bivw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ULJewh9LS4jqWqyOOzn7zEit8ml4EBWkSjKIkQDCm88=; b=fO52UhcaoeN98ITXI3RXTDhPz4MJFDu5V2/KruYI5U8ONBK1QKrjKMyVPovRVFvMac sTBPSLUzordQV2ocuXY5u6JospCwX8UJ3cY6igTIpQiVFGVVeBapd8hixLXSVksbt4ku PToHA+b9BGRs2JT/oNGZHntgl69pEyUL50UrOgNoOAns1M0N2TQmSkZx563scKTUQ45W mym5SsaxU/SikLiHDrC4PDnXLVNWemJ42UPtG6Tl4yO+JQY5sL/b9CyRe1PIqeveQFiI pf90LOYgGFSuNGybWhKq1dYOquY2EPGt/sQJPd7LJJBZPbW+Yns4yH1xTN0KQHtp0aqv XKlw== X-Gm-Message-State: AOAM531fK1Gbzrch81gAG/SGbLRNo4byKQm1qQGsRDrypc56HLNrXuvu l7pLbBEe6oZcxN/ERnFMhqFjKX+gHas= X-Received: by 2002:a17:902:d28a:b0:142:61ce:ae4c with SMTP id t10-20020a170902d28a00b0014261ceae4cmr32452218plc.35.1636957354634; Sun, 14 Nov 2021 22:22:34 -0800 (PST) Received: from localhost.localdomain (S0106bc4dfba470f3.vc.shawcable.net. [174.7.244.175]) by smtp.gmail.com with ESMTPSA id 9sm10904025pgq.57.2021.11.14.22.22.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 14 Nov 2021 22:22:34 -0800 (PST) From: mindmark@gmail.com To: ffmpeg-devel@ffmpeg.org Date: Sun, 14 Nov 2021 22:22:21 -0800 Message-Id: <20211115062221.1650-2-mindmark@gmail.com> X-Mailer: git-send-email 2.31.1.windows.1 In-Reply-To: <20211115062221.1650-1-mindmark@gmail.com> References: <20211115062221.1650-1-mindmark@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 2/2] swscale/input: clip rgbf32 values before lrintf X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Mark Reid Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: kyADbgcY1QIe From: Mark Reid if the float pixel * 65535.0f > 2147483647.0f lrintf may overfow and return negative values, depending on implementation. nan and +/-inf values may also be implementation defined clip the value first so lrintf always works. values < 0.0f, -inf, nan = 0.0f values > 65535.0f, +inf = 65535.0f old timings 195960 decicycles in planar_rgbf32le_to_uv, 1 runs, 0 skips 186120 decicycles in planar_rgbf32le_to_uv, 2 runs, 0 skips 188645 decicycles in planar_rgbf32le_to_uv, 4 runs, 0 skips 183625 decicycles in planar_rgbf32le_to_uv, 8 runs, 0 skips 181157 decicycles in planar_rgbf32le_to_uv, 16 runs, 0 skips 177533 decicycles in planar_rgbf32le_to_uv, 32 runs, 0 skips 175689 decicycles in planar_rgbf32le_to_uv, 64 runs, 0 skips 232960 decicycles in planar_rgbf32be_to_uv, 1 runs, 0 skips 221380 decicycles in planar_rgbf32be_to_uv, 2 runs, 0 skips 216640 decicycles in planar_rgbf32be_to_uv, 4 runs, 0 skips 213505 decicycles in planar_rgbf32be_to_uv, 8 runs, 0 skips 211558 decicycles in planar_rgbf32be_to_uv, 16 runs, 0 skips 210596 decicycles in planar_rgbf32be_to_uv, 32 runs, 0 skips 210202 decicycles in planar_rgbf32be_to_uv, 64 runs, 0 skips 161680 decicycles in planar_rgbf32le_to_y, 1 runs, 0 skips 153540 decicycles in planar_rgbf32le_to_y, 2 runs, 0 skips 148255 decicycles in planar_rgbf32le_to_y, 4 runs, 0 skips 140600 decicycles in planar_rgbf32le_to_y, 8 runs, 0 skips 132935 decicycles in planar_rgbf32le_to_y, 16 runs, 0 skips 128531 decicycles in planar_rgbf32le_to_y, 32 runs, 0 skips 140933 decicycles in planar_rgbf32le_to_y, 64 runs, 0 skips 190980 decicycles in planar_rgbf32be_to_y, 1 runs, 0 skips 176080 decicycles in planar_rgbf32be_to_y, 2 runs, 0 skips 167980 decicycles in planar_rgbf32be_to_y, 4 runs, 0 skips 164685 decicycles in planar_rgbf32be_to_y, 8 runs, 0 skips 162751 decicycles in planar_rgbf32be_to_y, 16 runs, 0 skips 162404 decicycles in planar_rgbf32be_to_y, 32 runs, 0 skips 167849 decicycles in planar_rgbf32be_to_y, 64 runs, 0 skips new timings 183320 decicycles in planar_rgbf32le_to_uv, 1 runs, 0 skips 175700 decicycles in planar_rgbf32le_to_uv, 2 runs, 0 skips 179570 decicycles in planar_rgbf32le_to_uv, 4 runs, 0 skips 172932 decicycles in planar_rgbf32le_to_uv, 8 runs, 0 skips 168707 decicycles in planar_rgbf32le_to_uv, 16 runs, 0 skips 165224 decicycles in planar_rgbf32le_to_uv, 32 runs, 0 skips 163423 decicycles in planar_rgbf32le_to_uv, 64 runs, 0 skips 184940 decicycles in planar_rgbf32be_to_uv, 1 runs, 0 skips 185150 decicycles in planar_rgbf32be_to_uv, 2 runs, 0 skips 185790 decicycles in planar_rgbf32be_to_uv, 4 runs, 0 skips 185472 decicycles in planar_rgbf32be_to_uv, 8 runs, 0 skips 185277 decicycles in planar_rgbf32be_to_uv, 16 runs, 0 skips 185813 decicycles in planar_rgbf32be_to_uv, 32 runs, 0 skips 185332 decicycles in planar_rgbf32be_to_uv, 64 runs, 0 skips 145400 decicycles in planar_rgbf32le_to_y, 1 runs, 0 skips 145100 decicycles in planar_rgbf32le_to_y, 2 runs, 0 skips 143490 decicycles in planar_rgbf32le_to_y, 4 runs, 0 skips 136687 decicycles in planar_rgbf32le_to_y, 8 runs, 0 skips 131271 decicycles in planar_rgbf32le_to_y, 16 runs, 0 skips 128698 decicycles in planar_rgbf32le_to_y, 32 runs, 0 skips 127170 decicycles in planar_rgbf32le_to_y, 64 runs, 0 skips 156020 decicycles in planar_rgbf32be_to_y, 1 runs, 0 skips 146990 decicycles in planar_rgbf32be_to_y, 2 runs, 0 skips 142020 decicycles in planar_rgbf32be_to_y, 4 runs, 0 skips 141052 decicycles in planar_rgbf32be_to_y, 8 runs, 0 skips 138973 decicycles in planar_rgbf32be_to_y, 16 runs, 0 skips 138027 decicycles in planar_rgbf32be_to_y, 32 runs, 0 skips 143939 decicycles in planar_rgbf32be_to_y, 64 runs, 0 skips --- libswscale/input.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) -- 2.31.1.windows.1 diff --git a/libswscale/input.c b/libswscale/input.c index 90efdd2ffc..1351ea5bd4 100644 --- a/libswscale/input.c +++ b/libswscale/input.c @@ -973,7 +973,7 @@ static av_always_inline void planar_rgbf32_to_a(uint8_t *_dst, const uint8_t *_s uint16_t *dst = (uint16_t *)_dst; for (i = 0; i < width; i++) { - dst[i] = av_clip_uint16(lrintf(65535.0f * rdpx(src[3] + i))); + dst[i] = lrintf(av_clipf(65535.0f * rdpx(src[3] + i), 0.0f, 65535.0f)); } } @@ -987,9 +987,9 @@ static av_always_inline void planar_rgbf32_to_uv(uint8_t *_dstU, uint8_t *_dstV, int32_t rv = rgb2yuv[RV_IDX], gv = rgb2yuv[GV_IDX], bv = rgb2yuv[BV_IDX]; for (i = 0; i < width; i++) { - int g = av_clip_uint16(lrintf(65535.0f * rdpx(src[0] + i))); - int b = av_clip_uint16(lrintf(65535.0f * rdpx(src[1] + i))); - int r = av_clip_uint16(lrintf(65535.0f * rdpx(src[2] + i))); + int g = lrintf(av_clipf(65535.0f * rdpx(src[0] + i), 0.0f, 65535.0f)); + int b = lrintf(av_clipf(65535.0f * rdpx(src[1] + i), 0.0f, 65535.0f)); + int r = lrintf(av_clipf(65535.0f * rdpx(src[2] + i), 0.0f, 65535.0f)); dstU[i] = (ru*r + gu*g + bu*b + (0x10001 << (RGB2YUV_SHIFT - 1))) >> RGB2YUV_SHIFT; dstV[i] = (rv*r + gv*g + bv*b + (0x10001 << (RGB2YUV_SHIFT - 1))) >> RGB2YUV_SHIFT; @@ -1005,9 +1005,9 @@ static av_always_inline void planar_rgbf32_to_y(uint8_t *_dst, const uint8_t *_s int32_t ry = rgb2yuv[RY_IDX], gy = rgb2yuv[GY_IDX], by = rgb2yuv[BY_IDX]; for (i = 0; i < width; i++) { - int g = av_clip_uint16(lrintf(65535.0f * rdpx(src[0] + i))); - int b = av_clip_uint16(lrintf(65535.0f * rdpx(src[1] + i))); - int r = av_clip_uint16(lrintf(65535.0f * rdpx(src[2] + i))); + int g = lrintf(av_clipf(65535.0f * rdpx(src[0] + i), 0.0f, 65535.0f)); + int b = lrintf(av_clipf(65535.0f * rdpx(src[1] + i), 0.0f, 65535.0f)); + int r = lrintf(av_clipf(65535.0f * rdpx(src[2] + i), 0.0f, 65535.0f)); dst[i] = (ry*r + gy*g + by*b + (0x2001 << (RGB2YUV_SHIFT - 1))) >> RGB2YUV_SHIFT; } @@ -1021,7 +1021,7 @@ static av_always_inline void grayf32ToY16_c(uint8_t *_dst, const uint8_t *_src, uint16_t *dst = (uint16_t *)_dst; for (i = 0; i < width; ++i){ - dst[i] = av_clip_uint16(lrintf(65535.0f * rdpx(src + i))); + dst[i] = lrintf(av_clipf(65535.0f * rdpx(src + i), 0.0f, 65535.0f)); } }