From patchwork Mon May 7 17:24:19 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?Q2zDqW1lbnQgQsWTc2No?= X-Patchwork-Id: 8841 Delivered-To: ffmpegpatchwork@gmail.com Received: by 2002:a02:155:0:0:0:0:0 with SMTP id c82-v6csp3103355jad; Mon, 7 May 2018 10:25:30 -0700 (PDT) X-Google-Smtp-Source: AB8JxZo/fL4jxXI13uebI4brA0961BYRy7vAu2fxEtIFdeBpv6PIftTMBlqBOdIqNnajVd4n0RnM X-Received: by 2002:a1c:c5cd:: with SMTP id v196-v6mr1316894wmf.16.1525713930541; Mon, 07 May 2018 10:25:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525713930; cv=none; d=google.com; s=arc-20160816; b=aXI+QIh+t6W60oTYj8UHMOXeOXbxSQb7vgKoomMZrWLtipGSNGWNfUwRQHg59kFhcp tL2uAJvIwtw1tc2xFZxpd6BMTa+NL+Q4Hfmu+NUEPb9NZCmAUVy1x/r6h65BP4Iy8Xcy jhwpMDOKAJKQHlnCji57jaWEb0T/pd2mzQUXBGe4Wmz93RAcN7bRRc3qkBbd5f3FQtpr NjkQToOvOTssM2XtkX8YC4nf6/l78e3vskXzbYc2PzfUiUwSoj9MOHImxYgJ1hVIEfw7 LmL1WlwcIUYhvFUyFfwurZkuEf9BUKNQ/a/QZ9CZeo8s5UOVqmbQS3x4PMZlQY5B1TzM cpPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:references:in-reply-to:message-id:date :to:from:domainkey-signature:dkim-signature:delivered-to :arc-authentication-results; bh=IAxnkh00DiiRxXruPhN2z17CQ3I16V5574a73Y+qaJY=; b=rCXT8W12s/UXO7fPC0Cu40TXxWUuq/6aEpQA+d3mRl9MUovKfcB/fIzWOKEOxhCtKs a978d/FdN88kLlRpuBL6xnWdh6CZNfDKT7asiYw/DeEBYtUZWosTpapEKnkbe4otsng+ koS3IShuYhuwfMKlkUJ7KHquaNazMWQ21PbS+gK1255pAiF/UyIqZZwxWd2m1zEjqX/A 549pcHav/m2H6FwEv6MaF1RnTXSvceyQEkLjRnPYN2YlVOKqc0/t/Gijr/K9bA6S+loM eIDxaFBDbw755EJYYHeWVZWZbnIdNug2Fy8bzVBrFVyyWloMZ1g2ctBqRCYPEidlc6bA rexw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@pkh.me header.s=selector1 header.b=OSJcW3wc; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id h34-v6si11752825wrf.259.2018.05.07.10.25.30; Mon, 07 May 2018 10:25:30 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@pkh.me header.s=selector1 header.b=OSJcW3wc; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5DED768A671; Mon, 7 May 2018 20:24:08 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from golem.pkh.me (LStLambert-657-1-117-164.w92-154.abo.wanadoo.fr [92.154.28.164]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 16B1A68A658 for ; Mon, 7 May 2018 20:24:06 +0300 (EEST) Received: from golem.pkh.me (localhost.localdomain [127.0.0.1]) by golem.pkh.me (OpenSMTPD) with ESMTP id 354de1bf for ; Mon, 7 May 2018 17:24:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pkh.me; h=from:to:cc :subject:date:message-id:in-reply-to:references; s=selector1; bh=uHftDat+gQOg/L8dR9ooEN0SNUw=; b=OSJcW3wcit+/AEXk1+BwraZDoLal rpBBrQXDIHnDxKdm9kWBWtGdC208ssAWU7Rj24/iDb1PP3zr8+QStYKtU0dK5oRA fBPd/3Bq5B4o3jMBZfOnLOcTtCewh7Vr1AGTr1JUDI5wai4v4Rlq1YvaT2glFMqP ieWNTp0QCehwl0YUPbdW32DZKY40NqkiyMQ6XSAIDuoA+oODpzPIjDCRxFpiwus1 SxWXGjMzv/Z17jFkGCi6ZEVJOPzrko7FOwfd6xP9gUJ/hQkq4pX5FPYDB0/NvkDI X0zgMWovHOE5HSSrUk2+/vM1f7R8K4Q3WCaWlGflDREWmIi4C9ELs0OaGQ== DomainKey-Signature: a=rsa-sha1; c=nofws; d=pkh.me; h=from:to:cc:subject :date:message-id:in-reply-to:references; q=dns; s=selector1; b=Y 3VVVlkFcFzjM0NNJ7ve4JXcq/bkHs9QsHi1tg1N7BZdMQ2yacUJla3SA20q+zSXG Dwzc1c+QTnbjFXtql+KiVeghef5eoYl2tBZTLEmiUzuotwOajEDEX2PblsNDPel/ 0ZxZcNnQHs+6Ii5RAARSqrj0OFAds3FQXPF2JKD63cTVnyBbw45p/Jr82btT1ogc +BCu2x8sH0UHzP+Oh7RuSLlN5o72SZ4t4uAYHYKyTTSe9j/bW2w4pntX4mn08lor PiWevb8E1OevdfQjSUp1Q9sgrUnDr3WjSOBgUIP3ryJ5MIlud/Nj1dwzK5dd/JuS nKUuYIwQq/llZG0KTKtdA== Received: from localhost (golem.pkh.me [local]) by golem.pkh.me (OpenSMTPD) with ESMTPA id 53cfe4b7; Mon, 7 May 2018 17:24:24 +0000 (UTC) From: =?UTF-8?q?Cl=C3=A9ment=20B=C5=93sch?= To: ffmpeg-devel@ffmpeg.org Date: Mon, 7 May 2018 19:24:19 +0200 Message-Id: <20180507172422.11003-8-u@pkh.me> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180507172422.11003-1-u@pkh.me> References: <20180507172422.11003-1-u@pkh.me> Subject: [FFmpeg-devel] [PATCH v2 07/10] lavfi/nlmeans: switch from double to float X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: =?UTF-8?q?Cl=C3=A9ment=20B=C5=93sch?= MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Overall speed appears to be 1.1x faster with no noticeable quality impact. --- libavfilter/vf_nlmeans.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/libavfilter/vf_nlmeans.c b/libavfilter/vf_nlmeans.c index f37f1183f7..aba587f46b 100644 --- a/libavfilter/vf_nlmeans.c +++ b/libavfilter/vf_nlmeans.c @@ -40,8 +40,8 @@ #include "video.h" struct weighted_avg { - double total_weight; - double sum; + float total_weight; + float sum; }; #define WEIGHT_LUT_NBITS 9 @@ -63,8 +63,8 @@ typedef struct NLMeansContext { ptrdiff_t ii_lz_32; // linesize in 32-bit units of the integral image struct weighted_avg *wa; // weighted average of every pixel ptrdiff_t wa_linesize; // linesize for wa in struct size unit - double weight_lut[WEIGHT_LUT_SIZE]; // lookup table mapping (scaled) patch differences to their associated weights - double pdiff_lut_scale; // scale factor for patch differences before looking into the LUT + float weight_lut[WEIGHT_LUT_SIZE]; // lookup table mapping (scaled) patch differences to their associated weights + float pdiff_lut_scale; // scale factor for patch differences before looking into the LUT int max_meaningful_diff; // maximum difference considered (if the patch difference is too high we ignore the pixel) NLMeansDSPContext dsp; } NLMeansContext; @@ -402,7 +402,7 @@ static int nlmeans_slice(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs const int patch_diff_sq = get_integral_patch_value(td->ii_start, s->ii_lz_32, x, y, td->p); if (patch_diff_sq < s->max_meaningful_diff) { const int weight_lut_idx = patch_diff_sq * s->pdiff_lut_scale; - const double weight = s->weight_lut[weight_lut_idx]; // exp(-patch_diff_sq * s->pdiff_scale) + const float weight = s->weight_lut[weight_lut_idx]; // exp(-patch_diff_sq * s->pdiff_scale) wa[x].total_weight += weight; wa[x].sum += weight * src[x]; } @@ -453,8 +453,8 @@ static int nlmeans_plane(AVFilterContext *ctx, int w, int h, int p, int r, struct weighted_avg *wa = &s->wa[y*s->wa_linesize + x]; // Also weight the centered pixel - wa->total_weight += 1.0; - wa->sum += 1.0 * src[y*src_linesize + x]; + wa->total_weight += 1.f; + wa->sum += 1.f * src[y*src_linesize + x]; dst[y*dst_linesize + x] = av_clip_uint8(wa->sum / wa->total_weight); }