From patchwork Fri Oct 29 15:18:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul B Mahol X-Patchwork-Id: 31256 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a5e:a610:0:0:0:0:0 with SMTP id q16csp1803643ioi; Fri, 29 Oct 2021 08:19:13 -0700 (PDT) X-Google-Smtp-Source: ABdhPJypWuOKGMu6NfbBQLpMtnHVMRzISswpTZEeMv+sRFYbEut8fVbWjgHzr9pehRjyMt2Qm1P8 X-Received: by 2002:a17:907:961d:: with SMTP id gb29mr15212116ejc.457.1635520752814; Fri, 29 Oct 2021 08:19:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635520752; cv=none; d=google.com; s=arc-20160816; b=XEMMeryjXUmAEF6ZiQrlTpl2FSjTBEoLYYJpstk3vx1Ob4aNG/I1vvnAfjJkiD/BTa vaWFTYnjPmTuUoAdGO8l1Sqm3aOrkfp8s3QmucrZoLKy/FZ380794tjcfj5sW9LOHaqn w6ed+sQTEww8pSqN5S3T0Nli739IcdKzKlIgosb2FOfiE1EOCAQdvujWhDno8ptPflQh 8q6ou/B2E6dwRimYFiDiKF1hLfc37HyQ2SqtNLJ7a+3qp0OZYzzeGG+tzuvVPRWiRZ39 LOCxorrakbX3D4RNrq4EmBLg1BV7uGNweyRe0Ozhog9WgRrpeeY4ly94DRsvZqIsFuNy To2A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=ftq+Ngri4elMI/14XifkB610ggVsKXVlfr42pRBRwnE=; b=SaB/QynQV2V9S26QFjm8jlEBnLaIO3fTUb6wnAIMDuobMVyo+FnU1yK5Hrd1kqsneT 0QqhlyQk7JlFoV3HUbEZP11Z6M6NxhugQoXUGQcLj6L9hG5SJEEiMp42zcZ3qixD7YOT nDm42xNi+q0nPHrWLtvc5CAM9hXhz0PHJzeBHlb9RQd5CqtDeITTMrn9pxVtikhEpeCo gqyMwcaQ4jMjJ41gXciyyDojndno70gQdFDy+xFXHcvH1wWDH8LPfIgntOEaFXLj1LPi hHKAduT678bJJ8PJD3YiQgqrLURwx7My5ttEvXbmd29PZHP279MrJFbzjiu+EpGQw4ES QqEw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=T9HznKXE; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id g19si8362926ejz.358.2021.10.29.08.19.11; Fri, 29 Oct 2021 08:19:12 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=T9HznKXE; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2A85B68A82F; Fri, 29 Oct 2021 18:19:07 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-ed1-f46.google.com (mail-ed1-f46.google.com [209.85.208.46]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 38DDD68981E for ; Fri, 29 Oct 2021 18:19:01 +0300 (EEST) Received: by mail-ed1-f46.google.com with SMTP id r4so39202705edi.5 for ; Fri, 29 Oct 2021 08:19:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:subject:date:message-id:mime-version :content-transfer-encoding; bh=zzO0lxpSHVCmw6WQztEDBV6PYfWdTDsSQyD2FKbi6mk=; b=T9HznKXEo5JlnjxkqB5bYlrQp+w+Hjrxe2Q9S//bp5IxioYkApkyd+J/nfq+fJWifs Va3M94U6VJAKgvZKFa4DvFB8L2Y6TXaO6xt9sjhCsc6ysaEM+nTCY6uNUz+C1PPeSQ6m 44XdEhBIitqnnfBkIZ54T/bbrabbSwhTn7DCJ9fExs/i9MSOcsjfBeFzERQ6yt4rAAa2 Fmusu+4e5ap8VePbalN8YSVFw4wXSHPQwpx++P+zdhu6PQDxrdrZ0thO6zeieYB7y1+v e9CFcVozNv69t4kx6YCGA56hSmglAfjP0G3KJVgDztB6JddnjrNS2x/KFv/ToXiVfLdV JHGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:mime-version :content-transfer-encoding; bh=zzO0lxpSHVCmw6WQztEDBV6PYfWdTDsSQyD2FKbi6mk=; b=AFAXR8mUK8l1LrFqe/0XHnHHNCc1OUgbToVy6/6DHO7GppuM62pY/LB5etlx+o2Rln pkRcaHgBLnEYBi6C3t1VckuaHXp5g/6GgmxYo0NNm0AjxRIgPHLpmNW3lX+NopKk1U1Y TbARgrzLzdpzOoVseXDlN4KM5GKfmXfX6Y9VHeqoZMOHKEVy+V1O8HryqLRLyCKvjpyW LQEzMAUHB2NkZ/PYthvm1lQ7x2kte+t/Yine1AVGFMuYQEkHfdhZVTXb8iPMrCy8EG0N FCZCUYA36ozZih+dm6z9vws1oOQRCew7dKKxdZxyn7lGOtGAsaJMYGoSCFy/fFL3aU+T z3Mw== X-Gm-Message-State: AOAM530Rr3YoQMFCQEq3DCjhmcHYq7RxkoTLbhLtAOJ8OqiFxJWQv3XJ ZY7lvc5yHwYijmuLw2jS4pLDbrlDt04= X-Received: by 2002:a17:907:961d:: with SMTP id gb29mr15210892ejc.457.1635520740683; Fri, 29 Oct 2021 08:19:00 -0700 (PDT) Received: from localhost.localdomain ([212.15.177.28]) by smtp.gmail.com with ESMTPSA id gb3sm3101386ejc.81.2021.10.29.08.18.59 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Oct 2021 08:19:00 -0700 (PDT) From: Paul B Mahol To: ffmpeg-devel@ffmpeg.org Date: Fri, 29 Oct 2021 17:18:57 +0200 Message-Id: <20211029151903.1078367-1-onemda@gmail.com> X-Mailer: git-send-email 2.33.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/7] avfilter/vf_nlmeans: use more friendlier 'for (int ...' X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: mB1+GfoNInKU Signed-off-by: Paul B Mahol --- libavfilter/vf_nlmeans.c | 33 ++++++++++++--------------------- 1 file changed, 12 insertions(+), 21 deletions(-) diff --git a/libavfilter/vf_nlmeans.c b/libavfilter/vf_nlmeans.c index 74fc3923b3..b8d8bb2ec0 100644 --- a/libavfilter/vf_nlmeans.c +++ b/libavfilter/vf_nlmeans.c @@ -101,14 +101,13 @@ static void compute_safe_ssd_integral_image_c(uint32_t *dst, ptrdiff_t dst_lines const uint8_t *s2, ptrdiff_t linesize2, int w, int h) { - int x, y; const uint32_t *dst_top = dst - dst_linesize_32; /* SIMD-friendly assumptions allowed here */ av_assert2(!(w & 0xf) && w >= 16 && h >= 1); - for (y = 0; y < h; y++) { - for (x = 0; x < w; x += 4) { + for (int y = 0; y < h; y++) { + for (int x = 0; x < w; x += 4) { const int d0 = s1[x ] - s2[x ]; const int d1 = s1[x + 1] - s2[x + 1]; const int d2 = s1[x + 2] - s2[x + 2]; @@ -161,14 +160,12 @@ static inline void compute_unsafe_ssd_integral_image(uint32_t *dst, ptrdiff_t ds int offx, int offy, int r, int sw, int sh, int w, int h) { - int x, y; - - for (y = starty; y < starty + h; y++) { + for (int y = starty; y < starty + h; y++) { uint32_t acc = dst[y*dst_linesize_32 + startx - 1] - dst[(y-1)*dst_linesize_32 + startx - 1]; const int s1y = av_clip(y - r, 0, sh - 1); const int s2y = av_clip(y - (r + offy), 0, sh - 1); - for (x = startx; x < startx + w; x++) { + for (int x = startx; x < startx + w; x++) { const int s1x = av_clip(x - r, 0, sw - 1); const int s2x = av_clip(x - (r + offx), 0, sw - 1); const uint8_t v1 = src[s1y*linesize + s1x]; @@ -334,7 +331,6 @@ struct thread_data { static int nlmeans_slice(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { - int x, y; NLMeansContext *s = ctx->priv; const struct thread_data *td = arg; const ptrdiff_t src_linesize = td->src_linesize; @@ -349,10 +345,10 @@ static int nlmeans_slice(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs const int dist_d = dist_b * s->ii_lz_32; const int dist_e = dist_d + dist_b; - for (y = starty; y < endy; y++) { + for (int y = starty; y < endy; y++) { const uint8_t *src = td->src + y*src_linesize; struct weighted_avg *wa = s->wa + y*s->wa_linesize; - for (x = td->startx; x < td->endx; x++) { + for (int x = td->startx; x < td->endx; x++) { /* * M is a discrete map where every entry contains the sum of all the entries * in the rectangle from the top-left origin of M to its coordinate. In the @@ -404,10 +400,8 @@ static void weight_averages(uint8_t *dst, ptrdiff_t dst_linesize, struct weighted_avg *wa, ptrdiff_t wa_linesize, int w, int h) { - int x, y; - - for (y = 0; y < h; y++) { - for (x = 0; x < w; x++) { + for (int y = 0; y < h; y++) { + for (int x = 0; x < w; x++) { // Also weight the centered pixel wa[x].total_weight += 1.f; wa[x].sum += 1.f * src[x]; @@ -423,7 +417,6 @@ static int nlmeans_plane(AVFilterContext *ctx, int w, int h, int p, int r, uint8_t *dst, ptrdiff_t dst_linesize, const uint8_t *src, ptrdiff_t src_linesize) { - int offx, offy; NLMeansContext *s = ctx->priv; /* patches center points cover the whole research window so the patches * themselves overflow the research window */ @@ -433,8 +426,8 @@ static int nlmeans_plane(AVFilterContext *ctx, int w, int h, int p, int r, memset(s->wa, 0, s->wa_linesize * h * sizeof(*s->wa)); - for (offy = -r; offy <= r; offy++) { - for (offx = -r; offx <= r; offx++) { + for (int offy = -r; offy <= r; offy++) { + for (int offx = -r; offx <= r; offx++) { if (offx || offy) { struct thread_data td = { .src = src + offy*src_linesize + offx, @@ -464,7 +457,6 @@ static int nlmeans_plane(AVFilterContext *ctx, int w, int h, int p, int r, static int filter_frame(AVFilterLink *inlink, AVFrame *in) { - int i; AVFilterContext *ctx = inlink->dst; NLMeansContext *s = ctx->priv; AVFilterLink *outlink = ctx->outputs[0]; @@ -476,7 +468,7 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in) } av_frame_copy_props(out, in); - for (i = 0; i < s->nb_planes; i++) { + for (int i = 0; i < s->nb_planes; i++) { const int w = i ? s->chroma_w : inlink->w; const int h = i ? s->chroma_h : inlink->h; const int p = i ? s->patch_hsize_uv : s->patch_hsize; @@ -508,7 +500,6 @@ void ff_nlmeans_init(NLMeansDSPContext *dsp) static av_cold int init(AVFilterContext *ctx) { - int i; NLMeansContext *s = ctx->priv; const double h = s->sigma * 10.; @@ -517,7 +508,7 @@ static av_cold int init(AVFilterContext *ctx) s->weight_lut = av_calloc(s->max_meaningful_diff, sizeof(*s->weight_lut)); if (!s->weight_lut) return AVERROR(ENOMEM); - for (i = 0; i < s->max_meaningful_diff; i++) + for (int i = 0; i < s->max_meaningful_diff; i++) s->weight_lut[i] = exp(-i * s->pdiff_scale); CHECK_ODD_FIELD(research_size, "Luma research window"); From patchwork Fri Oct 29 15:18:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul B Mahol X-Patchwork-Id: 31252 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a5e:a610:0:0:0:0:0 with SMTP id q16csp1804224ioi; Fri, 29 Oct 2021 08:19:40 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzQpImFo591Ou+N7IDu4dwinxVmLJ08YO2wRqeFLHBRwE/fO6WR/54vpB+MXxndc7Cpi4nL X-Received: by 2002:a05:6402:1ac1:: with SMTP id ba1mr497658edb.206.1635520780625; Fri, 29 Oct 2021 08:19:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635520780; cv=none; d=google.com; s=arc-20160816; b=wNvVYhJvm7YhQdQ4A1RYGBxTYdgZGTs4ulzVGpPOFrzdq0TJmNseX7N33wW2qbNwNB lHAUF7gMOQLh64VPOYTcqIPlqo9AupZRs9m1gxJPwvOvMe0xfmaO2yb/k+mQUGflNg2U jjdakqPSMD+AlI/RvcWO+ciH3l41QhM89n/XDx2JugQp5T9XgiiDNsuqrqgh+mbkvvnX VHjoEFFsLwo8Tcr/0BBLwJs/ZmJ4BgT+aPsT2hO+LGLSOW/rCRlwCscl57n1MdT8tqtf DY2Z86xhcYoY2/rO1mrmWNlHA9xmJb9e3esUPS4KAvfjcKgkyhW/rn3IkE4cE3j42TEB D/VQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=kcUYjwcJEEIZYHBv8htd6LGZdADydAN4hNTe5p8kfbs=; b=pJNKaS+Lm5yhFVaFNov9RybEWDx2SEDk1Bc8ZpdZaIgVyy8ih5zOzm2COYurFxMijT rcszygVi4B5CvI05ZyKAGV8OwVnb5/Lxs2VoyUDBqHZEebFDxF71JemtjFM99V86UkSC La5Ufzp15PZV1k5avLKjTfRoGQPixe5FYCA8M+/cptj/Cra7Zz5sNqmyGN6rnE1L5efT xGjrhBOP614sTdIw963Yh2pSp4IQ9ukLUrDedW6eT/G91ucuNmSZ0oaQb1Zsx3ZWu/bV FTZeH8JfSvhDoDAXiQKHFlM2k9RLF3prcxBOJt6dEp2UrJ+4WeMypK/DZ21ZaqmhlMUN s3ww== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=Are7maKy; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id jg41si10859272ejc.709.2021.10.29.08.19.39; Fri, 29 Oct 2021 08:19:40 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=Are7maKy; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 85BF568A949; Fri, 29 Oct 2021 18:19:10 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-ed1-f47.google.com (mail-ed1-f47.google.com [209.85.208.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 024D8689CFC for ; Fri, 29 Oct 2021 18:19:03 +0300 (EEST) Received: by mail-ed1-f47.google.com with SMTP id m17so38595644edc.12 for ; Fri, 29 Oct 2021 08:19:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=lSVYVpUQ52qdMrnCtX2XIquWxveqAnZqetmGcGHIQu4=; b=Are7maKyCgPFFJE28pTCf27wNxdinFOKGqk4fr8uoJJyMA7vCR7VNO+fNT0K9ooP3B UM5sEyPIg/+xWV2k0eBVu7klJPxI09Jk8lQXpVLox4RdToBiTaNw2wJolX+OiQ1XzFsw 9kGyQgfwk4hx5v2MCgai9d5ymBSdH8R+b0m2uxtLMa2gyDRBJ38135Ogsu+hIBLnWWO5 FsLUdhwnGK2wuQGbNw4AiMabtuG9ccyzBXwptllKoXE0c4KlC+WPhQ4dJUmuvaLW9tvW lhYqQ2RBBQKB/vCtbynnMI4VPQ4fmDMeW565iOyp+RZEk0j1pT4McWFH222pJHCdETPc 5Stg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=lSVYVpUQ52qdMrnCtX2XIquWxveqAnZqetmGcGHIQu4=; b=qWTIemMY9m/xTWhT1qdAOyCivDHnLMhw3dSPABQRaGs8dEQoHyB1ZnKesEewUoCDgp l5XqqkKjFuBxyB57LEyMLTcyp6PC6cLB9fqo9onH9wcP7saQwZkQ8c0HdTcmk1zftjqR bxpBaNoh4C5BHIZuy8exPq3LAzPFuoedqUKBotWwIpsLm5XcNKFbNX/bIrsEjXwXVwhe Iu4XAa7NiH133IACX6U3Rd7wNnnOfs48BLpA3ap7tqlWyITOxRESLnunwNYl43BihKyh 1Dl1uoJVMQF22TCL84UQShbsOg3apGv8AXIHi+NH1lWzHr1qV4kvxxBkw2FGbfMqhxSv BMcw== X-Gm-Message-State: AOAM531MOahwIN1ya1ljPsYFMU+5yJr16nVlmarI+AH/znulL2kadeox YAVyrRvY4sFFZeRi9+1GH0U/CDEKq1I= X-Received: by 2002:a50:d518:: with SMTP id u24mr9899804edi.137.1635520741689; Fri, 29 Oct 2021 08:19:01 -0700 (PDT) Received: from localhost.localdomain ([212.15.177.28]) by smtp.gmail.com with ESMTPSA id gb3sm3101386ejc.81.2021.10.29.08.19.00 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Oct 2021 08:19:01 -0700 (PDT) From: Paul B Mahol To: ffmpeg-devel@ffmpeg.org Date: Fri, 29 Oct 2021 17:18:58 +0200 Message-Id: <20211029151903.1078367-2-onemda@gmail.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20211029151903.1078367-1-onemda@gmail.com> References: <20211029151903.1078367-1-onemda@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/7] avfilter/vf_nlmeans: make access to pointer to lut faster X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: hdmQi7iIqfQN Signed-off-by: Paul B Mahol --- libavfilter/vf_nlmeans.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/libavfilter/vf_nlmeans.c b/libavfilter/vf_nlmeans.c index b8d8bb2ec0..0962056a6e 100644 --- a/libavfilter/vf_nlmeans.c +++ b/libavfilter/vf_nlmeans.c @@ -344,6 +344,7 @@ static int nlmeans_slice(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs const int dist_b = 2*p + 1; const int dist_d = dist_b * s->ii_lz_32; const int dist_e = dist_d + dist_b; + const float *const weight_lut = s->weight_lut; for (int y = starty; y < endy; y++) { const uint8_t *src = td->src + y*src_linesize; @@ -385,7 +386,7 @@ static int nlmeans_slice(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs const uint32_t patch_diff_sq = e - d - b + a; if (patch_diff_sq < s->max_meaningful_diff) { - const float weight = s->weight_lut[patch_diff_sq]; // exp(-patch_diff_sq * s->pdiff_scale) + const float weight = weight_lut[patch_diff_sq]; // exp(-patch_diff_sq * s->pdiff_scale) wa[x].total_weight += weight; wa[x].sum += weight * src[x]; } From patchwork Fri Oct 29 15:18:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul B Mahol X-Patchwork-Id: 31253 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a5e:a610:0:0:0:0:0 with SMTP id q16csp1803971ioi; Fri, 29 Oct 2021 08:19:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyMaF11w1gJs54i6WXo4q8uUwpqlEtq8JzOvlzam8Vr+5ZawOgBPerwaLwVtorqsPbgGCCI X-Received: by 2002:a05:6402:274c:: with SMTP id z12mr16475178edd.57.1635520768175; Fri, 29 Oct 2021 08:19:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635520768; cv=none; d=google.com; s=arc-20160816; b=qEdJNAwEzXLz05Zz5oQKpBbHZCuf0Ng7vMibr1gHxSC30hnYO0Zr3eskI58v0VOFWE 9ceWVph0lVoAtj2d072Twvo2N3dRmykl1LxKV7mxHncBqCM+eZ04jmb+5panmrIwq48t LNmzQ5Z6z+ZEqWqBnEb6ksosnRk0LMWQbo27xWQCm8KvRwRVjq/zL8DXeLzAEYHLcE+P 14JLfUA8dYdEZUm11vbq0sjlOsuAujZU2qDwhv/acuVx+/LF4qAOMkp32314ZnCAIClf SC9EGHAzEfHFdXBDsBiCTetc9lApO1IsI8jZ9/Q1R5mKGD84U0cszGt3Di1gxog1yT/Z kONQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=wbhkUfBhad+6lrRY9jYbcrb4LhjpdGWoCUHiRlzq/f0=; b=wdtoGzq8LK7hCrP49IIGJnkZWAiA1vSX+uyMoXbn0WXEneiMuYPvsqFY3TdzYraHP1 ydFiG1Lnv7IbiR/OdyxuveMnih+TgWQWOSTs2ib9KIJBPjaHSQq01mwy83xueCiFSvH4 pQ5plX4IDnew0i0PBOTD85TJzqYp2hes1ZpebTmhuZS13NTKPuBvyu2sRCDET4WI8lSz bQmliYQ8Bo9eeDLnmans2NNSxlMnPAfhBiNWjxCvmTV0LIA1UFle9vJ4rG78tqPvnMtv 8PGUBUzQiom7nxzl2MNMABLL4TJQnlw3FltvOLmjIBxu3mj9R4Ug7oTFeENKFMXBd9NO qfqw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=YI1UVwFo; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id l7si10847820edk.367.2021.10.29.08.19.26; Fri, 29 Oct 2021 08:19:28 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=YI1UVwFo; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6BE3368A926; Fri, 29 Oct 2021 18:19:09 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-ed1-f53.google.com (mail-ed1-f53.google.com [209.85.208.53]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 209F768981E for ; Fri, 29 Oct 2021 18:19:03 +0300 (EEST) Received: by mail-ed1-f53.google.com with SMTP id h7so40437649ede.8 for ; Fri, 29 Oct 2021 08:19:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=z5jC/TWxWKxJIlv66XByiYe69llcFApc9GqP6yHBpBI=; b=YI1UVwFogQcN1PtqLn2u7mQNJPgLAX3nKLeRM2fCh3d6VzZSbWM744Y0oqD/OsJjlx xlBZFnmQtkyA/gFHfTo9NgGC59miqWP8hW57LEVRrPLfC3jvOKohYXlPa6pHzDDanYsW ucrkc6kMvKca6MamFzKcsKT0whyyDQf/bCi8m4QtoZDcMiTjPmovWG5ms3q1smsPzd+d 7PsFEMMZvNPFbyO3koEEOPwPnY9Kyb+5XJGM80R7CKHaVz/FoLrU5M34/XHA32XwRZyB KlY1igJnGR2fiz3mpNVM7tnsjN0u8J+oLDui9B+cfJ1BluwmE6Qmp5cAFgccfJ2s5h2q 1n0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=z5jC/TWxWKxJIlv66XByiYe69llcFApc9GqP6yHBpBI=; b=rqmGYQB0Qs4guchsM830d5oKfYznKJjfi+JuKUTZuD8uMo5EBEe7pdM6sqcd87sQL4 lI0ykuUKm5qKT1arh1Jr6EVGn6UQ3MV+zgPbLLWux8bXDQCwvhEtNxnKFHHRWzowBw2z UTvlvpVs6VEoSnV/aupFrAyGRMWiJ7gEGGHTY9/dursf6sMK8K43QRfSGTZbsOg6gzZj dzMbzNE9McQECP4UZvxF9q4L/FiLOUI5ZZ76RVbiXJqUEHJ2w/hgtul5vbMJsrJFOkTk fWnmcHTP1CKaUF/8DHahzR8gKTvt9yJy4K4/ieECcI0lPZEg7tlCtMSZrSLDlHUJca/w g96A== X-Gm-Message-State: AOAM533N6Vryjycel+MFo/RVyY5Jl2n5g7IyJnPrvCE1fiavCLki3el/ f9UNX7d4qD7+euJ2nHxEtkDkfEgH37U= X-Received: by 2002:a17:906:6b1a:: with SMTP id q26mr13770999ejr.185.1635520742591; Fri, 29 Oct 2021 08:19:02 -0700 (PDT) Received: from localhost.localdomain ([212.15.177.28]) by smtp.gmail.com with ESMTPSA id gb3sm3101386ejc.81.2021.10.29.08.19.01 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Oct 2021 08:19:02 -0700 (PDT) From: Paul B Mahol To: ffmpeg-devel@ffmpeg.org Date: Fri, 29 Oct 2021 17:18:59 +0200 Message-Id: <20211029151903.1078367-3-onemda@gmail.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20211029151903.1078367-1-onemda@gmail.com> References: <20211029151903.1078367-1-onemda@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 3/7] avfilter/vf_nlmeans: no need to print filter options at info level X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 5R5nMOtLQ4SR Signed-off-by: Paul B Mahol --- libavfilter/vf_nlmeans.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavfilter/vf_nlmeans.c b/libavfilter/vf_nlmeans.c index 0962056a6e..d5a71291af 100644 --- a/libavfilter/vf_nlmeans.c +++ b/libavfilter/vf_nlmeans.c @@ -526,7 +526,7 @@ static av_cold int init(AVFilterContext *ctx) s->patch_hsize = s->patch_size / 2; s->patch_hsize_uv = s->patch_size_uv / 2; - av_log(ctx, AV_LOG_INFO, "Research window: %dx%d / %dx%d, patch size: %dx%d / %dx%d\n", + av_log(ctx, AV_LOG_DEBUG, "Research window: %dx%d / %dx%d, patch size: %dx%d / %dx%d\n", s->research_size, s->research_size, s->research_size_uv, s->research_size_uv, s->patch_size, s->patch_size, s->patch_size_uv, s->patch_size_uv); From patchwork Fri Oct 29 15:19:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul B Mahol X-Patchwork-Id: 31254 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a5e:a610:0:0:0:0:0 with SMTP id q16csp1805063ioi; Fri, 29 Oct 2021 08:20:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzqswR8vwSmfeqJCnsK/KJve3p8JidtpskFwc+RK44AGiUy+7iwY9Pye8URAtLimgjV7Jwi X-Received: by 2002:a17:907:7e82:: with SMTP id qb2mr14062352ejc.530.1635520828449; Fri, 29 Oct 2021 08:20:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635520828; cv=none; d=google.com; s=arc-20160816; b=OnOMLZq363+GGG+nE6HoDjk24klNK8UJwZlghDdQ59PFDQYOfN/MaR2GCmG+hP3qZ/ 8KZybGf5c2aFGVESIpCPlO6zgjmFTt3N4xBg8/tt9dd97g06MznDrTGNx7Jc+jnOr2Zl oT80cgsGcDAeuTP1OnjR/QhQV/g8H7ZFSK/8qwn6XlljFymz3co4zAr6ZnWSr18VdMKN fOiF9VTpCHUvmPnoFGYt7kXeHni//x+KTQm0DCl2x+NbVGeCvBgJJfc0aVe4jZsV6KqU 3eOA1TVSHZ17Ml/wjB43FFWb08Hffg1vN/BOdALi2lUPn4u8aBIRDRyzITXOpX3iGMUw c1zg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=uB4WkmWMduAqT7Z+G0ln/8EB9DX1msUCEohRnucckd4=; b=S3ATfBTfJYvv8UzYdrwpQa8FP+lwxEQRNgBk2a21T4a6lXMSNQjM/iSUTidHhQBdmW BNg1rpGlWEH7TRuv9Nf8Ztr6Zm0aah6cBhGcWXbLTjD/0x1YPxoDgpOczKMozEJtcDQ0 3YN2Kazs+POaGp8zVKr/DqDddn61zBwu/Vqxd11MAxWgpKUityDTHSHnxhWeMWJWT0Mv g5yjR/WTG55+T5L5YxxJ2/xVvXwa4ehu6Y9gMndlq/NhVKuNvJN24tGuhX1RJN7+RWzx YUtX48my3LiGOf65BO6rMfBmIuBfKqayvGaIlD4bZaEmwtdNQl6U2GTTh2qMxZLHpV1B cNiw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=qUztGwBX; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id e1si2853271edr.491.2021.10.29.08.20.05; Fri, 29 Oct 2021 08:20:28 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=qUztGwBX; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A535E68AA26; Fri, 29 Oct 2021 18:19:14 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-ed1-f41.google.com (mail-ed1-f41.google.com [209.85.208.41]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id CD0F368A963 for ; Fri, 29 Oct 2021 18:19:05 +0300 (EEST) Received: by mail-ed1-f41.google.com with SMTP id ee16so26838425edb.10 for ; Fri, 29 Oct 2021 08:19:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=76Aj9qd7FjMOqL9pvMO1WL+MKy66VFPkK7C4bVC4pV0=; b=qUztGwBXneoIDr6+Xc20HWa/84PwHzUTAQBH5fd9T2tdcJiIgEiORewJOWl9Yf12/J 5MK/OEjSxtM8hMccZbx4/s6UlccDUvmmieBJPf6kWOrIjwQk1dW8yUwdibAUcybbDHOa kAnE0OEEYjqx65kpJFiszyIphSJQROnO0fWhHRTtOT8qfq2G4gFIyka7FnFhIQ3RlM9j ua9NKMg5sgOpUOyldVsh3giyoEReb9Rf8MnRH4xRFZ9Ckx2s/QnkQau3Rvn/NQaS99fh qSr9H7UuaIiXTd+hsPNJmX3KPP+WXS1evKaif4T8M5XrrjMz5kO2iEM5+r9klCzMIMBT 9jKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=76Aj9qd7FjMOqL9pvMO1WL+MKy66VFPkK7C4bVC4pV0=; b=ZpZ3PNVR+u7zrsdaYfG4SqExyieyQOCJXP/fCnUcMYLwRrAUUF8X0DhGnACepl4duO vOyQnAfBn/Z11R4Ij0ulSWrGlPPkR+9fY5KXCY5ivWrqu88LUPb66KezXek7cpKkanO1 65zdExhNEWO9NFPuRMYiOJizRUBYYiNrTM7+azixlLttPpdCOzOLF+QyoJBHFnoW0IIC EF4azORSjh0eLnhY7Sdx4CB7FumR1d/rJgiWx5KJsIGJcMzLcunBohybFKm8Zgs4kVxl eDzdgtJiDcmQdtgB28G50tLx367BQ0mxbaMLcM3nwunV8eF3jJUfYXNy1HVg8K5ch+Ym 80QQ== X-Gm-Message-State: AOAM533h3OWjkxYIdEX/jmFjTwUPPXZlyZZJM0fmMbMORGKXBBRsm6O4 9M5hdKrw54UtPNr9KIj3X+YtlteNEqk= X-Received: by 2002:a17:907:7ea8:: with SMTP id qb40mr15085601ejc.168.1635520743645; Fri, 29 Oct 2021 08:19:03 -0700 (PDT) Received: from localhost.localdomain ([212.15.177.28]) by smtp.gmail.com with ESMTPSA id gb3sm3101386ejc.81.2021.10.29.08.19.02 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Oct 2021 08:19:03 -0700 (PDT) From: Paul B Mahol To: ffmpeg-devel@ffmpeg.org Date: Fri, 29 Oct 2021 17:19:00 +0200 Message-Id: <20211029151903.1078367-4-onemda@gmail.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20211029151903.1078367-1-onemda@gmail.com> References: <20211029151903.1078367-1-onemda@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 4/7] avfilter/vf_nlmeans: avoid if () to help paralellization X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: X55RPNPjpkXp Signed-off-by: Paul B Mahol --- libavfilter/vf_nlmeans.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/libavfilter/vf_nlmeans.c b/libavfilter/vf_nlmeans.c index d5a71291af..af165c861c 100644 --- a/libavfilter/vf_nlmeans.c +++ b/libavfilter/vf_nlmeans.c @@ -332,6 +332,7 @@ struct thread_data { static int nlmeans_slice(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { NLMeansContext *s = ctx->priv; + const uint32_t max_meaningful_diff = s->max_meaningful_diff; const struct thread_data *td = arg; const ptrdiff_t src_linesize = td->src_linesize; const int process_h = td->endy - td->starty; @@ -383,13 +384,11 @@ static int nlmeans_slice(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs const uint32_t b = ii[x + dist_b]; const uint32_t d = ii[x + dist_d]; const uint32_t e = ii[x + dist_e]; - const uint32_t patch_diff_sq = e - d - b + a; + const uint32_t patch_diff_sq = FFMIN(e - d - b + a, max_meaningful_diff); + const float weight = weight_lut[patch_diff_sq]; // exp(-patch_diff_sq * s->pdiff_scale) - if (patch_diff_sq < s->max_meaningful_diff) { - const float weight = weight_lut[patch_diff_sq]; // exp(-patch_diff_sq * s->pdiff_scale) - wa[x].total_weight += weight; - wa[x].sum += weight * src[x]; - } + wa[x].total_weight += weight; + wa[x].sum += weight * src[x]; } ii += s->ii_lz_32; } @@ -506,7 +505,7 @@ static av_cold int init(AVFilterContext *ctx) s->pdiff_scale = 1. / (h * h); s->max_meaningful_diff = log(255.) / s->pdiff_scale; - s->weight_lut = av_calloc(s->max_meaningful_diff, sizeof(*s->weight_lut)); + s->weight_lut = av_calloc(s->max_meaningful_diff + 1, sizeof(*s->weight_lut)); if (!s->weight_lut) return AVERROR(ENOMEM); for (int i = 0; i < s->max_meaningful_diff; i++) From patchwork Fri Oct 29 15:19:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul B Mahol X-Patchwork-Id: 31255 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a5e:a610:0:0:0:0:0 with SMTP id q16csp1804581ioi; Fri, 29 Oct 2021 08:20:00 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx05kLcDV2Y/ixNYtbyTvcfbbk/QTySn4F07sRNTKMxBvl52D3C3T+SAn4SoNqxtLvHJgPB X-Received: by 2002:adf:d082:: with SMTP id y2mr6625267wrh.214.1635520800532; Fri, 29 Oct 2021 08:20:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635520800; cv=none; d=google.com; s=arc-20160816; b=JxTegUtOp+KUteLEqZxbvyUpPDhhYraehfDwup3YL+cFHumK2mYgrqMV7WgAaH7Ful VHzIEJ+qQu60fSBcHRGBFbmJnJmvfcZ+65Wad3n4XbJc4gmvPCPK37SDsRHSC+S4DjSJ zS7ouqSP3IfjJbgtHdAYv0w02d6a2NvX6Fo/0DXpqIQLf7rn116EO4LPRgYNEBBTQf3e B8dw1EeejUfLFfKdZUBMlYIVxS25sKqgpHtgBZhEhqSy0RCAUwdHQbmNVSFb9OaAgcCM iKIoOtdjAqXmOM+6W7DOIC44C+y+7xuc5ed1FOR6tjMTVSDZc0f42/N4tsyGVrGay4ke WuYw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=onNoMF3w33ayqVup7pdMpuT1Gq3EYblyeJnFiEgXdlw=; b=l0EyTt8SJOndxuRIO20WmuGmnpeacAk7zR0Pqfg5PHHMw/xde0z1lU19uznwPXH1nU yZcQkBTUspzh54uDhPQY83weOW6JUv8dUZTPzAK3s+qSblYfJgqOAnaEi7MbSrQrdMzF ZgA1sMHQcYKisq1diwuDM45ad5nkZjDkqGQpHuEWB+tbsb+Xy2wTwNDkgZLRFHeiccrE 3lWO/Qb77Pg0Le1cjr9MynxXXbBXPi+0XgE0W/xUNPgVyQ3TKCtBrgHwkSLuMEqNwbRR dHZIBU8Im2np/X26Ny5ROCiraI3Z88A8CHuf2jUX/mqUFIt+pSUbQfKg/tlFmaPBfeq/ 3NNw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=PwqhsHkI; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id fj2si11483555ejc.153.2021.10.29.08.19.52; Fri, 29 Oct 2021 08:20:00 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=PwqhsHkI; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 02EB468A439; Fri, 29 Oct 2021 18:19:12 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-ed1-f47.google.com (mail-ed1-f47.google.com [209.85.208.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 2B70968A8EE for ; Fri, 29 Oct 2021 18:19:05 +0300 (EEST) Received: by mail-ed1-f47.google.com with SMTP id g10so39566063edj.1 for ; Fri, 29 Oct 2021 08:19:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=2oWxsKAZZGtNg2BX2RBCAuqdAuiJ7qws54vrPe4yKZg=; b=PwqhsHkIbw+VuIKKVtlnzBnT1BVbF+O7gItNxYY7baqqJtfETlMd0g6NUgr8zTrUKS nNpj9Xm6cAkNQVowdYL1gfGmcW12HRQ4OxtE8YACaSevTBlv9rn2ZFEQah6KcM3e+B3w BiNv3jJG3P6Aopy5mDI4/AizKW5SwJaGgFxl7n+A4uj7FhTtWDCcmDVyp0NScITevNA5 vmQ+Hj7vRonPghHAJBzg1VelObIB4dTu8yOoXcnTLknZpG0i3BP25Y5cgRYUxcaiW0LG Z7vkx1zfYFd3wFUUZDAGeWGKgOi14mThfeS6a+WZ6m5Mbk4mMD+hCBGn5X9rS0jnEPK2 zL1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=2oWxsKAZZGtNg2BX2RBCAuqdAuiJ7qws54vrPe4yKZg=; b=xp67WK10wXD3j+r3v0lHukEUswY2hUsiUGlwK45XWhHZH2wmAssgvVh6vh3IJcHKU5 1Du5eVPaQK7EdAYNUm1XgMPZPcc+FOK6kzGqgIWT+L0CHnNl9MO3UVkOszFplMSX5z2h hmtJL+4H0SNiRfeffX9D4ZhYyhv9iKGbxNfW4hj02asLLS2XACwxjKnpI6I+IogPqrJZ grKKLeRXvaQvspt/iXYUwlW9KXYyyyalTHyZJIbxPAttpAvvg0nPlvLuXQjbtwpdKaI+ lfc6CfBmHvnbOWKzqWPp3RERLfcUZ5AWRU/fj5D6eX/O2uneYCGnIYMBurV0jCxXzmFg x/tw== X-Gm-Message-State: AOAM533MWHKWUSab9VB7nmafso2RyBBA6r70a7W1R3n1qJ12z85jsrvW uxRqKnONytTQT0ZnwH3AQa2ZRS0BXYk= X-Received: by 2002:a17:907:3e85:: with SMTP id hs5mr14344721ejc.234.1635520744660; Fri, 29 Oct 2021 08:19:04 -0700 (PDT) Received: from localhost.localdomain ([212.15.177.28]) by smtp.gmail.com with ESMTPSA id gb3sm3101386ejc.81.2021.10.29.08.19.03 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Oct 2021 08:19:04 -0700 (PDT) From: Paul B Mahol To: ffmpeg-devel@ffmpeg.org Date: Fri, 29 Oct 2021 17:19:01 +0200 Message-Id: <20211029151903.1078367-5-onemda@gmail.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20211029151903.1078367-1-onemda@gmail.com> References: <20211029151903.1078367-1-onemda@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 5/7] avfilter/vf_nlmeans: refactor line processing in preparation for x86 SIMD assembly X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: AtQ4m7yhQ8+Z Signed-off-by: Paul B Mahol --- libavfilter/vf_nlmeans.c | 109 ++++++++++++++++++++++----------------- libavfilter/vf_nlmeans.h | 14 +++++ 2 files changed, 77 insertions(+), 46 deletions(-) diff --git a/libavfilter/vf_nlmeans.c b/libavfilter/vf_nlmeans.c index af165c861c..93a14bcf19 100644 --- a/libavfilter/vf_nlmeans.c +++ b/libavfilter/vf_nlmeans.c @@ -38,11 +38,6 @@ #include "vf_nlmeans.h" #include "video.h" -struct weighted_avg { - float total_weight; - float sum; -}; - typedef struct NLMeansContext { const AVClass *class; int nb_planes; @@ -329,6 +324,58 @@ struct thread_data { int p; }; +static void compute_weights_line_c(const uint32_t *const iia, + const uint32_t *const iib, + const uint32_t *const iid, + const uint32_t *const iie, + const uint8_t *const src, + struct weighted_avg *wa, + const float *const weight_lut, + int max_meaningful_diff, + int startx, int endx) +{ + for (int x = startx; x < endx; x++) { + /* + * M is a discrete map where every entry contains the sum of all the entries + * in the rectangle from the top-left origin of M to its coordinate. In the + * following schema, "i" contains the sum of the whole map: + * + * M = +----------+-----------------+----+ + * | | | | + * | | | | + * | a| b| c| + * +----------+-----------------+----+ + * | | | | + * | | | | + * | | X | | + * | | | | + * | d| e| f| + * +----------+-----------------+----+ + * | | | | + * | g| h| i| + * +----------+-----------------+----+ + * + * The sum of the X box can be calculated with: + * X = e-d-b+a + * + * See https://en.wikipedia.org/wiki/Summed_area_table + * + * The compute*_ssd functions compute the integral image M where every entry + * contains the sum of the squared difference of every corresponding pixels of + * two input planes of the same size as M. + */ + const uint32_t a = iia[x]; + const uint32_t b = iib[x]; + const uint32_t d = iid[x]; + const uint32_t e = iie[x]; + const uint32_t patch_diff_sq = FFMIN(e - d - b + a, max_meaningful_diff); + const float weight = weight_lut[patch_diff_sq]; // exp(-patch_diff_sq * s->pdiff_scale) + + wa[x].total_weight += weight; + wa[x].sum += weight * src[x]; + } +} + static int nlmeans_slice(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { NLMeansContext *s = ctx->priv; @@ -346,50 +393,19 @@ static int nlmeans_slice(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs const int dist_d = dist_b * s->ii_lz_32; const int dist_e = dist_d + dist_b; const float *const weight_lut = s->weight_lut; + NLMeansDSPContext *dsp = &s->dsp; for (int y = starty; y < endy; y++) { - const uint8_t *src = td->src + y*src_linesize; + const uint8_t *const src = td->src + y*src_linesize; struct weighted_avg *wa = s->wa + y*s->wa_linesize; - for (int x = td->startx; x < td->endx; x++) { - /* - * M is a discrete map where every entry contains the sum of all the entries - * in the rectangle from the top-left origin of M to its coordinate. In the - * following schema, "i" contains the sum of the whole map: - * - * M = +----------+-----------------+----+ - * | | | | - * | | | | - * | a| b| c| - * +----------+-----------------+----+ - * | | | | - * | | | | - * | | X | | - * | | | | - * | d| e| f| - * +----------+-----------------+----+ - * | | | | - * | g| h| i| - * +----------+-----------------+----+ - * - * The sum of the X box can be calculated with: - * X = e-d-b+a - * - * See https://en.wikipedia.org/wiki/Summed_area_table - * - * The compute*_ssd functions compute the integral image M where every entry - * contains the sum of the squared difference of every corresponding pixels of - * two input planes of the same size as M. - */ - const uint32_t a = ii[x]; - const uint32_t b = ii[x + dist_b]; - const uint32_t d = ii[x + dist_d]; - const uint32_t e = ii[x + dist_e]; - const uint32_t patch_diff_sq = FFMIN(e - d - b + a, max_meaningful_diff); - const float weight = weight_lut[patch_diff_sq]; // exp(-patch_diff_sq * s->pdiff_scale) - - wa[x].total_weight += weight; - wa[x].sum += weight * src[x]; - } + const uint32_t *const iia = ii; + const uint32_t *const iib = ii + dist_b; + const uint32_t *const iid = ii + dist_d; + const uint32_t *const iie = ii + dist_e; + + dsp->compute_weights_line(iia, iib, iid, iie, src, wa, + weight_lut, max_meaningful_diff, + td->startx, td->endx); ii += s->ii_lz_32; } return 0; @@ -493,6 +509,7 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in) void ff_nlmeans_init(NLMeansDSPContext *dsp) { dsp->compute_safe_ssd_integral_image = compute_safe_ssd_integral_image_c; + dsp->compute_weights_line = compute_weights_line_c; if (ARCH_AARCH64) ff_nlmeans_init_aarch64(dsp); diff --git a/libavfilter/vf_nlmeans.h b/libavfilter/vf_nlmeans.h index 0a9aab2928..d0d0056163 100644 --- a/libavfilter/vf_nlmeans.h +++ b/libavfilter/vf_nlmeans.h @@ -22,11 +22,25 @@ #include #include +struct weighted_avg { + float total_weight; + float sum; +}; + typedef struct NLMeansDSPContext { void (*compute_safe_ssd_integral_image)(uint32_t *dst, ptrdiff_t dst_linesize_32, const uint8_t *s1, ptrdiff_t linesize1, const uint8_t *s2, ptrdiff_t linesize2, int w, int h); + void (*compute_weights_line)(const uint32_t *const iia, + const uint32_t *const iib, + const uint32_t *const iid, + const uint32_t *const iie, + const uint8_t *const src, + struct weighted_avg *wa, + const float *const weight_lut, + int max_meaningful_diff, + int startx, int endx); } NLMeansDSPContext; void ff_nlmeans_init(NLMeansDSPContext *dsp); From patchwork Fri Oct 29 15:19:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul B Mahol X-Patchwork-Id: 31251 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a5e:a610:0:0:0:0:0 with SMTP id q16csp1804867ioi; Fri, 29 Oct 2021 08:20:18 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyWxcxffGA4oK2L8E7Ez+O7NENJjrq+yH7uo+qwYhRsJAES+GDGS/iEjTPPoJ5dCElqhW3W X-Received: by 2002:a17:906:1815:: with SMTP id v21mr14841186eje.218.1635520818613; Fri, 29 Oct 2021 08:20:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635520818; cv=none; d=google.com; s=arc-20160816; b=ZjRze+cVI/IgYNLgYnrRKlPtroynwtHRmhCG90rxCzf6P/CDjgLVFSWpB8EqrYsXiu aYGwQ/zterVBE76FEFsSPcwbrniH2NhUC6chGnmmQtbfixBt4aVZKuKlF9MyOmiH2arz qgyR+7uM3RXBmCcqy9z/2qJO37QhZA5cqJ1ImXmgNNFqPaThdKp7GpuvpZGClC1dFcUU tQLRn7u9pd+CEohljtdwgtKHz0hfFWtCqTN6svNTmlk9/XdAtNW9C0La5RhTl5M7eYFN drZVqqolCnyyBsbv/lRlybEyqHwCdv7Iv41SKsDDn16m11IL/Zhnb0VlGrvZmkwBm8tv 4qww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=fL6W9xw3YiM9qACX8ENU3f4zmg48Zn3puV/4b48VBGk=; b=yCTzTqDRrkm00FQA80GFzR2oClUb3rOfsFlUyGAS5UhOOHipKoDEigseMe1q8ur56/ 2HmTWeknO5ZnGlDqiApeYbjMnpcHByBLz1Q+QSWpUkdznn+U3gKEUZ9FJSL9s7VuM7/L jnSWxRas9HXO5jAuPYuVLj7t4Luhs3lBMJ5PJDZEK2krbic+EdVf/FBVwmorXfAG+oQH VV39Mh4j87NO9aoKFhCoBghcuM18T/hnDaTfc0OeH6QhiCC9j+kEQUQ0IkalzHPk+K+g 4nlVlR08LzxdDXYinRH5mFmG2HR06Lva9l7MQX/DsVZL3F/6+z01i729V9nHXNHpk+UI JmTQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=PGy0jQCs; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id e8si8594163ejj.400.2021.10.29.08.20.18; Fri, 29 Oct 2021 08:20:18 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=PGy0jQCs; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E22E768A12D; Fri, 29 Oct 2021 18:19:16 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-ed1-f43.google.com (mail-ed1-f43.google.com [209.85.208.43]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 3891668A938 for ; Fri, 29 Oct 2021 18:19:06 +0300 (EEST) Received: by mail-ed1-f43.google.com with SMTP id w15so39994747edc.9 for ; Fri, 29 Oct 2021 08:19:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=Sg+sGgHAQUpVoatxNgHdQeAb4pXQkblj+p/eWVof6TU=; b=PGy0jQCsweQPgwDqTgBGHajkHtB+mqki33vAmztb0KlxB0Gsp0X2+EbebJ9EW/Vak6 R5jfp0qqEmkzNA+J6JwYgHDNmRyvHNFcjG+qDF6LAdnksBRmuAuZxfPHWEsodrH6s5fn XW4eoNU8ONtKx3B1Vy63WMqH6cBBkIWtil7PXOLDmXvsc0282IW8zNTJa85CTXw0Bto5 o8sKoLCncKefEeQHcyEtaMi5VIDUctGAcndUuAZkjH2IvrH9h82zhB7U4fmGVg1F3BhR 8t/CJ2oapgNe8M9Tro7iCanFZIuH5jNHyeajzePEpzzDjXJPVwz7MWa7U0gJDf7Q4fJ2 Q0ag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Sg+sGgHAQUpVoatxNgHdQeAb4pXQkblj+p/eWVof6TU=; b=C0U/vTF2iGkWXF6/lxKfhuGzk6x2AHwpeMQUp0lXWx9ZAEswl8GPtfTfbWWu9kgZw3 iJpy960dEYlxo+1j1UDi/kce1m/Lj28ShS8Vy7iJ9Y2tsWgKE8C/0iyhjmONoEjgL8zs Pxx2OJFP3ycRPCgBzn3jYMd1XBIasDnJ+/LJsndlXoBJ8AIHrchiK253WBX+fpGozgHv w6FdTzXUWG7xKRZkXb8J/sCN5Ye1mHSv3npuLYPlKimPqujdpvXPLiTStpD0Opitg2O+ KHhNhoNXd8TgE4RiIRapgH5YLfNGfwxfR3bn2n9GhB+Rxa5jIAoXfDfqfRv8Eb0SepyE Nwvg== X-Gm-Message-State: AOAM5323Ql+RfV+xaJlY3Gmfmw3RYpTfBp4FjCaPHRKFN4Aq7YGYuGSS JlYCnTFYhaeBE/xrlCTQHbb1vyH5GP4= X-Received: by 2002:a05:6402:2787:: with SMTP id b7mr15772031ede.230.1635520745463; Fri, 29 Oct 2021 08:19:05 -0700 (PDT) Received: from localhost.localdomain ([212.15.177.28]) by smtp.gmail.com with ESMTPSA id gb3sm3101386ejc.81.2021.10.29.08.19.04 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Oct 2021 08:19:05 -0700 (PDT) From: Paul B Mahol To: ffmpeg-devel@ffmpeg.org Date: Fri, 29 Oct 2021 17:19:02 +0200 Message-Id: <20211029151903.1078367-6-onemda@gmail.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20211029151903.1078367-1-onemda@gmail.com> References: <20211029151903.1078367-1-onemda@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 6/7] avfilter/vf_nlmeans: split wa struct X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: WvAVY6X8EN1o This will make x86 SIMD simpler and faster. Signed-off-by: Paul B Mahol --- libavfilter/vf_nlmeans.c | 43 +++++++++++++++++++++++----------------- libavfilter/vf_nlmeans.h | 8 ++------ 2 files changed, 27 insertions(+), 24 deletions(-) diff --git a/libavfilter/vf_nlmeans.c b/libavfilter/vf_nlmeans.c index 93a14bcf19..dee1f68101 100644 --- a/libavfilter/vf_nlmeans.c +++ b/libavfilter/vf_nlmeans.c @@ -52,8 +52,9 @@ typedef struct NLMeansContext { uint32_t *ii; // integral image starting after the 0-line and 0-column int ii_w, ii_h; // width and height of the integral image ptrdiff_t ii_lz_32; // linesize in 32-bit units of the integral image - struct weighted_avg *wa; // weighted average of every pixel - ptrdiff_t wa_linesize; // linesize for wa in struct size unit + float *total_weight; // total weight for every pixel + float *sum; // weighted sum for every pixel + int linesize; // sum and total_weight linesize float *weight_lut; // lookup table mapping (scaled) patch differences to their associated weights uint32_t max_meaningful_diff; // maximum difference considered (if the patch difference is too high we ignore the pixel) NLMeansDSPContext dsp; @@ -307,9 +308,10 @@ static int config_input(AVFilterLink *inlink) s->ii = s->ii_orig + s->ii_lz_32 + 1; // allocate weighted average for every pixel - s->wa_linesize = inlink->w; - s->wa = av_malloc_array(s->wa_linesize, inlink->h * sizeof(*s->wa)); - if (!s->wa) + s->linesize = inlink->w; + s->total_weight = av_malloc_array(inlink->w, inlink->h * sizeof(*s->total_weight)); + s->sum = av_malloc_array(inlink->w, inlink->h * sizeof(*s->sum)); + if (!s->total_weight || !s->sum) return AVERROR(ENOMEM); return 0; @@ -329,7 +331,8 @@ static void compute_weights_line_c(const uint32_t *const iia, const uint32_t *const iid, const uint32_t *const iie, const uint8_t *const src, - struct weighted_avg *wa, + float *total_weight, + float *sum, const float *const weight_lut, int max_meaningful_diff, int startx, int endx) @@ -371,8 +374,8 @@ static void compute_weights_line_c(const uint32_t *const iia, const uint32_t patch_diff_sq = FFMIN(e - d - b + a, max_meaningful_diff); const float weight = weight_lut[patch_diff_sq]; // exp(-patch_diff_sq * s->pdiff_scale) - wa[x].total_weight += weight; - wa[x].sum += weight * src[x]; + total_weight[x] += weight; + sum[x] += weight * src[x]; } } @@ -397,13 +400,14 @@ static int nlmeans_slice(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs for (int y = starty; y < endy; y++) { const uint8_t *const src = td->src + y*src_linesize; - struct weighted_avg *wa = s->wa + y*s->wa_linesize; + float *total_weight = s->total_weight + y*s->linesize; + float *sum = s->sum + y*s->linesize; const uint32_t *const iia = ii; const uint32_t *const iib = ii + dist_b; const uint32_t *const iid = ii + dist_d; const uint32_t *const iie = ii + dist_e; - dsp->compute_weights_line(iia, iib, iid, iie, src, wa, + dsp->compute_weights_line(iia, iib, iid, iie, src, total_weight, sum, weight_lut, max_meaningful_diff, td->startx, td->endx); ii += s->ii_lz_32; @@ -413,19 +417,20 @@ static int nlmeans_slice(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs static void weight_averages(uint8_t *dst, ptrdiff_t dst_linesize, const uint8_t *src, ptrdiff_t src_linesize, - struct weighted_avg *wa, ptrdiff_t wa_linesize, + float *total_weight, float *sum, ptrdiff_t linesize, int w, int h) { for (int y = 0; y < h; y++) { for (int x = 0; x < w; x++) { // Also weight the centered pixel - wa[x].total_weight += 1.f; - wa[x].sum += 1.f * src[x]; - dst[x] = av_clip_uint8(wa[x].sum / wa[x].total_weight + 0.5f); + total_weight[x] += 1.f; + sum[x] += 1.f * src[x]; + dst[x] = av_clip_uint8(sum[x] / total_weight[x] + 0.5f); } dst += dst_linesize; src += src_linesize; - wa += wa_linesize; + total_weight += linesize; + sum += linesize; } } @@ -440,7 +445,8 @@ static int nlmeans_plane(AVFilterContext *ctx, int w, int h, int p, int r, /* focus an integral pointer on the centered image (s1) */ const uint32_t *centered_ii = s->ii + e*s->ii_lz_32 + e; - memset(s->wa, 0, s->wa_linesize * h * sizeof(*s->wa)); + memset(s->total_weight, 0, s->linesize * h * sizeof(*s->total_weight)); + memset(s->sum, 0, s->linesize * h * sizeof(*s->sum)); for (int offy = -r; offy <= r; offy++) { for (int offx = -r; offx <= r; offx++) { @@ -466,7 +472,7 @@ static int nlmeans_plane(AVFilterContext *ctx, int w, int h, int p, int r, } weight_averages(dst, dst_linesize, src, src_linesize, - s->wa, s->wa_linesize, w, h); + s->total_weight, s->sum, s->linesize, w, h); return 0; } @@ -556,7 +562,8 @@ static av_cold void uninit(AVFilterContext *ctx) NLMeansContext *s = ctx->priv; av_freep(&s->weight_lut); av_freep(&s->ii_orig); - av_freep(&s->wa); + av_freep(&s->total_weight); + av_freep(&s->sum); } static const AVFilterPad nlmeans_inputs[] = { diff --git a/libavfilter/vf_nlmeans.h b/libavfilter/vf_nlmeans.h index d0d0056163..cd1ee7c0bf 100644 --- a/libavfilter/vf_nlmeans.h +++ b/libavfilter/vf_nlmeans.h @@ -22,11 +22,6 @@ #include #include -struct weighted_avg { - float total_weight; - float sum; -}; - typedef struct NLMeansDSPContext { void (*compute_safe_ssd_integral_image)(uint32_t *dst, ptrdiff_t dst_linesize_32, const uint8_t *s1, ptrdiff_t linesize1, @@ -37,7 +32,8 @@ typedef struct NLMeansDSPContext { const uint32_t *const iid, const uint32_t *const iie, const uint8_t *const src, - struct weighted_avg *wa, + float *total_weight, + float *sum, const float *const weight_lut, int max_meaningful_diff, int startx, int endx); From patchwork Fri Oct 29 15:19:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul B Mahol X-Patchwork-Id: 31250 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a5e:a610:0:0:0:0:0 with SMTP id q16csp1805098ioi; Fri, 29 Oct 2021 08:20:30 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzUhQR3W7LAAOhJzYTjK7Kr0H/ayXivOcWwNEF7+D73AXZiucy67pFyUtG1kPxmYYiLs7dn X-Received: by 2002:a17:906:c0cf:: with SMTP id bn15mr14075918ejb.54.1635520830558; Fri, 29 Oct 2021 08:20:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635520830; cv=none; d=google.com; s=arc-20160816; b=ordCF4pdSZMcS2aC5lR+Foa3aANZkQTSPC5JzdT1FnVsD4vGQZVGoBVL3ZLCBTtVuh OcTVH9Z1Z/JZgxFaOuDcSTZpX6jnsMwRyR1llXpPc2AVkPUKXoHVFIA2OPZVsI4PlMfK iWaMOUHp/pM73jt9UfH3InAaUbjz9pABVjENc5qLyVdetJfpXwwQfmhGS6mfAF/IGgQv kItZwm+ubbwN4dp+n1gIag8dOuqO6Yj3uzklk5lMAQAOXL2MxjCWeZmG60Pm1vHvHB0k u8gx0zsY+GRYojB+wyn0dg7T8088Xbtr7EXmrqxAVZ/N8eroiQSwuZv68q3uhHQ8vdrZ suUg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=3Io3mr4n4fGJjiTxB1Ed8I45RrirUSu7P3/cNnTVr8Y=; b=P+ahIcl6e2xQsK6JrL0f7sYklkBYzgIvf00LiMXe3xYH+GKcVlWwnEWnBITbgoK85f vevmnLJJzs0O7u2EXmaxkuhfnOCodfBuLhsrl8GJLLYzwxuBT/fQJRDJK4mEYsl1NthX LjTfVEvP30qYti2uHMfem0MYzzVLBJUY0XQs5HWo0510xJmufDcd0llpeMWn4FpRkP30 8KcY/H2ZLVl3+ipkhbFWvY2Dd3yYGFI55MWXTe5J25It8E+lej8lPLyJLYmczYz6U0Y2 gYYDKlu3J0kWs8ztv9Ov+Bx2UVo+uG2fDKn4pXIahPpjPRujNREgNWmNgXD/ZRU96rWw jMDg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=QzA2UD7M; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 9si2033356edw.603.2021.10.29.08.20.30; Fri, 29 Oct 2021 08:20:30 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=QzA2UD7M; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0A4B468AA8F; Fri, 29 Oct 2021 18:19:18 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-ed1-f47.google.com (mail-ed1-f47.google.com [209.85.208.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 4B6E568A783 for ; Fri, 29 Oct 2021 18:19:07 +0300 (EEST) Received: by mail-ed1-f47.google.com with SMTP id z20so40171463edc.13 for ; Fri, 29 Oct 2021 08:19:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=gIpc7YqDeRbkkDMHHrGYL5AYqL04G2ZJCsMasWyjdos=; b=QzA2UD7Mq2KUNiAo3jc0CTJx7+YPqwy8pKMCqczz7ljnkoMak+uXM137Oxp2Bu0rXs DSQp/yy1wcKsTD/LYP8mDmOzZnc+whhXNS4gbXBV0h5/GCHKwulN7Qv3398EbKvMYwLl 4uMm+Dt37Qapeoz4AZXVLsxI/Mj0YZazMFypKCopgX1dIS2jfm34YQyjNj537B+0XiE7 +Cr0PImdazD6X/rT/OFUWVW9GHiSkhNaJlYddS2kZO+IXsHsHa1lXDbsxmSYEesbXsCB n5FUboDbsnIp6WQPJE+gQjudJIAuoJsYFu0/gDJb9mYUPq7au+6N9s4D859PvVPDK/tS sBng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gIpc7YqDeRbkkDMHHrGYL5AYqL04G2ZJCsMasWyjdos=; b=S+q82FG712nYC3uxq3V8/BtIkjTWl7EHZBIFvmnQKgPQ6xI1hoWSbBDhex+Ouav+dU IpxADhyPm9MRvIdFZDdtNC5t3W8+WSzCQlAWVWZGnQRgsLDdWm8YDNEr7XTa4dhpkJ8y P/XWfBYlbPoGIZguwSPynZGqWFVq3qNxhaIgzU673ALwO281ateQpB8iUwfojJByO/Lv iBIpdb9A2VbzwKJkTNQ7nfgA67czU5t33KwE92iNKvAb7T+2N6tUi/3xTOz8mIXyujYf 6JUuynYyD1LPvarqXsqhjKEJGmoVBurZi5ed6M1mp+kJ+GXuQigLub6khO3CaMOHFim/ EiDA== X-Gm-Message-State: AOAM5303CPHW1HHOj4qmvdCVIo5qso1RRIpGD9n8E2RdUsjAsR3GSzZ4 HR/FHGbhl7OQ4YPt2DoQu3K5ay0foF4= X-Received: by 2002:a17:906:fa90:: with SMTP id lt16mr9708487ejb.95.1635520746672; Fri, 29 Oct 2021 08:19:06 -0700 (PDT) Received: from localhost.localdomain ([212.15.177.28]) by smtp.gmail.com with ESMTPSA id gb3sm3101386ejc.81.2021.10.29.08.19.05 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Oct 2021 08:19:06 -0700 (PDT) From: Paul B Mahol To: ffmpeg-devel@ffmpeg.org Date: Fri, 29 Oct 2021 17:19:03 +0200 Message-Id: <20211029151903.1078367-7-onemda@gmail.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20211029151903.1078367-1-onemda@gmail.com> References: <20211029151903.1078367-1-onemda@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 7/7] avfilter/vf_nlmeans: add x86 SIMD X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: icrrVgpvlZsE Signed-off-by: Paul B Mahol --- libavfilter/vf_nlmeans.c | 3 + libavfilter/vf_nlmeans.h | 1 + libavfilter/x86/Makefile | 2 + libavfilter/x86/vf_nlmeans.asm | 92 +++++++++++++++++++++++++++++++ libavfilter/x86/vf_nlmeans_init.c | 40 ++++++++++++++ 5 files changed, 138 insertions(+) create mode 100644 libavfilter/x86/vf_nlmeans.asm create mode 100644 libavfilter/x86/vf_nlmeans_init.c diff --git a/libavfilter/vf_nlmeans.c b/libavfilter/vf_nlmeans.c index dee1f68101..4d5dcba5cc 100644 --- a/libavfilter/vf_nlmeans.c +++ b/libavfilter/vf_nlmeans.c @@ -519,6 +519,9 @@ void ff_nlmeans_init(NLMeansDSPContext *dsp) if (ARCH_AARCH64) ff_nlmeans_init_aarch64(dsp); + + if (ARCH_X86) + ff_nlmeans_init_x86(dsp); } static av_cold int init(AVFilterContext *ctx) diff --git a/libavfilter/vf_nlmeans.h b/libavfilter/vf_nlmeans.h index cd1ee7c0bf..43611a03bd 100644 --- a/libavfilter/vf_nlmeans.h +++ b/libavfilter/vf_nlmeans.h @@ -41,5 +41,6 @@ typedef struct NLMeansDSPContext { void ff_nlmeans_init(NLMeansDSPContext *dsp); void ff_nlmeans_init_aarch64(NLMeansDSPContext *dsp); +void ff_nlmeans_init_x86(NLMeansDSPContext *dsp); #endif /* AVFILTER_NLMEANS_H */ diff --git a/libavfilter/x86/Makefile b/libavfilter/x86/Makefile index a29941eaeb..e87481bd7a 100644 --- a/libavfilter/x86/Makefile +++ b/libavfilter/x86/Makefile @@ -20,6 +20,7 @@ OBJS-$(CONFIG_LIMITER_FILTER) += x86/vf_limiter_init.o OBJS-$(CONFIG_LUT3D_FILTER) += x86/vf_lut3d_init.o OBJS-$(CONFIG_MASKEDCLAMP_FILTER) += x86/vf_maskedclamp_init.o OBJS-$(CONFIG_MASKEDMERGE_FILTER) += x86/vf_maskedmerge_init.o +OBJS-$(CONFIG_NLMEANS_FILTER) += x86/vf_nlmeans_init.o OBJS-$(CONFIG_NOISE_FILTER) += x86/vf_noise.o OBJS-$(CONFIG_OVERLAY_FILTER) += x86/vf_overlay_init.o OBJS-$(CONFIG_PP7_FILTER) += x86/vf_pp7_init.o @@ -61,6 +62,7 @@ X86ASM-OBJS-$(CONFIG_LIMITER_FILTER) += x86/vf_limiter.o X86ASM-OBJS-$(CONFIG_LUT3D_FILTER) += x86/vf_lut3d.o X86ASM-OBJS-$(CONFIG_MASKEDCLAMP_FILTER) += x86/vf_maskedclamp.o X86ASM-OBJS-$(CONFIG_MASKEDMERGE_FILTER) += x86/vf_maskedmerge.o +X86ASM-OBJS-$(CONFIG_NLMEANS_FILTER) += x86/vf_nlmeans.o X86ASM-OBJS-$(CONFIG_OVERLAY_FILTER) += x86/vf_overlay.o X86ASM-OBJS-$(CONFIG_PP7_FILTER) += x86/vf_pp7.o X86ASM-OBJS-$(CONFIG_PSNR_FILTER) += x86/vf_psnr.o diff --git a/libavfilter/x86/vf_nlmeans.asm b/libavfilter/x86/vf_nlmeans.asm new file mode 100644 index 0000000000..1047e43de4 --- /dev/null +++ b/libavfilter/x86/vf_nlmeans.asm @@ -0,0 +1,92 @@ +;***************************************************************************** +;* x86-optimized functions for nlmeans filter +;* +;* This file is part of FFmpeg. +;* +;* FFmpeg is free software; you can redistribute it and/or +;* modify it under the terms of the GNU Lesser General Public +;* License as published by the Free Software Foundation; either +;* version 2.1 of the License, or (at your option) any later version. +;* +;* FFmpeg is distributed in the hope that it will be useful, +;* but WITHOUT ANY WARRANTY; without even the implied warranty of +;* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +;* Lesser General Public License for more details. +;* +;* You should have received a copy of the GNU Lesser General Public +;* License along with FFmpeg; if not, write to the Free Software +;* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA +;****************************************************************************** + + +%include "libavutil/x86/x86util.asm" + +%if HAVE_AVX2_EXTERNAL + +SECTION_RODATA + +SECTION .text + +; void ff_compute_weights_line(const uint32_t *const iia, +; const uint32_t *const iib, +; const uint32_t *const iid, +; const uint32_t *const iie, +; const uint8_t *const src, +; struct weighted_avg *wa, +; const float *const lut, +; int max, +; int startx, int endx); + +INIT_YMM avx2 +cglobal compute_weights_line, 11, 12, 7, iia, iib, iid, iie, src, total, sum, lut, max, startx, endx, x + movsxdifnidn startxq, startxd + movsxdifnidn endxq, endxd + movsxdifnidn maxq, maxd + + sub endxq, startxq + mov xq, mmsize / 4 + sub xq, 1 + not xq + and endxq, xq + add endxq, startxq + + mov xq, startxq + xor startxq, startxq + + VPBROADCASTD m4, maxm + vpcmpeqd m5, m5 + + .loop: + movu m0, [iieq + xq * 4] + movu m1, [iidq + xq * 4] + movu m2, [iibq + xq * 4] + movu m3, [iiaq + xq * 4] + + pmovzxbd m6, [srcq + xq] + cvtdq2ps m6, m6 + + psubd m0, m1 + psubd m0, m2 + paddd m0, m3 + pminud m0, m4 + pslld m0, 2 + mova m3, m5 + vgatherdps m1, [lutq + m0], m3 + + mulps m0, m1, m6 + + movups m3, [totalq + xq * 4] + movups m2, [sumq + xq * 4] + + addps m0, m2 + addps m1, m3 + + movups [totalq + xq * 4], m1 + movups [sumq + xq * 4], m0 + + add xq, mmsize / 4 + cmp xq, endxq + jl .loop + RET + +%endif diff --git a/libavfilter/x86/vf_nlmeans_init.c b/libavfilter/x86/vf_nlmeans_init.c new file mode 100644 index 0000000000..6fbd8f9008 --- /dev/null +++ b/libavfilter/x86/vf_nlmeans_init.c @@ -0,0 +1,40 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/attributes.h" +#include "libavutil/x86/cpu.h" +#include "libavfilter/vf_nlmeans.h" + +void ff_compute_weights_line_avx2(const uint32_t *const iia, + const uint32_t *const iib, + const uint32_t *const iid, + const uint32_t *const iie, + const uint8_t *const src, + float *total_weight, + float *sum, + const float *const weight_lut, + int max_meaningful_diff, + int startx, int endx); + +av_cold void ff_nlmeans_init_x86(NLMeansDSPContext *dsp) +{ + int cpu_flags = av_get_cpu_flags(); + + if (EXTERNAL_AVX2_FAST(cpu_flags)) + dsp->compute_weights_line = ff_compute_weights_line_avx2; +}