From patchwork Wed Aug 30 16:54:07 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Ashish Singh <ashk43712@gmail.com>
X-Patchwork-Id: 4910
Delivered-To: ffmpegpatchwork@gmail.com
Received: by 10.2.15.201 with SMTP id 70csp694301jao;
	Wed, 30 Aug 2017 10:01:03 -0700 (PDT)
X-Google-Smtp-Source: 
 ADKCNb7IV23rPGysxkvYObERwNC7ba/WdR0lwpI2AV5lsOkqxqRJqmEj0YiC6ZhyHz9+a8hp7FfT
X-Received: by 10.28.10.204 with SMTP id 195mr1729463wmk.30.1504112463703;
	Wed, 30 Aug 2017 10:01:03 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1504112463; cv=none;
	d=google.com; s=arc-20160816;
	b=V/fENiUNhQ0FI/87hQusElDq8Qj/FQMZVK4cbFPmJVFovceLSNgVrWFpv+M8EYIp+L
	sr9875Yo3crMu/oOPgFVSeIPwihykJBE5fDLvMZfHt08uxoEKM2vndejkPGCq2+XJcxz
	bmJU2EUu9Kc1VQGuCVgzrJ4x92TkiiOOjYf7Cl59F+BTC0M6Cz+AeXsSVqQ6062eopEH
	HvWdKVjdwk67VLGi6Mv6Kl7jmz8xYvoxjbuOuHhGLUvuDLBQwZE1F3BUF4AMoa4YQ2y0
	5qk7Mjx9dFAm5XgQ9uFKnDW1Zsb5faVw0vAlNNbwADYtZOGCqiTZ1EyJV/uOSEa6tfJL
	3vVQ==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
	s=arc-20160816;
	h=sender:errors-to:content-transfer-encoding:mime-version:cc:reply-to
	:list-subscribe:list-help:list-post:list-archive:list-unsubscribe
	:list-id:precedence:subject:references:in-reply-to:message-id:date
	:to:from:dkim-signature:delivered-to:arc-authentication-results;
	bh=DWODFkfuGbYoMGAn7GPAmSkKnGAXpWBbqhMuhkPGyq0=;
	b=SpkdtV2oQUJDubibYAF6v6wQl9TmmvD47gh30DwZjJOO6SkbONODmNI6C9mplHda/9
	3iNdursSIOJu8Djw/zUDsmb2kgCIJRkTGk74Tc+paYSRXrdpTcaEUfhaKuBgLXhxBnyU
	3it1hvMxio0kUO0moARWM72VSg68qMCy9NNr5HXIn4l3ioKe9Cu+tYl8Ho0vAi/pWRqV
	bhYRJ9cskW7Fxf6mtOwY7MxEaTVZdxtPrV7sSA82vCCjpeeg19RXCwyndJjmSaALiZow
	7YRuugz5sCrsmv/KO1/LHwtBUd3DQ9OcIVxysbOV+TO4nO5Aiswb3GvhZHftkwjy4J4V
	hRZQ==
ARC-Authentication-Results: i=1; mx.google.com;
	dkim=neutral (body hash did not verify) header.i=@gmail.com
	header.s=20161025 header.b=e09I8eps;
	spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
	designates 79.124.17.100 as permitted sender)
	smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
	dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gmail.com
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100])
	by mx.google.com with ESMTP id
	i99si2923134wri.269.2017.08.30.10.01.02;
	Wed, 30 Aug 2017 10:01:03 -0700 (PDT)
Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
	designates 79.124.17.100 as permitted sender)
	client-ip=79.124.17.100;
Authentication-Results: mx.google.com;
	dkim=neutral (body hash did not verify) header.i=@gmail.com
	header.s=20161025 header.b=e09I8eps;
	spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
	designates 79.124.17.100 as permitted sender)
	smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
	dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gmail.com
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 251D7689DDD;
	Wed, 30 Aug 2017 20:00:58 +0300 (EEST)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from mail-pg0-f67.google.com (mail-pg0-f67.google.com
	[74.125.83.67])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 60DAD689DC8
	for <ffmpeg-devel@ffmpeg.org>; Wed, 30 Aug 2017 20:00:51 +0300 (EEST)
Received: by mail-pg0-f67.google.com with SMTP id r133so5480797pgr.2
	for <ffmpeg-devel@ffmpeg.org>; Wed, 30 Aug 2017 10:00:51 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
	h=from:to:cc:subject:date:message-id:in-reply-to:references;
	bh=lXUdyfz6CDdEBUfWmf6BZDRWuez7zxjLk26gXMI7cnY=;
	b=e09I8eps9RpVFTN3yGCo1E2jruQY8VsoONnrr//AdcXHL5oaULZ8ZMEUhXQL8aeuca
	sV+urSHF1vIC6+ZAmJV+1r0Oppjus5Scryw/3K8Auiy/eM/bVlaYnWv+ArbhlfTACCud
	Zmm+QErKiWc12X4bcdU9Ij/npfatKl+sukln/104eeMzwn2Wgzaiwn/A/LIC5ul8YbEN
	A7CQbqYBsnk1iR5hlyf+5x/1EWkIGWa56I6dxFHbkScQFixrDi1LfhQ71MJURVag9WOJ
	9xjdYAx7G/VPr0fHQQa3YXWA43w9dLSlH+WVawPJSndhl24reiHv2wMse5KMCUPOacc/
	RCtA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
	:references;
	bh=lXUdyfz6CDdEBUfWmf6BZDRWuez7zxjLk26gXMI7cnY=;
	b=aaFHVDSbRqvgtm58yvUuaHiNA16hfS47X2I0liDShdweI3fyKKlcmjzhIlZ/wzzabt
	g954mYxHo5pEmHaS83bJ4nuuqo5puYSTBMbO4ijH3EB00aukF+WgEWyAeTEA3tOnLK9w
	AsCdAeicackJKm3HO1+KTy2dmgOTAEQ2HnTMxmmTx+Dls8WUCojGZ2+xA1haSFtgBer7
	H9GEaPtO18h4ak8W10xLQ7JyNfFNDQmqzh8ABPQkaGcPo6kuGT5pZayYP+G3KA6XbfJW
	j2V5bSAcSYaY6hhMOY6U0Kn6ekORCaYMoMfZ/bR3XXmC1//NAzAXlyhV2M6L/r1HIyki
	q20g==
X-Gm-Message-State: AHYfb5gxOijgSFzheKtYuis9e+VpLeNp3hkA8AUayyCQpH4Ml/Bc+9Rw
	OgO7eHF+T3xr0jMr8Gs=
X-Received: by 10.84.210.235 with SMTP id a98mr2827902pli.131.1504112071169;
	Wed, 30 Aug 2017 09:54:31 -0700 (PDT)
Received: from localhost.localdomain ([49.44.51.62])
	by smtp.gmail.com with ESMTPSA id
	p1sm8853401pfe.129.2017.08.30.09.54.28
	(version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128);
	Wed, 30 Aug 2017 09:54:30 -0700 (PDT)
From: Ashish Pratap Singh <ashk43712@gmail.com>
To: ffmpeg-devel@ffmpeg.org
Date: Wed, 30 Aug 2017 22:24:07 +0530
Message-Id: <1504112047-12230-1-git-send-email-ashk43712@gmail.com>
X-Mailer: git-send-email 2.7.4
In-Reply-To: <1501239185-4593-1-git-send-email-ashk43712@gmail.com>
References: <1501239185-4593-1-git-send-email-ashk43712@gmail.com>
Subject: [FFmpeg-devel] [PATCH] avfilter: add ADM filter
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <http://ffmpeg.org/mailman/options/ffmpeg-devel>,
	<mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <http://ffmpeg.org/pipermail/ffmpeg-devel/>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <http://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
	<mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches
	<ffmpeg-devel@ffmpeg.org>
Cc: Ashish Singh <ashk43712@gmail.com>
MIME-Version: 1.0
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>

From: Ashish Singh <ashk43712@gmail.com>

Hi,
This patch changes the previous adm filter to integer types which will be
helpful for writing SIMD optimizations later. 

Signed-off-by: Ashish Singh <ashk43712@gmail.com>
---
 Changelog                |   1 +
 doc/filters.texi         |  15 +
 libavfilter/Makefile     |   1 +
 libavfilter/adm.h        |  75 +++++
 libavfilter/allfilters.c |   1 +
 libavfilter/vf_adm.c     | 742 +++++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 835 insertions(+)
 create mode 100644 libavfilter/adm.h
 create mode 100644 libavfilter/vf_adm.c

diff --git a/Changelog b/Changelog
index 8309417..c6a775c 100644
--- a/Changelog
+++ b/Changelog
@@ -40,6 +40,7 @@ version <next>:
   They must always be used by name.
 - FITS demuxer and decoder
 - FITS muxer and encoder
+- ADM video filter
 
 version 3.3:
 - CrystalHD decoder moved to new decode API
diff --git a/doc/filters.texi b/doc/filters.texi
index 19e13a1..67b39f9 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -4654,6 +4654,21 @@ build.
 
 Below is a description of the currently available video filters.
 
+@section adm
+
+Obtain the average ADM/DLM (Detail Loss Metric) between two input videos.
+
+This filter takes two input videos.
+
+The obtained average ADM score is printed through the logging system.
+
+In the below example the input file @file{main.mpg} being processed is compared
+with the reference file @file{ref.mpg}.
+
+@example
+ffmpeg -i main.mpg -i ref.mpg -lavfi adm -f null -
+@end example
+
 @section alphaextract
 
 Extract the alpha component from the input as a grayscale video. This
diff --git a/libavfilter/Makefile b/libavfilter/Makefile
index ee840b0..b4b46e1 100644
--- a/libavfilter/Makefile
+++ b/libavfilter/Makefile
@@ -126,6 +126,7 @@ OBJS-$(CONFIG_SINE_FILTER)                   += asrc_sine.o
 OBJS-$(CONFIG_ANULLSINK_FILTER)              += asink_anullsink.o
 
 # video filters
+OBJS-$(CONFIG_ADM_FILTER)                    += vf_adm.o
 OBJS-$(CONFIG_ALPHAEXTRACT_FILTER)           += vf_extractplanes.o
 OBJS-$(CONFIG_ALPHAMERGE_FILTER)             += vf_alphamerge.o
 OBJS-$(CONFIG_ASS_FILTER)                    += vf_subtitles.o
diff --git a/libavfilter/adm.h b/libavfilter/adm.h
new file mode 100644
index 0000000..862bd17
--- /dev/null
+++ b/libavfilter/adm.h
@@ -0,0 +1,75 @@
+/*
+ * Copyright (c) 2017 Ronald S. Bultje <rsbultje@gmail.com>
+ * Copyright (c) 2017 Ashish Pratap Singh <ashk43712@gmail.com>
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef AVFILTER_ADM_H
+#define AVFILTER_ADM_H
+/** Formula (1), page 1165 - display visual resolution (DVR),
+ * in pixels/degree of visual angle. This should be 56.55
+ */
+#define R 56.55
+/** Percentage of frame to discard on all 4 sides */
+#define ADM_BORDER_FACTOR (0.1)
+
+#define N 15
+
+typedef struct adm_dwt_band_t {
+    int16_t *band_a; /** Low-pass V + low-pass H. */
+    int16_t *band_v; /** Low-pass V + high-pass H. */
+    int16_t *band_h; /** High-pass V + low-pass H. */
+    int16_t *band_d; /** High-pass V + high-pass H. */
+} adm_dwt_band_t;
+
+static const float dwt2_db2_coeffs_lo[4] = {
+    0.482962913144690,  0.836516303737469,
+    0.224143868041857, -0.129409522550921
+};
+static const float dwt2_db2_coeffs_hi[4] = {
+    -0.129409522550921, -0.224143868041857,
+    0.836516303737469,  -0.482962913144690
+};
+
+static int32_t dwt2_db2_coeffs_lo_int[4];
+static int32_t dwt2_db2_coeffs_hi_int[4];
+
+/**
+ * The following dwt basis function amplitudes, Q(lambda,theta), are taken from
+ * "Visibility of Wavelet Quantization Noise"
+ * by A. B. Watson, G. Y. Yang, J. A. Solomon and J. Villasenor
+ * IEEE Trans. on Image Processing, Vol. 6, No 8, Aug. 1997
+ * Page 1172, Table V
+ * The table has been transposed, i.e. it can be used directly to obtain Q[lambda][theta]
+ * These amplitudes were calculated for the 7-9 biorthogonal wavelet basis
+ */
+static const float Q[4][2] = {
+    { 57.534645,  169.767410, },
+    { 31.265896,  69.937431,  },
+    { 23.056629,  40.990150,  },
+    { 21.895033,  31.936741,  },
+};
+
+/** function to compute adm score */
+int compute_adm2(const void *ref, const void *main, int w, int h,
+                 ptrdiff_t ref_stride, ptrdiff_t main_stride, double *score,
+                 double *score_num, double *score_den, double *scores,
+                 int16_t *data_buf, int16_t *temp_lo, int16_t *temp_hi,
+                 uint8_t type);
+
+#endif /* AVFILTER_ADM_H */
diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
index 8b9b9a4..39ca9cc 100644
--- a/libavfilter/allfilters.c
+++ b/libavfilter/allfilters.c
@@ -138,6 +138,7 @@ static void register_all(void)
 
     REGISTER_FILTER(ANULLSINK,      anullsink,      asink);
 
+    REGISTER_FILTER(ADM,            adm,            vf);
     REGISTER_FILTER(ALPHAEXTRACT,   alphaextract,   vf);
     REGISTER_FILTER(ALPHAMERGE,     alphamerge,     vf);
     REGISTER_FILTER(ASS,            ass,            vf);
diff --git a/libavfilter/vf_adm.c b/libavfilter/vf_adm.c
new file mode 100644
index 0000000..1f05034
--- /dev/null
+++ b/libavfilter/vf_adm.c
@@ -0,0 +1,742 @@
+/*
+ * Copyright (c) 2017 Ronald S. Bultje <rsbultje@gmail.com>
+ * Copyright (c) 2017 Ashish Pratap Singh <ashk43712@gmail.com>
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+/**
+ * @file
+ * Calculate the ADM between two input videos.
+ */
+
+#include "libavutil/avstring.h"
+#include "libavutil/opt.h"
+#include "libavutil/pixdesc.h"
+#include "avfilter.h"
+#include "dualinput.h"
+#include "drawutils.h"
+#include "formats.h"
+#include "internal.h"
+#include "adm.h"
+#include "video.h"
+#include <emmintrin.h>
+
+typedef struct ADMContext {
+    const AVClass *class;
+    FFDualInputContext dinput;
+    const AVPixFmtDescriptor *desc;
+    int width;
+    int height;
+    int16_t *data_buf;
+    int16_t *temp_lo;
+    int16_t *temp_hi;
+    double adm_sum;
+    uint64_t nb_frames;
+} ADMContext;
+
+static const AVOption adm_options[] = {
+    { NULL }
+};
+
+AVFILTER_DEFINE_CLASS(adm);
+
+#define MAX_ALIGN 32
+#define ALIGN_CEIL(x) ((x) + ((x) % MAX_ALIGN ? MAX_ALIGN - (x) % MAX_ALIGN : 0))
+#define FLAGS AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_VIDEO_PARAM
+
+static float rcp(float x)
+{
+    float xi = _mm_cvtss_f32(_mm_rcp_ss(_mm_load_ss(&x)));
+    return xi + xi * (1.0 - x * xi);
+}
+
+#define DIVS(n, d) ((n) * rcp(d))
+
+static int32_t get_cube(int16_t val)
+{
+    return val * val * val;
+}
+
+static int16_t adm_sum_cube(const int16_t *x, int w, int h, ptrdiff_t stride,
+                            double border_factor)
+{
+    ptrdiff_t px_stride = stride / sizeof(int16_t);
+    int left = w * border_factor - 0.5;
+    int top = h * border_factor - 0.5;
+    int right = w - left;
+    int bottom = h - top;
+
+    int i, j;
+
+    int sum = 0;
+
+    for (i = top; i < bottom; i++) {
+        for (j = left; j < right; j++) {
+            sum += get_cube(FFABS(x[i * px_stride + j]));
+        }
+    }
+
+    return ceil(cbrt(sum)) + ceil(cbrt((bottom - top) * (right - left) / 32.0));
+}
+
+static void adm_decouple(const adm_dwt_band_t *ref, const adm_dwt_band_t *main,
+                         const adm_dwt_band_t *r, const adm_dwt_band_t *a,
+                         int w, int h, ptrdiff_t ref_stride, ptrdiff_t main_stride,
+                         ptrdiff_t r_stride, ptrdiff_t a_stride)
+{
+    const float cos_1deg_sq = cos(1.0 * M_PI / 180.0) * cos(1.0 * M_PI / 180.0);
+    const float eps = 1e-30;
+
+    ptrdiff_t ref_px_stride = ref_stride / sizeof(int16_t);
+    ptrdiff_t main_px_stride = main_stride / sizeof(int16_t);
+    ptrdiff_t r_px_stride = r_stride / sizeof(int16_t);
+    ptrdiff_t a_px_stride = a_stride / sizeof(int16_t);
+
+    int oh, ov, od, th, tv, td;
+    float kh, kv, kd, tmph, tmpv, tmpd;
+    float ot_dp, o_mag_sq, t_mag_sq;
+    int angle_flag;
+    int i, j;
+
+    for (i = 0; i < h; i++) {
+        for (j = 0; j < w; j++) {
+            oh = ref->band_h[i * ref_px_stride + j];
+            ov = ref->band_v[i * ref_px_stride + j];
+            od = ref->band_d[i * ref_px_stride + j];
+            th = main->band_h[i * main_px_stride + j];
+            tv = main->band_v[i * main_px_stride + j];
+            td = main->band_d[i * main_px_stride + j];
+
+            kh = DIVS(th, oh + eps);
+            kv = DIVS(tv, ov + eps);
+            kd = DIVS(td, od + eps);
+
+            kh = kh < 0 ? 0 : (kh > 1 ? 1 : kh);
+            kv = kv < 0 ? 0 : (kv > 1 ? 1 : kv);
+            kd = kd < 0 ? 0 : (kd > 1 ? 1 : kd);
+
+            tmph = kh * oh;
+            tmpv = kv * ov;
+            tmpd = kd * od;
+
+            ot_dp = oh * th + ov * tv;
+            o_mag_sq = oh * oh + ov * ov;
+            t_mag_sq = th * th + tv * tv;
+
+            angle_flag = (ot_dp >= 0) && (ot_dp * ot_dp >= cos_1deg_sq *
+                                          o_mag_sq * t_mag_sq);
+
+            if (angle_flag) {
+                tmph = th;
+                tmpv = tv;
+                tmpd = td;
+            }
+
+            r->band_h[i * r_px_stride + j] = ceil(tmph);
+            r->band_v[i * r_px_stride + j] = ceil(tmpv);
+            r->band_d[i * r_px_stride + j] = ceil(tmpd);
+
+            a->band_h[i * a_px_stride + j] = ceil(th - tmph);
+            a->band_v[i * a_px_stride + j] = ceil(tv - tmpv);
+            a->band_d[i * a_px_stride + j] = ceil(td - tmpd);
+        }
+    }
+}
+
+static void adm_csf(const adm_dwt_band_t *src, const adm_dwt_band_t *dst,
+                    int orig_h, int scale, int w, int h, ptrdiff_t src_stride,
+                    ptrdiff_t dst_stride)
+{
+    const int16_t *src_angles[3] = { src->band_h, src->band_v, src->band_d };
+    int16_t *dst_angles[3] = { dst->band_h, dst->band_v, dst->band_d };
+
+    const int16_t *src_ptr;
+    int16_t *dst_ptr;
+
+    ptrdiff_t src_px_stride = src_stride / sizeof(int16_t);
+    ptrdiff_t dst_px_stride = dst_stride / sizeof(int16_t);
+
+    uint16_t rfactor[3] = {lrint((1.0 / Q[scale][0]) * (1 << N)),
+        lrint((1.0 / Q[scale][0]) * (1 << N)),
+        lrint((1.0 / Q[scale][1]) * (1 << N))};
+
+    int i, j, theta;
+
+    for (theta = 0; theta < 3; theta++) {
+        src_ptr = src_angles[theta];
+        dst_ptr = dst_angles[theta];
+
+        for (i = 0; i < h; i++) {
+            for (j = 0; j < w; j++) {
+                dst_ptr[i * dst_px_stride + j] = (rfactor[theta] *
+                                                  src_ptr[i * src_px_stride + j]) >> N;
+            }
+        }
+    }
+}
+
+static void adm_cm_thresh(const adm_dwt_band_t *src, int16_t *dst, int w, int h,
+                          ptrdiff_t src_stride, ptrdiff_t dst_stride)
+{
+    const int16_t *angles[3] = { src->band_h, src->band_v, src->band_d };
+    const int16_t *src_ptr;
+
+    ptrdiff_t src_px_stride = src_stride / sizeof(int16_t);
+    ptrdiff_t dst_px_stride = dst_stride / sizeof(int16_t);
+
+    int filt_coeff, img_coeff;
+
+    int theta, i, j, filt_i, filt_j, src_i, src_j;
+
+    for (i = 0; i < h; i++) {
+
+        for (j = 0; j < w; j++) {
+            dst[i * dst_px_stride + j] = 0;
+        }
+
+        for (theta = 0; theta < 3; ++theta) {
+            src_ptr = angles[theta];
+
+            for (j = 0; j < w; j++) {
+                int sum = 0;
+
+                for (filt_i = 0; filt_i < 3; filt_i++) {
+                    for (filt_j = 0; filt_j < 3; filt_j++) {
+                        filt_coeff = (lrint((filt_i == 1 && filt_j == 1) ? 1.0 /
+                                            15.0 : 1.0 / 30.0) * (1 << N));
+
+                        src_i = i - 1 + filt_i;
+                        src_j = j - 1 + filt_j;
+
+                        src_i = FFABS(src_i);
+                        if (src_i >= h) {
+                            src_i = 2 * h - src_i - 1;
+                        }
+                        src_j = FFABS(src_j);
+                        if (src_j >= w) {
+                            src_j = 2 * w - src_j - 1;
+                        }
+                        img_coeff = FFABS(src_ptr[src_i * src_px_stride + src_j]);
+
+                        sum += filt_coeff * img_coeff;
+                    }
+                }
+
+                dst[i * dst_px_stride + j] += sum >> N;
+            }
+        }
+    }
+}
+
+static void adm_cm(const adm_dwt_band_t *src, const adm_dwt_band_t *dst,
+                   const int16_t *thresh, int w, int h, ptrdiff_t src_stride,
+                   ptrdiff_t dst_stride, ptrdiff_t thresh_stride)
+{
+    ptrdiff_t src_px_stride = src_stride / sizeof(int16_t);
+    ptrdiff_t dst_px_stride = dst_stride / sizeof(int16_t);
+    ptrdiff_t thresh_px_stride = thresh_stride / sizeof(int16_t);
+
+    int xh, xv, xd, thr;
+
+    int i, j;
+
+    for (i = 0; i < h; i++) {
+        for (j = 0; j < w; j++) {
+            xh  = src->band_h[i * src_px_stride + j];
+            xv  = src->band_v[i * src_px_stride + j];
+            xd  = src->band_d[i * src_px_stride + j];
+            thr = thresh[i * thresh_px_stride + j];
+
+            xh = FFABS(xh) - thr;
+            xv = FFABS(xv) - thr;
+            xd = FFABS(xd) - thr;
+
+            xh = xh < 0 ? 0 : xh;
+            xv = xv < 0 ? 0 : xv;
+            xd = xd < 0 ? 0 : xd;
+
+            dst->band_h[i * dst_px_stride + j] = xh;
+            dst->band_v[i * dst_px_stride + j] = xv;
+            dst->band_d[i * dst_px_stride + j] = xd;
+        }
+    }
+}
+
+#define adm_dwt2_fn(type, bits) \
+    static void adm_dwt2_##bits##bit(const type *src, const adm_dwt_band_t *dst, \
+                                     int w, int h, ptrdiff_t src_stride, \
+                                     ptrdiff_t dst_stride, int16_t *temp_lo, \
+                                     int16_t* temp_hi) \
+{ \
+    const int32_t *filter_lo = dwt2_db2_coeffs_lo_int; \
+    const int32_t *filter_hi = dwt2_db2_coeffs_hi_int; \
+    int filt_w = sizeof(dwt2_db2_coeffs_lo_int) / sizeof(int); \
+    \
+    ptrdiff_t src_px_stride = src_stride / sizeof(type); \
+    ptrdiff_t dst_px_stride = dst_stride / sizeof(int16_t); \
+    \
+    int filt_coeff_lo, filt_coeff_hi, img_coeff; \
+    \
+    int i, j, filt_i, filt_j, src_i, src_j; \
+    \
+    for (i = 0; i < (h + 1) / 2; i++) { \
+        /** Vertical pass. */ \
+        for (j = 0; j < w; j++) { \
+            int sum_lo = 0; \
+            int sum_hi = 0; \
+            \
+            for (filt_i = 0; filt_i < filt_w; filt_i++) { \
+                filt_coeff_lo = filter_lo[filt_i]; \
+                filt_coeff_hi = filter_hi[filt_i]; \
+                \
+                src_i = 2 * i - 1 + filt_i; \
+                \
+                src_i = FFABS(src_i); \
+                if (src_i >= h) { \
+                    src_i = 2 * h - src_i - 1; \
+                } \
+                \
+                img_coeff = src[src_i * src_px_stride + j]; \
+                \
+                sum_lo += filt_coeff_lo * img_coeff; \
+                sum_hi += filt_coeff_hi * img_coeff; \
+            } \
+            \
+            temp_lo[j] = sum_lo >> N; \
+            temp_hi[j] = sum_hi >> N; \
+        } \
+        \
+        /** Horizontal pass (lo). */ \
+        for (j = 0; j < (w + 1) / 2; j++) { \
+            int sum_lo = 0; \
+            int sum_hi = 0; \
+            \
+            for (filt_j = 0; filt_j < filt_w; filt_j++) { \
+                filt_coeff_lo = filter_lo[filt_j]; \
+                filt_coeff_hi = filter_hi[filt_j]; \
+                \
+                src_j = 2 * j - 1 + filt_j; \
+                \
+                src_j = FFABS(src_j); \
+                if (src_j >= w) { \
+                    src_j = 2 * w - src_j - 1; \
+                } \
+                \
+                img_coeff = temp_lo[src_j]; \
+                \
+                sum_lo += filt_coeff_lo * img_coeff; \
+                sum_hi += filt_coeff_hi * img_coeff; \
+            } \
+            \
+            dst->band_a[i * dst_px_stride + j] = sum_lo >> N; \
+            dst->band_v[i * dst_px_stride + j] = sum_hi >> N; \
+        } \
+        \
+        /** Horizontal pass (hi). */ \
+        for (j = 0; j < (w + 1) / 2; j++) { \
+            int sum_lo = 0; \
+            int sum_hi = 0; \
+            \
+            for (filt_j = 0; filt_j < filt_w; filt_j++) { \
+                filt_coeff_lo = filter_lo[filt_j]; \
+                filt_coeff_hi = filter_hi[filt_j]; \
+                \
+                src_j = 2 * j - 1 + filt_j; \
+                \
+                src_j = FFABS(src_j); \
+                if (src_j >= w) { \
+                    src_j = 2 * w - src_j - 1; \
+                } \
+                \
+                img_coeff = temp_hi[src_j]; \
+                \
+                sum_lo += filt_coeff_lo * img_coeff; \
+                sum_hi += filt_coeff_hi * img_coeff; \
+            } \
+            \
+            dst->band_h[i * dst_px_stride + j] = sum_lo >> N; \
+            dst->band_d[i * dst_px_stride + j] = sum_hi >> N; \
+        } \
+    } \
+}
+
+adm_dwt2_fn(uint8_t, 8);
+adm_dwt2_fn(uint16_t, 10);
+adm_dwt2_fn(int16_t, 32);
+
+static void adm_buffer_copy(const void *src, void *dst, int linewidth, int h,
+                            ptrdiff_t src_stride, ptrdiff_t dst_stride)
+{
+    const char *src_p = src;
+    char *dst_p = dst;
+    int i;
+
+    for (i = 0; i < h; i++) {
+        memcpy(dst_p, src_p, linewidth);
+        src_p += src_stride;
+        dst_p += dst_stride;
+    }
+}
+
+static char *init_dwt_band(adm_dwt_band_t *band, char *data_top, size_t buf_sz)
+{
+    band->band_a = (int16_t *) data_top;
+    data_top += buf_sz;
+    band->band_h = (int16_t *) data_top;
+    data_top += buf_sz;
+    band->band_v = (int16_t *) data_top;
+    data_top += buf_sz;
+    band->band_d = (int16_t *) data_top;
+    data_top += buf_sz;
+    return data_top;
+}
+
+int compute_adm2(const void *ref, const void *main, int w, int h,
+                 ptrdiff_t ref_stride, ptrdiff_t main_stride, double *score,
+                 double *score_num, double *score_den, double *scores,
+                 int16_t *data_buf, int16_t *temp_lo, int16_t *temp_hi,
+                 uint8_t type)
+{
+    double numden_limit = 1e-2 * (w * h) / (1920.0 * 1080.0);
+
+    char *data_top;
+
+    int16_t *ref_scale;
+    int16_t *main_scale;
+
+    adm_dwt_band_t ref_dwt2;
+    adm_dwt_band_t main_dwt2;
+
+    adm_dwt_band_t decouple_r;
+    adm_dwt_band_t decouple_a;
+
+    adm_dwt_band_t csf_o;
+    adm_dwt_band_t csf_r;
+    adm_dwt_band_t csf_a;
+
+    int16_t *mta;
+
+    adm_dwt_band_t cm_r;
+
+    const void *curr_ref_scale = ref;
+    const void *curr_main_scale = main;
+    ptrdiff_t curr_ref_stride = ref_stride;
+    ptrdiff_t curr_main_stride = main_stride;
+
+    int orig_h = h;
+
+    ptrdiff_t buf_stride = ALIGN_CEIL(((w + 1) / 2) * sizeof(int16_t));
+    size_t buf_sz = (size_t)buf_stride * ((h + 1) / 2);
+
+    double num = 0;
+    double den = 0;
+
+    int scale;
+    int ret = 1;
+
+    data_top = (char *) (data_buf);
+
+    ref_scale = (int16_t *) data_top;
+    data_top += buf_sz;
+    main_scale = (int16_t *) data_top;
+    data_top += buf_sz;
+
+    data_top = init_dwt_band(&ref_dwt2, data_top, buf_sz);
+    data_top = init_dwt_band(&main_dwt2, data_top, buf_sz);
+    data_top = init_dwt_band(&decouple_r, data_top, buf_sz);
+    data_top = init_dwt_band(&decouple_a, data_top, buf_sz);
+    data_top = init_dwt_band(&csf_o, data_top, buf_sz);
+    data_top = init_dwt_band(&csf_r, data_top, buf_sz);
+    data_top = init_dwt_band(&csf_a, data_top, buf_sz);
+
+    mta = (int16_t *) data_top;
+    data_top += buf_sz;
+
+    data_top = init_dwt_band(&cm_r, data_top, buf_sz);
+
+    for (scale = 0; scale < 4; scale++) {
+        float num_scale = 0.0;
+        float den_scale = 0.0;
+
+        if(!scale) {
+            if(type <= 8) {
+                adm_dwt2_8bit((const uint8_t *) curr_ref_scale, &ref_dwt2, w,
+                              h, curr_ref_stride, buf_stride, temp_lo, temp_hi);
+                adm_dwt2_8bit((const uint8_t *) curr_main_scale, &main_dwt2, w,
+                              h, curr_main_stride, buf_stride, temp_lo, temp_hi);
+            } else {
+                adm_dwt2_10bit((const uint16_t *) curr_ref_scale, &ref_dwt2, w,
+                               h, curr_ref_stride, buf_stride, temp_lo, temp_hi);
+                adm_dwt2_10bit((const uint16_t *) curr_main_scale, &main_dwt2, w,
+                               h, curr_main_stride, buf_stride, temp_lo, temp_hi);
+            }
+        } else{
+            adm_dwt2_32bit((const int16_t *) curr_ref_scale, &ref_dwt2, w, h,
+                           curr_ref_stride, buf_stride, temp_lo, temp_hi);
+            adm_dwt2_32bit((const int16_t *) curr_main_scale, &main_dwt2, w, h,
+                           curr_main_stride, buf_stride, temp_lo, temp_hi);
+        }
+
+        w = (w + 1) / 2;
+        h = (h + 1) / 2;
+
+        adm_decouple(&ref_dwt2, &main_dwt2, &decouple_r, &decouple_a, w, h,
+                     buf_stride, buf_stride, buf_stride, buf_stride);
+
+        adm_csf(&ref_dwt2, &csf_o, orig_h, scale, w, h, buf_stride, buf_stride);
+        adm_csf(&decouple_r, &csf_r, orig_h, scale, w, h, buf_stride, buf_stride);
+        adm_csf(&decouple_a, &csf_a, orig_h, scale, w, h, buf_stride, buf_stride);
+
+        adm_cm_thresh(&csf_a, mta, w, h, buf_stride, buf_stride);
+        adm_cm(&csf_r, &cm_r, mta, w, h, buf_stride, buf_stride, buf_stride);
+
+        num_scale += adm_sum_cube(cm_r.band_h, w, h, buf_stride, ADM_BORDER_FACTOR);
+        num_scale += adm_sum_cube(cm_r.band_v, w, h, buf_stride, ADM_BORDER_FACTOR);
+        num_scale += adm_sum_cube(cm_r.band_d, w, h, buf_stride, ADM_BORDER_FACTOR);
+
+        den_scale += adm_sum_cube(csf_o.band_h, w, h, buf_stride, ADM_BORDER_FACTOR);
+        den_scale += adm_sum_cube(csf_o.band_v, w, h, buf_stride, ADM_BORDER_FACTOR);
+        den_scale += adm_sum_cube(csf_o.band_d, w, h, buf_stride, ADM_BORDER_FACTOR);
+
+        num += num_scale;
+        den += den_scale;
+
+        adm_buffer_copy(ref_dwt2.band_a, ref_scale, w * sizeof(int16_t), h,
+                        buf_stride, buf_stride);
+        adm_buffer_copy(main_dwt2.band_a, main_scale, w * sizeof(int16_t), h,
+                        buf_stride, buf_stride);
+
+        curr_ref_scale = ref_scale;
+        curr_main_scale = main_scale;
+        curr_ref_stride = buf_stride;
+        curr_main_stride = buf_stride;
+
+        scores[2 * scale + 0] = num_scale;
+        scores[2 * scale + 1] = den_scale;
+    }
+
+    num = num < numden_limit ? 0 : num;
+    den = den < numden_limit ? 0 : den;
+
+    if (den == 0.0) {
+        *score = 1.0;
+    } else {
+        *score = num / den;
+    }
+    *score_num = num;
+    *score_den = den;
+
+    ret = 0;
+
+    return ret;
+}
+
+static void set_meta(AVDictionary **metadata, const char *key, float d)
+{
+    char value[128];
+    snprintf(value, sizeof(value), "%0.2f", d);
+    av_dict_set(metadata, key, value, 0);
+}
+
+static AVFrame *do_adm(AVFilterContext *ctx, AVFrame *main, const AVFrame *ref)
+{
+    ADMContext *s = ctx->priv;
+    AVDictionary **metadata = &main->metadata;
+
+    double score = 0.0;
+    double score_num = 0;
+    double score_den = 0;
+    double scores[2 * 4];
+
+    int w = s->width;
+    int h = s->height;
+
+    ptrdiff_t ref_stride, main_stride;
+
+    ref_stride = ref->linesize[0];
+    main_stride = main->linesize[0];
+
+    compute_adm2(ref->data[0], main->data[0], w, h, ref_stride, main_stride,
+                 &score, &score_num, &score_den, scores, s->data_buf, s->temp_lo,
+                 s->temp_hi, s->desc->comp[0].depth);
+
+    set_meta(metadata, "lavfi.adm.score", score);
+
+    s->nb_frames++;
+
+    s->adm_sum += score;
+
+    return main;
+}
+
+static av_cold int init(AVFilterContext *ctx)
+{
+    ADMContext *s = ctx->priv;
+
+    int i;
+    for(i = 0; i < 4; i++) {
+        dwt2_db2_coeffs_lo_int[i] = lrint(dwt2_db2_coeffs_lo[i] * (1 << N));
+        dwt2_db2_coeffs_hi_int[i] = lrint(dwt2_db2_coeffs_hi[i] * (1 << N));
+    }
+
+    s->dinput.process = do_adm;
+
+    return 0;
+}
+
+static int query_formats(AVFilterContext *ctx)
+{
+    static const enum AVPixelFormat pix_fmts[] = {
+        AV_PIX_FMT_YUV444P, AV_PIX_FMT_YUV422P, AV_PIX_FMT_YUV420P,
+        AV_PIX_FMT_YUV444P10LE, AV_PIX_FMT_YUV422P10LE, AV_PIX_FMT_YUV420P10LE,
+        AV_PIX_FMT_NONE
+    };
+
+    AVFilterFormats *fmts_list = ff_make_format_list(pix_fmts);
+    if (!fmts_list)
+        return AVERROR(ENOMEM);
+    return ff_set_common_formats(ctx, fmts_list);
+}
+
+static int config_input_ref(AVFilterLink *inlink)
+{
+    AVFilterContext *ctx  = inlink->dst;
+    ADMContext *s = ctx->priv;
+    ptrdiff_t buf_stride;
+    size_t buf_sz;
+    ptrdiff_t stride;
+
+    if (ctx->inputs[0]->w != ctx->inputs[1]->w ||
+        ctx->inputs[0]->h != ctx->inputs[1]->h) {
+        av_log(ctx, AV_LOG_ERROR, "Width and height of input videos must be same.\n");
+        return AVERROR(EINVAL);
+    }
+    if (ctx->inputs[0]->format != ctx->inputs[1]->format) {
+        av_log(ctx, AV_LOG_ERROR, "Inputs must be of same pixel format.\n");
+        return AVERROR(EINVAL);
+    }
+
+    s->desc = av_pix_fmt_desc_get(inlink->format);
+    s->width = ctx->inputs[0]->w;
+    s->height = ctx->inputs[0]->h;
+
+    buf_stride = ALIGN_CEIL(((s->width + 1) / 2) * sizeof(int16_t));
+    buf_sz = (size_t)buf_stride * ((s->height + 1) / 2);
+
+    if (SIZE_MAX / buf_sz < 35) {
+        av_log(ctx, AV_LOG_ERROR, "error: SIZE_MAX / buf_sz_one < 35");
+        return AVERROR(EINVAL);
+    }
+
+    if (!(s->data_buf = av_malloc(buf_sz * 35))) {
+        return AVERROR(ENOMEM);
+    }
+
+    stride = ALIGN_CEIL(s->width * sizeof(int16_t));
+    if (!(s->temp_lo = av_malloc(stride))) {
+        return AVERROR(ENOMEM);
+    }
+
+    if (!(s->temp_hi = av_malloc(stride))) {
+        return AVERROR(ENOMEM);
+    }
+
+    return 0;
+}
+
+
+static int config_output(AVFilterLink *outlink)
+{
+    AVFilterContext *ctx = outlink->src;
+    ADMContext *s = ctx->priv;
+    AVFilterLink *mainlink = ctx->inputs[0];
+    int ret;
+
+    outlink->w = mainlink->w;
+    outlink->h = mainlink->h;
+    outlink->time_base = mainlink->time_base;
+    outlink->sample_aspect_ratio = mainlink->sample_aspect_ratio;
+    outlink->frame_rate = mainlink->frame_rate;
+    if ((ret = ff_dualinput_init(ctx, &s->dinput)) < 0)
+        return ret;
+
+    return 0;
+}
+
+static int filter_frame(AVFilterLink *inlink, AVFrame *inpicref)
+{
+    ADMContext *s = inlink->dst->priv;
+    return ff_dualinput_filter_frame(&s->dinput, inlink, inpicref);
+}
+
+static int request_frame(AVFilterLink *outlink)
+{
+    ADMContext *s = outlink->src->priv;
+    return ff_dualinput_request_frame(&s->dinput, outlink);
+}
+
+static av_cold void uninit(AVFilterContext *ctx)
+{
+    ADMContext *s = ctx->priv;
+
+    if (s->nb_frames > 0) {
+        av_log(ctx, AV_LOG_INFO, "ADM AVG: %.3f\n", s->adm_sum / s->nb_frames);
+    }
+
+    av_free(s->data_buf);
+    av_free(s->temp_lo);
+    av_free(s->temp_hi);
+
+    ff_dualinput_uninit(&s->dinput);
+}
+
+static const AVFilterPad adm_inputs[] = {
+    {
+        .name         = "main",
+        .type         = AVMEDIA_TYPE_VIDEO,
+        .filter_frame = filter_frame,
+    },{
+        .name         = "reference",
+        .type         = AVMEDIA_TYPE_VIDEO,
+        .filter_frame = filter_frame,
+        .config_props = config_input_ref,
+    },
+    { NULL }
+};
+
+static const AVFilterPad adm_outputs[] = {
+    {
+        .name          = "default",
+        .type          = AVMEDIA_TYPE_VIDEO,
+        .config_props  = config_output,
+        .request_frame = request_frame,
+    },
+    { NULL }
+};
+
+AVFilter ff_vf_adm = {
+    .name          = "adm",
+    .description   = NULL_IF_CONFIG_SMALL("Calculate the ADM score between two video streams."),
+    .init          = init,
+    .uninit        = uninit,
+    .query_formats = query_formats,
+    .priv_size     = sizeof(ADMContext),
+    .priv_class    = &adm_class,
+    .inputs        = adm_inputs,
+    .outputs       = adm_outputs,
+};