From patchwork Thu Apr 14 22:09:36 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Paul B Mahol <onemda@gmail.com>
X-Patchwork-Id: 35321
Delivered-To: ffmpegpatchwork2@gmail.com
Received: by 2002:a05:6a20:671c:b0:7c:62c8:b2d1 with SMTP id q28csp977934pzh;
        Thu, 14 Apr 2022 15:07:28 -0700 (PDT)
X-Google-Smtp-Source: 
 ABdhPJznw4E4Eo+TSUsX6fo3AozqacLZZY9lQmjO8GWSHZnfqtD4St3z03xsadW43u/kbYWxVT+N
X-Received: by 2002:a17:907:1b1e:b0:6d7:31b0:e821 with SMTP id
 mp30-20020a1709071b1e00b006d731b0e821mr4110838ejc.334.1649974048625;
        Thu, 14 Apr 2022 15:07:28 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1649974048; cv=none;
        d=google.com; s=arc-20160816;
        b=NugGyXhhLDt86D9it6JE5ct0UYG0uja6BUFU+zp3H7dpcab3x3PKz+WKBCZJ53V4E8
         xlG6i8GaPkuh3U8nl5Qq1JrEgjaS2GSnOuwJJqOHLbqnVORKJhINLu+hiw3PdtGcsldc
         c9CctkE7X9Z+HeLRv4gx/M4I/4jZcWxbaLzC66TIgFnV9SvRMogcddWjXm7TcDlyMACV
         XviUbJMlUOimY0Da7Ytw64jb1Cn95lIpDcLtx2x0aduYYCHX/z9svHsU9vTd5aS7nY+p
         Cmi96vDn0Z5Nj8hRy+wlQwvoF8StiioZpotB1+UELg+BRZdJRvMn1PurLonlLszsxeQQ
         09+g==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe
         :list-help:list-post:list-archive:list-unsubscribe:list-id
         :precedence:subject:mime-version:message-id:date:to:from
         :dkim-signature:delivered-to;
        bh=vt1eYvxWXUFjnOZXbOh7YJQ3FLBP4dcBF+qbaPmHB4c=;
        b=wIP0agbknybzNe/VVtZB4oDEBgJVnW7EPFGdxOIAYzFj9LlJv7/o/MXZlzivCX+U/B
         UMk/MLWQVr+hlnRNF6jaSE6CTsHUX/7EtfbUwsO+VvNUxpF4sfTgUDKZd/yOIcAsCjjX
         gTcXbnL/BHK474OiGiKW9jtI+ugVSIPielgtL+SIliBCKbTX4kSOL1CdDL/MTj3VgaQs
         3vjnWX/DOaOxJMZMGBJ02Hwr0hCQifrGjamisE7Uo25YgEJImGpih4IUsz43qG1FwqsK
         xeqD0LLmezS6WAXaQcFtCtR1hcagO/4kKuq0R6GOiIc/9yyk1jAfY2x0iHsnqkR087bQ
         Mf0g==
ARC-Authentication-Results: i=1; mx.google.com;
       dkim=neutral (body hash did not verify) header.i=@gmail.com
 header.s=20210112 header.b=pN5Nt4IH;
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
       dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100])
        by mx.google.com with ESMTP id
 fr13-20020a170906890d00b006df76385d62si2576735ejc.514.2022.04.14.15.07.27;
        Thu, 14 Apr 2022 15:07:28 -0700 (PDT)
Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100;
Authentication-Results: mx.google.com;
       dkim=neutral (body hash did not verify) header.i=@gmail.com
 header.s=20210112 header.b=pN5Nt4IH;
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
       dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 756A568B4AA;
	Fri, 15 Apr 2022 01:07:24 +0300 (EEST)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from mail-ed1-f48.google.com (mail-ed1-f48.google.com
 [209.85.208.48])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 13F1E68B337
 for <ffmpeg-devel@ffmpeg.org>; Fri, 15 Apr 2022 01:07:18 +0300 (EEST)
Received: by mail-ed1-f48.google.com with SMTP id 11so3001920edw.0
 for <ffmpeg-devel@ffmpeg.org>; Thu, 14 Apr 2022 15:07:18 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112;
 h=from:to:subject:date:message-id:mime-version
 :content-transfer-encoding;
 bh=Bw1Tpo2TYroERynosBIC+MhLjK/GYbFO+2cGf3u6k8g=;
 b=pN5Nt4IHdHVOdbWOwrQlGiMJvyGF+FdfhbpXB6Vl3L3MoIU4h4UqPuXVcehBS4dbVI
 IErPMEMoZpUBq0vcP0hXJwDERjG9MiwkINmUKhUsV12M4URbr10nkMkmAwpa4LqFJmiM
 US76yw5JdqiG5fd2XNf6M2pT25xsXt0atba1OgBoKyJ2TJKpJGF1FtfelGQj6zR9r+Gi
 +31ChF0CvZokQ37wkaJZ6NEpVVnViYBApcHBDDBqKlUBu0+HFTCkQjCD11jwAc74uo/p
 tfTxyCWiaEHjGVA3zYqOL6m4UNN+P6TjhWA6c9N4TH7Ai/Hb40b0hXxnIiyjszoqWcT6
 wAqw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=x-gm-message-state:from:to:subject:date:message-id:mime-version
 :content-transfer-encoding;
 bh=Bw1Tpo2TYroERynosBIC+MhLjK/GYbFO+2cGf3u6k8g=;
 b=UyxUGHV62VubFAav2HXYghTxiyc/70GI9oZXbx3m6r7W7+EvLdp7gTyChtXD/Fg3ih
 LrGtKefOOGh8vslxP+VW4MPp1OnAChQUxQklpDlDyqIejjEwyUPxwyx9UnmcHUttp1mY
 IPnrebQM9tyFfjC6F2O3xf2c2iN0RvH7SbgG6bYoJSEk4XC0ePv6bTwGsN8JF+Py0GX9
 5X1cACyqo2Agtezo0yGS59TKde1hlFOTQltcVytYZEJ5gjEEhe5Yibg6OeNlSK2qZ5W1
 sbn4BZJ2ZT8Jj934VLyd8QKlMky+WUcyi60NjtKjXJLZX4lKNgEM0PuvztOsu5clX7AW
 F7YQ==
X-Gm-Message-State: AOAM531B4YXR+GWcu2aNkF4+KHLWUW7eO2rGuEpfXbGzEXXy5c+pkSOF
 hfscTpDBDAGQxYNbvbCoXtTOcQ3N3j4=
X-Received: by 2002:aa7:c689:0:b0:41d:8d46:818b with SMTP id
 n9-20020aa7c689000000b0041d8d46818bmr5154548edq.413.1649974037364;
 Thu, 14 Apr 2022 15:07:17 -0700 (PDT)
Received: from localhost.localdomain ([95.168.121.46])
 by smtp.gmail.com with ESMTPSA id
 q2-20020a170906144200b006ceb8723de9sm1057115ejc.120.2022.04.14.15.07.16
 for <ffmpeg-devel@ffmpeg.org>
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Thu, 14 Apr 2022 15:07:16 -0700 (PDT)
From: Paul B Mahol <onemda@gmail.com>
To: ffmpeg-devel@ffmpeg.org
Date: Fri, 15 Apr 2022 00:09:36 +0200
Message-Id: <20220414220936.71818-1-onemda@gmail.com>
X-Mailer: git-send-email 2.35.1
MIME-Version: 1.0
Subject: [FFmpeg-devel] [PATCH] avfilter: add warp video filter
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
X-TUID: K+NqPJuMYbpL

Signed-off-by: Paul B Mahol <onemda@gmail.com>
---
 libavfilter/Makefile     |   1 +
 libavfilter/allfilters.c |   1 +
 libavfilter/vf_warp.c    | 628 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 630 insertions(+)
 create mode 100644 libavfilter/vf_warp.c

diff --git a/libavfilter/Makefile b/libavfilter/Makefile
index 5eef334060..4bafa3b16b 100644
--- a/libavfilter/Makefile
+++ b/libavfilter/Makefile
@@ -519,6 +519,7 @@ OBJS-$(CONFIG_VMAFMOTION_FILTER)             += vf_vmafmotion.o framesync.o
 OBJS-$(CONFIG_VPP_QSV_FILTER)                += vf_vpp_qsv.o
 OBJS-$(CONFIG_VSTACK_FILTER)                 += vf_stack.o framesync.o
 OBJS-$(CONFIG_W3FDIF_FILTER)                 += vf_w3fdif.o
+OBJS-$(CONFIG_WARP_FILTER)                   += vf_warp.o
 OBJS-$(CONFIG_WAVEFORM_FILTER)               += vf_waveform.o
 OBJS-$(CONFIG_WEAVE_FILTER)                  += vf_weave.o
 OBJS-$(CONFIG_XBR_FILTER)                    += vf_xbr.o
diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
index 26e4dbb92a..cbcdcabe24 100644
--- a/libavfilter/allfilters.c
+++ b/libavfilter/allfilters.c
@@ -495,6 +495,7 @@ extern const AVFilter ff_vf_vpp_qsv;
 extern const AVFilter ff_vf_vstack;
 extern const AVFilter ff_vf_w3fdif;
 extern const AVFilter ff_vf_waveform;
+extern const AVFilter ff_vf_warp;
 extern const AVFilter ff_vf_weave;
 extern const AVFilter ff_vf_xbr;
 extern const AVFilter ff_vf_xcorrelate;
diff --git a/libavfilter/vf_warp.c b/libavfilter/vf_warp.c
new file mode 100644
index 0000000000..bada8f2d27
--- /dev/null
+++ b/libavfilter/vf_warp.c
@@ -0,0 +1,628 @@
+/*
+ * Copyright (c) 2021 Paul B Mahol
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+/**
+ * @file
+ * Pixel warp filter
+ */
+
+#include "libavutil/eval.h"
+#include "libavutil/imgutils.h"
+#include "libavutil/pixdesc.h"
+#include "libavutil/opt.h"
+#include "avfilter.h"
+#include "formats.h"
+#include "internal.h"
+#include "video.h"
+
+enum WarpEdge {
+    CLIP_EDGE,
+    FIXED_EDGE,
+    NB_WARPEDGE,
+};
+
+typedef struct WarpPoints {
+    float x0, y0, x1, y1;
+} WarpPoints;
+
+typedef struct Matrix {
+    int m, n;
+    float *t;
+} Matrix;
+
+typedef struct WarpContext {
+    const AVClass *class;
+    char *points_str;
+
+    int mode;
+    int interpolation;
+    int edge;
+    int nb_points;
+    int nb_planes;
+
+    double *points;
+    int points_size;
+
+    int nb_warp_points;
+    WarpPoints *warp_points;
+
+    int black[4];
+
+    int elements;
+    int uv_linesize;
+    int16_t *u, *v, *du, *dv;
+
+    int (*warp_slice)(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs);
+
+    void (*remap_line)(uint8_t *dst, int width, int height,
+                       const uint8_t *const src, ptrdiff_t in_linesize,
+                       const int16_t *const u, const int16_t *const v,
+                       const int16_t *const du, const int16_t *const dv,
+                       int fixed);
+} WarpContext;
+
+#define OFFSET(x) offsetof(WarpContext, x)
+#define FLAGS AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_VIDEO_PARAM
+
+static const AVOption warp_options[] = {
+    { "points", "set warp points", OFFSET(points_str), AV_OPT_TYPE_STRING, {.str="0 0 0 0|1 0 0 0|0 1 0 0|1 1 0 0"}, 0, 0, FLAGS },
+    { "mode", "set warp mode", OFFSET(mode), AV_OPT_TYPE_INT, {.i64=2}, 0, 2, FLAGS, "mode" },
+    { "abs", "absolute", 0, AV_OPT_TYPE_CONST, {.i64=0}, 0, 0, FLAGS, "mode" },
+    { "absrel", "absolute + relative", 0, AV_OPT_TYPE_CONST, {.i64=1}, 0, 0, FLAGS, "mode" },
+    { "rel", "relative", 0, AV_OPT_TYPE_CONST, {.i64=2}, 0, 0, FLAGS, "mode" },
+    { "interpolation", "set interpolation", OFFSET(interpolation), AV_OPT_TYPE_INT, {.i64=0}, 0, 1, FLAGS, "interp" },
+    { "nearest",  "set nearest",  0, AV_OPT_TYPE_CONST, {.i64=0}, 0, 1, FLAGS, "interp" },
+    { "bilinear", "set bilinear", 0, AV_OPT_TYPE_CONST, {.i64=1}, 0, 1, FLAGS, "interp" },
+    { "edge", "set edge mode", OFFSET(edge), AV_OPT_TYPE_INT, {.i64=FIXED_EDGE}, 0, NB_WARPEDGE-1, FLAGS, "edge" },
+    { "clip",  "clip edge",   0, AV_OPT_TYPE_CONST, {.i64=CLIP_EDGE},  0, 1, FLAGS, "edge" },
+    { "fixed", "fixed color", 0, AV_OPT_TYPE_CONST, {.i64=FIXED_EDGE}, 0, 1, FLAGS, "edge" },
+    { NULL }
+};
+
+AVFILTER_DEFINE_CLASS(warp);
+
+typedef struct ThreadData {
+    AVFrame *in, *out;
+} ThreadData;
+
+static const enum AVPixelFormat pix_fmts[] = {
+    AV_PIX_FMT_YUVA444P,
+    AV_PIX_FMT_YUV444P,
+    AV_PIX_FMT_YUVJ444P,
+    AV_PIX_FMT_GBRP, AV_PIX_FMT_GBRAP,
+    AV_PIX_FMT_YUV444P9, AV_PIX_FMT_YUV444P10, AV_PIX_FMT_YUV444P12,
+    AV_PIX_FMT_YUV444P14, AV_PIX_FMT_YUV444P16,
+    AV_PIX_FMT_YUVA444P9, AV_PIX_FMT_YUVA444P10, AV_PIX_FMT_YUVA444P12, AV_PIX_FMT_YUVA444P16,
+    AV_PIX_FMT_GBRP9, AV_PIX_FMT_GBRP10, AV_PIX_FMT_GBRP12,
+    AV_PIX_FMT_GBRP14, AV_PIX_FMT_GBRP16,
+    AV_PIX_FMT_GBRAP10, AV_PIX_FMT_GBRAP12, AV_PIX_FMT_GBRAP16,
+    AV_PIX_FMT_GRAY8, AV_PIX_FMT_GRAY9,
+    AV_PIX_FMT_GRAY10, AV_PIX_FMT_GRAY12,
+    AV_PIX_FMT_GRAY14, AV_PIX_FMT_GRAY16,
+    AV_PIX_FMT_NONE
+};
+
+static int create_matrix(Matrix *matrix, int m, int n)
+{
+    matrix->t = av_calloc(m * n, sizeof(*matrix->t));
+    if (!matrix->t)
+        return AVERROR(ENOMEM);
+
+    matrix->n = n;
+    matrix->m = m;
+
+    return 0;
+}
+
+static void free_matrix(Matrix *matrix)
+{
+    av_freep(&matrix->t);
+
+    matrix->n = 0;
+    matrix->m = 0;
+}
+
+static void multiply(Matrix *a, Matrix *b, Matrix *c)
+{
+    for (int i = 0; i < c->m; i++) {
+        for (int j = 0; j < c->n; j++) {
+            c->t[i * c->n + j] = 0.f;
+
+            for (int k = 0; k < a->n; k++)
+                c->t[i * c->n + j] += a->t[i * a->n + k] * b->t[k * b->n + j];
+        }
+    }
+}
+
+static void inverse(Matrix *matrix)
+{
+    float pv, pav, temp, tt;
+    int i, ik, j, jk, k;
+    float det;
+    int n = matrix->n;
+
+    float *pc = av_calloc(n, sizeof(*pc));
+    float *pl = av_calloc(n, sizeof(*pl));
+    float *cs = av_calloc(n, sizeof(*cs));
+
+    if (!pc || !pl || !cs)
+        goto fail;
+
+    det = 1.f;
+
+    for (k = 0; k < n; k++) {
+        pv = matrix->t[k * n + k];
+        ik = k;
+        jk = k;
+        pav = fabsf(pv);
+        for (i = k; i < n; i++) {
+            for (j = k; j < n; j++) {
+                temp = fabsf(matrix->t[i * n + j]);
+                if (temp > pav) {
+                    pv = matrix->t[i * n + j];
+                    pav = fabsf(pv);
+                    ik = i;
+                    jk = j;
+                }
+            }
+        }
+
+        pc[k] = jk;
+        pl[k] = ik;
+
+        if (ik != k)
+            det = -det;
+        if (jk != k)
+            det = -det;
+
+        det  = det * pv;
+        temp = fabsf(det);
+
+        if (ik != k) {
+            for (i = 0; i < n; i++) {
+                tt = matrix->t[ik * n + i];
+                matrix->t[ik * n + i] = matrix->t[k * n + i];
+                matrix->t[k * n + i] = tt;
+            }
+        }
+
+        if (jk != k) {
+            for (i = 0; i < n; i++) {
+                tt = matrix->t[i * n + jk];
+                matrix->t[i * n + jk] = matrix->t[i * n + k];
+                matrix->t[i * n + k] = tt;
+            }
+        }
+
+        for (i = 0; i < n; i++) {
+            cs[i] = matrix->t[i * n + k];
+            matrix->t[i * n + k] = 0;
+        }
+
+        cs[k] = 0;
+        matrix->t[k * n + k] = 1;
+
+        temp = fabsf(pv);
+
+        for (i = 0; i < n; i++)
+            matrix->t[k * n + i] = matrix->t[k * n + i] / pv;
+
+        for (j = 0; j < n; j++) {
+            if (j == k)
+                j++;
+            if (j < n) {
+                for (i = 0; i < n; i++) {
+                    matrix->t[j * n + i] = matrix->t[j * n + i] - cs[j] * matrix->t[k * n + i];
+                }
+            }
+        }
+    }
+
+    for (i = n - 1; i >= 0; i--) {
+        ik = (int)pc[i];
+        if (ik != i) {
+            for (j = 0; j < n; j++) {
+                tt = matrix->t[i * n + j];
+                matrix->t[i * n + j] = matrix->t[ik * n + j];
+                matrix->t[ik * n + j] = tt;
+            }
+        }
+    }
+
+    for (j = n - 1; j >= 0; j--) {
+        jk = (int)pl[j];
+        if (jk != j) {
+            for (i = 0; i < n; i++) {
+                tt = matrix->t[i * n + j];
+                matrix->t[i * n + j] = matrix->t[i * n + jk];
+                matrix->t[i * n + jk] = tt;
+            }
+        }
+    }
+
+fail:
+
+    av_free(pc);
+    av_free(pl);
+    av_free(cs);
+}
+
+static void warp_remap(WarpContext *s, Matrix *vox, Matrix *voy, int w, int h)
+{
+    for (int y = 0; y < h; y++) {
+        for (int x = 0; x < w; x++) {
+            float bx = x;
+            float by = y;
+            float dx, dy;
+
+            float ox = vox->t[s->nb_warp_points] +
+                       vox->t[s->nb_warp_points + 1] * bx +
+                       vox->t[s->nb_warp_points + 2] * by;
+            float oy = voy->t[s->nb_warp_points] +
+                       voy->t[s->nb_warp_points + 1] * bx +
+                       voy->t[s->nb_warp_points + 2] * by;
+
+            for (int i = 0; i < s->nb_warp_points; i++) {
+                float t = s->warp_points[i].x0 - bx;
+                float d = t * t;
+
+                t = s->warp_points[i].y0 - by;
+                d += t * t;
+                if (d > 0)
+                    d = d * logf(d) * 0.5f;
+                ox += vox->t[i] * d;
+                oy += voy->t[i] * d;
+            }
+
+            ox += bx;
+            oy += by;
+
+            dx = ox - floorf(ox);
+            dy = oy - floorf(oy);
+
+            ox -= dx;
+            oy -= dy;
+
+            s->du[y * s->uv_linesize + x] = dx * (1 << 15);
+            s->dv[y * s->uv_linesize + x] = dy * (1 << 15);
+
+            if (s->edge == CLIP_EDGE) {
+                s->u[y * s->uv_linesize + x] = av_clip(ox, 0, w - 1);
+                s->v[y * s->uv_linesize + x] = av_clip(oy, 0, h - 1);
+            } else if (s->edge == FIXED_EDGE) {
+                s->u[y * s->uv_linesize + x] = ox >= 0 && ox < w - 1 ? ox : -1;
+                s->v[y * s->uv_linesize + x] = oy >= 0 && oy < h - 1 ? oy : -1;
+            }
+        }
+    }
+}
+
+#define DEFINE_REMAP1_LINE(bits, div)                                                    \
+static void remap1_##bits##bit_line_c(uint8_t *dst, int width, int height,               \
+                                      const uint8_t *const src,                          \
+                                      ptrdiff_t in_linesize,                             \
+                                      const int16_t *const u, const int16_t *const v,    \
+                                      const int16_t *const du, const int16_t *const dv,  \
+                                      int fixed)                                         \
+{                                                                                        \
+    const uint##bits##_t *const s = (const uint##bits##_t *const)src;                    \
+    uint##bits##_t *d = (uint##bits##_t *)dst;                                           \
+                                                                                         \
+    in_linesize /= div;                                                                  \
+                                                                                         \
+    for (int x = 0; x < width; x++)                                                      \
+        d[x] = v[x] >= 0 && u[x] >= 0 ? s[v[x] * in_linesize + u[x]] : fixed;            \
+}
+
+DEFINE_REMAP1_LINE( 8, 1)
+DEFINE_REMAP1_LINE(16, 2)
+
+#define DEFINE_REMAP2_LINE(bits, div)                                                    \
+static void remap2_##bits##bit_line_c(uint8_t *dst, int w, int h,                        \
+                                      const uint8_t *const src,                          \
+                                      ptrdiff_t in_linesize,                             \
+                                      const int16_t *const u, const int16_t *const v,    \
+                                      const int16_t *const du, const int16_t *const dv,  \
+                                      int fixed)                                         \
+{                                                                                        \
+    const uint##bits##_t *const s = (const uint##bits##_t *const)src;                    \
+    uint##bits##_t *d = (uint##bits##_t *)dst;                                           \
+                                                                                         \
+    in_linesize /= div;                                                                  \
+                                                                                         \
+    for (int x = 0; x < w; x++) {                                                        \
+        const int mapped = v[x] >= 0 && u[x] >= 0;                                       \
+        int64_t sum = 0;                                                                 \
+                                                                                         \
+        if (!mapped) {                                                                   \
+            d[x] = fixed;                                                                \
+            continue;                                                                    \
+        }                                                                                \
+                                                                                         \
+        {                                                                                \
+            int64_t au = du[x];                                                          \
+            int64_t av = dv[x];                                                          \
+            int64_t zu = (1 << 15) - du[x];                                              \
+            int64_t zv = (1 << 15) - dv[x];                                              \
+            int ax = u[x];                                                               \
+            int ay = v[x];                                                               \
+            int bx = FFMIN(ax + 1, w - 1);                                               \
+            int by = FFMIN(ay + 1, h - 1);                                               \
+            sum += zu * zv * (s[ay * in_linesize + ax]);                                 \
+            sum += au * zv * (s[ay * in_linesize + bx]);                                 \
+            sum += zu * av * (s[by * in_linesize + ax]);                                 \
+            sum += au * av * (s[by * in_linesize + bx]);                                 \
+            d[x] = ((sum + (1LL << 29)) >> 30);                                          \
+        }                                                                                \
+    }                                                                                    \
+}
+
+DEFINE_REMAP2_LINE( 8, 1)
+DEFINE_REMAP2_LINE(16, 2)
+
+#define DEFINE_REMAP(ws)                                                                \
+static int warp##ws##_slice(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs)    \
+{                                                                                       \
+    ThreadData *td = arg;                                                               \
+    const WarpContext *s = ctx->priv;                                                   \
+    const AVFrame *in = td->in;                                                         \
+    AVFrame *out = td->out;                                                             \
+                                                                                        \
+    for (int plane = 0; plane < s->nb_planes; plane++) {                                \
+        const int in_linesize  = in->linesize[plane];                                   \
+        const int out_linesize = out->linesize[plane];                                  \
+        const int uv_linesize = s->uv_linesize;                                         \
+        const uint8_t *const src = in->data[plane];                                     \
+        uint8_t *dst = out->data[plane];                                                \
+        const int width = in->width;                                                    \
+        const int height = in->height;                                                  \
+                                                                                        \
+        const int slice_start = (height *  jobnr     ) / nb_jobs;                       \
+        const int slice_end   = (height * (jobnr + 1)) / nb_jobs;                       \
+                                                                                        \
+        for (int y = slice_start; y < slice_end; y++) {                                 \
+            const int16_t *const u = s->u + y * uv_linesize * ws * ws;                  \
+            const int16_t *const v = s->v + y * uv_linesize * ws * ws;                  \
+            const int16_t *const du = s->du + y * uv_linesize * ws * ws;                \
+            const int16_t *const dv = s->dv + y * uv_linesize * ws * ws;                \
+                                                                                        \
+            s->remap_line(dst + y * out_linesize,                                       \
+                          width, height, src, in_linesize,                              \
+                          u, v, du, dv, s->black[plane]);                               \
+        }                                                                               \
+    }                                                                                   \
+                                                                                        \
+    return 0;                                                                           \
+}
+
+DEFINE_REMAP(1)
+
+static int config_output(AVFilterLink *outlink)
+{
+    AVFilterContext *ctx = outlink->src;
+    AVFilterLink *inlink = ctx->inputs[0];
+    WarpContext *s = ctx->priv;
+    const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(inlink->format);
+    const int depth = desc->comp[0].depth;
+    const int rgb = !!(desc->flags & AV_PIX_FMT_FLAG_RGB);
+    char *p = s->points_str;
+    double *new_points;
+    Matrix L = { 0 }, vx = { 0 }, vy = { 0 };
+    Matrix vox = { 0 }, voy = { 0 };
+    int sizeof_uv;
+    int ret;
+
+    s->nb_points = 0;
+    s->nb_planes = av_pix_fmt_count_planes(inlink->format);
+
+    s->black[0] = s->black[3] = 0;
+    s->black[1] = s->black[2] = rgb ? 0 : (1 << (depth - 1));
+
+    av_freep(&s->points);
+    s->points_size = 0;
+    new_points = av_fast_realloc(NULL, &s->points_size, 1 * sizeof(*s->points));
+    if (!new_points)
+        return AVERROR(ENOMEM);
+    s->points = new_points;
+
+    while (p && *p) {
+        s->points[s->nb_points++] = av_strtod(p, &p);
+        if (p && *p)
+            p++;
+
+        new_points = av_fast_realloc(s->points, &s->points_size, (s->nb_points + 1) * sizeof(*s->points));
+        if (!new_points)
+            return AVERROR(ENOMEM);
+        s->points = new_points;
+    }
+
+    if (s->nb_points & 3)
+        return AVERROR(EINVAL);
+
+    s->nb_warp_points = s->nb_points / 4;
+    s->warp_points = av_calloc(s->nb_warp_points, sizeof(*s->warp_points));
+    if (!s->warp_points)
+        return AVERROR(ENOMEM);
+
+    for (int n = 0; n < s->nb_warp_points; n++) {
+        if (s->mode > 0) {
+            s->warp_points[n].x0 = s->points[n * 4 + 0] + s->points[n * 4 + 2];
+            s->warp_points[n].y0 = s->points[n * 4 + 1] + s->points[n * 4 + 3];
+            s->warp_points[n].x1 = -s->points[n * 4 + 2];
+            s->warp_points[n].y1 = -s->points[n * 4 + 3];
+            if (s->mode == 2) {
+                s->warp_points[n].x0 *= inlink->w;
+                s->warp_points[n].y0 *= inlink->h;
+                s->warp_points[n].x1 *= inlink->w;
+                s->warp_points[n].y1 *= inlink->h;
+            }
+        } else {
+            s->warp_points[n].x0 = s->points[n * 4 + 2];
+            s->warp_points[n].y0 = s->points[n * 4 + 3];
+            s->warp_points[n].x1 = s->points[n * 4 + 0] - s->points[n * 4 + 2];
+            s->warp_points[n].y1 = s->points[n * 4 + 1] - s->points[n * 4 + 3];
+        }
+    }
+
+    ret = create_matrix(&L, s->nb_warp_points + 3, s->nb_warp_points + 3);
+    if (ret < 0)
+        goto fail;
+
+    ret = create_matrix(&vx, s->nb_warp_points + 3, 1);
+    if (ret < 0)
+        goto fail;
+
+    ret = create_matrix(&vy, s->nb_warp_points + 3, 1);
+    if (ret < 0)
+        goto fail;
+
+    ret = create_matrix(&vox, s->nb_warp_points + 3, 1);
+    if (ret < 0)
+        goto fail;
+
+    ret = create_matrix(&voy, s->nb_warp_points + 3, 1);
+    if (ret < 0)
+        goto fail;
+
+    for (int i = 0; i < s->nb_warp_points; i++) {
+        for (int j = 0; j < s->nb_warp_points; j++) {
+            float t = s->warp_points[i].x0 - s->warp_points[j].x0;
+            float d = t * t;
+
+            t  = s->warp_points[i].y0 - s->warp_points[j].y0;
+            d += t * t;
+            if (d > 0)
+                L.t[i * L.n + j] = d * logf(d) * 0.5f;
+        }
+
+        L.t[i * L.n + s->nb_warp_points] = 1;
+        L.t[i * L.n + s->nb_warp_points + 1] = s->warp_points[i].x0;
+        L.t[i * L.n + s->nb_warp_points + 2] = s->warp_points[i].y0;
+
+        L.t[s->nb_warp_points * L.n + i] = 1;
+        L.t[(s->nb_warp_points + 1) * L.n + i] = s->warp_points[i].x0;
+        L.t[(s->nb_warp_points + 2) * L.n + i] = s->warp_points[i].y0;
+
+        vx.t[i] = s->warp_points[i].x1;
+        vy.t[i] = s->warp_points[i].y1;
+    }
+
+    inverse(&L);
+
+    multiply(&L, &vx, &vox);
+    multiply(&L, &vy, &voy);
+
+    s->elements = 1;
+    sizeof_uv = sizeof(int16_t) * s->elements;
+    s->uv_linesize = FFALIGN(inlink->w, 8);
+
+    s->u  = av_calloc(s->uv_linesize * inlink->h, sizeof_uv);
+    s->v  = av_calloc(s->uv_linesize * inlink->h, sizeof_uv);
+    s->du = av_calloc(s->uv_linesize * inlink->h, sizeof_uv);
+    s->dv = av_calloc(s->uv_linesize * inlink->h, sizeof_uv);
+    if (!s->u || !s->v || !s->du || !s->dv) {
+        ret = AVERROR(ENOMEM);
+        goto fail;
+    }
+
+    warp_remap(s, &vox, &voy, inlink->w, inlink->h);
+
+    s->warp_slice = warp1_slice;
+    s->remap_line = depth <= 8 ? remap1_8bit_line_c : remap1_16bit_line_c;
+    if (s->interpolation == 1)
+        s->remap_line = depth <= 8 ? remap2_8bit_line_c : remap2_16bit_line_c;
+
+fail:
+
+    free_matrix(&L);
+    free_matrix(&vx);
+    free_matrix(&vy);
+    free_matrix(&vox);
+    free_matrix(&voy);
+
+    return ret;
+}
+
+static int filter_frame(AVFilterLink *inlink, AVFrame *in)
+{
+    AVFilterContext *ctx = inlink->dst;
+    AVFilterLink *outlink = ctx->outputs[0];
+    WarpContext *s = ctx->priv;
+    AVFrame *out;
+    ThreadData td;
+
+    out = ff_get_video_buffer(outlink, outlink->w, outlink->h);
+    if (!out) {
+        av_frame_free(&in);
+        return AVERROR(ENOMEM);
+    }
+    av_frame_copy_props(out, in);
+
+    td.in = in;
+    td.out = out;
+
+    ctx->internal->execute(ctx, s->warp_slice, &td, NULL, FFMIN(outlink->h, ff_filter_get_nb_threads(ctx)));
+
+    av_frame_free(&in);
+    return ff_filter_frame(outlink, out);
+}
+
+static av_cold void uninit(AVFilterContext *ctx)
+{
+    WarpContext *s = ctx->priv;
+
+    av_freep(&s->points);
+    s->points_size = 0;
+    av_freep(&s->warp_points);
+    s->nb_warp_points = 0;
+
+    av_freep(&s->u);
+    av_freep(&s->v);
+    av_freep(&s->du);
+    av_freep(&s->dv);
+}
+
+static const AVFilterPad warp_inputs[] = {
+    {
+        .name = "default",
+        .type = AVMEDIA_TYPE_VIDEO,
+        .filter_frame = filter_frame,
+    },
+};
+
+static const AVFilterPad warp_outputs[] = {
+    {
+        .name         = "default",
+        .type         = AVMEDIA_TYPE_VIDEO,
+        .config_props = config_output,
+    },
+};
+
+const AVFilter ff_vf_warp = {
+    .name          = "warp",
+    .description   = NULL_IF_CONFIG_SMALL("Warp pixels."),
+    .priv_size     = sizeof(WarpContext),
+    .uninit        = uninit,
+    FILTER_INPUTS(warp_inputs),
+    FILTER_OUTPUTS(warp_outputs),
+    FILTER_PIXFMTS_ARRAY(pix_fmts),
+    .priv_class    = &warp_class,
+    .flags         = AVFILTER_FLAG_SUPPORT_TIMELINE_GENERIC | AVFILTER_FLAG_SLICE_THREADS,
+};