From patchwork Tue Mar 31 13:51:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul B Mahol X-Patchwork-Id: 18543 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 64DD144BA5E for ; Tue, 31 Mar 2020 16:58:53 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5427268B1AA; Tue, 31 Mar 2020 16:58:53 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wm1-f65.google.com (mail-wm1-f65.google.com [209.85.128.65]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 6A42768B1A2 for ; Tue, 31 Mar 2020 16:58:46 +0300 (EEST) Received: by mail-wm1-f65.google.com with SMTP id z14so2137138wmf.0 for ; Tue, 31 Mar 2020 06:58:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id; bh=GHMiqkMvEv+FCD1MMlD3A5aGxu8e7uvsbK44c7mlxUg=; b=eG7WyrOyN+Wv7AEYd7jYfYuBfOAwYG9pd/4exQa6kQ6swAixIRg5GmQZaZHmGAjISL gO0V/6r3Lj4g5ZHYREtPRrWXEzMC0IPB+q0Lz74tnDQM/NMWUXTBroS1C6xYM2ZPx7hN WKMQyCfXHuz8nQnDuA9fTkQ56Epmoubv2YQviq8GPLLuQQLK5HqxneMLiQsrTRcew7Dm uwNlg8qO2mPNOw+P3RG41VPkTutvFnm4Jx99cO/7+kMB9/JPWxAeVGHOCicKJj8naczi SUmGDLc0UblJ80R/O9/9HSm2kEovrgMT/XYH0or0C7rUeH6snSCtFjGTvibiXhGj6HUC eRYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id; bh=GHMiqkMvEv+FCD1MMlD3A5aGxu8e7uvsbK44c7mlxUg=; b=kC/QDcGzSb9JePAlNNP+5hSYxB44+E/dxaJoWJS2IsOPvY8GqsK/T2ZFAhsKd8zScN +e5HYyeztAjxk0d12YptBoPUesv1xUvZE/QPdFZpsyDWjNShPY6j/tkl1h8jWq5gC+9D szDjILLhYuowe1l+7tTJ0lGOmVOIuxbyASBnewx3mP54FKGv9De/BIZVqBZss6R26JU9 zw/bYLopBAvf1vMul3YE43gTmpFRZNsKU3M34I+Pbm85ZGHLlhUYSPExi6iaEJuvbks0 9aueUVWlf3MflBniR9j4HuqTPbBkHq1z2g8b9oPrIL31rCYDvk5rTFPGC9kFMPhiytd9 VffQ== X-Gm-Message-State: ANhLgQ0vHmpSnKtPJmdxDe11wliQyZQNFRqWLhoR0i6gcKIahO/gSR/H qMqNw20RfSd9xXA34gVJlQqh8Ej9 X-Google-Smtp-Source: ADFU+vvE1hERNkg77H2TQercsLJui0DhJoZdLAIOtZ80QA8Z+eTQcF11hd9Ao1Ac4iUrYJR4AHW5lA== X-Received: by 2002:a1c:2007:: with SMTP id g7mr3625710wmg.70.1585662677392; Tue, 31 Mar 2020 06:51:17 -0700 (PDT) Received: from localhost.localdomain ([37.244.237.154]) by smtp.gmail.com with ESMTPSA id q8sm28518739wrc.8.2020.03.31.06.51.15 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Mar 2020 06:51:16 -0700 (PDT) From: Paul B Mahol To: ffmpeg-devel@ffmpeg.org Date: Tue, 31 Mar 2020 15:51:05 +0200 Message-Id: <20200331135106.32490-1-onemda@gmail.com> X-Mailer: git-send-email 2.17.1 Subject: [FFmpeg-devel] [PATCH 1/2] avfilter/vf_v360: add lagrange interpolation X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Signed-off-by: Paul B Mahol --- doc/filters.texi | 2 ++ libavfilter/v360.h | 1 + libavfilter/vf_v360.c | 56 +++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 59 insertions(+) diff --git a/doc/filters.texi b/doc/filters.texi index 44d41a87cf..8827aac316 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -19105,6 +19105,8 @@ Nearest neighbour. @item line @item linear Bilinear interpolation. +@item lagrange +Lagrange interpolation. @item cube @item cubic Bicubic interpolation. diff --git a/libavfilter/v360.h b/libavfilter/v360.h index f2f1a47144..1d4098e5c1 100644 --- a/libavfilter/v360.h +++ b/libavfilter/v360.h @@ -57,6 +57,7 @@ enum Projections { enum InterpMethod { NEAREST, BILINEAR, + LAGRANGE, BICUBIC, LANCZOS, SPLINE16, diff --git a/libavfilter/vf_v360.c b/libavfilter/vf_v360.c index 54d4d23825..ca8e55e32c 100644 --- a/libavfilter/vf_v360.c +++ b/libavfilter/vf_v360.c @@ -112,6 +112,7 @@ static const AVOption v360_options[] = { { "nearest", "nearest neighbour", 0, AV_OPT_TYPE_CONST, {.i64=NEAREST}, 0, 0, FLAGS, "interp" }, { "line", "bilinear interpolation", 0, AV_OPT_TYPE_CONST, {.i64=BILINEAR}, 0, 0, FLAGS, "interp" }, { "linear", "bilinear interpolation", 0, AV_OPT_TYPE_CONST, {.i64=BILINEAR}, 0, 0, FLAGS, "interp" }, + { "lagrange", "lagrange interpolation", 0, AV_OPT_TYPE_CONST, {.i64=LAGRANGE}, 0, 0, FLAGS, "interp" }, { "cube", "bicubic interpolation", 0, AV_OPT_TYPE_CONST, {.i64=BICUBIC}, 0, 0, FLAGS, "interp" }, { "cubic", "bicubic interpolation", 0, AV_OPT_TYPE_CONST, {.i64=BICUBIC}, 0, 0, FLAGS, "interp" }, { "lanc", "lanczos interpolation", 0, AV_OPT_TYPE_CONST, {.i64=LANCZOS}, 0, 0, FLAGS, "interp" }, @@ -313,9 +314,11 @@ static int remap##ws##_##bits##bit_slice(AVFilterContext *ctx, void *arg, int jo DEFINE_REMAP(1, 8) DEFINE_REMAP(2, 8) +DEFINE_REMAP(3, 8) DEFINE_REMAP(4, 8) DEFINE_REMAP(1, 16) DEFINE_REMAP(2, 16) +DEFINE_REMAP(3, 16) DEFINE_REMAP(4, 16) #define DEFINE_REMAP_LINE(ws, bits, div) \ @@ -346,8 +349,10 @@ static void remap##ws##_##bits##bit_line_c(uint8_t *dst, int width, const uint8_ } DEFINE_REMAP_LINE(2, 8, 1) +DEFINE_REMAP_LINE(3, 8, 1) DEFINE_REMAP_LINE(4, 8, 1) DEFINE_REMAP_LINE(2, 16, 2) +DEFINE_REMAP_LINE(3, 16, 2) DEFINE_REMAP_LINE(4, 16, 2) void ff_v360_init(V360Context *s, int depth) @@ -359,6 +364,9 @@ void ff_v360_init(V360Context *s, int depth) case BILINEAR: s->remap_line = depth <= 8 ? remap2_8bit_line_c : remap2_16bit_line_c; break; + case LAGRANGE: + s->remap_line = depth <= 8 ? remap3_8bit_line_c : remap3_16bit_line_c; + break; case BICUBIC: case LANCZOS: case SPLINE16: @@ -417,6 +425,47 @@ static void bilinear_kernel(float du, float dv, const XYRemap *rmap, ker[3] = lrintf( du * dv * 16385.f); } +/** + * Calculate 1-dimensional lagrange coefficients. + * + * @param t relative coordinate + * @param coeffs coefficients + */ +static inline void calculate_lagrange_coeffs(float t, float *coeffs) +{ + coeffs[0] = (t - 1.f) * (t - 2.f) * 0.5f; + coeffs[1] = -t * (t - 2.f); + coeffs[2] = t * (t - 1.f) * 0.5f; +} + +/** + * Calculate kernel for lagrange interpolation. + * + * @param du horizontal relative coordinate + * @param dv vertical relative coordinate + * @param rmap calculated 4x4 window + * @param u u remap data + * @param v v remap data + * @param ker ker remap data + */ +static void lagrange_kernel(float du, float dv, const XYRemap *rmap, + int16_t *u, int16_t *v, int16_t *ker) +{ + float du_coeffs[3]; + float dv_coeffs[3]; + + calculate_lagrange_coeffs(du, du_coeffs); + calculate_lagrange_coeffs(dv, dv_coeffs); + + for (int i = 0; i < 3; i++) { + for (int j = 0; j < 3; j++) { + u[i * 3 + j] = rmap->u[i+1][j+1]; + v[i * 3 + j] = rmap->v[i+1][j+1]; + ker[i * 3 + j] = lrintf(du_coeffs[j] * dv_coeffs[i] * 16385.f); + } + } +} + /** * Calculate 1-dimensional cubic coefficients. * @@ -3689,6 +3738,13 @@ static int config_output(AVFilterLink *outlink) sizeof_uv = sizeof(int16_t) * s->elements; sizeof_ker = sizeof(int16_t) * s->elements; break; + case LAGRANGE: + s->calculate_kernel = lagrange_kernel; + s->remap_slice = depth <= 8 ? remap3_8bit_slice : remap3_16bit_slice; + s->elements = 3 * 3; + sizeof_uv = sizeof(int16_t) * s->elements; + sizeof_ker = sizeof(int16_t) * s->elements; + break; case BICUBIC: s->calculate_kernel = bicubic_kernel; s->remap_slice = depth <= 8 ? remap4_8bit_slice : remap4_16bit_slice; From patchwork Tue Mar 31 13:51:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul B Mahol X-Patchwork-Id: 18542 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 9090244BA5E for ; Tue, 31 Mar 2020 16:58:15 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6522368B0AA; Tue, 31 Mar 2020 16:58:15 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wr1-f67.google.com (mail-wr1-f67.google.com [209.85.221.67]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1BEB268AFEB for ; Tue, 31 Mar 2020 16:58:09 +0300 (EEST) Received: by mail-wr1-f67.google.com with SMTP id h15so26054210wrx.9 for ; Tue, 31 Mar 2020 06:58:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id:in-reply-to:references; bh=haUkkfsCCTFBIoPau4nCE1ZMHrb/g2I+vMsHEMxyPqE=; b=fNL++SEhMMvOxLAhCjgJmkbGgbDYkaQAZVGMRsOrW2Mya1rZuoe+uTfW2Zi0eUDIgN G3CASf4WpvZFNst9YPOSovfsdHtCKhMgv5VA0GggTXNJy2Cutcsr8AUcYuHV8+Vr7oi1 Gk4nErV2zPMlu5yg5UNyUjf9Rzb53gkMBDM/sP/XbnvoAOrTaltmiUW4lVoXFL/MSNsJ sbncee6jzaILwCms6lzWRlL0GNuxMMnGYdGlWDWobhw1+jrfFFbNS6O8BH3wTPOMlrG1 6oifUAWIczXenE+KDlkaEi339fyvUy+VSrNWkCEUKt/J4H+BdACHsUxEklmhUjtKMFCV qnIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=haUkkfsCCTFBIoPau4nCE1ZMHrb/g2I+vMsHEMxyPqE=; b=NERDVW7/sU8bnvn2FEz1gi+zQs7uCJBrojggK2CP+Y7QDaooEIg9Rk3JNVfzLzLDaW 5f7gC9qopL9zsnZBdnGfFzd0lni+yV1pmNAgjI6OXUvrPOiEvEwpc+t5QlR23mNZ04ex 4dM8RcNKhVZZo4IWy8YQCBhDcOPXpX887Ds6zGxWXvlYyqImn/GT0l1wJyaWBqawwFaM f4RfRtna0y9KMgQ0tkEszZXEm7OPd0v9txA34e1IYwZoGygRS7oI39EGWZo50Z/Z9sX2 58KXRrXxcobjSTlyfDQ9Jh0kSNMUZhOajM4horPfwlnzOT9hHoJFwWO2Vd5YQGN+7ALQ jlOw== X-Gm-Message-State: AGi0PuZTfHlYp8Hkb4CuraQMWh94rhd5ZaqjecrEaqBY685s8Opwtosx bXOTwUdc12uT7a6ShkBybQQXbBTV X-Google-Smtp-Source: ADFU+vuDXdtfgsirux+8wzBihAtHPI0grvV3NHw6K4UNQs0FauBoTDzOPHZ+wnD3UtgDiSGycACznQ== X-Received: by 2002:a5d:474b:: with SMTP id o11mr20480429wrs.391.1585662678481; Tue, 31 Mar 2020 06:51:18 -0700 (PDT) Received: from localhost.localdomain ([37.244.237.154]) by smtp.gmail.com with ESMTPSA id q8sm28518739wrc.8.2020.03.31.06.51.17 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Mar 2020 06:51:17 -0700 (PDT) From: Paul B Mahol To: ffmpeg-devel@ffmpeg.org Date: Tue, 31 Mar 2020 15:51:06 +0200 Message-Id: <20200331135106.32490-2-onemda@gmail.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200331135106.32490-1-onemda@gmail.com> References: <20200331135106.32490-1-onemda@gmail.com> Subject: [FFmpeg-devel] [PATCH 2/2] avfilter/vf_v360: add SIMD for lagrange interpolation X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Signed-off-by: Paul B Mahol --- libavfilter/x86/vf_v360.asm | 46 ++++++++++++++++++++++++++++++++++ libavfilter/x86/vf_v360_init.c | 6 +++++ 2 files changed, 52 insertions(+) diff --git a/libavfilter/x86/vf_v360.asm b/libavfilter/x86/vf_v360.asm index 5b241220d8..e1908e5e71 100644 --- a/libavfilter/x86/vf_v360.asm +++ b/libavfilter/x86/vf_v360.asm @@ -165,6 +165,52 @@ DEFINE_ARGS dst, width, src, x, u, v, ker %if ARCH_X86_64 +INIT_YMM avx2 +cglobal remap3_8bit_line, 7, 11, 8, dst, width, src, in_linesize, u, v, ker, x, y, tmp, z + movsxdifnidn widthq, widthd + xor zq, zq + xor yq, yq + xor xq, xq + movd xm0, in_linesized + pcmpeqw m7, m7 + vpbroadcastd m0, xm0 + vpbroadcastd m6, [pd_255] + + .loop: + pmovsxwd m1, [kerq + yq] + pmovsxwd m2, [vq + yq] + pmovsxwd m3, [uq + yq] + + pmulld m4, m2, m0 + paddd m4, m3 + mova m3, m7 + vpgatherdd m2, [srcq + m4], m3 + pand m2, m6 + pmulld m2, m1 + phaddd m2, m2 + phaddd m1, m2, m2 + vextracti128 xm2, m1, 1 + paddd m2, m1 + movzx tmpq, word [vq + yq + 16] + imul tmpq, in_linesizeq + movzx zq, word [uq + yq + 16] + add tmpq, zq + movzx zq, byte [srcq + tmpq] + movzx tmpq, word [kerq + yq + 16] + imul zq, tmpq + movd xm1, zd + paddd m2, m1 + psrld m2, m2, 0xe + + packuswb m2, m2 + pextrb [dstq+xq], xm2, 0 + + add xq, 1 + add yq, 18 + cmp xq, widthq + jl .loop + RET + INIT_YMM avx2 cglobal remap4_8bit_line, 7, 9, 11, dst, width, src, in_linesize, u, v, ker, x, y movsxdifnidn widthq, widthd diff --git a/libavfilter/x86/vf_v360_init.c b/libavfilter/x86/vf_v360_init.c index babc6c426a..83f58bb96a 100644 --- a/libavfilter/x86/vf_v360_init.c +++ b/libavfilter/x86/vf_v360_init.c @@ -29,6 +29,9 @@ void ff_remap1_8bit_line_avx2(uint8_t *dst, int width, const uint8_t *src, ptrdi void ff_remap2_8bit_line_avx2(uint8_t *dst, int width, const uint8_t *src, ptrdiff_t in_linesize, const int16_t *const u, const int16_t *const v, const int16_t *const ker); +void ff_remap3_8bit_line_avx2(uint8_t *dst, int width, const uint8_t *src, ptrdiff_t in_linesize, + const int16_t *const u, const int16_t *const v, const int16_t *const ker); + void ff_remap4_8bit_line_avx2(uint8_t *dst, int width, const uint8_t *src, ptrdiff_t in_linesize, const int16_t *const u, const int16_t *const v, const int16_t *const ker); @@ -48,6 +51,9 @@ av_cold void ff_v360_init_x86(V360Context *s, int depth) if (EXTERNAL_AVX2_FAST(cpu_flags) && s->interp == BILINEAR && depth <= 8) s->remap_line = ff_remap2_8bit_line_avx2; + if (EXTERNAL_AVX2_FAST(cpu_flags) && s->interp == LAGRANGE && depth <= 8) + s->remap_line = ff_remap3_8bit_line_avx2; + if (EXTERNAL_AVX2_FAST(cpu_flags) && s->interp == NEAREST && depth > 8) s->remap_line = ff_remap1_16bit_line_avx2;