From patchwork Sun Nov 12 19:15:14 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul B Mahol X-Patchwork-Id: 6018 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.2.161.94 with SMTP id m30csp1337059jah; Sun, 12 Nov 2017 11:21:51 -0800 (PST) X-Google-Smtp-Source: AGs4zMYurziyOYm5DpnNEknQl7vR9bLksN0LmUoKIGYMi7GEY4vZKVFgVoQXpORpqnxjx0/FxGcB X-Received: by 10.28.4.146 with SMTP id 140mr4237757wme.38.1510514511646; Sun, 12 Nov 2017 11:21:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1510514511; cv=none; d=google.com; s=arc-20160816; b=jAAPVQCz7m6KXZe3yF4WZD3UmIT+RuQGvdkjE+oz1b7lklPggL5WuGFW0zmGJn1Moa +o9Rej8jX6+HzlMsGrt0k1p0MVgOupvlub+TGma8g+FNY9ORy2kmZX5b9VZpyZP9DfQ0 OGoQbcEbEKcyCI587BDmkn2as1u18nN19KAxrx77cduvTD533ZngEZ9PSI+CsfgI9wSe mCx2zSL779/Cq28hkux7BpJhGTfbiSDBmJqeY5+Rm0aybJKX14hvZMfoUcK7PCoWF6vL +S9Se7RrviAD8++2NhwhWEEUfgpfCkH5qMhh95JlG2qmquJQn0d5Rba4tBwKxFjbkzm/ BORg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:message-id:date:to:from:dkim-signature :delivered-to:arc-authentication-results; bh=5Th7lZhSn+jRBMrWan/P3szBzf5i73Ue8rKE/o18q7Q=; b=HA8p+2v+66HsbCAlVXUSaXmDpoedncU1cgTmc4MsnOn699ynvurfTXk0kn1H7C/Edz Vyih0lCLiHTqyHTxkrZDZr3i9nWsGtYtwVNx94+jMfszHvqNioIGaIr2EclYNYnBy4bT HmoI3d2KbjEhiXwznOVEqy8G9/hG08ZBkaR+gMqAKUhlJaevDu/PC1JidfHkSVzVT4yy 3Q/AorJv6EEsIZ/k3ac2OmI1MreAMO7vvrSqTfjQvSS9tCSJNYp9iOMHivz3daU0ccf0 uyb86cozLuc/++rXrqRrsY2Q5/NA1VdE/XQtg3KPWY4/6UekVXpeFjY7/PZPx6bq4zrN y/ww== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20161025 header.b=tmkINx5u; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id l43si11966103wre.522.2017.11.12.11.21.51; Sun, 12 Nov 2017 11:21:51 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20161025 header.b=tmkINx5u; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 24F9968A416; Sun, 12 Nov 2017 21:21:36 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wm0-f65.google.com (mail-wm0-f65.google.com [74.125.82.65]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C80B168A409 for ; Sun, 12 Nov 2017 21:21:29 +0200 (EET) Received: by mail-wm0-f65.google.com with SMTP id n74so11644190wmi.1 for ; Sun, 12 Nov 2017 11:21:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id; bh=+mkG9lS6wmKytMLDimq4sy6hbMm9wsc96J47tv0ozzk=; b=tmkINx5uQdzx1QWO0RoHJijPHFeOYx4PX6AK9dc7uY1NVS0L/SfC+4K0rmRSh3Ewhh 4q77oY7aZtdEGv6X7cwBYojFMnzpLV4q9euMDc4HN6XdxoQAnaHX6/PJS+gX22OHlJjj dsiS9mBxJMUBp8qf13Og4wXMxhIPgOgQBJFvZ2/XFcJrxw2uo9HpbMGApOIbUIiNMYQv FLQ6nS3LpBkdBTU0DEFGdFevsj142QvzqBcrxrOpo++tq1iuybHc0a0HktZFIhX2/7fe ftxBWRrvwWusiGZRFasD8/XK6iwk++TDfG19IJBB+tJb3glBSjgSeCbwfYRfZj2fEtnv 2Ung== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id; bh=+mkG9lS6wmKytMLDimq4sy6hbMm9wsc96J47tv0ozzk=; b=Bta/0hPiUBBON//UnA7ETn/xay9RfPsRVLVcx+lf+VSnQ4G1vJ0Lb/H/+3BMWvDSr5 RGhNmXg/HbA4lLcFNly1zffd8pTR1/M7xWw7jVTXekYuXWF6lUEBUTC78qhOffrcMg9H 9A4e/Vd98w3eCqtro7Ps7AatKEQmRZnDp1b5in4lvX6nJ94whZJSsFaIomTA5FKpCufZ CYzy40CZvO+j3dtS04oMLmIy+BnA0fhTF72vsCq4Qjpr/it7UYP3asTQ+MA4UX6k7+Cm 9gu/akTlHv+bG0I8e0xXx4bjGPCtvjD6ZbC8EKCoL3339LOEOmXJsBuO6kFQ7BRXi6T4 mg0A== X-Gm-Message-State: AJaThX6XhggouoCYQgA24JnpX8FFYttdX+w9wbJ89P85j4BLplNW92EY CXkgjj6FEb44LwnQ0rYReDzSzw== X-Received: by 10.28.12.193 with SMTP id 184mr5381649wmm.70.1510514170871; Sun, 12 Nov 2017 11:16:10 -0800 (PST) Received: from localhost.localdomain ([94.250.174.60]) by smtp.gmail.com with ESMTPSA id o70sm46946908wrb.62.2017.11.12.11.16.09 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 12 Nov 2017 11:16:10 -0800 (PST) From: Paul B Mahol To: ffmpeg-devel@ffmpeg.org Date: Sun, 12 Nov 2017 20:15:14 +0100 Message-Id: <20171112191514.25142-1-onemda@gmail.com> X-Mailer: git-send-email 2.11.0 Subject: [FFmpeg-devel] [PATCH] avfilter/vf_threshold: add x86 SIMD X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Signed-off-by: Paul B Mahol --- libavfilter/threshold.h | 51 +++++++++++++++++++++++++++ libavfilter/vf_threshold.c | 32 +++++------------ libavfilter/x86/Makefile | 2 ++ libavfilter/x86/vf_threshold.asm | 69 +++++++++++++++++++++++++++++++++++++ libavfilter/x86/vf_threshold_init.c | 41 ++++++++++++++++++++++ 5 files changed, 171 insertions(+), 24 deletions(-) create mode 100644 libavfilter/threshold.h create mode 100644 libavfilter/x86/vf_threshold.asm create mode 100644 libavfilter/x86/vf_threshold_init.c diff --git a/libavfilter/threshold.h b/libavfilter/threshold.h new file mode 100644 index 0000000000..8b55ad6ba1 --- /dev/null +++ b/libavfilter/threshold.h @@ -0,0 +1,51 @@ +/* + * Copyright (c) 2016 Paul B Mahol + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVFILTER_THRESHOLD_H +#define AVFILTER_THRESHOLD_H + +#include "avfilter.h" +#include "framesync.h" + +typedef struct ThresholdContext { + const AVClass *class; + + int depth; + int planes; + int bpc; + + int nb_planes; + int width[4], height[4]; + + void (*threshold)(const uint8_t *in, const uint8_t *threshold, + const uint8_t *min, const uint8_t *max, + uint8_t *out, + ptrdiff_t ilinesize, ptrdiff_t tlinesize, + ptrdiff_t flinesize, ptrdiff_t slinesize, + ptrdiff_t olinesize, + int w, int h); + + AVFrame *frames[4]; + FFFrameSync fs; +} ThresholdContext; + +void ff_threshold_init_x86(ThresholdContext *s); + +#endif /* AVFILTER_THRESHOLD_H */ diff --git a/libavfilter/vf_threshold.c b/libavfilter/vf_threshold.c index 88f6ef28d7..4183b353d2 100644 --- a/libavfilter/vf_threshold.c +++ b/libavfilter/vf_threshold.c @@ -31,27 +31,7 @@ #include "framesync.h" #include "internal.h" #include "video.h" - -typedef struct ThresholdContext { - const AVClass *class; - - int planes; - int bpc; - - int nb_planes; - int width[4], height[4]; - - void (*threshold)(const uint8_t *in, const uint8_t *threshold, - const uint8_t *min, const uint8_t *max, - uint8_t *out, - ptrdiff_t ilinesize, ptrdiff_t tlinesize, - ptrdiff_t flinesize, ptrdiff_t slinesize, - ptrdiff_t olinesize, - int w, int h); - - AVFrame *frames[4]; - FFFrameSync fs; -} ThresholdContext; +#include "threshold.h" #define OFFSET(x) offsetof(ThresholdContext, x) #define FLAGS AV_OPT_FLAG_VIDEO_PARAM|AV_OPT_FLAG_FILTERING_PARAM @@ -155,7 +135,7 @@ static void threshold8(const uint8_t *in, const uint8_t *threshold, in += ilinesize; threshold += tlinesize; min += flinesize; - max += flinesize; + max += slinesize; out += olinesize; } } @@ -183,7 +163,7 @@ static void threshold16(const uint8_t *iin, const uint8_t *tthreshold, in += ilinesize / 2; threshold += tlinesize / 2; min += flinesize / 2; - max += flinesize / 2; + max += slinesize / 2; out += olinesize / 2; } } @@ -203,8 +183,9 @@ static int config_input(AVFilterLink *inlink) s->height[0] = s->height[3] = inlink->h; s->width[1] = s->width[2] = AV_CEIL_RSHIFT(inlink->w, hsub); s->width[0] = s->width[3] = inlink->w; + s->depth = desc->comp[0].depth; - if (desc->comp[0].depth == 8) { + if (s->depth == 8) { s->threshold = threshold8; s->bpc = 1; } else { @@ -212,6 +193,9 @@ static int config_input(AVFilterLink *inlink) s->bpc = 2; } + if (ARCH_X86) + ff_threshold_init_x86(s); + return 0; } diff --git a/libavfilter/x86/Makefile b/libavfilter/x86/Makefile index 3431625883..c10f4d5538 100644 --- a/libavfilter/x86/Makefile +++ b/libavfilter/x86/Makefile @@ -20,6 +20,7 @@ OBJS-$(CONFIG_SPP_FILTER) += x86/vf_spp.o OBJS-$(CONFIG_SSIM_FILTER) += x86/vf_ssim_init.o OBJS-$(CONFIG_STEREO3D_FILTER) += x86/vf_stereo3d_init.o OBJS-$(CONFIG_TBLEND_FILTER) += x86/vf_blend_init.o +OBJS-$(CONFIG_THRESHOLD_FILTER) += x86/vf_threshold_init.o OBJS-$(CONFIG_TINTERLACE_FILTER) += x86/vf_tinterlace_init.o OBJS-$(CONFIG_VOLUME_FILTER) += x86/af_volume_init.o OBJS-$(CONFIG_W3FDIF_FILTER) += x86/vf_w3fdif_init.o @@ -46,6 +47,7 @@ X86ASM-OBJS-$(CONFIG_SHOWCQT_FILTER) += x86/avf_showcqt.o X86ASM-OBJS-$(CONFIG_SSIM_FILTER) += x86/vf_ssim.o X86ASM-OBJS-$(CONFIG_STEREO3D_FILTER) += x86/vf_stereo3d.o X86ASM-OBJS-$(CONFIG_TBLEND_FILTER) += x86/vf_blend.o +X86ASM-OBJS-$(CONFIG_THRESHOLD_FILTER) += x86/vf_threshold.o X86ASM-OBJS-$(CONFIG_TINTERLACE_FILTER) += x86/vf_interlace.o X86ASM-OBJS-$(CONFIG_VOLUME_FILTER) += x86/af_volume.o X86ASM-OBJS-$(CONFIG_W3FDIF_FILTER) += x86/vf_w3fdif.o diff --git a/libavfilter/x86/vf_threshold.asm b/libavfilter/x86/vf_threshold.asm new file mode 100644 index 0000000000..9db2f89aa8 --- /dev/null +++ b/libavfilter/x86/vf_threshold.asm @@ -0,0 +1,69 @@ +;***************************************************************************** +;* x86-optimized functions for threshold filter +;* +;* Copyright (C) 2017 Paul B Mahol +;* +;* This file is part of FFmpeg. +;* +;* FFmpeg is free software; you can redistribute it and/or +;* modify it under the terms of the GNU Lesser General Public +;* License as published by the Free Software Foundation; either +;* version 2.1 of the License, or (at your option) any later version. +;* +;* FFmpeg is distributed in the hope that it will be useful, +;* but WITHOUT ANY WARRANTY; without even the implied warranty of +;* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +;* Lesser General Public License for more details. +;* +;* You should have received a copy of the GNU Lesser General Public +;* License along with FFmpeg; if not, write to the Free Software +;* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA +;***************************************************************************** + +%include "libavutil/x86/x86util.asm" + +%if ARCH_X86_64 + +SECTION_RODATA + +pb_128: times 16 db 128 + +SECTION .text + +INIT_XMM sse4 +cglobal threshold8, 13, 13, 8, in, threshold, min, max, out, ilinesize, tlinesize, flinesize, slinesize, olinesize, w, h, x + mov wd, dword wm + mov hd, dword hm + mova m7, [pb_128] + add inq, wq + add thresholdq, wq + add minq, wq + add maxq, wq + add outq, wq + neg wq +.nextrow: + mov xq, wq + + .loop: + movu m1, [inq + xq] + movu m0, [thresholdq + xq] + movu m2, [minq + xq] + movu m3, [maxq + xq] + pxor m0, m7 + pxor m1, m7 + pcmpgtb m0, m1 + pblendvb m3, m2, m0 + movu [outq + xq], m3 + add xq, mmsize + jl .loop + + add inq, ilinesizeq + add thresholdq, tlinesizeq + add minq, flinesizeq + add maxq, slinesizeq + add outq, olinesizeq + sub hd, 1 + jg .nextrow +REP_RET + +%endif diff --git a/libavfilter/x86/vf_threshold_init.c b/libavfilter/x86/vf_threshold_init.c new file mode 100644 index 0000000000..e2bbae11d5 --- /dev/null +++ b/libavfilter/x86/vf_threshold_init.c @@ -0,0 +1,41 @@ +/* + * Copyright (c) 2015 Paul B Mahol + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavutil/x86/cpu.h" +#include "libavfilter/threshold.h" + +void ff_threshold8_sse4(const uint8_t *in, const uint8_t *threshold, + const uint8_t *min, const uint8_t *max, + uint8_t *out, + ptrdiff_t ilinesize, ptrdiff_t tlinesize, + ptrdiff_t flinesize, ptrdiff_t slinesize, + ptrdiff_t olinesize, + int w, int h); + +av_cold void ff_threshold_init_x86(ThresholdContext *s) +{ + int cpu_flags = av_get_cpu_flags(); + + if (ARCH_X86_64 && EXTERNAL_SSE4(cpu_flags) && s->depth == 8) { + s->threshold = ff_threshold8_sse4; + } +}