From patchwork Thu Apr 13 02:53:56 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 3385 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.103.3.129 with SMTP id 123csp515818vsd; Wed, 12 Apr 2017 19:54:10 -0700 (PDT) X-Received: by 10.28.55.3 with SMTP id e3mr20373974wma.15.1492052050672; Wed, 12 Apr 2017 19:54:10 -0700 (PDT) Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id b67si10851874wmg.95.2017.04.12.19.54.10; Wed, 12 Apr 2017 19:54:10 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E680C689913; Thu, 13 Apr 2017 05:54:00 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-qt0-f173.google.com (mail-qt0-f173.google.com [209.85.216.173]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 5DFA6688283 for ; Thu, 13 Apr 2017 05:53:54 +0300 (EEST) Received: by mail-qt0-f173.google.com with SMTP id v3so36802267qtd.3 for ; Wed, 12 Apr 2017 19:54:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=yVN0o3YfPllP/l7G6g82mPcKV4vSbec0iQDsSy6iJug=; b=gLCzXtTiaSjZE0WwyiV3KxR60NZuNDU3f4aP8oA39YZQvPHQ22oTU1tKTaF8gDLkiC c3wnS82mVW2hAaOK3eh6bEbNTNpL/UeFLJLTF7dT43sIIDp/NYF75AfOnjooGSxIn1Bw Hub8eO3DFodxviCdboj4yioQ8W0FzjXBzplxw95NFqJnkyPmVqMlpfi86R5y6lKG3Psq SsGjjN5T6q4uW3OiXu+Bh8JKE6p+AnOCWiyI/YQN7myv1ticpkrifOwhw4jEFwB9UuYK zm+Ae0hJoQk3kOTmSacFG0YSqzBbZISqkFPmUFxeEQUNU7Nt7dCT4yRN+wvhCwYv4+94 l/aQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=yVN0o3YfPllP/l7G6g82mPcKV4vSbec0iQDsSy6iJug=; b=FJwSPV65mHP15cXS7uaiKSybod+ejxLMnFRvg5ty4Ex7jK492Aoezy2BVs7bQB61Et QgRDn/wLN8ru/RTP7bc5LvxgcQssEprHMKC7BOnhGTkSkUoVirWGM6YnJXWysIZKP5o6 Xa0ngHvtglRQCip2ZQOYGAMz6QI/yKGGOHvwM8Z+UX62N956H/wY06RVgbcfXoCfWTZ2 1Fmfc3+OhPfthOr3W3PWxAhOujcoi5LZn3OhkOK6eqvdxtBB/PLyuPCWHUTbrAuLxS6h W9pO1nPpGu9TX4bHZd1a390hpvXJaFVu/862nYzdiZ38yXjnvx79opoMnGHZIYz1zW6V 3z6A== X-Gm-Message-State: AN3rC/4TS5UF0m4Q4G2eR4ruxIKxVdanmLfZWedQR0Ph0TYl+mMWCjN5 6PzjdK49/8lbu7bu X-Received: by 10.200.41.75 with SMTP id z11mr660563qtz.183.1492052038902; Wed, 12 Apr 2017 19:53:58 -0700 (PDT) Received: from [192.168.0.4] ([181.231.62.139]) by smtp.googlemail.com with ESMTPSA id e9sm1754005qkb.23.2017.04.12.19.53.57 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 12 Apr 2017 19:53:58 -0700 (PDT) To: FFmpeg development discussions and patches References: <2076920013.1351932.1492043967488.ref@mail.yahoo.com> <2076920013.1351932.1492043967488@mail.yahoo.com> From: James Almer Message-ID: <30e1d9c4-0fff-3c36-4b4b-49e662ccda91@gmail.com> Date: Wed, 12 Apr 2017 23:53:56 -0300 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.0 MIME-Version: 1.0 In-Reply-To: <2076920013.1351932.1492043967488@mail.yahoo.com> Content-Language: en-US Subject: Re: [FFmpeg-devel] [PATCH v2 2/2] avfilter/interlace: add complex vertical low-pass filter X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" On 4/12/2017 9:39 PM, Thomas Mundt wrote: >>>> Michael Niedermayer schrieb am Mi, 12.4.2017: > On Thu, Mar 30, 2017 at 12:21:58AM +0000, Thomas Mundt wrote: >>>>> Lou Logan schrieb am Do, 30.3.2017: >>>> On Mon, 13 Mar 2017 16:23:46 +0000 (UTC) >>>> Thomas Mundt wrote: >>>> >>>> [...] >>>>> index 09ca4d3..0b5b858 100644 >>>>> --- a/libavfilter/vf_tinterlace.c >>>>> +++ b/libavfilter/vf_tinterlace.c >>>> [...] >>>>> +static void lowpass_line_complex_c(uint8_t *dstp, ptrdiff_t width, const uint8_t *srcp, >>>>> + ptrdiff_t mref, ptrdiff_t pref) >>>> >>>> Trailing whitespace should be avoided. It prevents the patch from being >>>> applied. >>> >>> Oh, didn´t notice. Thanks. >>> New patch set attached. >> >> [...] >>> --- a/libavfilter/x86/vf_interlace.asm >>> +++ b/libavfilter/x86/vf_interlace.asm >>> @@ -28,33 +28,28 @@ SECTION_RODATA >>> SECTION .text >>> >>> %macro LOWPASS_LINE 0 >>> -cglobal lowpass_line, 5, 5, 7 >>> - add r0, r1 >>> - add r2, r1 > > [...] >>> - add r1, 2*mmsize >>> - jl .loop >>> + add dstq, 2*mmsize >>> + add srcq, 2*mmsize >>> + sub hd, 2*mmsize >>> + jg .loop >> >> this increases the number of instructions in the inner loop by 2 > > James Almer suggested to change the function prototype. Which was easy in c, but for simd this is the best I can do. I didn't check, but I think the reason i told you to change the prototype here was to share the function pointer with lowpass_line_complex, so you can do something like if (tinterlace->flags & TINTERLACE_FLAG_VLPF) tinterlace->lowpass_line = lowpass_line_c; else if (tinterlace->flags & TINTERLACE_FLAG_CVLPF) tinterlace->lowpass_line = lowpass_line_complex_c; instead of adding a new one to InterlaceContext and TInterlaceContext. Otherwise you wouldn't really gain much changing the prototype for linear here. > I asked for help a month ago but get no reply. Can you tell me how to avoid this? Yes, sorry, i kinda lost track of this since for some reason your emails start a new thread each instead of showing up as a reply. You just need to turn mref and pref into the equivalent of the old srcp_above and srcp_below pointers, like so: > >> also can you add a fate test for the -1 2 6 2-1 filter ? > > Sure. I never wrote a fate test and I´m off for a couple of days, so this could take some time. Can you give me a hint or an example? > > Regards, > Thomas > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > diff --git a/libavfilter/x86/vf_interlace.asm b/libavfilter/x86/vf_interlace.asm index f70c700965..8a0dd3bdea 100644 --- a/libavfilter/x86/vf_interlace.asm +++ b/libavfilter/x86/vf_interlace.asm @@ -28,32 +28,32 @@ SECTION_RODATA SECTION .text %macro LOWPASS_LINE 0 -cglobal lowpass_line, 5, 5, 7 - add r0, r1 - add r2, r1 - add r3, r1 - add r4, r1 - neg r1 +cglobal lowpass_line, 5, 5, 7, dst, h, src, mref, pref + add dstq, hq + add srcq, hq + add mrefq, srcq + add prefq, srcq + neg hq pcmpeqb m6, m6 .loop: - mova m0, [r3+r1] - mova m1, [r3+r1+mmsize] - pavgb m0, [r4+r1] - pavgb m1, [r4+r1+mmsize] + mova m0, [mrefq+hq] + mova m1, [mrefq+hq+mmsize] + pavgb m0, [prefq+hq] + pavgb m1, [prefq+hq+mmsize] pxor m0, m6 pxor m1, m6 - pxor m2, m6, [r2+r1] - pxor m3, m6, [r2+r1+mmsize] + pxor m2, m6, [srcq+hq] + pxor m3, m6, [srcq+hq+mmsize] pavgb m0, m2 pavgb m1, m3 pxor m0, m6 pxor m1, m6 - mova [r0+r1], m0 - mova [r0+r1+mmsize], m1 + mova [dstq+hq], m0 + mova [dstq+hq+mmsize], m1 - add r1, 2*mmsize + add hq, 2*mmsize jl .loop REP_RET %endmacro