From patchwork Tue Jun 6 12:48:54 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ronald S. Bultje" X-Patchwork-Id: 3857 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.103.10.2 with SMTP id 2csp1831805vsk; Tue, 6 Jun 2017 05:49:07 -0700 (PDT) X-Received: by 10.31.134.140 with SMTP id i134mr12878979vkd.75.1496753347116; Tue, 06 Jun 2017 05:49:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1496753347; cv=none; d=google.com; s=arc-20160816; b=BwCAD01+75Y9UzXqiHOsJvcFlqGrNeju/qDyFI30SGxw9SYI9uUd4ARxnlYakpmmHy GMPBPcsl9LkkhrphgPOG8T4FicjqttmdO3zb0D/jyxfAFGN3nfLoHLLSykqfDgLTeFlf aVQPwrPEeSSz8j+/KtC9+6UfY3CHOKClRq+ufPNfWITCpiiuoE5g8/sTf6paYOkc70TQ LPYGqYyTWLaiDjL7SNTGmmlsPt+H8HmeklPNdj3cIAMB2UORWtMxyCOiI2VsZahr6ZzW M1vZ6nRcPtP12XIiWG722MwyhQEaQeiy8Bal9ejUTgWjVnOsesbXWhdd+x58K8zGfHVn lUoQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:to:message-id:date:from:references:in-reply-to :mime-version:dkim-signature:delivered-to:arc-authentication-results; bh=UPSlQbPnDeTn/3vcmXsUFoAH4vY+OamficYoLigk0Ew=; b=NjRKvXjgNK2akS6knvMjTJjFMGYaUy8LWUyzyDqGEo1ik1M36dE8walDg10EyICJ6L TaLyQYJFYzWCtIDIF+DwADcRe91la6lVT/8NzNOjy8GdTn5XrS22Zz9IgUiX+16x/Y1Q Sh3I0fvFvBgAgvQs07SWtDZ3tdONimJJNMCRcL8I2TBSuRLyQ1WPtZi4rh0j/hguJxm/ DTTOxjjXrSWjYivg9MQYXoYwKw0sguC0CC59DYGt7ipGZCgS7wJoLPXeTkshLts2QlIo 2ZuES47fdIGE7PhuNBlFyoGyaoGeWq0b747dIyMT2oTCpwj+CTtoThYMENp907BCNK93 S34A== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id g2si15936763uag.59.2017.06.06.05.49.05; Tue, 06 Jun 2017 05:49:07 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 7A7BF689E15; Tue, 6 Jun 2017 15:48:55 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-oi0-f52.google.com (mail-oi0-f52.google.com [209.85.218.52]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id DAA3C689D0C for ; Tue, 6 Jun 2017 15:48:48 +0300 (EEST) Received: by mail-oi0-f52.google.com with SMTP id s3so87291802oia.0 for ; Tue, 06 Jun 2017 05:48:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=mXXP4P7JQOu1CgPMupgzNHbNQenqGjq2DS8caydODBc=; b=huXSIs5o/P4SZLfjZzB3pWNeyUXJB9s3UW6FLT3kFiQaMjOzHL4LTR4QMJ/g+Pg+Tz 6cvBhjbKfInZOPSfjMk8Oug30RlV25fRI0lm1kc/SoMylY7KTP4fAf74GBvjwAa6m5qb UsyZlMj0vtcpUlaKvM02T7j5VCT9L7RULyjjOyzvZUHebYdbVPAEjKEQ9evnbgJo1kOc FuyUM7Zxx+i3LoObUAXQvUJWCEdpPrr6NMPgSoPDx4gFgF1DvEpa7zI1voHrhtKGiQlH v1W8miC3jgpirBLtuG6rr0TC6f/QmVfNryJ0lbHvVlwsdxqZdXYLNB0ibhOEpjFwpQIQ dgPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=mXXP4P7JQOu1CgPMupgzNHbNQenqGjq2DS8caydODBc=; b=Fj5wQQht7XR+C7/sVs3OUpacvysK0m7d5LEwD89jW0cGBbM6z8fJKIo4SAjzyEDKEZ iOtL8SNkuUTDJa50m+c42gdQtqFQyXcsbO5pyqCumWJeZuqdSr8puCkGiuCuZycOjUeE Zj8bzLI7uFKNOm2FGEZBfjyKLZcHucqZkg3d5owtu59yOQCuIkd/mu0EQ6JuWB6n/sD+ A3jNK6IPUss5E6N9QeHSJdGSvDDBFO8D0O4G7npER3DAN3zwRaXLQo0PBsNVW1HiopUC WtEJu8OCwzxFUHHfgidSiDljAyTsjbJGE6SuO7wuhCfak+G2M5Rrp9FVi2c7jqDyIOF/ U87g== X-Gm-Message-State: AODbwcCj+doQPA8QmuHIEKAghr0TLVujCn97k03P7DIgv/Z8bMao+zND 2ss55w8+5WB2l6vn1+h7loxAk/kYDmvCJwA= X-Received: by 10.202.196.208 with SMTP id u199mr4089252oif.137.1496753334486; Tue, 06 Jun 2017 05:48:54 -0700 (PDT) MIME-Version: 1.0 Received: by 10.157.68.154 with HTTP; Tue, 6 Jun 2017 05:48:54 -0700 (PDT) In-Reply-To: References: <20170603001809.13960-1-jdarnley@obe.tv> <059f3c4a-8313-098d-001b-fd8077a35801@obe.tv> From: "Ronald S. Bultje" Date: Tue, 6 Jun 2017 08:48:54 -0400 Message-ID: To: FFmpeg development discussions and patches X-Content-Filtered-By: Mailman/MimeDel 2.1.20 Subject: Re: [FFmpeg-devel] [WIP] [PATCH 0/6] sse2/xmm version of 8-bit simple_idct X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Hi, On Mon, Jun 5, 2017 at 8:02 AM, Ronald S. Bultje wrote: > On Mon, Jun 5, 2017 at 7:23 AM, James Darnley wrote: > >> I forgot to mention in my cover letter that although the dct test >> passes, fate does not. As I mentioned on IRC, changing them causes >> errors elsewhere in fate. I am currently looking into this problem and >> I'm sure I will speak to you or others about it. > > > I'll have a look at this. > This makes the output of dct-test exact: How the final patch should look (i.e. change coefficients only for mpeg idct and not for prores idct to keep fate happy? Or change C code for prores so coefficients are identical?) is up to you, I don't have a preference. Michael might have an opinion on that. Ronald diff --git a/libavcodec/x86/simple_idct10.asm b/libavcodec/x86/simple_idct10.asm index ae848b7..0dd1ae5 100644 --- a/libavcodec/x86/simple_idct10.asm +++ b/libavcodec/x86/simple_idct10.asm @@ -52,6 +52,9 @@ times 4 dw %2, %3 %define W6sh2 8867 ; W6 = 35468 = 8867<<2 %define W7sh2 4520 ; W7 = 18081 = 4520<<2 + 1 +pw_round_20_div_w4: times 8 dw ((1 << (20 - 1)) / W4sh2) + + CONST_DEC w4_plus_w2, W4sh2, +W2sh2 CONST_DEC w4_min_w2, W4sh2, -W2sh2 CONST_DEC w4_plus_w6, W4sh2, +W6sh2 @@ -71,7 +74,7 @@ SECTION .text %macro idct_fn 0 cglobal simple_idct8, 1, 1, 16, block - IDCT_FN "", 11, "", 20 + IDCT_FN "", 11, pw_round_20_div_w4, 20 RET cglobal simple_idct10, 1, 1, 16, block