From patchwork Wed Jan 23 02:39:12 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rodger Combs X-Patchwork-Id: 11832 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 576CC44DF3E for ; Wed, 23 Jan 2019 04:44:37 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6C80168AABE; Wed, 23 Jan 2019 04:44:25 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-it1-f177.google.com (mail-it1-f177.google.com [209.85.166.177]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 5226F68A99C for ; Wed, 23 Jan 2019 04:44:18 +0200 (EET) Received: by mail-it1-f177.google.com with SMTP id p197so1019495itp.0 for ; Tue, 22 Jan 2019 18:44:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id:mime-version :content-transfer-encoding; bh=goXN/EFAOTUrP1DunEQxIUFffhhqTr2HkoOhBJFJLDg=; b=al6HDSwLR9Mc5Wm6qqEZ4l5rj3EyXjKgXQh6Gm6P/kNSyUp59u9a9Oa5aG50FEGU+J ws97NJMREukhgfYl33kldGABgYqxqs5r5tSjakTBL5xWASY4SGJX0BjNjTvtz7kNPVXw 01sLPeG16WVdEq43B3/Sia0fevuc710jK+Vv65BpdI4K+tdBVTpkdM5pvNX0CgsuMW1h qwT+O/rMt46sfdNgG5ekyliw7mjTm/mLm5uadf+VDA4lhPqtLNix77r8eUD9loRGTkpW G9lJlsttFFLoYk7k934Y7vtySzw8sSz3fCXAjIVexBc8qRytzzv8qjzhfYLUcYh7H5Yc Ln8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:mime-version :content-transfer-encoding; bh=goXN/EFAOTUrP1DunEQxIUFffhhqTr2HkoOhBJFJLDg=; b=ELDxVoCGCYll9M691XS22Z/U8A2+F91ItQ6c3aasFsBL78ZYvYECTNBZXy/4dOzq1/ L59cwUOccLGoYNoK67AHkWZ/BVXonqRE1K7WQg2JhRw+0N8P8F+eZ1Q25fjSLd4zGQik ZadXGtLpGh33u1gQPdzMN7Cd4kN3Ph1iDeLm5GMOOSTbH0SZdNK8wa+DjN7MJD6koy0s iBGrgOEVjJJzRR5+/QY+sLNDAgZhX5XtTHWZplTMLTm/Qrl7emgItNIuBCiqQumhO2LO AsSmSMv5OqTMDmIjD+iVU7FccoeVBDCE5+mRIvfO+Wn8/dNGwXre74QpO19m/nTIn+C+ E0qw== X-Gm-Message-State: AHQUAuZwJwDACGYkbmtujdJN6MlH94iwMUU8624pl/aJM2EIOAjCQDTK Lgq+cunzl8l5YMqKD0uXD5F6AsTO X-Google-Smtp-Source: ALg8bN60qnUyEIo0w1pfGbXjwxJtQiMJp2/GQmFdZN2jvOKnf9APoMJcP40DHg5Wf40JrPbNdOL5/Q== X-Received: by 2002:a24:67c6:: with SMTP id u189mr658031itc.106.1548211166264; Tue, 22 Jan 2019 18:39:26 -0800 (PST) Received: from Rodgers-MBP.localdomain ([71.201.155.37]) by smtp.gmail.com with ESMTPSA id h16sm9089748ith.25.2019.01.22.18.39.24 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 22 Jan 2019 18:39:24 -0800 (PST) From: Rodger Combs To: ffmpeg-devel@ffmpeg.org Date: Tue, 22 Jan 2019 20:39:12 -0600 Message-Id: <20190123023914.20619-1-rodger.combs@gmail.com> X-Mailer: git-send-email 2.19.1 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/3] lavu/x86util: make imprecise PMINSD implementation opt-in X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" This caused rounding errors when values that can't be expressed exactly as 32-bit floats were passed in, which could happen in 16-bit yadif swscale's call is opted in, under the assumption that it never uses values large enough to run into this (i.e. within 2^24) --- libavutil/x86/x86util.asm | 4 ++-- libswscale/x86/scale.asm | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/libavutil/x86/x86util.asm b/libavutil/x86/x86util.asm index d7cd996842..c96afb6ef1 100644 --- a/libavutil/x86/x86util.asm +++ b/libavutil/x86/x86util.asm @@ -799,10 +799,10 @@ pminsw %1, %3 %endmacro -%macro PMINSD 3 ; dst, src, tmp/unused +%macro PMINSD 3-4 ; dst, src, tmp/unused, rounding-allowed %if cpuflag(sse4) pminsd %1, %2 -%elif cpuflag(sse2) +%elif cpuflag(sse2) && (%0 > 3) cvtdq2ps %1, %1 minps %1, %2 cvtps2dq %1, %1 diff --git a/libswscale/x86/scale.asm b/libswscale/x86/scale.asm index 83cabff722..914fd1ada4 100644 --- a/libswscale/x86/scale.asm +++ b/libswscale/x86/scale.asm @@ -364,7 +364,7 @@ cglobal hscale%1to%2_%4, %5, 10, %6, pos0, dst, w, srcmem, filter, fltpos, fltsi movd [dstq+wq*2], m0 %endif ; %3 ==/!= X %else ; %2 == 19 - PMINSD m0, m2, m4 + PMINSD m0, m2, m4, 1 %ifnidn %3, X mova [dstq+wq*(4>>wshr)], m0 %else ; %3 == X