From patchwork Wed Feb 17 16:41:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 25707 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id F27134499AE for ; Wed, 17 Feb 2021 18:41:31 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C7B4D68A0EA; Wed, 17 Feb 2021 18:41:31 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-qk1-f178.google.com (mail-qk1-f178.google.com [209.85.222.178]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 547CF68A073 for ; Wed, 17 Feb 2021 18:41:25 +0200 (EET) Received: by mail-qk1-f178.google.com with SMTP id q85so13330719qke.8 for ; Wed, 17 Feb 2021 08:41:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id:mime-version :content-transfer-encoding; bh=8Ia629RccoWEWa91KQltExqrWtyIn9a4lzNDPprzyI8=; b=uw2LqNFbKDE5y3u8EvBxsKazyZq69t0qh6LFEaDAsQvkplSusFp8UE0bbFewygD2Vs pkhccXdX4kxw+RVSedTy3SCS5OxI6n2C9yfmW+KSemhL9LJ6wZHfKDjMwGVoqavMF48i FJsdpN2cK8qPghSeSvzZZIC4tPkmAFhNAxz+NyxgtFMOZOSZkFa3LLcRrFaNP9y2Pwej spojbKEfr1EP5+G6yiqwdqaU9evCV/jC8XEGDtBUUQGKgOcQpfEIa8123NiiPPX8ozxN 1bQwe1fl37E8K3begUzcZcd/1nruJvkKcWN8P2T0Y2jgLPt3QGh4ibAZvthyjEeuLAAO Zh4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:mime-version :content-transfer-encoding; bh=8Ia629RccoWEWa91KQltExqrWtyIn9a4lzNDPprzyI8=; b=JprwISFnzuy0E59x0r5C/cUAkmKaiWnOGSShcA71i2gr9yIoaFIZaZ7ovks0BOJbbL E+sftL7Nr3SO6u7Y5QNsDzF4dnTu1+BNGn4YmGnqnqglB6PaGk8C8kcrO6RukF5Go+Aw od2ZgPdCo162+1VylIAtpali0rv+gmUN4QU+us2pZiKiK9OXuaLlGk6ZFXSnQjAeuvqe 7q2iFS9fxi6quvIeSTw6JZOloT/7NSPL26gjdeFhvVN0xrET91BJGXoFvJUsBt6TWoTV Qknk6Eke6c6hOvCh275X2+UT0YjoaCWqtunUtvJUL8aCtK5y7fVBojXDlKCqH3ViKf3X Xtkg== X-Gm-Message-State: AOAM531qIX8vR6A9R/I02qpIpuRExHa9+Jn8tZZ+zu7A5+PqGa/lIksz S40KOmu2eNZTRkJFEPk/EsAUlsbQt0M= X-Google-Smtp-Source: ABdhPJyX0MuFXBVvtOoGjESxjUHFsoQ0O0eZ9T/Yt2s3/Pa3Rx4pkb7SMFnZrdLFqqtneRUBVx3Sig== X-Received: by 2002:a37:4ecd:: with SMTP id c196mr39461qkb.264.1613580083530; Wed, 17 Feb 2021 08:41:23 -0800 (PST) Received: from localhost.localdomain ([181.23.76.251]) by smtp.gmail.com with ESMTPSA id f12sm1944390qkl.2.2021.02.17.08.41.21 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Feb 2021 08:41:22 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Wed, 17 Feb 2021 13:41:04 -0300 Message-Id: <20210217164106.6370-1-jamrial@gmail.com> X-Mailer: git-send-email 2.30.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/3] x86/vf_gblur: fix postscale_slice prologue X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" x86_32 ABI does not pass float arguments directly on xmm regs, and the Win64 ABI uses only the first four regs for this purpose. Signed-off-by: James Almer --- libavfilter/vf_gblur.c | 3 +-- libavfilter/x86/vf_gblur.asm | 29 +++++++++++++---------------- 2 files changed, 14 insertions(+), 18 deletions(-) diff --git a/libavfilter/vf_gblur.c b/libavfilter/vf_gblur.c index 109a7a95f9..40956e122d 100644 --- a/libavfilter/vf_gblur.c +++ b/libavfilter/vf_gblur.c @@ -234,8 +234,7 @@ void ff_gblur_init(GBlurContext *s) { s->horiz_slice = horiz_slice_c; s->postscale_slice = postscale_c; - if (ARCH_X86_64) - ff_gblur_init_x86(s); + ff_gblur_init_x86(s); } static int config_input(AVFilterLink *inlink) diff --git a/libavfilter/x86/vf_gblur.asm b/libavfilter/x86/vf_gblur.asm index c29ecba889..c2b2998202 100644 --- a/libavfilter/x86/vf_gblur.asm +++ b/libavfilter/x86/vf_gblur.asm @@ -185,27 +185,24 @@ HORIZ_SLICE %endif %macro POSTSCALE_SLICE 0 -%if UNIX64 -cglobal postscale_slice, 2, 2, 4, ptr, length -%else -cglobal postscale_slice, 5, 5, 4, ptr, length, postscale, min, max -%endif +cglobal postscale_slice, 2, 2, 4, ptr, length, postscale, min, max shl lengthd, 2 add ptrq, lengthq neg lengthq -%if WIN64 +%if ARCH_X86_32 + VBROADCASTSS m0, postscalem + VBROADCASTSS m1, minm + VBROADCASTSS m2, maxm +%elif WIN64 SWAP 0, 2 SWAP 1, 3 - SWAP 2, 4 -%endif -%if cpuflag(avx2) - vbroadcastss m0, xm0 - vbroadcastss m1, xm1 - vbroadcastss m2, xm2 -%else - shufps xm0, xm0, 0 - shufps xm1, xm1, 0 - shufps xm2, xm2, 0 + VBROADCASTSS m0, xm0 + VBROADCASTSS m1, xm1 + VBROADCASTSS m2, maxm +%else ; UNIX64 + VBROADCASTSS m0, xm0 + VBROADCASTSS m1, xm1 + VBROADCASTSS m2, xm3 %endif .loop: From patchwork Wed Feb 17 16:41:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 25712 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id D017D44B7A8 for ; Wed, 17 Feb 2021 21:28:59 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A62C968A0F1; Wed, 17 Feb 2021 21:28:59 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-vs1-f51.google.com (mail-vs1-f51.google.com [209.85.217.51]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 79C30688140 for ; Wed, 17 Feb 2021 21:28:53 +0200 (EET) Received: by mail-vs1-f51.google.com with SMTP id y24so5257730vsq.3 for ; Wed, 17 Feb 2021 11:28:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=A1HS/85ZWVikipWz+6xfeAaEQhvCWRC//pmNGMfPLyY=; b=qVlE9seyO3CXP7rHuWrRiZzpS7RTtFeBjVldb/Rt+VQrbPitzNivzRAM5vpUO4LdAH TMk4E1uSX1QA8QyrwX6otujIzzyz2ajQuGJgHsXESiaZRdiW6JSClJleT9wJJoDTOsz5 Vs+tgPLt9k3HHPpRS/VZY1fbUwRnHDLhMDA5gR09Gb+DHCLbpGn3tkfZ/jX0kHoEi3LH Rm7PkJxPECZqenaAvk8cd4lOEJp3A25s8y5PFoB1Co08Eq1yWX5D/722DDtBMB5WMJWl PG7m53bGPPNyaMtZNnH/kH1NTHtVQA0g0FW/RGSqaoJJOz8PuslBBhRtSCtt6dRZ3ox3 YB8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=A1HS/85ZWVikipWz+6xfeAaEQhvCWRC//pmNGMfPLyY=; b=re0shwJki+m3YB5VYF/As0gPorruFK8cuI6svS0epV/v/OzvkZg3yMh7axXRHpe6Wn REFaTUewhKH8d0SmgHEM7on6Vn+9fJxhHAcKYkyIcoPb7hSdGi5roY8e5bYcXi5vAlXv UiibxjzLugK1SLtlFx/X4iaU0RtJ2zB+sYU3bMBu6GgA48PIvm0Sjet+WzcJDWNDZnK7 McaHBwK/uiC3GdKLId/yyVUXCpWiBVdNYWr5n86pdF07tgx/+32/Xw7siST3JX3vCS2S p79kjOjRNhTLeoZyS8Xf+ksKQuYtJHZzrc2dIPIfIE6e7dt1T3yHgmrHn9Iu0VcZE4tU 4G6A== X-Gm-Message-State: AOAM532LG2qV7fft9scOo4dqbi3mhVpZupDGIce30tYrRuXcjgjXwjr5 XOZ08y4sTIr/PmGbtikJrGEXeNUITvY= X-Google-Smtp-Source: ABdhPJyV4xwbbIuW3hSGYKbqdM+RwLFqcnE9CwtxalcJyrM10nzpatZ++L6CNLg5c1y2qWoP4UIqew== X-Received: by 2002:a05:6214:208:: with SMTP id i8mr25728615qvt.31.1613580084821; Wed, 17 Feb 2021 08:41:24 -0800 (PST) Received: from localhost.localdomain ([181.23.76.251]) by smtp.gmail.com with ESMTPSA id f12sm1944390qkl.2.2021.02.17.08.41.23 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Feb 2021 08:41:24 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Wed, 17 Feb 2021 13:41:05 -0300 Message-Id: <20210217164106.6370-2-jamrial@gmail.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210217164106.6370-1-jamrial@gmail.com> References: <20210217164106.6370-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/3] checkasm/vf_gblur: split off the horiz_slice test into its own function X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Will come in handy for the following commit. Signed-off-by: James Almer --- tests/checkasm/vf_gblur.c | 28 +++++++++++++++------------- 1 file changed, 15 insertions(+), 13 deletions(-) diff --git a/tests/checkasm/vf_gblur.c b/tests/checkasm/vf_gblur.c index 1d63fc22a0..8ff47a338f 100644 --- a/tests/checkasm/vf_gblur.c +++ b/tests/checkasm/vf_gblur.c @@ -33,18 +33,26 @@ tmp_buf[j] = (float)(rnd() & 0xFF); \ } while (0) -void checkasm_check_vf_gblur(void) +static void check_horiz_slice(float *dst_ref, float *dst_new) { - float *dst_ref = av_malloc(BUF_SIZE); - float *dst_new = av_malloc(BUF_SIZE); - int w = WIDTH; - int h = HEIGHT; int steps = 2; float nu = 0.101f; float bscale = 1.112f; - GBlurContext s; declare_func(void, float *dst, int w, int h, int steps, float nu, float bscale); + call_ref(dst_ref, WIDTH, HEIGHT, steps, nu, bscale); + call_new(dst_new, WIDTH, HEIGHT, steps, nu, bscale); + if (!float_near_abs_eps_array(dst_ref, dst_new, 0.01f, PIXELS)) { + fail(); + } + bench_new(dst_new, WIDTH, HEIGHT, 1, nu, bscale); +} + +void checkasm_check_vf_gblur(void) +{ + float *dst_ref = av_malloc(BUF_SIZE); + float *dst_new = av_malloc(BUF_SIZE); + GBlurContext s; randomize_buffers(dst_ref, PIXELS); memcpy(dst_new, dst_ref, BUF_SIZE); @@ -52,13 +60,7 @@ void checkasm_check_vf_gblur(void) ff_gblur_init(&s); if (check_func(s.horiz_slice, "horiz_slice")) { - call_ref(dst_ref, w, h, steps, nu, bscale); - call_new(dst_new, w, h, steps, nu, bscale); - - if (!float_near_abs_eps_array(dst_ref, dst_new, 0.01f, PIXELS)) { - fail(); - } - bench_new(dst_new, w, h, 1, nu, bscale); + check_horiz_slice(dst_ref, dst_new); } report("horiz_slice"); av_freep(&dst_ref); From patchwork Wed Feb 17 16:41:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 25708 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 0836F4499AE for ; Wed, 17 Feb 2021 18:41:35 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E0ABE68A0FC; Wed, 17 Feb 2021 18:41:34 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-qt1-f182.google.com (mail-qt1-f182.google.com [209.85.160.182]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id D84D768A073 for ; Wed, 17 Feb 2021 18:41:27 +0200 (EET) Received: by mail-qt1-f182.google.com with SMTP id c1so10002624qtc.1 for ; Wed, 17 Feb 2021 08:41:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=gtefPPkYWQeb4dzGmlUVO8zxozRTO16Jfmhbr5ptDgQ=; b=pbnzOGXQ/dVk5aCS4VNijQEnhIl1lFUO1AcoQfRDuA8lWAB3GiF1uGEj2uTCynJ85q aPlqx/FB71EYJh5N814XTRSApeIFHUmcy5CAGjzbcapfyRM/4C0AysxiC9ltYrXY+gnZ ZSAX4halgvg+5YkD+uJEDu6Mlkp31P56B0rVhBQbURHlB6plr6qiZ1rhTiLTccXKYhvr BAeO9du0ROl2TQDfPsxgMWx3xFAwyOn/SOOYNbtj2qmTY7R70yFET3Eawxl6HYPcW4oG hTt3qmlX0VK79SihHp5DNmg6GPVSAL6sC0F4SZzAWElOwUHPisulQIVR6Q/Etyvko0+j yXdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gtefPPkYWQeb4dzGmlUVO8zxozRTO16Jfmhbr5ptDgQ=; b=OJDzg8vhqaLTRXZyUBcrTxhfxYqLx/UOzuEfOVxbJxG/2WCiZD03aOIldWz/Ibulea YXJD3HivcYffmN2IcPaQKmP6pZDzy8fcCybtqFR8rwEfY+8WFh44Rgvje+dEPJXkZVki krR1EN/nAGtWdovm7MxIthGnX2ffMn74m17RrWqKnckuxjANP6qlzUywC4LT23oJHXXr nCRBEn1KpfQREjoWslDCvdYAUy8LzFapAtVNDaUBpKCP2fU8C5CeISDQSzBDODEiZe2U Nxez//POIpbaQnvGbHeEuZu0oBNPVHV8/T3IzqnNMb4qpe8ocp9YAoOZRJTdoUHwFEej Dnyg== X-Gm-Message-State: AOAM53332gLbFvymlg3WGYFy9LRxsu5nUMqtO/nZDTKNSB9JIR5w7frm mup4HXV37XYZjRyQ4kNpXWKRFSSdgso= X-Google-Smtp-Source: ABdhPJzKZRxoVWR3bagD3/GxFlrsake39bLzBeKtGNR/KXsoyF/5qCPnDVZCFOwsD9LV1tF9u8kMFg== X-Received: by 2002:ac8:5c0f:: with SMTP id i15mr144264qti.152.1613580086187; Wed, 17 Feb 2021 08:41:26 -0800 (PST) Received: from localhost.localdomain ([181.23.76.251]) by smtp.gmail.com with ESMTPSA id f12sm1944390qkl.2.2021.02.17.08.41.25 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Feb 2021 08:41:25 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Wed, 17 Feb 2021 13:41:06 -0300 Message-Id: <20210217164106.6370-3-jamrial@gmail.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210217164106.6370-1-jamrial@gmail.com> References: <20210217164106.6370-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 3/3] checkasm/vf_gblur: add a test for postscale_slice X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Signed-off-by: James Almer --- tests/checkasm/vf_gblur.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/tests/checkasm/vf_gblur.c b/tests/checkasm/vf_gblur.c index 8ff47a338f..b9fe2f9a36 100644 --- a/tests/checkasm/vf_gblur.c +++ b/tests/checkasm/vf_gblur.c @@ -16,6 +16,7 @@ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. */ +#include #include #include "checkasm.h" #include "libavfilter/gblur.h" @@ -48,6 +49,19 @@ static void check_horiz_slice(float *dst_ref, float *dst_new) bench_new(dst_new, WIDTH, HEIGHT, 1, nu, bscale); } +static void check_postscale_slice(float *dst_ref, float *dst_new) +{ + float postscale = 0.0603f; + + declare_func(void, float *dst, int len, float postscale, float min, float max); + call_ref(dst_ref, PIXELS, postscale, -FLT_MAX, FLT_MAX); + call_new(dst_new, PIXELS, postscale, -FLT_MAX, FLT_MAX); + if (!float_near_abs_eps_array(dst_ref, dst_new, FLT_EPSILON, PIXELS)) { + fail(); + } + bench_new(dst_new, PIXELS, postscale, -FLT_MAX, FLT_MAX); +} + void checkasm_check_vf_gblur(void) { float *dst_ref = av_malloc(BUF_SIZE); @@ -63,6 +77,14 @@ void checkasm_check_vf_gblur(void) check_horiz_slice(dst_ref, dst_new); } report("horiz_slice"); + + randomize_buffers(dst_ref, PIXELS); + memcpy(dst_new, dst_ref, BUF_SIZE); + if (check_func(s.postscale_slice, "postscale_slice")) { + check_postscale_slice(dst_ref, dst_new); + } + report("postscale_slice"); + av_freep(&dst_ref); av_freep(&dst_new); }