From patchwork Wed May 25 01:21:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Swinney, Jonathan" X-Patchwork-Id: 35914 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:6914:b0:82:6b11:2509 with SMTP id q20csp822943pzj; Tue, 24 May 2022 18:21:45 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwysoLcxRWil3x+2gBHX+MBYoOqpj1VS0BTl+YVyac6osoH7qyi5GTbjpDxZa9rPFN9cLQU X-Received: by 2002:a17:906:58c3:b0:6fe:7d3:a6c3 with SMTP id e3-20020a17090658c300b006fe07d3a6c3mr25588002ejs.317.1653441705526; Tue, 24 May 2022 18:21:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1653441705; cv=none; d=google.com; s=arc-20160816; b=YMlGcyxJF++bgSsA2PPbqaQroGgOKrZ2kPRoWZ/Rnryyr0Tfh19qRclNuTNzm8BFzd EVYVmvhZki4/lqVrKEYPzhx/WQ3l9/WZe/DRGFVcfEnYdF8fOls7ZfIkOOWXxM8iRlbj eFRhMuae3qekk3mR9+7vgHj9qbNYc4hvnzJOn9BLd/XHnQEi7lYOB2QUGruF5fXCdxI+ Ns1vutah13PDY0H+vlU3poFUhUbKZmcgjaNgtNPjlWiGw+vkYnSf4Yu9mB3BDyTGoARu nEOLrJfLkxQhrk9VhU88F3my4kL/T/OSlrgbEEla9ecbTwiXsSB36WL9O25D7YBEKJZD QJgw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:content-language :accept-language:message-id:date:thread-index:thread-topic:to:from :dkim-signature:delivered-to; bh=gYNzXONyHgr8QcRqJA2WWs0Gp6I+pkqBVoiQZWrMiRU=; b=KrULE157bL3L33L3JT4trauGF7xuidflZWYheSwm62SV7jchs0TL3k/PsdUtYou4oy TNbjc+9UQsrgK/UP3iJx65byNbojzuzDc364ZXohQ0I84TVvGm4rchUgs3uBUHDmSgT7 sV5LSiLLhMGC5rC5WlPKaEmQaLvBpejSTkxClg2EEYrVlDS+deJxIucwuyX0j+X/K6Rv J4ZEB7UWersHXt98jmIlVPgM0yggMh2dlcDkH/6cECmAEOwAKlP3P0oWfxANbj2VU2yp /wmhB09Sz1hY6L2CIM0ZOgNoCvhVr9Gw+TxD3VmhGmPe8RBfU60TD18nYYvaPqO8LiOr v8Kw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@amazon.com header.s=amazon201209 header.b="DlrCj5/E"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id c7-20020a05640227c700b00426024720bdsi19269368ede.577.2022.05.24.18.21.45; Tue, 24 May 2022 18:21:45 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@amazon.com header.s=amazon201209 header.b="DlrCj5/E"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 55FCF68B548; Wed, 25 May 2022 04:21:42 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from smtp-fw-9103.amazon.com (smtp-fw-9103.amazon.com [207.171.188.200]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 31FE968B52B for ; Wed, 25 May 2022 04:21:35 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1653441700; x=1684977700; h=from:to:cc:subject:date:message-id: content-transfer-encoding:mime-version; bh=A/+9uWpX65MWlcMLf+7ZgwiqY5znQEh9tnJzff2ol/I=; b=DlrCj5/EkKS6thZpd4DCZglta/FD3aztDQ60/80ijC8O8khV7duFqthq KRYjc8dlf3JOnJuqpbRZHIXDKiR4xcN+8HF8uL0ZPrqBrMGaYC6N3criy 1s7AtLzJwZHAaScQ/+8pZS/mYsfKMH0CthHDZiWjbDe78soSPqbM3uH5a s=; X-IronPort-AV: E=Sophos;i="5.91,250,1647302400"; d="scan'208";a="1018690997" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO email-inbound-relay-pdx-2c-5c4a15b1.us-west-2.amazon.com) ([10.25.36.210]) by smtp-border-fw-9103.sea19.amazon.com with ESMTP; 25 May 2022 01:21:33 +0000 Received: from EX13MTAUWB001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan2.pdx.amazon.com [10.236.137.194]) by email-inbound-relay-pdx-2c-5c4a15b1.us-west-2.amazon.com (Postfix) with ESMTPS id CEADC42758; Wed, 25 May 2022 01:21:32 +0000 (UTC) Received: from EX13D01UWB004.ant.amazon.com (10.43.161.157) by EX13MTAUWB001.ant.amazon.com (10.43.161.207) with Microsoft SMTP Server (TLS) id 15.0.1497.36; Wed, 25 May 2022 01:21:30 +0000 Received: from EX13D07UWB004.ant.amazon.com (10.43.161.196) by EX13d01UWB004.ant.amazon.com (10.43.161.157) with Microsoft SMTP Server (TLS) id 15.0.1497.36; Wed, 25 May 2022 01:21:30 +0000 Received: from EX13D07UWB004.ant.amazon.com ([10.43.161.196]) by EX13D07UWB004.ant.amazon.com ([10.43.161.196]) with mapi id 15.00.1497.036; Wed, 25 May 2022 01:21:30 +0000 From: "Swinney, Jonathan" To: "ffmpeg-devel@ffmpeg.org" Thread-Topic: [PATCH v2 1/2] checkasm: added additional dstW tests for hscale Thread-Index: Adhv1b2KjctceO+uTi+RCYId+Zpn9g== Date: Wed, 25 May 2022 01:21:30 +0000 Message-ID: <36fdb39238cf499c815f9c2704381656@EX13D07UWB004.ant.amazon.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.43.160.132] MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2 1/2] checkasm: added additional dstW tests for hscale X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: =?utf-8?q?Martin_Storsj=C3=B6?= , "Pop, Sebastian" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: hDDv/boVWYVR Signed-off-by: Jonathan Swinney --- tests/checkasm/sw_scale.c | 38 ++++++++++++++++++++++---------------- 1 file changed, 22 insertions(+), 16 deletions(-) diff --git a/tests/checkasm/sw_scale.c b/tests/checkasm/sw_scale.c index 3c0a083b42..6c223c48f9 100644 --- a/tests/checkasm/sw_scale.c +++ b/tests/checkasm/sw_scale.c @@ -148,7 +148,11 @@ static void check_hscale(void) { 8, 18 }, }; - int i, j, fsi, hpi, width; +#define LARGEST_INPUT_SIZE 512 +#define INPUT_SIZES 6 + static const int input_sizes[INPUT_SIZES] = {8, 24, 128, 144, 256, 512}; + + int i, j, fsi, hpi, width, dstWi; struct SwsContext *ctx; // padded @@ -183,7 +187,6 @@ static void check_hscale(void) ctx->srcBpc = hscale_pairs[hpi][0]; ctx->dstBpc = hscale_pairs[hpi][1]; ctx->hLumFilterSize = ctx->hChrFilterSize = width; - ctx->dstW = ctx->chrDstW = SRC_PIXELS; for (i = 0; i < SRC_PIXELS; i++) { filterPos[i] = i; @@ -215,20 +218,23 @@ static void check_hscale(void) filter[SRC_PIXELS * width + i] = rnd(); } - ff_sws_init_scale(ctx); - memcpy(filterAvx2, filter, sizeof(uint16_t) * (SRC_PIXELS * MAX_FILTER_WIDTH + MAX_FILTER_WIDTH)); - if ((cpu_flags & AV_CPU_FLAG_AVX2) && !(cpu_flags & AV_CPU_FLAG_SLOW_GATHER)) - ff_shuffle_filter_coefficients(ctx, filterPosAvx, width, filterAvx2, SRC_PIXELS); - - if (check_func(ctx->hcScale, "hscale_%d_to_%d_width%d", ctx->srcBpc, ctx->dstBpc + 1, width)) { - memset(dst0, 0, SRC_PIXELS * sizeof(dst0[0])); - memset(dst1, 0, SRC_PIXELS * sizeof(dst1[0])); - - call_ref(NULL, dst0, SRC_PIXELS, src, filter, filterPos, width); - call_new(NULL, dst1, SRC_PIXELS, src, filterAvx2, filterPosAvx, width); - if (memcmp(dst0, dst1, SRC_PIXELS * sizeof(dst0[0]))) - fail(); - bench_new(NULL, dst0, SRC_PIXELS, src, filter, filterPosAvx, width); + for (dstWi = 0; dstWi < INPUT_SIZES; dstWi++) { + ctx->dstW = ctx->chrDstW = input_sizes[dstWi]; + ff_sws_init_scale(ctx); + memcpy(filterAvx2, filter, sizeof(uint16_t) * (SRC_PIXELS * MAX_FILTER_WIDTH + MAX_FILTER_WIDTH)); + if ((cpu_flags & AV_CPU_FLAG_AVX2) && !(cpu_flags & AV_CPU_FLAG_SLOW_GATHER)) + ff_shuffle_filter_coefficients(ctx, filterPosAvx, width, filterAvx2, SRC_PIXELS); + + if (check_func(ctx->hcScale, "hscale_%d_to_%d__fs_%d_dstW_%d", ctx->srcBpc, ctx->dstBpc + 1, width, ctx->dstW)) { + memset(dst0, 0, SRC_PIXELS * sizeof(dst0[0])); + memset(dst1, 0, SRC_PIXELS * sizeof(dst1[0])); + + call_ref(NULL, dst0, ctx->dstW, src, filter, filterPos, width); + call_new(NULL, dst1, ctx->dstW, src, filterAvx2, filterPosAvx, width); + if (memcmp(dst0, dst1, ctx->dstW * sizeof(dst0[0]))) + fail(); + bench_new(NULL, dst0, ctx->dstW, src, filter, filterPosAvx, width); + } } } }