From patchwork Mon Dec 20 13:57:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Kelly X-Patchwork-Id: 32755 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a6b:cd86:0:0:0:0:0 with SMTP id d128csp4380521iog; Mon, 20 Dec 2021 05:57:13 -0800 (PST) X-Google-Smtp-Source: ABdhPJxnMd1pYCDoBri9pXq5pjEddQ9IBeh8NdrDvPFe+oVOoyMcBucfntBDvcb/DCc23v/uWLAv X-Received: by 2002:a05:6402:168b:: with SMTP id a11mr13426624edv.367.1640008633734; Mon, 20 Dec 2021 05:57:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1640008633; cv=none; d=google.com; s=arc-20160816; b=t8uXErntj7T3q/VyTpqFKJwVKil/0XXk+Hzb/nFB/B9ZI0fYP+kYY8e/2Fz/nox9fG C0Nzd2ZxrPex0eIHr0BbBw9PGF+QJnSG0CNhDeq6Y5GjfTOqSjyvVGeR4YF3JzKv+XSS QTgUtVfP1Ami6Rs80Q2hI6U80QZDkqJzCmLv4AlT2qL6r6CISYWIdpnHcxv9nenmqBZ+ 3KJzc68YR1YSHSyL21Q4meD429FIr3Hm4wBWNFLuUS4JCcVQNyp1sNGKlEk3d5zQd+q3 3I55oLa31qDn7L0jn/2VDCL7WWwMu+xnRf8hgTxavWO5jK/723q+wMgDLHzMlbH81RzN UPAg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:to:from:mime-version:message-id:date :dkim-signature:delivered-to; bh=271CPIEvbim4jSgTox8qAMOjatCn0mVd5HppaA1h8jA=; b=IHtEm9119z9s7Pj7TYUn9tg+ZTA+YwxVwIodbNQJrNxHHQy22JBXEgTRvFVQpKdpyq dyH8HOUCLWUIejbdgctxPechjPrAEkR2OajHdRsK4egWKsd9WhY7B4TjzXQp4FCIR8kX p0TqUB1ukrNwED7eCXtRtVRLJReLG8+4ThpeSPlZd4g10SGClcDoj2UmtWaF40337n4j 2D66Vbs3tFL2iuaKpc2dpfeszYlfcGXG/JuUMFjzrbFAqqLqF/U4TtN6Ugs5FW39LF7m mqeoIL3gQX7lb7ARPbVhUSYiXkOZe2LKlAjQsPjaXOAdnxoLrZaKXo5fZLVG0oBk0Lq9 vn9A== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@google.com header.s=20210112 header.b="H/u2J7OV"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id s22si12490880edd.54.2021.12.20.05.57.13; Mon, 20 Dec 2021 05:57:13 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@google.com header.s=20210112 header.b="H/u2J7OV"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id AA00F68AF05; Mon, 20 Dec 2021 15:57:11 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wr1-f74.google.com (mail-wr1-f74.google.com [209.85.221.74]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 96BE668A516 for ; Mon, 20 Dec 2021 15:57:04 +0200 (EET) Received: by mail-wr1-f74.google.com with SMTP id x20-20020adfbb54000000b001a0d044e20fso3780337wrg.11 for ; Mon, 20 Dec 2021 05:57:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:message-id:mime-version:subject:from:to:cc; bh=/loom0B7xbjBoCDszX2o4my9AT1orT4k2fd4Y9GB0f0=; b=H/u2J7OVPgKWfIBRicfaN8PcTJdHAxKNPxag3oHYE8MjYBZHPhpF2tDTLf2ib0HR1o CYtiAnNRAsmSU4nLOijj3Sf6sE1yONRgqTmCkj1EZmzjjxe4C5s2EfO9X1DnVLJzWokZ uGGmQh+202OolZRvsRWwPY69PzW9meHEVK+CKODLt4ZyQ/kxUGDUxemr9e+ysaQE7o8v 1JmrKHA6chWAKw6xZQRz2Y1+STYkjkIa0a8eAbl8uRwsit6pdMZMPZ6e/TkAQ4/P2N3x LG3WHBXqQdGaDUhmz6bnkz8MR3oUIH+ZpyrQiO4scKkp8mBofUBw2HKQPmo8b/L1u4rg SA4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=/loom0B7xbjBoCDszX2o4my9AT1orT4k2fd4Y9GB0f0=; b=jLBGbmjBOMW401WWmmUkOqUuB0u+gdLxA5cOwqJZn1/MAIEn6wIwSoYezPf+U7QCqW 6euOMAsZd3evVkFqeOjY6o+YW6Qbs+DBWMnGr/D6VgqNGjkCzygEBQgAXJemSDemi9Vx HI0d07sQf90Mcj/4Kf5HMcWEfM1Pko7wY/rYg6CjpD649RpGd/dxF4RtI/Q7Il6C7bvl 9+poGjesZSBAsKEL8i4PhbEBFwqvYycM/DJwww47OZvlfl5J3x/XmTZzEo3SjhGz+uBY xAoyHn+AMLdo+s52XuCdVhEPpama28IF+Y21IDbkAkOwHbS99b4V3Q7WtmY1MntxiojO kmww== X-Gm-Message-State: AOAM5328o6BCNyQwhpU0SrCQhJQhD/U4TpVXlEMSluwUCWfxR3cKgJ9i A7rPDZXCXLwuaZnKsMMz3W7teThunheykqcFR8pFZsatb2R8QFjw0xNtS6KgIOlmCL0SNmr/hMJ 9T9sZ5iC4bGQd6v/kQLBPWglhaGcdLVsLqkSCn4HRfwXKwFFvXm1fl/dV4x+M5bAygbqh9P8= X-Received: from alankelly0.zrh.corp.google.com ([2a00:79e0:61:301:922d:7ddd:85f5:5a25]) (user=alankelly job=sendgmr) by 2002:a1c:20c2:: with SMTP id g185mr20836984wmg.115.1640008623890; Mon, 20 Dec 2021 05:57:03 -0800 (PST) Date: Mon, 20 Dec 2021 14:57:00 +0100 Message-Id: <20211220135700.615644-1-alankelly@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.34.1.173.g76aa8bc2d0-goog From: Alan Kelly To: ffmpeg-devel@ffmpeg.org Subject: [FFmpeg-devel] [PATCH 2/2] libswscale: Test AV_CPU_FLAG_SLOW_GATHER for hscale functions. X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Alan Kelly Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: p6GMmeHwRQL/ This is instead of EXTERNAL_AVX2_FAST so that the avx2 hscale functions are only used where they are faster. --- libswscale/utils.c | 2 +- libswscale/x86/swscale.c | 2 +- tests/checkasm/sw_scale.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/libswscale/utils.c b/libswscale/utils.c index d4a72d3ce1..9a69b45afe 100644 --- a/libswscale/utils.c +++ b/libswscale/utils.c @@ -282,7 +282,7 @@ void ff_shuffle_filter_coefficients(SwsContext *c, int *filterPos, int filterSiz #if ARCH_X86_64 int i, j, k, l; int cpu_flags = av_get_cpu_flags(); - if (EXTERNAL_AVX2_FAST(cpu_flags)){ + if (cpu_flags & AV_CPU_FLAG_SLOW_GATHER) { if ((c->srcBpc == 8) && (c->dstBpc <= 14)){ if (dstW % 16 == 0){ if (filter != NULL){ diff --git a/libswscale/x86/swscale.c b/libswscale/x86/swscale.c index c49a05c37b..eb5334a2be 100644 --- a/libswscale/x86/swscale.c +++ b/libswscale/x86/swscale.c @@ -578,7 +578,7 @@ switch(c->dstBpc){ \ break; \ } - if (EXTERNAL_AVX2_FAST(cpu_flags)) { + if (cpu_flags & AV_CPU_FLAG_SLOW_GATHER) { if ((c->srcBpc == 8) && (c->dstBpc <= 14)) { if (c->chrDstW % 16 == 0) ASSIGN_AVX2_SCALE_FUNC(c->hcScale, c->hChrFilterSize); diff --git a/tests/checkasm/sw_scale.c b/tests/checkasm/sw_scale.c index f4912e6c2c..680562af08 100644 --- a/tests/checkasm/sw_scale.c +++ b/tests/checkasm/sw_scale.c @@ -217,7 +217,7 @@ static void check_hscale(void) } ff_sws_init_scale(ctx); memcpy(filterAvx2, filter, sizeof(uint16_t) * (SRC_PIXELS * MAX_FILTER_WIDTH + MAX_FILTER_WIDTH)); - if (cpu_flags & AV_CPU_FLAG_AVX2) + if (cpu_flags & AV_CPU_FLAG_SLOW_GATHER) ff_shuffle_filter_coefficients(ctx, filterPosAvx, width, filterAvx2, SRC_PIXELS); if (check_func(ctx->hcScale, "hscale_%d_to_%d_width%d", ctx->srcBpc, ctx->dstBpc + 1, width)) {