From patchwork Thu Feb 17 10:03:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Kelly X-Patchwork-Id: 34355 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6838:d078:0:0:0:0 with SMTP id x24csp489785nkx; Thu, 17 Feb 2022 02:03:59 -0800 (PST) X-Google-Smtp-Source: ABdhPJwzeCH6RjzL61iRayDD5hT7srOZ8w5HVgLwY4DURTL/V50c1nb9jjBhePNsoPPJdcrfKOQQ X-Received: by 2002:a05:6402:cb:b0:410:8094:872b with SMTP id i11-20020a05640200cb00b004108094872bmr1742078edu.378.1645092239350; Thu, 17 Feb 2022 02:03:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1645092239; cv=none; d=google.com; s=arc-20160816; b=Y1so8HKp7UYwLo5y4sv40VFL1D0fYlfTAX+CIvbbat0Gt665NkXZEHV4vFvbU9p8sh GUhkEFMThNQXCTD4cFrGMYWDDf8RBzq5LjBFrV57wXIEdYziw2Ytbewjc6HzMAO90+c0 WzQHPjHhcdJeqMGMY+Mg4cx7Cho6CkS30xumMIkprUmbiQ8MCvvq94ThOARr1oasHSYX B58me3VGQIN4DcdQ6WDi6FwLRxUYeOCUMDAZhr1h2de7AnsyaaF/BskZnW7P5W5ED70O vArFCtiwtxLM+xXOwr6DtuOmEBp4I6X6uHXzGOOGhYQUNi658xfWPJ1eXAEjdez2lMqq iVaQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:to:from:mime-version:message-id:date :dkim-signature:delivered-to; bh=+NNplIaP0YmUZhzKdcDmapuWKat0wgdmai0RTCXV6EI=; b=sn8XZZRqqC4pvL65iNYKevKeDVWRJ3Q8trU/quuo01sv4jL52SLGKNuhF4KQd/Gevm RZivauotpOn5tNitwnptf4TZbzXdYoLKgNQWILOMqXVPcm2BW2YqfdVsL1Rnr62VsJvJ JaehdiPAe5zHjx1RUjorIjvitYFNyrUDotdgsxjNVzhIjQAlIHFl6PBiHtpbd6/n6jix 1A4H1b2HzYAB2/+VhMhfE16dNJbp8CSxY621r/R0eyt9Eg6IL6nKPfwfGUhORWocvQvu UI6HTHhs/9b9H4JywDTeJDIyZ3Uqj0E6e541oQAIV4IHnzZ9l6vJ59JQT19TStf5AN8e hY0A== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@google.com header.s=20210112 header.b=cD+B5V4J; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id b3si1365313ejl.768.2022.02.17.02.03.35; Thu, 17 Feb 2022 02:03:59 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@google.com header.s=20210112 header.b=cD+B5V4J; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 132FF68B1E0; Thu, 17 Feb 2022 12:03:33 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1821E68B07E for ; Thu, 17 Feb 2022 12:03:26 +0200 (EET) Received: by mail-wr1-f73.google.com with SMTP id j8-20020adfc688000000b001e3322ced69so2080175wrg.13 for ; Thu, 17 Feb 2022 02:03:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:message-id:mime-version:subject:from:to:cc; bh=z2bT2fPLvDXm891Tc56t27V5/fusQLjdKu3PLiPsC9I=; b=cD+B5V4Jlk8C0MotON8MDeKi3uPjVVU2yi2DKl0vaHkp/jis7o1dK0JYGGOLRJtvEv 9XYag0y4igO6A5pCtdEX9yKfJ8mE45g23DsMvTBGyVVhvc4dQ0Bp+nBkhVJReXhzMYi1 wzkeD0P+RW4NHIs1buG+MR4+wSfpWuiAf4PClF1a7eLV20z1iK21+1xLrXSiyg3u3xBr K5b3G2xdOwmza3QL0x/bla18BYVA+VCxm8bwYMe8XmqEfq/vYeqfWIKtaw+goRnj9W9N 9GzLzO1BDE+JszVRf1G+dfbNBGhiUyD1x/XhnK+eAMUNqTlIX7CkKvbNfdqKvYbl8L5q f7pw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=z2bT2fPLvDXm891Tc56t27V5/fusQLjdKu3PLiPsC9I=; b=OCamkTELJti6DU7YRkPLdSX1hAoatVzARZpl/2mnHrXWRRnkPklNDt8E6QNJV87di2 8T0mQPWneOAcUJmTB5IvY00tlTbRnmWA/Dghev+2cPT8wcCiquDObM8Uz+vMXrfW2O8j 5fD/vuqV+FI2DMEoESt4KUrCrY5QRrEPM3W9l3B8E+peFQytvc/w6NHTXx8VMKeSh4B3 QBB0K18+xWPGAJ08rMZHScbVX5Pjp/VXjiCPu7VTHsJSU8o5ocbqJvaMnj+hcrffZ2lk apFY0AN+lJZRfjW1vGwlVgKF7m8Wnh+woak+/bPR6VQ7tgbjVyjExDFZ8Ua/0N/Tye7v dodw== X-Gm-Message-State: AOAM530RJAKz4kfiG598NpoLuSNPO8P2ejwY72uCdj6vZJ4bwo4zP3jn ne8V7J9cfvM2tmUgAnzhO8iyJI9qcTwvjMWQZJSg9FwiLA7ZEMa8AL7MHSj8tsGog5KPgKmYkRk 605prZz5slcXEeyoyazaBSHeijGf9DII+vVST0f9s9yGMvJAIL7w3MbR4wyIvu0kwaeEDZgI= X-Received: from alankelly0.zrh.corp.google.com ([2a00:79e0:61:301:b159:808d:943e:13ba]) (user=alankelly job=sendgmr) by 2002:a5d:6c68:0:b0:1e8:9827:b978 with SMTP id r8-20020a5d6c68000000b001e89827b978mr1568559wrz.633.1645092205341; Thu, 17 Feb 2022 02:03:25 -0800 (PST) Date: Thu, 17 Feb 2022 11:03:21 +0100 Message-Id: <20220217100321.1110443-1-alankelly@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.35.1.265.g69c8d7142f-goog From: Alan Kelly To: ffmpeg-devel@ffmpeg.org Subject: [FFmpeg-devel] [PATCH v2 1/5] libswscale: Check and propagate memory allocation errors from ff_shuffle_filter_coefficients. X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Alan Kelly Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: Nm9l1TNPK/2r --- libswscale/swscale_internal.h | 2 +- libswscale/utils.c | 11 ++++++++--- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h index 3a78d95ba6..26d28d42e6 100644 --- a/libswscale/swscale_internal.h +++ b/libswscale/swscale_internal.h @@ -1144,5 +1144,5 @@ void ff_sws_slice_worker(void *priv, int jobnr, int threadnr, #define MAX_LINES_AHEAD 4 //shuffle filter and filterPos for hyScale and hcScale filters in avx2 -void ff_shuffle_filter_coefficients(SwsContext *c, int* filterPos, int filterSize, int16_t *filter, int dstW); +int ff_shuffle_filter_coefficients(SwsContext *c, int* filterPos, int filterSize, int16_t *filter, int dstW); #endif /* SWSCALE_SWSCALE_INTERNAL_H */ diff --git a/libswscale/utils.c b/libswscale/utils.c index c5ea8853d5..344c87dfdf 100644 --- a/libswscale/utils.c +++ b/libswscale/utils.c @@ -278,7 +278,7 @@ static const FormatEntry format_entries[] = { [AV_PIX_FMT_P416LE] = { 1, 1 }, }; -void ff_shuffle_filter_coefficients(SwsContext *c, int *filterPos, int filterSize, int16_t *filter, int dstW){ +int ff_shuffle_filter_coefficients(SwsContext *c, int *filterPos, int filterSize, int16_t *filter, int dstW){ #if ARCH_X86_64 int i, j, k, l; int cpu_flags = av_get_cpu_flags(); @@ -292,6 +292,8 @@ void ff_shuffle_filter_coefficients(SwsContext *c, int *filterPos, int filterSiz } if (filterSize > 4){ int16_t *tmp2 = av_malloc(dstW * filterSize * 2); + if (!tmp2) + return AVERROR(ENOMEM); memcpy(tmp2, filter, dstW * filterSize * 2); for (i = 0; i < dstW; i += 16){//pixel for (k = 0; k < filterSize / 4; ++k){//fcoeff @@ -310,6 +312,7 @@ void ff_shuffle_filter_coefficients(SwsContext *c, int *filterPos, int filterSiz } } } + return 0; #endif } @@ -1836,7 +1839,8 @@ av_cold int sws_init_context(SwsContext *c, SwsFilter *srcFilter, get_local_pos(c, 0, 0, 0), get_local_pos(c, 0, 0, 0))) < 0) goto fail; - ff_shuffle_filter_coefficients(c, c->hLumFilterPos, c->hLumFilterSize, c->hLumFilter, dstW); + if (ff_shuffle_filter_coefficients(c, c->hLumFilterPos, c->hLumFilterSize, c->hLumFilter, dstW) < 0) + goto nomem; if ((ret = initFilter(&c->hChrFilter, &c->hChrFilterPos, &c->hChrFilterSize, c->chrXInc, c->chrSrcW, c->chrDstW, filterAlign, 1 << 14, @@ -1846,7 +1850,8 @@ av_cold int sws_init_context(SwsContext *c, SwsFilter *srcFilter, get_local_pos(c, c->chrSrcHSubSample, c->src_h_chr_pos, 0), get_local_pos(c, c->chrDstHSubSample, c->dst_h_chr_pos, 0))) < 0) goto fail; - ff_shuffle_filter_coefficients(c, c->hChrFilterPos, c->hChrFilterSize, c->hChrFilter, c->chrDstW); + if (ff_shuffle_filter_coefficients(c, c->hChrFilterPos, c->hChrFilterSize, c->hChrFilter, c->chrDstW) < 0) + goto nomem; } } // initialize horizontal stuff From patchwork Thu Feb 17 10:03:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Kelly X-Patchwork-Id: 34356 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6838:d078:0:0:0:0 with SMTP id x24csp489910nkx; Thu, 17 Feb 2022 02:04:07 -0800 (PST) X-Google-Smtp-Source: ABdhPJxcJk8ZFrb2I73yzLDt84uwgws8ZKwmNPMVtwGJWpMBQ3t0pRu11hKWahAOxKkFEwnHrOXL X-Received: by 2002:a05:6402:228f:b0:410:ae4e:d1bf with SMTP id cw15-20020a056402228f00b00410ae4ed1bfmr1716025edb.231.1645092247438; Thu, 17 Feb 2022 02:04:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1645092247; cv=none; d=google.com; s=arc-20160816; b=L0Hg5V4xcozLjxO4tL7XvYXCCrk3zITCojcqZxZyU6olHjL+hwIeKB8SmAWKc98es1 /QWqqZ28dKX4VhoJVfkAAVQuPAk2FpcgsmA4ootynLhIgph0GgmPCpAVnxqRSE9emdFO T1avi8kWpojBifV/EplOfRsDrl879F3EAy3C0NSVAL7dX4svVhJvDWI5hWaQa+tWHvyJ 9D/W+HiZGe0UGoxC3TdwBvaDC1U9PUK687VSWxmLN9mZ9w9fkZqA2yoS1pCIehPptQ8g pAFKumuhIKFC4LNFb7+QSjBfTBK3+pDwVDKZwahfa5XZpHFHTDm+9AGjKxrZ1/sdG0n6 JK6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:to:from:mime-version:message-id:date :dkim-signature:delivered-to; bh=h/AIunn7k/S2xhmJc6k0HLVVGKMM6R2WY3SvFFDvm4c=; b=KuxXWAqkTAtfLb6dRpzr0xlPoBbAaBz0rlrImKJUiRJxrdhFN8soRakQ5LMGu88JPI vpR7MhJCc71zJpNw/7i2n/H0JMw1B9IbaEm9nW89tQy83ihqwLyqDZ30h0zdAkmezoS4 GXD+ucBBp9GQXEYZLg8aUKdVjZ8zbZ+Gomdv0ujjpBQBz26UVOX3WMepbJubxmgrKtiR mDwaKA3qxUdegAALWD3AuKtzXNAOVUxL2mJ79F7g/Mb+ni6Ila3Gl+Y3/P9VyzXgMqK2 Wsg5J7JmZ8DiHHarDQCv2vLUyp1aTf9rGNOeG8w0rv8M5Onnfoon/AUKrQRxXWYJg0Cr qKlg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@google.com header.s=20210112 header.b=eOnX3w9L; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id v18si3890739edc.420.2022.02.17.02.04.06; Thu, 17 Feb 2022 02:04:06 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@google.com header.s=20210112 header.b=eOnX3w9L; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1664568B1F9; Thu, 17 Feb 2022 12:04:03 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 7E7DD68AAA0 for ; Thu, 17 Feb 2022 12:03:56 +0200 (EET) Received: by mail-yb1-f201.google.com with SMTP id j17-20020a25ec11000000b0061dabf74012so9503113ybh.15 for ; Thu, 17 Feb 2022 02:03:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:message-id:mime-version:subject:from:to:cc; bh=xC1nT2XTjJzG57tuJWzW/nUloQCoei4n24re72ODP40=; b=eOnX3w9Lgh45LDQ8D4hGYCoeUiC4TLtK3dvtnNt09YUr67FMw1FqAY/AjQu3ZlEQ9D i1MmC179IbELjFD6luAXgOq3l5IQcUzocBjbKFUgMImoZgJjdeK/kbRtthmpLy4A0Igm leNpm61tZoATsjfbZREB5YoC2mqoerg6ZgI5/uSgpfFqycGmQ8vES3gkyO0dIT9OO85R tiJr3XiQ0/jyg+KBctB6zdepbhICtq8TzY26boV+qQoPFkoGYRxTDPLnXQtXTZNeI0v1 iGEEugCsH2vJ8ZmvsFJT4jcu6LN3uGJ06X0HXVbByZU5oHaK2cyneGRuM7IZIGjMaN7z DPhQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=xC1nT2XTjJzG57tuJWzW/nUloQCoei4n24re72ODP40=; b=YtyNlNwvoGDoHe49rITroM3YhsH7sDCQDWma5nWCest7dPr/o8xxZv6WOzvt3Dqc8C nLDHKpe38hn2wjlSlh+K5QmLMWkvRM8sGMDc3DbFXBqsLHs8d/NBqOnosmF6amyCPVX8 AOJZzsZY2s8cOik/re+WOGEMHbNkveQdo7hr678BGW9uZYGLXm0V9h6Napu3FQLLd7FY z7gNS62rlcEfVEuu0tIi4WHgz3zR+JrQmhR6EeAmA1o4TfuU/kjrVTwP0gOZ57VSQ4EH 5lOSBKQ6GGTgbT5NAobJYX2LH/d9aETY3hVlkKlSNkC44osqcA/zFOz6XSd0j2Pu9O2O 4GXQ== X-Gm-Message-State: AOAM533+sfV9pFurYDhqRrww6tZXFymSk+vv98T3xn/f06QiAb5uQDpZ DDtpcf/rfD9/eqHkCBnFvwwt+ihBPRZHOKe1JO9rhpVA9/PqjC7nBs1FqmdBOXeWnHBLeYJlqch yoics0gmwRJiqKTU2aMTuuF0XWlxciuaBKywiYc7h8cSMsc39GPkFTojFSzPrrqr9ST3lizI= X-Received: from alankelly0.zrh.corp.google.com ([2a00:79e0:61:301:b159:808d:943e:13ba]) (user=alankelly job=sendgmr) by 2002:a0d:da45:0:b0:2d0:bd53:b39 with SMTP id c66-20020a0dda45000000b002d0bd530b39mr1755363ywe.463.1645092235258; Thu, 17 Feb 2022 02:03:55 -0800 (PST) Date: Thu, 17 Feb 2022 11:03:52 +0100 Message-Id: <20220217100352.1112032-1-alankelly@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.35.1.265.g69c8d7142f-goog From: Alan Kelly To: ffmpeg-devel@ffmpeg.org Subject: [FFmpeg-devel] [PATCH v2 2/5] libswscale: Re-factor ff_shuffle_filter_coefficients. X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Alan Kelly Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: FhzbJent0Anh Make the code more readable and follow the style guide. --- libswscale/utils.c | 66 +++++++++++++++++++++++++--------------------- 1 file changed, 36 insertions(+), 30 deletions(-) diff --git a/libswscale/utils.c b/libswscale/utils.c index 344c87dfdf..7c8e1bbdde 100644 --- a/libswscale/utils.c +++ b/libswscale/utils.c @@ -278,42 +278,48 @@ static const FormatEntry format_entries[] = { [AV_PIX_FMT_P416LE] = { 1, 1 }, }; -int ff_shuffle_filter_coefficients(SwsContext *c, int *filterPos, int filterSize, int16_t *filter, int dstW){ +int ff_shuffle_filter_coefficients(SwsContext *c, int *filterPos, + int filterSize, int16_t *filter, + int dstW) +{ #if ARCH_X86_64 - int i, j, k, l; + int i, j, k; int cpu_flags = av_get_cpu_flags(); + // avx2 hscale filter processes 16 pixel blocks. + if (!filter || dstW % 16 != 0) + return 0; if (EXTERNAL_AVX2_FAST(cpu_flags) && !(cpu_flags & AV_CPU_FLAG_SLOW_GATHER)) { - if ((c->srcBpc == 8) && (c->dstBpc <= 14)){ - if (dstW % 16 == 0){ - if (filter != NULL){ - for (i = 0; i < dstW; i += 8){ - FFSWAP(int, filterPos[i + 2], filterPos[i+4]); - FFSWAP(int, filterPos[i + 3], filterPos[i+5]); - } - if (filterSize > 4){ - int16_t *tmp2 = av_malloc(dstW * filterSize * 2); - if (!tmp2) - return AVERROR(ENOMEM); - memcpy(tmp2, filter, dstW * filterSize * 2); - for (i = 0; i < dstW; i += 16){//pixel - for (k = 0; k < filterSize / 4; ++k){//fcoeff - for (j = 0; j < 16; ++j){//inner pixel - for (l = 0; l < 4; ++l){//coeff - int from = i * filterSize + j * filterSize + k * 4 + l; - int to = (i) * filterSize + j * 4 + l + k * 64; - filter[to] = tmp2[from]; - } - } - } - } - av_free(tmp2); - } - } - } + if ((c->srcBpc == 8) && (c->dstBpc <= 14)) { + int16_t *filterCopy = NULL; + if (filterSize > 4) { + if (!FF_ALLOC_TYPED_ARRAY(filterCopy, dstW * filterSize)) + return AVERROR(ENOMEM); + memcpy(filterCopy, filter, dstW * filterSize * sizeof(int16_t)); + } + // Do not swap filterPos for pixels which won't be processed by + // the main loop. + for (i = 0; i + 8 <= dstW; i += 8) { + FFSWAP(int, filterPos[i + 2], filterPos[i + 4]); + FFSWAP(int, filterPos[i + 3], filterPos[i + 5]); + } + if (filterSize > 4) { + // 16 pixels are processed at a time. + for (i = 0; i + 16 <= dstW; i += 16) { + // 4 filter coeffs are processed at a time. + for (k = 0; k + 4 <= filterSize; k += 4) { + for (j = 0; j < 16; ++j) { + int from = (i + j) * filterSize + k; + int to = i * filterSize + j * 4 + k * 16; + memcpy(&filter[to], &filterCopy[from], 4 * sizeof(int16_t)); + } + } + } + } + av_free(filterCopy); } } - return 0; #endif + return 0; } int sws_isSupportedInput(enum AVPixelFormat pix_fmt) From patchwork Thu Feb 17 10:04:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Kelly X-Patchwork-Id: 34357 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6838:d078:0:0:0:0 with SMTP id x24csp490079nkx; Thu, 17 Feb 2022 02:04:20 -0800 (PST) X-Google-Smtp-Source: ABdhPJzCiYv2IddmJ82LKALvxfxUEzLb0FyU8UaMAMb6Ym4fyidlWfJvzq5EO5uZ+1RCkENzF17s X-Received: by 2002:a17:907:3a0f:b0:6cd:5ca7:648d with SMTP id fb15-20020a1709073a0f00b006cd5ca7648dmr1724072ejc.79.1645092260303; Thu, 17 Feb 2022 02:04:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1645092260; cv=none; d=google.com; s=arc-20160816; b=lB9WufLnt/n9Mt7lJo87vIo7hU5ybyxTb2KGnpEFM7S3Zhvs9KpjXaPzxiUU+ceoqG C776mx+mkFnlH0jjs3my9IB4touFcHUjosUE7qFPaSxQa42WjAx0IY3QJ5brgPMka4SY yhwyhf6ryl3zooKNjyXmt9gKXPYeWaX4ShHXvY28faKz9lvAyimeu6atzVMxISpo7m33 Qp6vwGBuAzt6aF9nR6n6eJ0zT94TR5ZtoyzWpL1y5+Q90+L01miG5oUWezqfx6EziIFK InbEgwGveejunL04t/SEw9rRqUZcJNlW+HcHt8ts0WZb2BaY9K9GYeYCRQm6R571ok7L QXlQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:to:from:mime-version:message-id:date :dkim-signature:delivered-to; bh=eptn5wwl0rYdfSup79PiEHd62CaDCraMZ9YkPDJER94=; b=TWvA1reGXAw/7NqF3F4ZYCEURWZDhQdbLigaai+FsxiWTbKmB5nSTQBmMQimEVo4bZ 84zxl50gVmlhrIm8bgWpKq3PFSleMQJfP8vrgj5kmgVzj6OPdN+Y4zl463Hoesjoc4Zq ShV9EsB9gRff+piZNM7Zw65SU1M1M3BBKuS5bUpg3/scFbNSuPmLXS7pY0Uhg9fHu7A6 g87clUewWFxjKB3uYdhJSGJWH2HJ9sAn+zLeQ7Tkxl4swmDvlQleXxSl4pb+gnonxrbC s1Dmls0Gpi5gcvqRXDPBr29EnQ7IIzV+y+b90UTouBVKVC+MT1S9bQCHWPv8RNiQ89es 3Ypg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@google.com header.s=20210112 header.b=lIr4Ht+h; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id b12si1532210ejc.97.2022.02.17.02.04.19; Thu, 17 Feb 2022 02:04:20 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@google.com header.s=20210112 header.b=lIr4Ht+h; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1994068B253; Thu, 17 Feb 2022 12:04:17 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id B92FD68AAA0 for ; Thu, 17 Feb 2022 12:04:10 +0200 (EET) Received: by mail-wm1-f73.google.com with SMTP id r11-20020a1c440b000000b0037bb51b549aso3235297wma.4 for ; Thu, 17 Feb 2022 02:04:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:message-id:mime-version:subject:from:to:cc; bh=Fi2fqh6LCwMrm7FieXk9hCnhWxzkIuYcqP9NI7/EJTc=; b=lIr4Ht+hLbvMLrYO/eLssM+dyVTI9tDqbt+/4jpPXlxghbtFR2ftZJpmddYqDaxxc6 I3Qo+13SYct5imiK4MfXsr/rwRLjw0OrOb4eJ9J1W2pacYVUHv5fAKPIsAb7t+N9iP7R vCz6/dAEy+3c84lhZllgpC5rmmKAIJp8gZtcwplrr6d+F1aWnisUxGksWOfKpx65ykdP sKg4UdHZJTN2kBgNNGDkEb5TVp8e/sLHScUGuQEOrCAoqtM4eAiGQGtfVGy3drxDUPpZ 3hKKK6FoU9fY4FNMmFvytIDidukFK64WgqSI+wzznq7LBgmVLaDxOqBhkHXxUA+HxxSm KeVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=Fi2fqh6LCwMrm7FieXk9hCnhWxzkIuYcqP9NI7/EJTc=; b=vlOMhgHRejxmWZfNHazU8h4mZSn0nGuZcwqCbepLQzUVy1bgTIM6MdYbCE1jV40mjc cTXUNPyNZ0xzH1pGxd1Tr7GNetqGCKKwS/4vlm55f1w1VTyTXrXJjd5tHkTs0R35IlCN RZzC5i1KdcH/8s9IDi1SSMa00gabuJ5ArBIXtPEdIuR/ItMgXCCMwFWgA2/WKpBPAsPB gIVSNIkR05MRLkXaHYm3zhAt0x/KSE4rvZHLcR9VjTkQqgtREdG6NVAi9Ncz5lMpT2Ap MbnuTOvuUmqitbBMEVlS7xN/K7vOeAE23DvD4eP2XJzgY6TRBqvuXrfQ7npYn+ll6ruv ALsA== X-Gm-Message-State: AOAM533pxYwX7t2PtGb03fus8neGhCj5ZnEfaw8dgQNLUdMdd2cEDafz ZDZizocVPEkP5SPBm+3qwQ2Hs7rIQToQu/o92MstB+zDSF/G/wSc5OgqwdnLkz8b2MmP4/F7dso VLDKnPqY4DnnRtL2S4GUrRknnR7kcbCkslED/e1jz6uBKJimgxmTE7Bv7ED7mL0RsTc6MMFc= X-Received: from alankelly0.zrh.corp.google.com ([2a00:79e0:61:301:b159:808d:943e:13ba]) (user=alankelly job=sendgmr) by 2002:a5d:59a2:0:b0:1e3:3df1:78a2 with SMTP id p2-20020a5d59a2000000b001e33df178a2mr1714788wrr.312.1645092247609; Thu, 17 Feb 2022 02:04:07 -0800 (PST) Date: Thu, 17 Feb 2022 11:04:04 +0100 Message-Id: <20220217100404.1112755-1-alankelly@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.35.1.265.g69c8d7142f-goog From: Alan Kelly To: ffmpeg-devel@ffmpeg.org Subject: [FFmpeg-devel] [PATCH v2 3/5] libswscale: Avx2 hscale can process inputs of any size. X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Alan Kelly Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: +CfQbep+Uylf The main loop processes blocks of 16 pixels. The tail processes blocks of size 4. --- libswscale/x86/scale_avx2.asm | 48 +++++++++++++++++++++++++++++++++-- 1 file changed, 46 insertions(+), 2 deletions(-) diff --git a/libswscale/x86/scale_avx2.asm b/libswscale/x86/scale_avx2.asm index 20acdbd633..dc42abb100 100644 --- a/libswscale/x86/scale_avx2.asm +++ b/libswscale/x86/scale_avx2.asm @@ -53,6 +53,9 @@ cglobal hscale8to15_%1, 7, 9, 16, pos0, dst, w, srcmem, filter, fltpos, fltsize, mova m14, [four] shr fltsized, 2 %endif + cmp wq, 16 + jl .tail_loop + mov countq, 0x10 .loop: movu m1, [fltposq] movu m2, [fltposq+32] @@ -97,11 +100,52 @@ cglobal hscale8to15_%1, 7, 9, 16, pos0, dst, w, srcmem, filter, fltpos, fltsize, vpsrad m6, 7 vpackssdw m5, m5, m6 vpermd m5, m15, m5 - vmovdqu [dstq + countq * 2], m5 + vmovdqu [dstq], m5 + add dstq, 0x20 add fltposq, 0x40 add countq, 0x10 cmp countq, wq - jl .loop + jle .loop + + sub countq, 0x10 + cmp countq, wq + jge .end + +.tail_loop: + movu xm1, [fltposq] +%ifidn %1, X4 + pxor xm9, xm9 + pxor xm10, xm10 + xor innerq, innerq +.tail_innerloop: +%endif + vpcmpeqd xm13, xm13 + vpgatherdd xm3,[srcmemq + xm1], xm13 + vpunpcklbw xm5, xm3, xm0 + vpunpckhbw xm6, xm3, xm0 + vpmaddwd xm5, xm5, [filterq] + vpmaddwd xm6, xm6, [filterq + 16] + add filterq, 0x20 +%ifidn %1, X4 + paddd xm9, xm5 + paddd xm10, xm6 + paddd xm1, xm14 + add innerq, 1 + cmp innerq, fltsizeq + jl .tail_innerloop + vphaddd xm5, xm9, xm10 +%else + vphaddd xm5, xm5, xm6 +%endif + vpsrad xm5, 7 + vpackssdw xm5, xm5, xm5 + vmovq [dstq], xm5 + add dstq, 0x8 + add fltposq, 0x10 + add countq, 0x4 + cmp countq, wq + jl .tail_loop +.end: REP_RET %endmacro From patchwork Thu Feb 17 10:04:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Kelly X-Patchwork-Id: 34358 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6838:d078:0:0:0:0 with SMTP id x24csp490252nkx; Thu, 17 Feb 2022 02:04:34 -0800 (PST) X-Google-Smtp-Source: ABdhPJx4fuNhfdqCR4ENcD/CxqnXMV+TdPxH/E2CEWw/LDwLt/e+bi4BKywyiidkghwMJzSmZWp+ X-Received: by 2002:a17:907:1a48:b0:6ce:4aa:30c5 with SMTP id mf8-20020a1709071a4800b006ce04aa30c5mr1701162ejc.559.1645092274209; Thu, 17 Feb 2022 02:04:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1645092274; cv=none; d=google.com; s=arc-20160816; b=M7SBNJFr4HkrrTSXcssU9ZfP3sqK82xp6xDdIlkG04rf3A3G7g5yyjQ1Ki1ZMmBQVj QkCfPKyP2PrSv/5WxJ+oTfB/GE6m1aoCVyj4jHbB/uytI1VAksNcKSkzDdfwo3IHUAYi dQV4hxNnudOuPqOqzz1vrVmdj4Fvh6cemwiP7WbzX0VtlnTLB46zM1XOSdxGcB2hZkto cdfT4LV+Tfv+sx5/85y5dHmJ8jEYzyemtTWdnwc95RlF3yb7Q786p8vTQ1QCVV0oaRl6 ZWHaGYibTqSyBB3cFtScfPs+Ydzf78/h1nuZu7xAGNv22xbG+Ll/TEM6/e0SdA/yJ9s2 Ughw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:to:from:mime-version:message-id:date :dkim-signature:delivered-to; bh=8pOlWr8dVQtm2nlYsE0eKY32m/hvY38QR8mbFDnMZ24=; b=qeBmXkDSJFae+XY78pGROGPjuhv7O1TAK/HbvSm2aQ07uwJTDxrStf7lJMXRbBZZ1/ b6biG8f53Q3C7oxS3SHtfJfXZtViv4IJZv9F2F8zLXp+7lBV6CQCK2scW90cOWwA2e72 4Yv9ALWdnVmOAD4VA9Dwbki1FL8P4jyMmP16L7uaC77P+ifQp+YB/nrLX8NM+WFysqt4 yezas3EcDXKbrZKujfLm3HQTptfTkRiR3F3iHcdRQ2a5lxkJ3IZNCZWZRA7VtrkLZH/P i9St+gqwCOsKl/gR7whhGeyKOiKyBdp6FTlj/k7dIAU/LQeaDd5+FRGj9hCwwSjL8Awv uDKg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@google.com header.s=20210112 header.b=PFfSyZbG; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id c12si1283446eja.863.2022.02.17.02.04.33; Thu, 17 Feb 2022 02:04:34 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@google.com header.s=20210112 header.b=PFfSyZbG; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1D0A168B1CC; Thu, 17 Feb 2022 12:04:30 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-ej1-f74.google.com (mail-ej1-f74.google.com [209.85.218.74]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id AAE5668B1CC for ; Thu, 17 Feb 2022 12:04:23 +0200 (EET) Received: by mail-ej1-f74.google.com with SMTP id d7-20020a1709061f4700b006bbf73a7becso1267927ejk.17 for ; Thu, 17 Feb 2022 02:04:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:message-id:mime-version:subject:from:to:cc; bh=sP+FgB7BKGXQp8fa1HxGM7u8B33V09CCnI8tayWyPN0=; b=PFfSyZbGD86CmFmIb55OEp0/75OvXGmRWswQGK83UgLGthPNuaEmyInOFGK/3G48B1 1h5YkeHKDyJ10vnbGqh3q5Xnp2/NGHVr+H7BuwTKVqhnXnEJ7ZzIsaTjdNy3GOsiWdWb 9uMlfq1w2afAMZY+SD44i8NG9eKtU7TmiJDlDyIgcaDeBuMRHLJQWgIZbB0WyjYdJx2D iPWdGJGeeB9/xIecgwJFFf3sMDzffzjxDqoA0XgSvh3qO1kAhT491C3VZkAl2akVS8T+ NlDbSwJjVs+GfP4XcRywcoNBxnXwaOP2JC3QVSuxVqkABneewpGwr5+wwrEnx/1dEeMN GvVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=sP+FgB7BKGXQp8fa1HxGM7u8B33V09CCnI8tayWyPN0=; b=UMH0dfo1LRcb4Iu8h5c6ihEaLE8Lzg85DoaXWk/LXE/yTAtu3t3dJAKJUIeNG8ckiJ +ayKMwaUPIxXwh4+blVV8Y1e0AlvF58WN8V7B1jzLAcGV5MCLpCh/ODht+pRuIu2e0X8 MdhcJL8OxJeCQmPGcBnOu+wxmQwwqR58F1wwYdDClfJUq/yEDZzFfQYMtDaQG9ef7z8T UdjWNaS6AHR2i0/NIYckZk6zt2I6/9hlzopEzwEbxSmuvcaPkhQDaVxRFkB1wBnBTTcl ejEAf+ucv3VufMJ5dLAeBIMQxbEDTI+ngCLi4zDL1cDfgTAkfXXW4UI5pz0EPDsLQnIT jCOw== X-Gm-Message-State: AOAM533gVQaBVGFfLGb/1mkPSS8GWyaPnZfpixH7ZQ4eFV9OFi4pIWtz gkM/iWuIE3e0M7mRK0LSu6CisQ5UDGfVANCwPlphPdHz728ZvLkFCMOv1rSLJJYI7vT32xzm015 p0xNF87ad5syYECUIP/GZfEz6LR0cY3HF8UDvUGTW7Ax2xqtLl6WS8Z68NgCQqTRztv+MVOE= X-Received: from alankelly0.zrh.corp.google.com ([2a00:79e0:61:301:b159:808d:943e:13ba]) (user=alankelly job=sendgmr) by 2002:a17:906:743:b0:6d0:7f19:d737 with SMTP id z3-20020a170906074300b006d07f19d737mr557864ejb.11.1645092263098; Thu, 17 Feb 2022 02:04:23 -0800 (PST) Date: Thu, 17 Feb 2022 11:04:20 +0100 Message-Id: <20220217100420.1113388-1-alankelly@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.35.1.265.g69c8d7142f-goog From: Alan Kelly To: ffmpeg-devel@ffmpeg.org Subject: [FFmpeg-devel] [PATCH v2 4/5] libswscale: Enable hscale_avx2 for all input sizes. X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Alan Kelly Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: thqdtjSaKPvp ff_shuffle_filter_coefficients shuffles the tail as required. --- libswscale/utils.c | 19 ++++++++++++++++--- libswscale/x86/swscale.c | 6 ++---- 2 files changed, 18 insertions(+), 7 deletions(-) diff --git a/libswscale/utils.c b/libswscale/utils.c index 7c8e1bbdde..d818c9ce55 100644 --- a/libswscale/utils.c +++ b/libswscale/utils.c @@ -285,8 +285,7 @@ int ff_shuffle_filter_coefficients(SwsContext *c, int *filterPos, #if ARCH_X86_64 int i, j, k; int cpu_flags = av_get_cpu_flags(); - // avx2 hscale filter processes 16 pixel blocks. - if (!filter || dstW % 16 != 0) + if (!filter) return 0; if (EXTERNAL_AVX2_FAST(cpu_flags) && !(cpu_flags & AV_CPU_FLAG_SLOW_GATHER)) { if ((c->srcBpc == 8) && (c->dstBpc <= 14)) { @@ -298,9 +297,11 @@ int ff_shuffle_filter_coefficients(SwsContext *c, int *filterPos, } // Do not swap filterPos for pixels which won't be processed by // the main loop. - for (i = 0; i + 8 <= dstW; i += 8) { + for (i = 0; i + 16 <= dstW; i += 16) { FFSWAP(int, filterPos[i + 2], filterPos[i + 4]); FFSWAP(int, filterPos[i + 3], filterPos[i + 5]); + FFSWAP(int, filterPos[i + 10], filterPos[i + 12]); + FFSWAP(int, filterPos[i + 11], filterPos[i + 13]); } if (filterSize > 4) { // 16 pixels are processed at a time. @@ -314,6 +315,18 @@ int ff_shuffle_filter_coefficients(SwsContext *c, int *filterPos, } } } + // 4 pixels are processed at a time in the tail. + for (; i < dstW; i += 4) { + // 4 filter coeffs are processed at a time. + int rem = dstW - i >= 4 ? 4 : dstW - i; + for (k = 0; k + 4 <= filterSize; k += 4) { + for (j = 0; j < rem; ++j) { + int from = (i + j) * filterSize + k; + int to = i * filterSize + j * 4 + k * 4; + memcpy(&filter[to], &filterCopy[from], 4 * sizeof(int16_t)); + } + } + } } av_free(filterCopy); } diff --git a/libswscale/x86/swscale.c b/libswscale/x86/swscale.c index 73869355b8..76f5a70fc5 100644 --- a/libswscale/x86/swscale.c +++ b/libswscale/x86/swscale.c @@ -691,10 +691,8 @@ switch(c->dstBpc){ \ if (EXTERNAL_AVX2_FAST(cpu_flags) && !(cpu_flags & AV_CPU_FLAG_SLOW_GATHER)) { if ((c->srcBpc == 8) && (c->dstBpc <= 14)) { - if (c->chrDstW % 16 == 0) - ASSIGN_AVX2_SCALE_FUNC(c->hcScale, c->hChrFilterSize); - if (c->dstW % 16 == 0) - ASSIGN_AVX2_SCALE_FUNC(c->hyScale, c->hLumFilterSize); + ASSIGN_AVX2_SCALE_FUNC(c->hcScale, c->hChrFilterSize); + ASSIGN_AVX2_SCALE_FUNC(c->hyScale, c->hLumFilterSize); } } From patchwork Thu Feb 17 10:04:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Kelly X-Patchwork-Id: 34359 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6838:d078:0:0:0:0 with SMTP id x24csp490409nkx; Thu, 17 Feb 2022 02:04:47 -0800 (PST) X-Google-Smtp-Source: ABdhPJyUlfQsJEL5Ou8YjYKrQf+JcXO3IKfWBxZ6waNzvA8+Ooy7/1Csj2kpZFUiKcxuqokHr8ff X-Received: by 2002:a05:6402:5214:b0:412:996:9ffb with SMTP id s20-20020a056402521400b0041209969ffbmr1720280edd.238.1645092286866; Thu, 17 Feb 2022 02:04:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1645092286; cv=none; d=google.com; s=arc-20160816; b=GwvUcx1VrHMMxLVfVjwWFHbt7Oy1IDDOmnP6GPIRgrSVLsX67/19PhQyr2Me8C0mQx 3UhIMSBkwoDKRVTaxmeMTHdbV06egShgZyJJSUkGHrK4tTrN+fJCjfhhwHxl1l4bUKaA eFyzs60NBxloW2wEW/16OtfeIz2IUj/gvDDMBe8VU17N7008ciQeLrpK8AUWl7tJeZmd 1lNdJRiuYEgy/j7ilFYClTvAQYXbg8o41Thza1pw8BDmPpbrDfr7wEXTed4iNEIYQfuF AqIIeeb/y0XkeRVk/J6PJpfMzG526M6DTOIZl34DPdirHVOClLULzDcxJ4KKaMNBkGMI lLPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:to:from:mime-version:message-id:date :dkim-signature:delivered-to; bh=UG8hLcufnDIYehA5XaCa1at0Fay/NsR0LZj58U0xAOI=; b=XoO9/CBGJFyCcBx1OqOb48KtjDrfMZrTpimhv2QiF8lHpSWtndzpA1FlyzCkDFkLet RSqEEpJGfR4i49We6JfbFm6vlx7fqY5XXxbVASUAQgjuLoTKutZa0jcxep74prwgOtzV 0gXFmgE5nz8bNDVn1PNo9tlXnSuVdLRnfHSgRWzdeNFhBpRDcvczAKhzbBQl50EPCnRj EsH4GIQ79KMhS0GOW+Zs1zZTIA0ft2rPWaoENfKKVtKz5c3Fny/pFSbLVz4DYeS4V9GI Z1t8Z98NU5cAe6tdOBSN+cb+NsZ/zr+V7tjbydoVwhGxpWro3c1MBmxCLQWWTptAgdiN 2yTQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@google.com header.s=20210112 header.b=pvktnPxE; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id j26si1442365ejy.88.2022.02.17.02.04.43; Thu, 17 Feb 2022 02:04:46 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@google.com header.s=20210112 header.b=pvktnPxE; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id F303D68B29B; Thu, 17 Feb 2022 12:04:38 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-ed1-f73.google.com (mail-ed1-f73.google.com [209.85.208.73]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 3033868B286 for ; Thu, 17 Feb 2022 12:04:37 +0200 (EET) Received: by mail-ed1-f73.google.com with SMTP id j29-20020a508a9d000000b00412aa79f367so374209edj.0 for ; Thu, 17 Feb 2022 02:04:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:message-id:mime-version:subject:from:to:cc; bh=ZqaOGG1SuFPAskrZlz+c5csazDT5BJB5hzFNFQ3wVwk=; b=pvktnPxE53OgZYcfAcljlR+at8BY8Ogx7rBlrUwJ4u+H/ktcTdCKnhk3s70TSqEOLU u7SIyBjtyFLo0l62nKOLZgZ8wRP3677dKOcCrkWaR5iT7wF1Z/DzwhqHeEamUDUz6CxU HcAMaSsUdhqFEdd9cPoWzDOOrB+xI/LNdJDYop4/bjyS1GTnDBVv/yTvdEkB4BlYSjtM IQ4D1TNcmF9t7bkVw2+6X+YvkbEy7KvMOCDS8LTfOCAFbBZFucusoDo6FnpeAMcqb9YY 0xRFp3DQGikek4MR7hAJa0SgAss+JNuJTFFTeUbWmWmO5e7lo1NR7rIb0dO6E9mKHqFz p5og== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=ZqaOGG1SuFPAskrZlz+c5csazDT5BJB5hzFNFQ3wVwk=; b=p/eGPiZklhq8MaEoOCaEtQp0DUehdcZtH9XWMoM/k3j+3HD+66DZbSO/FDwK98P+J4 IWlSrxj2IbGI4O2Hxl8gUMLyPMv/y0zGb5blcUy6ecmh4o1fJDB3gEg+5kX51/hGxntk ief6BANkGbbRmaXmbAz3vLLt9SMhVMVt+91F8F3K6LuDPJ1Y7ObEfQJzsxlFDlFY2M92 U+YkqhcI2DAFJU7b9+OWciEx/m77PlX0+2gmvQwtdquKLJEgATvkMKoQirS28D2OEdnd Se7X9+utT6NP2x7xI2sLR3URpZGa0eky3kmyEGy3uvtV9pmfNXvWHuznT+1zlA9k+y+C wBwg== X-Gm-Message-State: AOAM532ILco3jck/dr9TcIFvo9E/+ACKhrgB5b5vLIyTx2vrDrv8mzBx KVEgiLS8uH3mMCJZbCDMuz/IeLqvmbGkmeAigBaY8dFWqzyTmj/+Z4OwEPE5Nrv76NcQVPI/rfM 4Uw2MbAwb0PG9YK2z9ZhX4WDppX7KiFDE/xR8IoYveyonTWaBxKoHVdn7E2pxL+Bega63xDE= X-Received: from alankelly0.zrh.corp.google.com ([2a00:79e0:61:301:b159:808d:943e:13ba]) (user=alankelly job=sendgmr) by 2002:a50:b402:0:b0:410:836e:92f3 with SMTP id b2-20020a50b402000000b00410836e92f3mr1730851edh.29.1645092277222; Thu, 17 Feb 2022 02:04:37 -0800 (PST) Date: Thu, 17 Feb 2022 11:04:33 +0100 Message-Id: <20220217100433.1114010-1-alankelly@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.35.1.265.g69c8d7142f-goog From: Alan Kelly To: ffmpeg-devel@ffmpeg.org Subject: [FFmpeg-devel] [PATCH v2 5/5] checkasm/sw_scale: hscale does not requires cpuflag test. X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Alan Kelly Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: qBBgYDlCCWTd This is done in ff_shuffle_filter_coefficients. --- tests/checkasm/sw_scale.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/tests/checkasm/sw_scale.c b/tests/checkasm/sw_scale.c index 3c0a083b42..4c57b6a372 100644 --- a/tests/checkasm/sw_scale.c +++ b/tests/checkasm/sw_scale.c @@ -168,8 +168,6 @@ static void check_hscale(void) const uint8_t *src, const int16_t *filter, const int32_t *filterPos, int filterSize); - int cpu_flags = av_get_cpu_flags(); - ctx = sws_alloc_context(); if (sws_init_context(ctx, NULL, NULL) < 0) fail(); @@ -217,8 +215,7 @@ static void check_hscale(void) } ff_sws_init_scale(ctx); memcpy(filterAvx2, filter, sizeof(uint16_t) * (SRC_PIXELS * MAX_FILTER_WIDTH + MAX_FILTER_WIDTH)); - if ((cpu_flags & AV_CPU_FLAG_AVX2) && !(cpu_flags & AV_CPU_FLAG_SLOW_GATHER)) - ff_shuffle_filter_coefficients(ctx, filterPosAvx, width, filterAvx2, SRC_PIXELS); + ff_shuffle_filter_coefficients(ctx, filterPosAvx, width, filterAvx2, SRC_PIXELS); if (check_func(ctx->hcScale, "hscale_%d_to_%d_width%d", ctx->srcBpc, ctx->dstBpc + 1, width)) { memset(dst0, 0, SRC_PIXELS * sizeof(dst0[0]));