From patchwork Tue Feb 23 13:40:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Kelly X-Patchwork-Id: 25921 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id B76E4449634 for ; Tue, 23 Feb 2021 15:48:51 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 822DB68AB8F; Tue, 23 Feb 2021 15:48:51 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 8609468AB1D for ; Tue, 23 Feb 2021 15:48:44 +0200 (EET) Received: by mail-pf1-f201.google.com with SMTP id r6so6414225pfg.7 for ; Tue, 23 Feb 2021 05:48:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:message-id:mime-version:subject:from:to:cc; bh=bS+d27Mwq83VejFrNaUl+zbAS/ejXoiW8TMIgJtxVoU=; b=KZW0sfauby5Mlr5QOXRjTwsv3xf/CWS7rwgYZk1M1wd1tMlXMcp0jBpAknz+kEZ//N TkyYkh3XhJ3Nc3elry78uuvnxZRKZoOJe2l8Jb2gRjCFJwYkEIzXfE7NjhtZN5Si1ghu Dieph6/7deAVcXcB54ZDhWQB4S/KZEpkW6qT/8m7lzaPiUjwcIABEUu55tIQPYjd3a0y bw9Gl12fxKrrev/6SoUiy24v7Rxo2ePSkdsG6glUlNMz+OFiqzR+K5iYPrImZL0rq6az SmXsCb8eoA6j/DFAEQUEWfgmKf2KomoxgvKNrKa7+k0NjT3wx9L0WEQG7gSDRi0IforU dOqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:message-id:mime-version:subject:from :to:cc; bh=bS+d27Mwq83VejFrNaUl+zbAS/ejXoiW8TMIgJtxVoU=; b=ivreF3H2j1hC33hjhldD+YrXuhMSUN3d3MszpA4fHwUpreYQWOrVKZPLtfG82uQqKO iWvZsBYRAZPHLTq01tpGnRt8C4OUp3vqUnYVeEQZwDszKlZiUuS2Oi5PsxT9Kee/Ss2v 85FsOtNJxdh7Py/CvPVSALVnUgUcUN2uX+/KtWJkoaJFSFSEZ+VWMfFroyhuz5hBKbRL L/IPIsoCdCv74pvyQ14zOIn7BW2LZsCkJMxGFFd8DccMKZcZorix46zL8M0ClxUcVAC9 oBazzWs1blbwQzuRmFqJCF9qem1BG90gUr5dcH/ZvlIjFAyG2muLogIBlgLZlQgFMCpZ yGQw== X-Gm-Message-State: AOAM5316rHBU3dSSjTla2HwV2jnY7jcj7q6CrdCAeXCFjcTjLBL5UdY7 Pukk1gwgVbbeS3Q4csCNWzjZMuRXD0TXRChJVz4y6LtRxSg0Fxu8/7W6wlaEJtRYuSYN7KO9jKY mgDQwahbJRDMic2H8IXSYkX5JCM5nLlmlNHyaL1kRoaJvje6vgf+EYjkkbnMLmooCj1bK/N8= X-Google-Smtp-Source: ABdhPJySG4fYdf2XFhXmtrqccnE3T4jgCKm1akzR3jKw95i/EgrwvHPt8+KvtXzpbtbUw/Lle9VY/s7NrOw0xxg= X-Received: from alankelly0.zrh.corp.google.com ([2a00:79e0:42:205:f103:c0d9:882a:4eb4]) (user=alankelly job=sendgmr) by 2002:a0c:ef51:: with SMTP id t17mr20427976qvs.1.1614087654922; Tue, 23 Feb 2021 05:40:54 -0800 (PST) Date: Tue, 23 Feb 2021 14:40:45 +0100 Message-Id: <20210223134047.1834787-1-alankelly@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.30.0.617.g56c4b15f3c-goog From: Alan Kelly To: ffmpeg-devel@ffmpeg.org Subject: [FFmpeg-devel] [PATCH 1/3] libswscale/x86/yuv2yuvX: Removes unrolling for mmx and mmxext X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Alan Kelly Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" --- This is so that tails of size 8 may safely be processed libswscale/x86/yuv2yuvX.asm | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/libswscale/x86/yuv2yuvX.asm b/libswscale/x86/yuv2yuvX.asm index 521880dabe..b6294cb919 100644 --- a/libswscale/x86/yuv2yuvX.asm +++ b/libswscale/x86/yuv2yuvX.asm @@ -37,8 +37,10 @@ SECTION .text cglobal yuv2yuvX, 7, 7, 8, filter, filterSize, src, dest, dstW, dither, offset %if notcpuflag(sse3) %define movr mova +%define unroll 1 %else %define movr movdqu +%define unroll 2 %endif movsxdifnidn dstWq, dstWd movsxdifnidn offsetq, offsetd @@ -70,8 +72,10 @@ cglobal yuv2yuvX, 7, 7, 8, filter, filterSize, src, dest, dstW, dither, offset .outerloop: mova m4, m7 mova m3, m7 +%if cpuflag(sse3) mova m6, m7 mova m1, m7 +%endif .loop: %if cpuflag(avx2) vpbroadcastq m0, [filterSizeq + 8] @@ -84,28 +88,36 @@ cglobal yuv2yuvX, 7, 7, 8, filter, filterSize, src, dest, dstW, dither, offset pmulhw m5, m0, [srcq + offsetq * 2 + mmsize] paddw m3, m3, m2 paddw m4, m4, m5 +%if cpuflag(sse3) pmulhw m2, m0, [srcq + offsetq * 2 + 2 * mmsize] pmulhw m5, m0, [srcq + offsetq * 2 + 3 * mmsize] paddw m6, m6, m2 paddw m1, m1, m5 +%endif add filterSizeq, $10 mov srcq, [filterSizeq] test srcq, srcq jnz .loop psraw m3, m3, 3 psraw m4, m4, 3 +%if cpuflag(sse3) psraw m6, m6, 3 psraw m1, m1, 3 +%endif packuswb m3, m3, m4 +%if cpuflag(sse3) packuswb m6, m6, m1 +%endif mov srcq, [filterq] %if cpuflag(avx2) vpermq m3, m3, 216 vpermq m6, m6, 216 %endif movr [destq + offsetq], m3 +%if cpuflag(sse3) movr [destq + offsetq + mmsize], m6 - add offsetq, mmsize * 2 +%endif + add offsetq, mmsize * unroll mov filterSizeq, filterq cmp offsetq, dstWq jb .outerloop From patchwork Tue Feb 23 13:40:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Kelly X-Patchwork-Id: 25919 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 07A2B44ABAB for ; Tue, 23 Feb 2021 15:41:05 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E739268AB60; Tue, 23 Feb 2021 15:41:04 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id B4C0F68AB33 for ; Tue, 23 Feb 2021 15:40:58 +0200 (EET) Received: by mail-wm1-f73.google.com with SMTP id z67so1217319wme.3 for ; Tue, 23 Feb 2021 05:40:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=VLR5mFpv2U13gWw2YtaPyz8KsnJZJWePPW2zutRm46Y=; b=qIXo9tmwgrcD5tVgyt/WG8Kt0yNHZAXt4Tc00WHQOciDMAg5fBypAdnYkCgoxAHlVY 78y9Iy8RcloHei3m/5TuEG1GUXkvnHrT9p3Keyub7ZD5JJcfHTgmLIDZ4iNiTvSpzWI4 klqfSq505Xe960BI8dygksg8IzoOz+V/HFquqzHDLRD88dR2j5TJfAKSii6trWwhZRMO twrPyjUI1NQ/OiJQnCJqq0PDHupxygYlXnFp9Aq7qeb/tV8N4GkiIddoyLNJA1aMQ2YS 3GfM+3Rb1FAYmE7XmAF+XsPtgyM4P8NjQ/ZbS0R3DOzGGFa/YqmQS8G6oQo3ONxRv8w3 4RYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=VLR5mFpv2U13gWw2YtaPyz8KsnJZJWePPW2zutRm46Y=; b=ZyDrJR0ycbet3AK6dPm1PcXoZ38pIsqUUngFrt+r0Q8PTZlVlLFuIkBSbZ4lu2Wczp RckM1iOtb864neqseo39ptSvoEm1fTgeCaFVH8g/zzVTIJVoftrw6MJFQtfdGeeWQKEK onLIHnC2jlyt5pYmj5NZbJJrC16ChTfugaN5NEdx/NLqbxee6a9N2uiN8PxE/qXnF+5/ I59SRazMYTk3fBE6Q+sN1zp3aNsnMheun5DJ3Nr4lvEI8vI777V1kVpOaPbd0UYXlGRg THyIsgiQ3b4PbN/7hxYf/XTOK4Fn5FjpI2nXB7lM0boJawctLmZ/9QDWy5TcrtAWQsEV ycXw== X-Gm-Message-State: AOAM533K3Ib+AXmkJF2ts8lMKIQBRD3DQqXxSydTrP/AkQlugxY7FIzd nMbf1Lm6lX3EKhE+dVjNfROqu7+LoWbOSN9dmzDWgG5vAFFNLRlGoEz5LtWnO0KjH/YFC0qfc3i jS3JEAC9DS/5QE2IGftoMolxS05gMah/1l5i3GbP/0gGqo5JaczRyNeDwBGMssJ6+RUkuE44= X-Google-Smtp-Source: ABdhPJxQad1BeUkRgfZRRwxbl/ibFkxWlari+9CjJplC0JJpL0YDPC2WXSZGhrJrpz/4blhCaf4Gf6ZVpqmFJ94= X-Received: from alankelly0.zrh.corp.google.com ([2a00:79e0:42:205:f103:c0d9:882a:4eb4]) (user=alankelly job=sendgmr) by 2002:a1c:3546:: with SMTP id c67mr1433930wma.1.1614087657877; Tue, 23 Feb 2021 05:40:57 -0800 (PST) Date: Tue, 23 Feb 2021 14:40:46 +0100 In-Reply-To: <20210223134047.1834787-1-alankelly@google.com> Message-Id: <20210223134047.1834787-2-alankelly@google.com> Mime-Version: 1.0 References: <20210223134047.1834787-1-alankelly@google.com> X-Mailer: git-send-email 2.30.0.617.g56c4b15f3c-goog From: Alan Kelly To: ffmpeg-devel@ffmpeg.org Subject: [FFmpeg-devel] [PATCH 2/3] libswscale/x86/swscale: Only call ff_yuv2yuvX functions if the input size is > 0 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Alan Kelly Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" --- libswscale/x86/swscale.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/libswscale/x86/swscale.c b/libswscale/x86/swscale.c index 1e865914cb..71961a9ae0 100644 --- a/libswscale/x86/swscale.c +++ b/libswscale/x86/swscale.c @@ -206,7 +206,8 @@ static void yuv2yuvX_ ##opt(const int16_t *filter, int filterSize, \ const int16_t **src, uint8_t *dest, int dstW, \ const uint8_t *dither, int offset) \ { \ - ff_yuv2yuvX_ ##opt(filter, filterSize - 1, 0, dest - offset, dstW + offset, dither, offset); \ + if(dstW > 0) \ + ff_yuv2yuvX_ ##opt(filter, filterSize - 1, 0, dest - offset, dstW + offset, dither, offset); \ return; \ } @@ -224,7 +225,8 @@ static void yuv2yuvX_ ##opt(const int16_t *filter, int filterSize, \ yuv2yuvX_mmx(filter, filterSize, src, dest, dstW, dither, offset); \ return; \ } \ - ff_yuv2yuvX_ ##opt(filter, filterSize - 1, 0, dest - offset, pixelsProcessed + offset, dither, offset); \ + if(pixelsProcessed > 0) \ + ff_yuv2yuvX_ ##opt(filter, filterSize - 1, 0, dest - offset, pixelsProcessed + offset, dither, offset); \ if(remainder > 0){ \ ff_yuv2yuvX_mmx(filter, filterSize - 1, pixelsProcessed, dest - offset, pixelsProcessed + remainder + offset, dither, offset); \ } \ From patchwork Tue Feb 23 13:40:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Kelly X-Patchwork-Id: 25920 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 9F8D6449076 for ; Tue, 23 Feb 2021 15:46:33 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 7E8F768AB70; Tue, 23 Feb 2021 15:46:33 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-qv1-f73.google.com (mail-qv1-f73.google.com [209.85.219.73]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id CD11668A192 for ; Tue, 23 Feb 2021 15:46:26 +0200 (EET) Received: by mail-qv1-f73.google.com with SMTP id ce4so508197qvb.9 for ; Tue, 23 Feb 2021 05:46:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=xONx8afU0gEeykWU7gWKouWIsvz48YHHZn6iRTyfofI=; b=VCRYgCXW4RuT17R6xAnE4Bd5Vcd7SwhwYVU5Ni9ArUKxHiuhnhecU/5WGXB8WaX4tg 3VVXEhM5sYQLRhNh8F2ZHPHRT4Im5y2gzI2IlM69JDojPa+xbFqFrPvqUVzhY8UdlFmI HzclzMiTT+qPI2/lENqzHRYn3p8BvExn2x7H/hBhN0IXSGy89BjjnSYgOhNBwUDwc2cM /7FG7k4bmIfEh+hTLz16LLr4fgghr+7z+RLd2xV5QSqjEuVyxT8GFS0PShP4S3T9+VB+ POpvg0s7THGLL6GdC93a4wGwb6GAGHaC1ka1BYCU2EcDVkjWo0+Tz56bSXhHr4Zjle3e ro1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=xONx8afU0gEeykWU7gWKouWIsvz48YHHZn6iRTyfofI=; b=DvT5Vdo5b7K53tjUPEMyQh/6yyxH10FZW8qaRPJaI1WZ0c72KNm/KZJqswS8pzkomY 3RSoeJZDIxTSghpayawSgsx6opn1E9xrSPpVF8mT7Aw9RnLAFPlXpbfCLXA5+CSonC2S YmRkAMCmB+wV8g+RLYoe/vxtmBgWk8eHoKmpvUl2KIAxLKwfq9BwQYochqqVUCwtRxOQ IKIz5NIwLnwmkJCdyD2oKqVdkDfN1U3iQUp5M99WHWuVW726fqe/IBwCHGCKG3w8Fu2G 7y1LaJ6aOh49o3H8ELs0re4fqDRyeFIoEgVY6039uEjvDkIJBqpK0VmgmOPz22yBJsQ4 HH+A== X-Gm-Message-State: AOAM531pBViPqJVkou+1cpNVmpCg48RWDSWs2jtpFbhsXEWsY9zf4FxV RwM4AlkyUDnf2b+bFJZCXOEri3gNYLv6PCcNFRXvo2x45YuuStuExBAC29L/SfhBlxHVCGmHtag dol3pwae9Ugi1lNX7xDKD8xK2X/YGhfm95K6DQTqBuEVn9QJUqwxx5iJUCSF1RDVRCGyGybY= X-Google-Smtp-Source: ABdhPJzbLrWze2P2OjkbLaXca1zOtp7QqTxENpql5RSPDs0p1iYu2NfhMFpwaKF6cyzcmDATXzEZrJksGpuc09g= X-Received: from alankelly0.zrh.corp.google.com ([2a00:79e0:42:205:f103:c0d9:882a:4eb4]) (user=alankelly job=sendgmr) by 2002:ad4:550f:: with SMTP id az15mr1785765qvb.35.1614087662946; Tue, 23 Feb 2021 05:41:02 -0800 (PST) Date: Tue, 23 Feb 2021 14:40:47 +0100 In-Reply-To: <20210223134047.1834787-1-alankelly@google.com> Message-Id: <20210223134047.1834787-3-alankelly@google.com> Mime-Version: 1.0 References: <20210223134047.1834787-1-alankelly@google.com> X-Mailer: git-send-email 2.30.0.617.g56c4b15f3c-goog From: Alan Kelly To: ffmpeg-devel@ffmpeg.org Subject: [FFmpeg-devel] [PATCH 3/3] tests/checkasm/sw_scale: adds additional tests sizes for yux2yuvX X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Alan Kelly Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" --- tests/checkasm/sw_scale.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tests/checkasm/sw_scale.c b/tests/checkasm/sw_scale.c index a10118704b..3ac0f9082f 100644 --- a/tests/checkasm/sw_scale.c +++ b/tests/checkasm/sw_scale.c @@ -68,8 +68,8 @@ static void check_yuv2yuvX(void) #define FILTER_SIZES 4 static const int filter_sizes[FILTER_SIZES] = {1, 4, 8, 16}; #define LARGEST_INPUT_SIZE 512 -#define INPUT_SIZES 4 - static const int input_sizes[INPUT_SIZES] = {128, 144, 256, 512}; +#define INPUT_SIZES 6 + static const int input_sizes[INPUT_SIZES] = {8, 24, 128, 144, 256, 512}; declare_func_emms(AV_CPU_FLAG_MMX, void, const int16_t *filter, int filterSize, const int16_t **src, uint8_t *dest, @@ -107,7 +107,7 @@ static void check_yuv2yuvX(void) for(j = 0; j < 4; ++j) vFilterData[i].coeff[j + 4] = filter_coeff[i]; } - if (check_func(ctx->yuv2planeX, "yuv2yuvX_%d_%d", filter_sizes[fsi], osi)){ + if (check_func(ctx->yuv2planeX, "yuv2yuvX_%d_%d_%d", filter_sizes[fsi], osi, dstW)){ memset(dst0, 0, LARGEST_INPUT_SIZE * sizeof(dst0[0])); memset(dst1, 0, LARGEST_INPUT_SIZE * sizeof(dst1[0]));