From patchwork Wed Jul 13 20:47:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Martin_Storsj=C3=B6?= X-Patchwork-Id: 36779 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a21:6da0:b0:8b:e47:9dbf with SMTP id wl32csp597562pzb; Wed, 13 Jul 2022 13:47:29 -0700 (PDT) X-Google-Smtp-Source: AGRyM1uH3//ko5O3kPVRnoX0JNjkYI+5n1nVavwaqOkcrzn8B/eBbVUorDURnC+DA0snr2KxfOmU X-Received: by 2002:a05:6402:5193:b0:43a:eaa8:74b2 with SMTP id q19-20020a056402519300b0043aeaa874b2mr7635653edd.111.1657745249032; Wed, 13 Jul 2022 13:47:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657745249; cv=none; d=google.com; s=arc-20160816; b=akwjadBVwXR3hjzJcj4E4ENvkoVWa3GLHAHkvksi4C+5zd52qFrlMKHkit0lWRkA8b 1ohVkXkWksb7R+MpwN7stve7EvA1g9/0Zgl1tvlQFT4+4aYkKA/b7z6++rQ2G2e9pgry Gfa8nL4BQu8MEf93lk6RQcsNKVygzZ/UBPOa61dm/bOWp2C9GJr2vjO9ewQC2ishw8wa beApyBV7pehf/qXDUEntnlrFhmo65ih+8F9jZhWEhF+89c2zq8AWS+evTSZBQW4bGIwt JVvgqcui2ZsLBgjomSreIru0tYdjNDa9qzXvRWqlJkrNgxwWO9FngJ49362OgQUu2dLm ObRg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=4q8XrjlhCKQsgRTgWvBnPce2eU54iFPsDGjKU2kB5Ak=; b=sJArUvdrpd79vSGKDFvsRgVB8HW5Y7ou1PMngds9j5EzxiyFlbdAp104DMgAKAZbMW BgqEyBYVf4v4un3b1qSfD6jd9lKMZ00XYk6rxU0Ytp/k7nqSUOXKyMA1Zpv/AenEMyia NFF0qYDufogO/Uj+b2E3roUJdIXw8vVAL63uo549E33NkkNq0hPsThBLvbwJoACxCRWb Zbymtz3ebZOTgQ8VjsgrVKodtR3po+EkaaiZQPraDcSpiyPRfd9FsCNKlysQIhq7Z3e3 CaNxqofNXDWC5tLLHM0txCdjtvnJ+busuF2eAZsfcj//7HsJvC1KMj2Vgq/m/A377t8C 0uOg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20210112.gappssmtp.com header.s=20210112 header.b=JYVgvS9M; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id v9-20020aa7d649000000b0043ab49628b2si19137969edr.586.2022.07.13.13.47.28; Wed, 13 Jul 2022 13:47:29 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20210112.gappssmtp.com header.s=20210112 header.b=JYVgvS9M; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 11B1B68B9E6; Wed, 13 Jul 2022 23:47:25 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lf1-f43.google.com (mail-lf1-f43.google.com [209.85.167.43]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C409568B786 for ; Wed, 13 Jul 2022 23:47:18 +0300 (EEST) Received: by mail-lf1-f43.google.com with SMTP id r9so13034261lfp.10 for ; Wed, 13 Jul 2022 13:47:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=8YNpWUD3e9Hru9EO6pbOyub3mwvjjM68NB9TPiFKWnU=; b=JYVgvS9MmE4WBkVxFCzxTNRSnBOhSGxNDCm4rp8+0mDB+Ubbc0+hAOC7Cqf7q0kLAH hLHhxCji5PKekoQTCqXXkxdA9Z7XD/SBXHQzdRLhkMqsugW4Lq+UT4BzTA6myLrguH2E hPF1jpWZIggGJGn0qwbNaW5GNYnVmF5/dL6YaQLZMDgNIGXvSnrfYKHDFr70rMdXdQ+t S5SzGhdQ1tU28BjGaGCYU55N0iqyskBCgKamRu5WcfxGjwhGAFFnVfPAe4X5LhfpvVZy Q0R87x63L9IKTzoxU6w1g08xDdRyWtJdLzbfMY+6EMFU/PEXBcHmnsOLqN/Tt5yD5Uc4 reUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=8YNpWUD3e9Hru9EO6pbOyub3mwvjjM68NB9TPiFKWnU=; b=zA1b2vpw1i4qij2+AS8zyJB4+j4nkQkN2who91l35RrrrYBUzSboz4tWAu7IfSc+LV ochs8YOn2VHfMWxKTsz8F0A5F9+9HtGu0LCS2AGtg1pRBj9/kSBMSQkjPbzamjeBROu1 KbnL8aEHDeCwVl2mikLncac47n71EjE38ck5xQcnguseLEinP59jO6VJ5pbkEb9ncLNT WhJeI7Q2zU1cE+fvCvhKuSlYtwXe9RCtynFreZRYv7uP0gUck79sY/fFDPifC7JOLkfS uj/UnTYfRT9YvllkTmybv78DL3YEmcH/0M2JhEjpO8D4wE4aUgOcI7+BCylXGk97pRBs nt+Q== X-Gm-Message-State: AJIora9U5X9agSU5pKuXOE1j4ZigG1G0WfdUYtVGhXekDqTBEoG1lEm6 4/ZXWHr0ZZkeDN1vkCSLiHbCJ8g8SgK/t6E0 X-Received: by 2002:a05:6512:2254:b0:489:d168:20cf with SMTP id i20-20020a056512225400b00489d16820cfmr3102668lfu.97.1657745237952; Wed, 13 Jul 2022 13:47:17 -0700 (PDT) Received: from localhost.localdomain (dsl-tkubng21-58c01c-243.dhcp.inet.fi. [88.192.28.243]) by smtp.gmail.com with ESMTPSA id f9-20020a056512360900b0047f6c98e243sm2994711lfs.102.2022.07.13.13.47.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Jul 2022 13:47:17 -0700 (PDT) From: =?utf-8?q?Martin_Storsj=C3=B6?= To: ffmpeg-devel@ffmpeg.org Date: Wed, 13 Jul 2022 23:47:15 +0300 Message-Id: <20220713204716.3114529-1-martin@martin.st> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/2] x86: Don't hardcode the height to 8 in sad8_xy2_mmx X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Michael Niedermayer , Jonathan Swinney Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: ndsDzgdQyoCg The height is hardcoded in some of the me_cmp functions, but not in all of them. But in the case of all other functions, it's hardcoded in the same place in SIMD functions as in the C reference functions, while this one function differs from the behaviour of the C code. (Before 542765ce3eccbca587d54262a512cbdb1407230d, there were a couple other sad8_*_mmx functions with similar hardcoded height.) --- libavcodec/x86/me_cmp_init.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/libavcodec/x86/me_cmp_init.c b/libavcodec/x86/me_cmp_init.c index 61e9396b8f..dcc2621276 100644 --- a/libavcodec/x86/me_cmp_init.c +++ b/libavcodec/x86/me_cmp_init.c @@ -202,13 +202,12 @@ static inline int sum_mmx(void) static int sad8_xy2_ ## suf(MpegEncContext *v, uint8_t *blk2, \ uint8_t *blk1, ptrdiff_t stride, int h) \ { \ - av_assert2(h == 8); \ __asm__ volatile ( \ "pxor %%mm7, %%mm7 \n\t" \ "pxor %%mm6, %%mm6 \n\t" \ ::); \ \ - sad8_4_ ## suf(blk1, blk2, stride, 8); \ + sad8_4_ ## suf(blk1, blk2, stride, h); \ \ return sum_ ## suf(); \ } \ From patchwork Wed Jul 13 20:47:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Martin_Storsj=C3=B6?= X-Patchwork-Id: 36780 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a21:6da0:b0:8b:e47:9dbf with SMTP id wl32csp597638pzb; Wed, 13 Jul 2022 13:47:37 -0700 (PDT) X-Google-Smtp-Source: AGRyM1s3VIWJn4U79UpcCeuIClz7dA9ZXVrtpvoAmk89BMhXJBqkSNCpZ3Z1CtHJztNwlCxQdCJj X-Received: by 2002:aa7:c2d7:0:b0:43a:78af:6e57 with SMTP id m23-20020aa7c2d7000000b0043a78af6e57mr7493480edp.163.1657745257633; Wed, 13 Jul 2022 13:47:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657745257; cv=none; d=google.com; s=arc-20160816; b=RH+SkK5AjU188TovtmMm7xPwnblTHqc4S3932eghNI9H7c75lEefLRSUWIAIHd7Vfa R6R65OBPkM0FiZ+vatGQ9LvYWv9zbDjf9Vx7WUJhMiyzlXc4MYLFy9pr1GNFPzWlntbg oPzQr4+2FNcleap9TN8kXGKS73067yaAYkUDgM3wodZyO7plzQLoeyh+Zf7W8QrMoTD/ b9aEI7aA2RIIXXnq9vemdBQgqu0Ch1YMOePiMD4bzo2QC4eRd+GECTO3XDXdBIElm6Ck 3X8e8JCTw/d6E4UhY17ozPmsn6TBI4ANon5jmOpV+XomI0UOZPA+FImkL5TQgRUQmasE ShHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=jd/nLuxuIu4ZjuuvhH/hGZ8AaGj6Lq1BKm/CdFxOXuo=; b=X7tIiLbNvrcu4bJk64f2zB2zKqt0tQlv19ZWiyMEKn+MfH2idN1sWGmQpEIAwZ1+Ex 69H1wB/LTVfwrYGV7jaxw5UTadgAptlgPnHNCTtGwqx3qd6qAzdtwnNcWOIwI5X5Ci4P i0Yrln1BKLfE73Vz6BsaX8+WILIRsA+/PNgAWZ5hBFhDH9/9PSF71GX58xAabn/0mI9/ vQOt/Rdl+BSbZPESRvM0xiqbeLqtC2i1BEcfiE+d20XKRUaDHbVKy1dpXYHNyuuzLiyT NUr0pJc+K/VYpEChngBZNc49Z2eN36rE7AMxKoIBAmeHYEoptzQW/mqZ8IHrqSV+gQin PatA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20210112.gappssmtp.com header.s=20210112 header.b=B1rhFGDJ; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id o18-20020a170906769200b006ff49b183e9si16589361ejm.971.2022.07.13.13.47.37; Wed, 13 Jul 2022 13:47:37 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20210112.gappssmtp.com header.s=20210112 header.b=B1rhFGDJ; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E86E868B9ED; Wed, 13 Jul 2022 23:47:25 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lf1-f46.google.com (mail-lf1-f46.google.com [209.85.167.46]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 8638068B9B9 for ; Wed, 13 Jul 2022 23:47:19 +0300 (EEST) Received: by mail-lf1-f46.google.com with SMTP id bp17so13192262lfb.3 for ; Wed, 13 Jul 2022 13:47:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=b6pgbI6w7Fh0doM1YLFyiXT84nr1Xr5TOv2+DSAPIao=; b=B1rhFGDJga2DAekdGVDtBW+ZwdSCAb/uSSC6QgEo47tPyQkBh23Jwc2DLXBQq4K2j6 zwLrdoHAgrniJ28G2Mx8oOqffCle6Blv6bGsiC6l2z1KyA1Agclwkb21/8czLt2yjjLS gQwxoqYnHzRGReR7E18E+SfsVkIT7FZhYAfm3Jq3cqqMfF9aXtcmSC4tYYb6aO5glO8Z 5W/PqrsDZs4rHZtuI5MrzuTchgnI0KrFOkfaUrA7yjBW4PC/HKG7mQuom7g/ULaulwsE Z83h2gQR95KL9fWb2TjoNKwwK+kM/R0OPxjdXxcMsVdpGcpW6KOTyrAqlrUBTvh2MY+G BcaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=b6pgbI6w7Fh0doM1YLFyiXT84nr1Xr5TOv2+DSAPIao=; b=X2iTfCc/KGR+RSlx41diLpKYYvR58huDeWcWpQ5G4g/Qg2/Dheqgc2emZTVkZ4HXay oOcV+9LNLDoy/92EKSYBLbPaaJapgmBwL5vYW9RUh3Jf32qPvyp1yfFJP+LR/Fz8rF/V GwaHVzwqYfNJ4VoBCb3JRrn/OarWzGFG7zphBVobQeypvPmpOlwO252aYXHwjCaSqIUR mXLDbqioM+itQv/ZxhvtMxrIpgXRjxetMnJ2vGqHRIRvktDME66sIS8QDiIizCkh/5M4 eAPRK/YLb8mYbVHs8jLXs80z9v6EMT2j9+ZPzKPUA+J9gZKmsp4RWVVUQMzKeNK0UQnc O8Rg== X-Gm-Message-State: AJIora90wttNanx9GIhVKzhB1HpdxwpR8+q7nfc4EHujIeBYWpqJ9KZN 8fNw7NPDhIHv+LlcWU+HVHWturwc2LAsESq6 X-Received: by 2002:a05:6512:3981:b0:478:54e2:7003 with SMTP id j1-20020a056512398100b0047854e27003mr3001827lfu.416.1657745238734; Wed, 13 Jul 2022 13:47:18 -0700 (PDT) Received: from localhost.localdomain (dsl-tkubng21-58c01c-243.dhcp.inet.fi. [88.192.28.243]) by smtp.gmail.com with ESMTPSA id f9-20020a056512360900b0047f6c98e243sm2994711lfs.102.2022.07.13.13.47.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Jul 2022 13:47:18 -0700 (PDT) From: =?utf-8?q?Martin_Storsj=C3=B6?= To: ffmpeg-devel@ffmpeg.org Date: Wed, 13 Jul 2022 23:47:16 +0300 Message-Id: <20220713204716.3114529-2-martin@martin.st> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220713204716.3114529-1-martin@martin.st> References: <20220713204716.3114529-1-martin@martin.st> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/2] RFC: checkasm: motion: Test different h parameters X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Michael Niedermayer , Jonathan Swinney Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: Z3aLYoU9TP4u Previously, the checkasm test always passed h=8, so no other cases were tested. Out of the me_cmp functions, in practice, some functions are hardcoded to always assume a 8x8 block (ignoring the h parameter), while others do use the parameter. For those with hardcoded height, both the reference C function and the assembly implementations ignore the parameter similarly. The documentation for the functions indicate that heights between w/2 and 2*w, within the range of 4 to 16, should be supported. This patch just tests random heights in that range, without knowing what width the current function actually uses. --- I'm not sure if it's good to have checkasm exercise cases that don't occur in practice or not. In particular, the aarch64 functions have a separate implementation for non-multiple-of-4 height, which probably doesn't ever get called in practice, while other SIMD implementations lack that. Alternatively, we'd improve the documentation for the expectations for these functions and make the test match that, and remove the unused non-multiple-of-4 case in the aarch64 assembly. --- tests/checkasm/motion.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/tests/checkasm/motion.c b/tests/checkasm/motion.c index 0112822174..79e4358941 100644 --- a/tests/checkasm/motion.c +++ b/tests/checkasm/motion.c @@ -45,7 +45,7 @@ static void test_motion(const char *name, me_cmp_func test_func) /* motion estimation can look up to 17 bytes ahead */ static const int look_ahead = 17; - int i, x, y, d1, d2; + int i, x, y, h, d1, d2; uint8_t *ptr; LOCAL_ALIGNED_16(uint8_t, img1, [WIDTH * HEIGHT]); @@ -68,14 +68,16 @@ static void test_motion(const char *name, me_cmp_func test_func) for (i = 0; i < ITERATIONS; i++) { x = rnd() % (WIDTH - look_ahead); y = rnd() % (HEIGHT - look_ahead); + // Pick a random h between 4 and 16; pick an even value. + h = 4 + ((rnd() % (16 + 1 - 4)) & ~1); ptr = img2 + y * WIDTH + x; - d2 = call_ref(NULL, img1, ptr, WIDTH, 8); - d1 = call_new(NULL, img1, ptr, WIDTH, 8); + d2 = call_ref(NULL, img1, ptr, WIDTH, h); + d1 = call_new(NULL, img1, ptr, WIDTH, h); if (d1 != d2) { fail(); - printf("func: %s, x=%d y=%d, error: asm=%d c=%d\n", name, x, y, d1, d2); + printf("func: %s, x=%d y=%d h=%d, error: asm=%d c=%d\n", name, x, y, h, d1, d2); break; } }