From patchwork Wed Jul 13 20:47:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Martin_Storsj=C3=B6?= X-Patchwork-Id: 36779 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a21:6da0:b0:8b:e47:9dbf with SMTP id wl32csp597562pzb; Wed, 13 Jul 2022 13:47:29 -0700 (PDT) X-Google-Smtp-Source: AGRyM1uH3//ko5O3kPVRnoX0JNjkYI+5n1nVavwaqOkcrzn8B/eBbVUorDURnC+DA0snr2KxfOmU X-Received: by 2002:a05:6402:5193:b0:43a:eaa8:74b2 with SMTP id q19-20020a056402519300b0043aeaa874b2mr7635653edd.111.1657745249032; Wed, 13 Jul 2022 13:47:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657745249; cv=none; d=google.com; s=arc-20160816; b=akwjadBVwXR3hjzJcj4E4ENvkoVWa3GLHAHkvksi4C+5zd52qFrlMKHkit0lWRkA8b 1ohVkXkWksb7R+MpwN7stve7EvA1g9/0Zgl1tvlQFT4+4aYkKA/b7z6++rQ2G2e9pgry Gfa8nL4BQu8MEf93lk6RQcsNKVygzZ/UBPOa61dm/bOWp2C9GJr2vjO9ewQC2ishw8wa beApyBV7pehf/qXDUEntnlrFhmo65ih+8F9jZhWEhF+89c2zq8AWS+evTSZBQW4bGIwt JVvgqcui2ZsLBgjomSreIru0tYdjNDa9qzXvRWqlJkrNgxwWO9FngJ49362OgQUu2dLm ObRg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=4q8XrjlhCKQsgRTgWvBnPce2eU54iFPsDGjKU2kB5Ak=; b=sJArUvdrpd79vSGKDFvsRgVB8HW5Y7ou1PMngds9j5EzxiyFlbdAp104DMgAKAZbMW BgqEyBYVf4v4un3b1qSfD6jd9lKMZ00XYk6rxU0Ytp/k7nqSUOXKyMA1Zpv/AenEMyia NFF0qYDufogO/Uj+b2E3roUJdIXw8vVAL63uo549E33NkkNq0hPsThBLvbwJoACxCRWb Zbymtz3ebZOTgQ8VjsgrVKodtR3po+EkaaiZQPraDcSpiyPRfd9FsCNKlysQIhq7Z3e3 CaNxqofNXDWC5tLLHM0txCdjtvnJ+busuF2eAZsfcj//7HsJvC1KMj2Vgq/m/A377t8C 0uOg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20210112.gappssmtp.com header.s=20210112 header.b=JYVgvS9M; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id v9-20020aa7d649000000b0043ab49628b2si19137969edr.586.2022.07.13.13.47.28; Wed, 13 Jul 2022 13:47:29 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20210112.gappssmtp.com header.s=20210112 header.b=JYVgvS9M; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 11B1B68B9E6; Wed, 13 Jul 2022 23:47:25 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lf1-f43.google.com (mail-lf1-f43.google.com [209.85.167.43]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C409568B786 for ; Wed, 13 Jul 2022 23:47:18 +0300 (EEST) Received: by mail-lf1-f43.google.com with SMTP id r9so13034261lfp.10 for ; Wed, 13 Jul 2022 13:47:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=8YNpWUD3e9Hru9EO6pbOyub3mwvjjM68NB9TPiFKWnU=; b=JYVgvS9MmE4WBkVxFCzxTNRSnBOhSGxNDCm4rp8+0mDB+Ubbc0+hAOC7Cqf7q0kLAH hLHhxCji5PKekoQTCqXXkxdA9Z7XD/SBXHQzdRLhkMqsugW4Lq+UT4BzTA6myLrguH2E hPF1jpWZIggGJGn0qwbNaW5GNYnVmF5/dL6YaQLZMDgNIGXvSnrfYKHDFr70rMdXdQ+t S5SzGhdQ1tU28BjGaGCYU55N0iqyskBCgKamRu5WcfxGjwhGAFFnVfPAe4X5LhfpvVZy Q0R87x63L9IKTzoxU6w1g08xDdRyWtJdLzbfMY+6EMFU/PEXBcHmnsOLqN/Tt5yD5Uc4 reUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=8YNpWUD3e9Hru9EO6pbOyub3mwvjjM68NB9TPiFKWnU=; b=zA1b2vpw1i4qij2+AS8zyJB4+j4nkQkN2who91l35RrrrYBUzSboz4tWAu7IfSc+LV ochs8YOn2VHfMWxKTsz8F0A5F9+9HtGu0LCS2AGtg1pRBj9/kSBMSQkjPbzamjeBROu1 KbnL8aEHDeCwVl2mikLncac47n71EjE38ck5xQcnguseLEinP59jO6VJ5pbkEb9ncLNT WhJeI7Q2zU1cE+fvCvhKuSlYtwXe9RCtynFreZRYv7uP0gUck79sY/fFDPifC7JOLkfS uj/UnTYfRT9YvllkTmybv78DL3YEmcH/0M2JhEjpO8D4wE4aUgOcI7+BCylXGk97pRBs nt+Q== X-Gm-Message-State: AJIora9U5X9agSU5pKuXOE1j4ZigG1G0WfdUYtVGhXekDqTBEoG1lEm6 4/ZXWHr0ZZkeDN1vkCSLiHbCJ8g8SgK/t6E0 X-Received: by 2002:a05:6512:2254:b0:489:d168:20cf with SMTP id i20-20020a056512225400b00489d16820cfmr3102668lfu.97.1657745237952; Wed, 13 Jul 2022 13:47:17 -0700 (PDT) Received: from localhost.localdomain (dsl-tkubng21-58c01c-243.dhcp.inet.fi. [88.192.28.243]) by smtp.gmail.com with ESMTPSA id f9-20020a056512360900b0047f6c98e243sm2994711lfs.102.2022.07.13.13.47.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Jul 2022 13:47:17 -0700 (PDT) From: =?utf-8?q?Martin_Storsj=C3=B6?= To: ffmpeg-devel@ffmpeg.org Date: Wed, 13 Jul 2022 23:47:15 +0300 Message-Id: <20220713204716.3114529-1-martin@martin.st> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/2] x86: Don't hardcode the height to 8 in sad8_xy2_mmx X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Michael Niedermayer , Jonathan Swinney Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: ndsDzgdQyoCg The height is hardcoded in some of the me_cmp functions, but not in all of them. But in the case of all other functions, it's hardcoded in the same place in SIMD functions as in the C reference functions, while this one function differs from the behaviour of the C code. (Before 542765ce3eccbca587d54262a512cbdb1407230d, there were a couple other sad8_*_mmx functions with similar hardcoded height.) --- libavcodec/x86/me_cmp_init.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/libavcodec/x86/me_cmp_init.c b/libavcodec/x86/me_cmp_init.c index 61e9396b8f..dcc2621276 100644 --- a/libavcodec/x86/me_cmp_init.c +++ b/libavcodec/x86/me_cmp_init.c @@ -202,13 +202,12 @@ static inline int sum_mmx(void) static int sad8_xy2_ ## suf(MpegEncContext *v, uint8_t *blk2, \ uint8_t *blk1, ptrdiff_t stride, int h) \ { \ - av_assert2(h == 8); \ __asm__ volatile ( \ "pxor %%mm7, %%mm7 \n\t" \ "pxor %%mm6, %%mm6 \n\t" \ ::); \ \ - sad8_4_ ## suf(blk1, blk2, stride, 8); \ + sad8_4_ ## suf(blk1, blk2, stride, h); \ \ return sum_ ## suf(); \ } \