From patchwork Sun Aug 18 20:13:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramiro Polla X-Patchwork-Id: 51076 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:b6ca:0:b0:48e:c0f8:d0de with SMTP id s10csp1571746vqj; Sun, 18 Aug 2024 13:21:20 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCX5PcX7OzOu72b4/5ehtLNuNvlE9PLaHhkSFOnCm1DkFd9Qf7aFHptOMRIRnTR8w4rhD6YB1pmusQmMbZWcAAPf@gmail.com X-Google-Smtp-Source: AGHT+IF9UqNuP/c4YYJDbNQvd9YnLJ8ZYxpMOl0TPlIcBELHhR4yhHIoDznePT9khk+NmYjQSijf X-Received: by 2002:a2e:be23:0:b0:2ef:2405:ff63 with SMTP id 38308e7fff4ca-2f3be5f8f47mr32492571fa.5.1724012480069; Sun, 18 Aug 2024 13:21:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1724012480; cv=none; d=google.com; s=arc-20160816; b=U7oEI8XWL2TPgvXKX3WQRmMGSrLCtZk1cHeLnk1wAM10a3qDmKAg3LaHKK5SNBkSQE Pz9yh7Wp1xPBXLB7l/vuztlFXWZdYsR6CuQNQ3+dJEcDC1XQGjS/05Uc6aW3W0gugApy ZsxdFW9BPSwUonsTjsP/S8mBuRTgxo6yOzqN2fzWC7fAE+sSpNhUs2sxXQBZyaj+ekrK 45YAs0NFMyjgHXpJj7O9Pvd6cxCGNjzOoHaXmGrLseby8mATELve2/NAnW+64V56amce 5OKGQm/9Z0qzu39XWr870xSn7VNnw7HS8g11jQ5sM5QkTy3jYVbQ82qUwDJoyiz/wcrV 17Sg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=7mC924w6gCrph2sUntZLLxiHLFyH37a1TDbLdAVy+M0=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=zrt8pHn6UJCttaKX4eR55t5AVF0CTNPewW2PmLyqPy3+SYhbKBQC1VzlZn704GUYLj UNXAeGL64A35ZXj43lIEnCVN0rHVci88JqSaO0sJr7YI2KSxSydkfMs8CkFFCGTnFyjI igU4JWgrZ0bhD6leg/ah8KWak6hqWixZJgeID4DJhHyNjgBtgHIeNOxAFb/kq5U4yWMX mNXpSD0OMitM2aGSuZzZOCTB+d1bQ0S8EWomtRO9blo7zJmB/JvypZHgdc4xGgSwkHVD VTlVCdHzIKFjTV1hrCU170dTuyvqnSQ9h5410xcYtfATQNsD3LN4i0hW4C2MbIZncDCB 6MjA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b="eziI/CpB"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 4fb4d7f45d1cf-5bebc07e9besi4687330a12.327.2024.08.18.13.21.19; Sun, 18 Aug 2024 13:21:20 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b="eziI/CpB"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id F169C68DDF1; Sun, 18 Aug 2024 23:13:48 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wr1-f48.google.com (mail-wr1-f48.google.com [209.85.221.48]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 7EB2C68DD49 for ; Sun, 18 Aug 2024 23:13:38 +0300 (EEST) Received: by mail-wr1-f48.google.com with SMTP id ffacd0b85a97d-3718ca50fd7so2115881f8f.1 for ; Sun, 18 Aug 2024 13:13:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724012017; x=1724616817; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=Y+GQ0nxGy8Ihca2+FNag42PssT3wDtn6wFSBlSVoWBs=; b=eziI/CpBJz9WLg+KodihgL5NnljSy9mo6KGJc+N2LJLWStIIl8nvj6eDHe0FDTCouO Nnn+nWjuj8hbDp70iZXN1b8O6CbgOuqfEXe3q3a2ePCi3duNt9D0r0iFeGB726eBHlh2 gcMhIXB64D683pAq7LBd0t0Y/752qpvVnYYrtek51kymC9f7QmAHrB93RTkDi+/Mg/4H OZ9GuRVHcubK25nbsHemlTlOPYktdw5W0y/8hdE/3yUigy6co4fj2qM7fLQ57QPgwc5m HEPbBYpTFa9rO8JCiyu51wkSBbyVVS9jzA1NXX2ARImxHjGOTCW8nZxv7FMC2zR5g5Yt Gy1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724012017; x=1724616817; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Y+GQ0nxGy8Ihca2+FNag42PssT3wDtn6wFSBlSVoWBs=; b=ra5m0WNatAV1SS83S9acONiXUbCSCybxlo684ORVft0cT3VsZ8vUZFqcTn2OSARorK peYV/4F8vHmQCuAfHVlDb9qbdnm25usVuTQxRVOhWAsuNQ1mFHt0ZWak07G3NKbn2C9g PhXDsWONt8BNofvPjKt+ZUoTxILCNKo9ZWQ8UNEi43xB0amY+lClLbaWAuDL+bA85Ort D6dx8SIYdIAMZ8dlLX1BkwPNnPvUqK//g0YrcQwK1ZOe+bFokYQXrqjoYoIvwAxum2fw sTtijopnvxEJASVg6Z34y4YhdW6BLNqHCJ4pdXA2Hf3IVaAS8qOaMT7sriPR5xY7RS65 M1kA== X-Gm-Message-State: AOJu0YwzQW45tIR+ScBFJ5nE2cdahu1RR7T7EGEaJ+z0m8GBpH5/AgO9 gw7Asb/iHEHUr++5V5MhMquC3EFKv4hpOhRcn8SQtr9upT+vFgqja4rHgLO2 X-Received: by 2002:a5d:4043:0:b0:371:8f26:67f1 with SMTP id ffacd0b85a97d-37194452a6dmr5132252f8f.33.1724012016975; Sun, 18 Aug 2024 13:13:36 -0700 (PDT) Received: from localhost.localdomain (205.66-130-109.adsl-dyn.isp.belgacom.be. [109.130.66.205]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-37189896c5bsm8739931f8f.80.2024.08.18.13.13.36 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 18 Aug 2024 13:13:36 -0700 (PDT) From: Ramiro Polla To: ffmpeg-devel@ffmpeg.org Date: Sun, 18 Aug 2024 22:13:26 +0200 Message-Id: <20240818201326.100492-7-ramiro.polla@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240818201326.100492-1-ramiro.polla@gmail.com> References: <20240818201326.100492-1-ramiro.polla@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 7/7] avcodec/mpegvideoencdsp: speed up draw_edges_8_c by inlining it for all used edge widths X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: l62QMrKREWLg This commit also restricts w to 4, 8, or 16. Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz: before after draw_edges_8_1724_4_c: 45074.5 7144.7 ( 6.31x) draw_edges_8_1724_8_c: 41716.5 7216.0 ( 5.78x) draw_edges_8_1724_16_c: 45282.7 16026.2 ( 2.83x) draw_edges_128_407_4_c: 10863.2 4153.0 ( 2.62x) draw_edges_128_407_8_c: 10273.0 4392.7 ( 2.34x) draw_edges_128_407_16_c: 11606.0 4614.0 ( 2.52x) draw_edges_1080_31_4_c: 1238.5 971.7 ( 1.27x) draw_edges_1080_31_8_c: 1712.2 1035.2 ( 1.65x) draw_edges_1080_31_16_c: 4281.5 3774.7 ( 1.13x) draw_edges_1920_4_4_c: 920.5 731.0 ( 1.26x) draw_edges_1920_4_8_c: 2861.0 2749.5 ( 1.04x) draw_edges_1920_4_16_c: 6416.7 6334.5 ( 1.01x) --- libavcodec/mpegvideoencdsp.c | 22 +++++++++++++++++----- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/libavcodec/mpegvideoencdsp.c b/libavcodec/mpegvideoencdsp.c index a96f0b6436..89fde7edf0 100644 --- a/libavcodec/mpegvideoencdsp.c +++ b/libavcodec/mpegvideoencdsp.c @@ -114,19 +114,31 @@ static int pix_norm1_c(const uint8_t *pix, int line_size) return s; } +static av_always_inline void draw_edges_lr(uint8_t *ptr, int wrap, int width, int height, int w) +{ + for (int i = 0; i < height; i++) { + memset(ptr - w, ptr[0], w); + memset(ptr + width, ptr[width - 1], w); + ptr += wrap; + } +} + /* draw the edges of width 'w' of an image of size width, height */ // FIXME: Check that this is OK for MPEG-4 interlaced. static void draw_edges_8_c(uint8_t *buf, int wrap, int width, int height, int w, int h, int sides) { - uint8_t *ptr = buf, *last_line; + uint8_t *last_line; int i; /* left and right */ - for (i = 0; i < height; i++) { - memset(ptr - w, ptr[0], w); - memset(ptr + width, ptr[width - 1], w); - ptr += wrap; + if (w == 16) { + draw_edges_lr(buf, wrap, width, height, 16); + } else if (w == 8) { + draw_edges_lr(buf, wrap, width, height, 8); + } else { + av_assert1(w == 4); + draw_edges_lr(buf, wrap, width, height, 4); } /* top and bottom + corners */