From patchwork Sun Jun 9 11:00:49 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andreas Rheinhardt X-Patchwork-Id: 13475 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 678BE447159 for ; Sun, 9 Jun 2019 14:16:01 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 57A1C68AB03; Sun, 9 Jun 2019 14:16:01 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wr1-f66.google.com (mail-wr1-f66.google.com [209.85.221.66]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1708F68AA53 for ; Sun, 9 Jun 2019 14:15:55 +0300 (EEST) Received: by mail-wr1-f66.google.com with SMTP id b17so6303171wrq.11 for ; Sun, 09 Jun 2019 04:15:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=RyLoJjHFpG5C0xDE2EIGjhI0dTgqFamZpegqVG3gnqY=; b=ngADVcaikz1k2r8cJs4bf/5YzbHsFVTB/yR9r6rKl3MA2ZCgVVXBAKiSwMdE4vzmFH JAmK0YSQxsedX8FcPtqMjNBcMSuTgmkPkpw5odZ0exPh7Ytpvj0gdFc0IBEQAYdbg/7Y gPPRm5IcnxAfaNf3mHmc585sORsmLBsZ5ynCrhB3DMVCpbpMJa8Cqo44KMAr9WTbNI3b A2LwPL5qetM5/bXkm9Zv/zwX7zRbDHE4nA3S+p/5AB00IBtA09OgZs6Nl9Sw2sR9vekm AoAFwx++xv/+UyMpZSfoI1VElGJtnqg3EvoncrMvigX79x5UYpgODIw7PWlZLkEtNSu5 6iFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=RyLoJjHFpG5C0xDE2EIGjhI0dTgqFamZpegqVG3gnqY=; b=cMBtMeg1oRAN9v4BQSLC6JZnv6YL9ciZkoF8oX7y5zbdcR4WB/OhPo2jmKuNre5h4t PI0lsGkc0IPSn65ZmoVBU+K7cED+52+jLYikn83LcV654CmL+MQUD1TpffDFhC9TbAFH OO04J6mMYrVAbkwE9lXRsGBYU0ZdaFpmfnGigF8PE7i5RsSsmpPLEX0gVmsu2aYH+rl5 o/kcet1yZRa9bjQP/LdqybuTs2GOGfonKrWI4OIOzAjXyBQmjrCB0B0i3NzI20MHKnWX PpNjNon/OkoP8bXRpzVXEIdesAzflvPlhrU7ghVLlq00V1aQK9DnyRf9sAsDhdut4Rxy ef1g== X-Gm-Message-State: APjAAAW6yi5Au4egiXve7nDnXtXwZ/QUjTIAdoUKCchIbDX6h3TsbXEd I+v2LD6vsgeMxOBm4asdcXzQdYVV X-Google-Smtp-Source: APXvYqxCzQlZN1NBxW2Iw0M0WUkdtKvOLmmr9uTy7WnA0cx9ppVPfmYE3GmSUJLZFbOS6Yz5+//G+A== X-Received: by 2002:a5d:42ca:: with SMTP id t10mr23854364wrr.202.1560078573299; Sun, 09 Jun 2019 04:09:33 -0700 (PDT) Received: from localhost.localdomain (ipbcc063db.dynamic.kabel-deutschland.de. [188.192.99.219]) by smtp.gmail.com with ESMTPSA id e7sm6055079wmd.0.2019.06.09.04.09.32 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Sun, 09 Jun 2019 04:09:32 -0700 (PDT) From: Andreas Rheinhardt To: ffmpeg-devel@ffmpeg.org Date: Sun, 9 Jun 2019 13:00:49 +0200 Message-Id: <20190609110053.4012-2-andreas.rheinhardt@gmail.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190609110053.4012-1-andreas.rheinhardt@gmail.com> References: <20190604111632.GZ3118@michaelspb> <20190609110053.4012-1-andreas.rheinhardt@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/5] startcode: Use common macro and switch to pointer arithmetic X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Andreas Rheinhardt Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" The reasons are cosmetics and preparation for future patches that will have even more cases and whose performance improves when switching to direct pointer arithmetic: Benchmarks have shown that using pointers directly instead of indexing to access the array to be about 5% faster (33665 vs. 31806 for a 7.4 Mb/s H.264 file based on 10 iterations of 131072 runs each; and 244356 vs 233373 for a 30.2 Mb/s H.264 file based on 10 iterations with 8192 runs each). Signed-off-by: Andreas Rheinhardt --- libavcodec/startcode.c | 37 +++++++++++++++++++------------------ 1 file changed, 19 insertions(+), 18 deletions(-) diff --git a/libavcodec/startcode.c b/libavcodec/startcode.c index 9efdffe8c6..a55a8fafa6 100644 --- a/libavcodec/startcode.c +++ b/libavcodec/startcode.c @@ -27,31 +27,32 @@ #include "startcode.h" #include "config.h" +#include "libavutil/intreadwrite.h" int ff_startcode_find_candidate_c(const uint8_t *buf, int size) { - int i = 0; + const uint8_t *start = buf, *end = buf + size; + #if HAVE_FAST_UNALIGNED - /* we check i < size instead of i + 3 / 7 because it is - * simpler and there must be AV_INPUT_BUFFER_PADDING_SIZE - * bytes at the end. - */ +#define READ(bitness) AV_RN ## bitness +#define MAIN_LOOP(bitness, mask1, mask2) do { \ + /* we check p < end instead of p + 3 / 7 because it is + * simpler and there must be AV_INPUT_BUFFER_PADDING_SIZE + * bytes at the end. */ \ + for (; buf < end; buf += bitness / 8) \ + if ((~READ(bitness)(buf) & (READ(bitness)(buf) - mask1)) \ + & mask2) \ + break; \ + } while (0) + #if HAVE_FAST_64BIT - while (i < size && - !((~*(const uint64_t *)(buf + i) & - (*(const uint64_t *)(buf + i) - 0x0101010101010101ULL)) & - 0x8080808080808080ULL)) - i += 8; + MAIN_LOOP(64, 0x0101010101010101ULL, 0x8080808080808080ULL); #else - while (i < size && - !((~*(const uint32_t *)(buf + i) & - (*(const uint32_t *)(buf + i) - 0x01010101U)) & - 0x80808080U)) - i += 4; + MAIN_LOOP(32, 0x01010101U, 0x80808080U); #endif #endif - for (; i < size; i++) - if (!buf[i]) + for (; buf < end; buf++) + if (!*buf) break; - return i; + return buf - start; }