From patchwork Sat Aug 1 13:47:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andreas Rheinhardt X-Patchwork-Id: 21438 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 4A09C44AA8D for ; Sat, 1 Aug 2020 16:49:48 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3AF3268BAEF; Sat, 1 Aug 2020 16:49:48 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-ej1-f67.google.com (mail-ej1-f67.google.com [209.85.218.67]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id AC2D068BAB4 for ; Sat, 1 Aug 2020 16:49:39 +0300 (EEST) Received: by mail-ej1-f67.google.com with SMTP id a21so34138080ejj.10 for ; Sat, 01 Aug 2020 06:49:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=9/gV7IW8aX+9ziQ56Lmuq++L+Qc7y3UO0pdTjO5H0Pk=; b=ozE5ZZp66NsKUSKR2imhR1gnYfdwiFYGB4OaYw1u2onC4HrYc09lnzw8ExPmkuDq1q N6KmDyft9uj8E19JNHYtlYBIE3wM0/04S6oQ4myvLAMa+r9I0nZpPoPRyiec5ZctaifD i8qkfKPJy0Kl3S5yd9QbP+m9v92PRrynAECnVFhAOaq9YWfRBcbMunzRCssyj1fdUtQS 3bT2hnJXJ/vUaOvmVF5UFwD3oMjoiAs5/QaqtzPXFfpGNSKJCdIyNWy7NgBQFHm7lnQY B+MFpsYPV/dBt6LBVI3V3FtXK4aLoHjonP8ZWxx5ChC8WUP+xlwsb/CRW2Z/VO/LMbWJ gEKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=9/gV7IW8aX+9ziQ56Lmuq++L+Qc7y3UO0pdTjO5H0Pk=; b=ARYNpDDwY3GyEMbym/iTfLtjXBIjCxHnEzIlPhe7vkO6O6dyntPfZgMpXRgrqJT8go q8N/F2//aIw1uGOgud9fCeUhTJGdohwoCgiP/J444kkaX/F5QugQF62mbXPoQyU24UQi K4CN7UpUNKREfoHRVBcttcleTzbsI8s4rTIUmsJsOzXJlKAnwMyqBRc7UsXiT51wRbdl 0JcygcWS+8xdYBsh5tw1mT0KZPwa/gj7I7QEB2J8n8AQdnPnFOe81bhT3/hY76btffn0 KjzTWo8COqT6fa2NRxNtvcUirWa7/40fOHu+tGFGzjALyonq2ehVTbKSqHm4QTTL0FZT VqlQ== X-Gm-Message-State: AOAM530dPtEFJF++PHREESGhMJnPnFkwKkZkNbftQIuak7NoGMW5Sznf HINeiKvnZ+LO55sbhlknx+U4YBCY X-Google-Smtp-Source: ABdhPJwk5y2JzrqfQZbdCYFs0QXh2aBhFBrX4JAW9zdK4vkX8z187lqRoZ6gS+7sykW6EEFPd0CI2w== X-Received: by 2002:a17:906:841:: with SMTP id f1mr8908955ejd.158.1596289778794; Sat, 01 Aug 2020 06:49:38 -0700 (PDT) Received: from sblaptop.fritz.box (ipbcc10296.dynamic.kabel-deutschland.de. [188.193.2.150]) by smtp.gmail.com with ESMTPSA id b24sm12178501edn.33.2020.08.01.06.49.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 01 Aug 2020 06:49:38 -0700 (PDT) From: Andreas Rheinhardt To: ffmpeg-devel@ffmpeg.org Date: Sat, 1 Aug 2020 15:47:02 +0200 Message-Id: <20200801134704.3647-11-andreas.rheinhardt@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200731112241.8948-1-andreas.rheinhardt@gmail.com> References: <20200731112241.8948-1-andreas.rheinhardt@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 19/21] avcodec/smacker: Avoid allocations for decoding Smacker X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Andreas Rheinhardt Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" by using buffers on the stack instead. The fact that the effective lifetime of most of the allocated buffers doesn't overlap enables one to limit the stack space used to a fairly modest size (about 1.5 KiB). That all the buffers used in HuffContexts have always the same number of elements (namely 256) makes it possible to include the buffers directly in the HuffContext. Doing so also makes the length field redundant; it has therefore been removed. This is beneficial for performance: For GCC 9 the time for one call to smka_decode_frame() for the sample in ticket #2425 went down from 1794494 to 1709043 decicyles; for Clang 9 it decreased from 1449420 to 1355273 decicycles. Signed-off-by: Andreas Rheinhardt --- libavcodec/smacker.c | 69 +++++++++++++++----------------------------- 1 file changed, 23 insertions(+), 46 deletions(-) diff --git a/libavcodec/smacker.c b/libavcodec/smacker.c index 15c8856e69..e588b03820 100644 --- a/libavcodec/smacker.c +++ b/libavcodec/smacker.c @@ -66,11 +66,10 @@ typedef struct SmackVContext { * Context used for code reconstructing */ typedef struct HuffContext { - int length; int current; - uint32_t *bits; - uint8_t *lengths; - uint8_t *values; + uint32_t bits[256]; + uint8_t lengths[256]; + uint8_t values[256]; } HuffContext; /* common parameters used for decode_bigtree */ @@ -114,7 +113,7 @@ static int smacker_decode_tree(GetBitContext *gb, HuffContext *hc, uint32_t pref } if(!get_bits1(gb)){ //Leaf - if(hc->current >= hc->length){ + if (hc->current >= 256) { av_log(NULL, AV_LOG_ERROR, "Tree size exceeded!\n"); return AVERROR_INVALIDDATA; } @@ -198,7 +197,6 @@ static int smacker_decode_bigtree(GetBitContext *gb, DBCtx *ctx, int length) */ static int smacker_decode_header_tree(SmackVContext *smk, GetBitContext *gb, int **recodes, int *last, int size) { - HuffContext h[2] = { 0 }; VLC vlc[2] = { { 0 } }; int escapes[3]; DBCtx ctx; @@ -210,37 +208,30 @@ static int smacker_decode_header_tree(SmackVContext *smk, GetBitContext *gb, int } for (int i = 0; i < 2; i++) { - h[i].length = 256; - h[i].current = 0; - h[i].bits = av_malloc(256 * sizeof(h[i].bits[0])); - h[i].lengths = av_malloc(256 * sizeof(h[i].lengths[0])); - h[i].values = av_malloc(256 * sizeof(h[i].values[0])); - if (!h[i].bits || !h[i].lengths || !h[i].values) { - err = AVERROR(ENOMEM); - goto error; - } + HuffContext h; + h.current = 0; if (!get_bits1(gb)) { ctx.vals[i] = 0; av_log(smk->avctx, AV_LOG_ERROR, "Skipping %s bytes tree\n", i ? "high" : "low"); continue; } - err = smacker_decode_tree(gb, &h[i], 0, 0); + err = smacker_decode_tree(gb, &h, 0, 0); if (err < 0) goto error; skip_bits1(gb); - if (h[i].current > 1) { - err = ff_init_vlc_sparse(&vlc[i], SMKTREE_BITS, h[i].current, - INIT_VLC_DEFAULT_SIZES(h[i].lengths), - INIT_VLC_DEFAULT_SIZES(h[i].bits), - INIT_VLC_DEFAULT_SIZES(h[i].values), - INIT_VLC_LE); + if (h.current > 1) { + err = ff_init_vlc_sparse(&vlc[i], SMKTREE_BITS, h.current, + INIT_VLC_DEFAULT_SIZES(h.lengths), + INIT_VLC_DEFAULT_SIZES(h.bits), + INIT_VLC_DEFAULT_SIZES(h.values), + INIT_VLC_LE); if (err < 0) { av_log(smk->avctx, AV_LOG_ERROR, "Cannot build VLC table\n"); goto error; } } else - ctx.vals[i] = h[i].values[0]; + ctx.vals[i] = h.values[0]; } escapes[0] = get_bits(gb, 16); @@ -276,9 +267,6 @@ static int smacker_decode_header_tree(SmackVContext *smk, GetBitContext *gb, int error: for (int i = 0; i < 2; i++) { ff_free_vlc(&vlc[i]); - av_free(h[i].bits); - av_free(h[i].lengths); - av_free(h[i].values); } return err; @@ -603,7 +591,6 @@ static int smka_decode_frame(AVCodecContext *avctx, void *data, const uint8_t *buf = avpkt->data; int buf_size = avpkt->size; GetBitContext gb; - HuffContext h[4] = { { 0 } }; VLC vlc[4] = { { 0 } }; int16_t *samples; uint8_t *samples8; @@ -659,31 +646,24 @@ static int smka_decode_frame(AVCodecContext *avctx, void *data, // Initialize for(i = 0; i < (1 << (bits + stereo)); i++) { - h[i].length = 256; - h[i].current = 0; - h[i].bits = av_malloc(256 * sizeof(h[i].bits)); - h[i].lengths = av_malloc(256 * sizeof(h[i].lengths)); - h[i].values = av_malloc(256 * sizeof(h[i].values)); - if (!h[i].bits || !h[i].lengths || !h[i].values) { - ret = AVERROR(ENOMEM); - goto error; - } + HuffContext h; + h.current = 0; skip_bits1(&gb); - if ((ret = smacker_decode_tree(&gb, &h[i], 0, 0)) < 0) + if ((ret = smacker_decode_tree(&gb, &h, 0, 0)) < 0) goto error; skip_bits1(&gb); - if(h[i].current > 1) { - ret = ff_init_vlc_sparse(&vlc[i], SMKTREE_BITS, h[i].current, - INIT_VLC_DEFAULT_SIZES(h[i].lengths), - INIT_VLC_DEFAULT_SIZES(h[i].bits), - INIT_VLC_DEFAULT_SIZES(h[i].values), + if (h.current > 1) { + ret = ff_init_vlc_sparse(&vlc[i], SMKTREE_BITS, h.current, + INIT_VLC_DEFAULT_SIZES(h.lengths), + INIT_VLC_DEFAULT_SIZES(h.bits), + INIT_VLC_DEFAULT_SIZES(h.values), INIT_VLC_LE); if (ret < 0) { av_log(avctx, AV_LOG_ERROR, "Cannot build VLC table\n"); goto error; } } else - values[i] = h[i].values[0]; + values[i] = h.values[0]; } /* this codec relies on wraparound instead of clipping audio */ if(bits) { //decode 16-bit data @@ -758,9 +738,6 @@ static int smka_decode_frame(AVCodecContext *avctx, void *data, error: for(i = 0; i < 4; i++) { ff_free_vlc(&vlc[i]); - av_free(h[i].bits); - av_free(h[i].lengths); - av_free(h[i].values); } return ret;