From patchwork Fri Nov 20 07:18:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andreas Rheinhardt X-Patchwork-Id: 23778 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 3041344AB58 for ; Fri, 20 Nov 2020 09:29:26 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0EF7068B9E5; Fri, 20 Nov 2020 09:29:26 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lj1-f180.google.com (mail-lj1-f180.google.com [209.85.208.180]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 24F1268B9D9 for ; Fri, 20 Nov 2020 09:29:24 +0200 (EET) Received: by mail-lj1-f180.google.com with SMTP id l10so9036685lji.4 for ; Thu, 19 Nov 2020 23:29:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; bh=sHi7fQIlhSpnYQKERpgUzGsPLFCPvF/LSq6qevCnhiI=; b=fEPP/shMphKFe850DmIurmyXvv5U1WIdhiNfvwv/kRykJkNiAcCEnBasFuotrcji1w wXaCO6xKqxVWgY02Nm11A5xJv73hwRBEO2kOpCDFcaiULIH209XqdtJLhU8wy0yMoM7s Pk/SyDJaQO/g9vor5393sil8ywJzthXC9jpOFGtG5sBjiAXhP3mJImoYA5H+eixmnO87 puKABJ1wTsW6fIhfhsnq8yBWLy9++mmV4AUYYkNugta5lf0gBoCGD5HX9WTdLsqR05zR pRvqQEpT3OZNQaEZRPehsnpsyc8b/L9L2ZRy5No9ubtkeRczo0k9pr700kia7i4Ue0uo k2Mw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:reply-to:mime-version:content-transfer-encoding; bh=sHi7fQIlhSpnYQKERpgUzGsPLFCPvF/LSq6qevCnhiI=; b=BcnW6PdN1TPd/otiKQGLwwI2Wao3swvrrW030b6OKr+tCfBmw/TEegsadYsW0+QCJp 5zPQLPpTQEQxENF78NrPKB0aDy7RLSa2uEcnOdhFRxQxZp7WGBgBLIDYAopwgtqQzttJ 1Z5r70sg0/zTePVWMyPKu4WohvahalZMB5O54ROYW3OjKhAP6FeVQA5W8RTGrYLVONpK eO0VDaB2a5w6yeRpj07FJka/dzcrXN3K1qSsBZSMpSm8gVYcFWfMg2lLDUKsOJByz9JL xfnMFkDunGWnXvbKPhQZE9jTlV3//xJbkEaWsnmtnQMiumK8fq8H/QadZ3iDH3aV67Ff G81w== X-Gm-Message-State: AOAM532HFxjkkZmLX/9AFJ2AhY7yOHT+hvQ7SJIJgOUQWPronKidG8xy CNGbI+2gNKxAoK72qpegz1hWGRjN/tqnog== X-Google-Smtp-Source: ABdhPJw/1BmSipZvNDtwaPdgkzB2DfRFTGJb78Y9LFNEO0vW4I3HEV0BPKCm4sxX9j2YHY9iCYZr0w== X-Received: by 2002:a05:6402:100e:: with SMTP id c14mr32741447edu.243.1605857046134; Thu, 19 Nov 2020 23:24:06 -0800 (PST) Received: from sblaptop.fritz.box (ipbcc1aa4b.dynamic.kabel-deutschland.de. [188.193.170.75]) by smtp.gmail.com with ESMTPSA id lz27sm779419ejb.39.2020.11.19.23.24.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 Nov 2020 23:24:05 -0800 (PST) From: Andreas Rheinhardt To: ffmpeg-devel@ffmpeg.org Date: Fri, 20 Nov 2020 08:18:34 +0100 Message-Id: <20201120072116.818090-2-andreas.rheinhardt@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201120072116.818090-1-andreas.rheinhardt@gmail.com> References: <20201120072116.818090-1-andreas.rheinhardt@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2 001/162] avcodec/bitstream: Add second function to create VLCs X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Andreas Rheinhardt Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" When using ff_init_vlc_sparse() to create a VLC, three input tables are used: A table for lengths, one for codes and one for symbols; the latter one can be omitted, then a default one will be used. These input tables will be traversed twice, once to get the long codes (which will be put into subtables) and once for the small codes. The long codes are then sorted so that entries that should be in the same subtable are contiguous. This commit adds an alternative to ff_init_vlc_sparse(): ff_init_vlc_from_lengths(). It is based upon the observation that if lengths, codes and symbols tables are permuted (in the same way) so that the codes are ordered from left to right in the corresponding tree and if said tree is complete (i.e. every non-leaf node has two children), the codes can be easily computed from the lengths and are therefore redundant. This means that if one initializes such a VLC with explicitly coded lengths, codes and symbols, the codes can be avoided; and even if one has no explicitly coded symbols, it might still be beneficial to remove the codes even when one has to add a new symbol table, because codes are typically longer than symbols so that the latter often fit into a smaller type, saving space. Furthermore, given that the codes here are by definition ordered from left to right, it is unnecessary to sort them again; for the same reason, one does not have to traverse the input twice. This function proved to be faster than ff_init_vlc_sparse() whenever it has been benchmarked. This function is usable for static tables (they can simply be permuted once) as well as in scenarios where the tables are naturally ordered from left to right in the tree; the latter e.g. happens with Smacker, Theora and several other formats. In order to make it also usable for (static) tables with incomplete trees, negative lengths are used to indicate that there is an open end of a certain length. Finally, ff_init_vlc_from_lengths() has one downside compared to ff_init_vlc_sparse(): The latter uses tables that can be reused by encoders. Of course, one could calculate the needed table at runtime if one so wishes, but it is nevertheless an obstacle. Signed-off-by: Andreas Rheinhardt --- With logcontext as desired. libavcodec/bitstream.c | 138 ++++++++++++++++++++++++++++++----------- libavcodec/vlc.h | 48 ++++++++++++++ 2 files changed, 150 insertions(+), 36 deletions(-) diff --git a/libavcodec/bitstream.c b/libavcodec/bitstream.c index c7a87734e5..4ce49fd51c 100644 --- a/libavcodec/bitstream.c +++ b/libavcodec/bitstream.c @@ -132,6 +132,8 @@ static int alloc_table(VLC *vlc, int size, int use_static) return index; } +#define LOCALBUF_ELEMS 1500 // the maximum currently needed is 1296 by rv34 + typedef struct VLCcode { uint8_t bits; VLC_TYPE symbol; @@ -140,6 +142,31 @@ typedef struct VLCcode { uint32_t code; } VLCcode; +static int vlc_common_init(VLC *vlc_arg, int nb_bits, int nb_codes, + VLC **vlc, VLC *localvlc, VLCcode **buf, + int flags) +{ + *vlc = vlc_arg; + (*vlc)->bits = nb_bits; + if (flags & INIT_VLC_USE_NEW_STATIC) { + av_assert0(nb_codes <= LOCALBUF_ELEMS); + *localvlc = *vlc_arg; + *vlc = localvlc; + (*vlc)->table_size = 0; + } else { + (*vlc)->table = NULL; + (*vlc)->table_allocated = 0; + (*vlc)->table_size = 0; + } + if (nb_codes > LOCALBUF_ELEMS) { + *buf = av_malloc_array(nb_codes, sizeof(VLCcode)); + if (!*buf) + return AVERROR(ENOMEM); + } + + return 0; +} + static int compare_vlcspec(const void *a, const void *b) { const VLCcode *sa = a, *sb = b; @@ -248,6 +275,27 @@ static int build_table(VLC *vlc, int table_nb_bits, int nb_codes, return table_index; } +static int vlc_common_end(VLC *vlc, int nb_bits, int nb_codes, VLCcode *codes, + int flags, VLC *vlc_arg, VLCcode localbuf[LOCALBUF_ELEMS]) +{ + int ret = build_table(vlc, nb_bits, nb_codes, codes, flags); + + if (flags & INIT_VLC_USE_NEW_STATIC) { + if(vlc->table_size != vlc->table_allocated) + av_log(NULL, AV_LOG_ERROR, "needed %d had %d\n", vlc->table_size, vlc->table_allocated); + + av_assert0(ret >= 0); + *vlc_arg = *vlc; + } else { + if (codes != localbuf) + av_free(codes); + if (ret < 0) { + av_freep(&vlc->table); + return ret; + } + } + return 0; +} /* Build VLC decoding tables suitable for use with get_vlc(). @@ -278,30 +326,14 @@ int ff_init_vlc_sparse(VLC *vlc_arg, int nb_bits, int nb_codes, const void *symbols, int symbols_wrap, int symbols_size, int flags) { - VLCcode *buf; + VLCcode localbuf[LOCALBUF_ELEMS], *buf = localbuf; int i, j, ret; - VLCcode localbuf[1500]; // the maximum currently needed is 1296 by rv34 VLC localvlc, *vlc; - vlc = vlc_arg; - vlc->bits = nb_bits; - if (flags & INIT_VLC_USE_NEW_STATIC) { - av_assert0(nb_codes <= FF_ARRAY_ELEMS(localbuf)); - localvlc = *vlc_arg; - vlc = &localvlc; - vlc->table_size = 0; - } else { - vlc->table = NULL; - vlc->table_allocated = 0; - vlc->table_size = 0; - } - if (nb_codes > FF_ARRAY_ELEMS(localbuf)) { - buf = av_malloc_array(nb_codes, sizeof(VLCcode)); - if (!buf) - return AVERROR(ENOMEM); - } else - buf = localbuf; - + ret = vlc_common_init(vlc_arg, nb_bits, nb_codes, &vlc, &localvlc, + &buf, flags); + if (ret < 0) + return ret; av_assert0(symbols_size <= 2 || !symbols); j = 0; @@ -342,26 +374,60 @@ int ff_init_vlc_sparse(VLC *vlc_arg, int nb_bits, int nb_codes, COPY(len && len <= nb_bits); nb_codes = j; - ret = build_table(vlc, nb_bits, nb_codes, buf, flags); - - if (flags & INIT_VLC_USE_NEW_STATIC) { - if(vlc->table_size != vlc->table_allocated) - av_log(NULL, AV_LOG_ERROR, "needed %d had %d\n", vlc->table_size, vlc->table_allocated); + return vlc_common_end(vlc, nb_bits, nb_codes, buf, + flags, vlc_arg, localbuf); +} - av_assert0(ret >= 0); - *vlc_arg = *vlc; - } else { - if (buf != localbuf) - av_free(buf); - if (ret < 0) { - av_freep(&vlc->table); - return ret; +int ff_init_vlc_from_lengths(VLC *vlc_arg, int nb_bits, int nb_codes, + const int8_t *lens, int lens_wrap, + const void *symbols, int symbols_wrap, int symbols_size, + int offset, int flags, void *logctx) +{ + VLCcode localbuf[LOCALBUF_ELEMS], *buf = localbuf; + VLC localvlc, *vlc; + uint64_t code; + int ret, j, len_max = FFMIN(32, 3 * nb_bits); + + ret = vlc_common_init(vlc_arg, nb_bits, nb_codes, &vlc, &localvlc, + &buf, flags); + if (ret < 0) + return ret; + + j = code = 0; + for (int i = 0; i < nb_codes; i++, lens += lens_wrap) { + int len = *lens; + if (len > 0) { + unsigned sym; + + buf[j].bits = len; + if (symbols) + GET_DATA(sym, symbols, i, symbols_wrap, symbols_size) + else + sym = i; + buf[j].symbol = sym + offset; + buf[j++].code = code; + } else if (len < 0) { + len = -len; + } else + continue; + if (len > len_max || code & ((1U << (32 - len)) - 1)) { + av_log(logctx, AV_LOG_ERROR, "Invalid VLC (length %u)\n", len); + goto fail; + } + code += 1U << (32 - len); + if (code > UINT32_MAX + 1ULL) { + av_log(logctx, AV_LOG_ERROR, "Overdetermined VLC tree\n"); + goto fail; } } - return 0; + return vlc_common_end(vlc, nb_bits, j, buf, + flags, vlc_arg, localbuf); +fail: + if (buf != localbuf) + av_free(buf); + return AVERROR_INVALIDDATA; } - void ff_free_vlc(VLC *vlc) { av_freep(&vlc->table); diff --git a/libavcodec/vlc.h b/libavcodec/vlc.h index 22d3e33485..50a1834b84 100644 --- a/libavcodec/vlc.h +++ b/libavcodec/vlc.h @@ -49,6 +49,41 @@ int ff_init_vlc_sparse(VLC *vlc, int nb_bits, int nb_codes, const void *codes, int codes_wrap, int codes_size, const void *symbols, int symbols_wrap, int symbols_size, int flags); + +/** + * Build VLC decoding tables suitable for use with get_vlc2() + * + * This function takes lengths and symbols and calculates the codes from them. + * For this the input lengths and symbols have to be sorted according to "left + * nodes in the corresponding tree first". + * + * @param[in,out] vlc The VLC to be initialized; table and table_allocated + * must have been set when initializing a static VLC, + * otherwise this will be treated as uninitialized. + * @param[in] nb_bits The number of bits to use for the VLC table; + * higher values take up more memory and cache, but + * allow to read codes with fewer reads. + * @param[in] nb_codes The number of provided length and (if supplied) symbol + * entries. + * @param[in] lens The lengths of the codes. Entries > 0 correspond to + * valid codes; entries == 0 will be skipped and entries + * with len < 0 indicate that the tree is incomplete and + * has an open end of length -len at this position. + * @param[in] lens_wrap Stride (in bytes) of the lengths. + * @param[in] symbols The symbols, i.e. what is returned from get_vlc2() + * when the corresponding code is encountered. + * May be NULL, then 0, 1, 2, 3, 4,... will be used. + * @param[in] symbols_wrap Stride (in bytes) of the symbols. + * @param[in] symbols_size Size of the symbols. 1 and 2 are supported. + * @param[in] offset An offset to apply to all the valid symbols. + * @param[in] flags A combination of the INIT_VLC_* flags; notice that + * INIT_VLC_INPUT_LE is pointless and ignored. + */ +int ff_init_vlc_from_lengths(VLC *vlc, int nb_bits, int nb_codes, + const int8_t *lens, int lens_wrap, + const void *symbols, int symbols_wrap, int symbols_size, + int offset, int flags, void *logctx); + void ff_free_vlc(VLC *vlc); /* If INIT_VLC_INPUT_LE is set, the LSB bit of the codes used to @@ -87,4 +122,17 @@ void ff_free_vlc(VLC *vlc); #define INIT_LE_VLC_STATIC(vlc, bits, a, b, c, d, e, f, g, static_size) \ INIT_LE_VLC_SPARSE_STATIC(vlc, bits, a, b, c, d, e, f, g, NULL, 0, 0, static_size) +#define INIT_VLC_STATIC_FROM_LENGTHS(vlc, bits, nb_codes, lens, len_wrap, \ + symbols, symbols_wrap, symbols_size, \ + offset, flags, static_size) \ + do { \ + static VLC_TYPE table[static_size][2]; \ + (vlc)->table = table; \ + (vlc)->table_allocated = static_size; \ + ff_init_vlc_from_lengths(vlc, bits, nb_codes, lens, len_wrap, \ + symbols, symbols_wrap, symbols_size, \ + offset, flags | INIT_VLC_USE_NEW_STATIC, \ + NULL); \ + } while (0) + #endif /* AVCODEC_VLC_H */