From patchwork Fri Nov 20 07:19:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andreas Rheinhardt X-Patchwork-Id: 23783 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id B8B6E44AD2C for ; Fri, 20 Nov 2020 09:30:09 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 04F4468BA80; Fri, 20 Nov 2020 09:24:54 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-ed1-f68.google.com (mail-ed1-f68.google.com [209.85.208.68]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id ED0B468BA3B for ; Fri, 20 Nov 2020 09:24:43 +0200 (EET) Received: by mail-ed1-f68.google.com with SMTP id cf17so4914430edb.2 for ; Thu, 19 Nov 2020 23:24:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; bh=Os4bjlAqnM/5vs2UMVsa7g9kpzZ0SeOqgaQh+xNNzpM=; b=uP2sUSx0Ec2PYI2y5IeUNEDcKcnxoKu9VsQ2+0N7jmb+CYx6MAXcBaQpvgPq9d/2UG Qg+w/r7w1kRk8ILngMQPyCs+tyS5SJaWFLe0IzsHU9MVwG1qNOg1N4k+BU5+vPh8I+nP Rm3tSCMFgiw2Ye2p5jP79eP/j3F0DAb4Rr6TGp6OIpt7BfJZv9LJsyRVTadCmeSpQufE PMhzdDk4s9/kVexZwkfMkHqe0hvV3BU5HvDyecRayk4K3xXbJkgbnsLfn59ORX/21dSk YVt/o98YB6JXkYVTWy40b4DS+tqji/hxfnY32pbhykrkI+Cp9yp6xqR7Q5GiJjpI/0ci 7vpg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:reply-to:mime-version:content-transfer-encoding; bh=Os4bjlAqnM/5vs2UMVsa7g9kpzZ0SeOqgaQh+xNNzpM=; b=DUmFG6oYRt7MN5nOWqBQCw63Cs6lRkON4sAn4TRCh1mmKjWPPzWtGFor7ccBiCMtXB B753ZLouvc8C/7g/2cFFPwHY/XczqwhRInrif7C3GLBz6PcmGCs60pVxGdd4uSP7xks4 Jv5CYrVPlq0ORG0WBBTPCabaG2hV1xhaUklvRo2AjoW+2Pq4zESZiZ4byH2h5fHv7W61 Ne5XheluNeY4o0WsZtleyvA9j0LCVSEFSgHdYRfLmtNBt8+rgA9a5MUzEA5KawZpfFfn 9zHKOR1RKMxi9XWK2U9Vco6x0JqdamR5CLjmhEbnEAHBhju5Y7QJznom1HGXAYSJJ0oJ ysNQ== X-Gm-Message-State: AOAM532Nx5WftDj8yh6xfVaFyCJB6nO82fCHgjgIrdFrM4PAyGcmJ4ih oqE+JvLKTtJhpEHsr4C42i4TPHNYVrDeWg== X-Google-Smtp-Source: ABdhPJyhwxpL10pfC50i8fqUMtkwHn4SV2Jfsy28PMYgHZnuRKZJynYh6inn+YPqEZZEZY9hIWDg7g== X-Received: by 2002:a05:6402:1358:: with SMTP id y24mr23333962edw.291.1605857082852; Thu, 19 Nov 2020 23:24:42 -0800 (PST) Received: from sblaptop.fritz.box (ipbcc1aa4b.dynamic.kabel-deutschland.de. [188.193.170.75]) by smtp.gmail.com with ESMTPSA id lz27sm779419ejb.39.2020.11.19.23.24.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 Nov 2020 23:24:42 -0800 (PST) From: Andreas Rheinhardt To: ffmpeg-devel@ffmpeg.org Date: Fri, 20 Nov 2020 08:19:05 +0100 Message-Id: <20201120072116.818090-33-andreas.rheinhardt@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201120072116.818090-1-andreas.rheinhardt@gmail.com> References: <20201120072116.818090-1-andreas.rheinhardt@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2 032/162] avcodec/rv10: Reduce number of exceptions when reading VLC value X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Andreas Rheinhardt Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" RealVideo 1.0 uses an insane way to encode DC coefficients: There are several symbols that (for no good reason whatsoever) have multiple encodings, leading to longer codes than necessary. More specifically, the tree for the 256 luma symbols contains 255 codes belonging to 255 different symbols on the left; going further right, the tree consists of two blocks of 128 codes each of length 14 encoding consecutive numbers (including two encodings for the symbol missing among the 255 codes on the left); this is followed by two blocks of codes of length 16 each containing 256 elements with consecutive symbols (i.e. each of the blocks allows to encode all symbols). The rest of the tree consists of 2^11 codes that all encode the same symbol. The tree for the 256 chroma symbols is similar, but is missing the blocks of length 256 and there are only 2^9 consecutive codes that encode the same symbol; furthermore, the chroma tree is incomplete: The right-most node has no right child. All of this caused problems when parsing these codes; the reason is that the code for this predates commit b613bacca9c256f1483c46847f713e47a0e9a5f6 which added support for explicit symbol tables and thereby removed the requirement that different codes have different symbols. In order to address this, the trees used for parsing were incomplete: They contained the 255 codes on the left and one code for the remaining symbol. Whenever a code not in these trees was encountered, it was dealt with in special cases (one for each of the blocks mentioned above). This commit reduces the number of special cases: Using a symbols table allows to treat the blocks of consecutive symbols like ordinary codes; only the blocks encoding a single symbol are still treated specially (in order not to waste memory on tables for them). In order to not increment the size of the tables used to initialize the VLCs both the symbols as well as the lengths are now run-length encoded. Signed-off-by: Andreas Rheinhardt --- libavcodec/rv10.c | 137 ++++++++++++++++++---------------------------- 1 file changed, 53 insertions(+), 84 deletions(-) diff --git a/libavcodec/rv10.c b/libavcodec/rv10.c index d1a02d9578..eaabb99e69 100644 --- a/libavcodec/rv10.c +++ b/libavcodec/rv10.c @@ -46,6 +46,7 @@ #define RV_GET_MINOR_VER(x) (((x) >> 20) & 0xFF) #define RV_GET_MICRO_VER(x) (((x) >> 12) & 0xFF) +#define MAX_VLC_ENTRIES 1023 // Note: Does not include the skip entries. #define DC_VLC_BITS 14 // FIXME find a better solution typedef struct RVDecContext { @@ -54,63 +55,25 @@ typedef struct RVDecContext { int orig_width, orig_height; } RVDecContext; -/* The elements with negative length in the bits table correspond to - * open ends in the respective Huffman tree. */ -static const uint8_t rv_sym[] = { - 128, 127, 129, 125, 126, 130, 131, 121, 122, 123, 124, 132, 133, 134, 135, - 113, 114, 115, 116, 117, 118, 119, 120, 136, 137, 138, 139, 140, 141, 142, - 143, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, - 111, 112, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, - 157, 158, 159, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, - 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, - 92, 93, 94, 95, 96, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, - 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, - 185, 186, 187, 188, 189, 190, 191, 1, 2, 3, 4, 5, 6, 7, 8, - 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, - 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, - 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, - 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 192, 193, 194, 195, - 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, - 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, - 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, - 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, - 0, 0, 0, 0, 0, 0, 0, 0, +/* (run, length) encoded value for the symbols table. The actual symbols + * are run..run + length (mod 256). + * The last two entries in the following table apply to luma only. + * The skip values are not included in this list. */ +static const uint8_t rv_sym_run_len[][2] = { + { 128, 0 }, { 127, 0 }, { 129, 0 }, { 125, 1 }, { 130, 1 }, + { 121, 3 }, { 132, 3 }, { 113, 7 }, { 136, 7 }, { 97, 15 }, + { 144, 15 }, { 65, 31 }, { 160, 31 }, { 1, 63 }, { 192, 63 }, + { 129, 127 }, { 0, 127 }, { 1, 255 }, { 0, 255 }, }; -static const uint8_t rv_lum_len[] = { - 2, 4, 4, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, - 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, - 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, - 8, 8, 8, 8, 8, 8, 8, 8, 8, 10, 10, 10, 10, 10, 10, 10, 10, 10, - 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, - 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, - 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, - 10, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, - 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, - 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, - 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, - 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, - 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, - 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, - 12, 12, 12, -8, -9,-10,-11,-12,-13,-14, 14, +/* entry[i] of the following tables gives + * the number of VLC codes of length i + 2. */ +static const uint16_t rv_lum_len_count[15] = { + 1, 0, 2, 4, 8, 16, 32, 0, 64, 0, 128, 0, 256, 0, 512, }; -static const uint8_t rv_chrom_len[] = { - 2, 3, 3, 4, 4, 4, 4, 6, 6, 6, 6, 6, 6, 6, 6, 8, 8, 8, - 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 10, 10, 10, 10, 10, - 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, - 10, 10, 10, 10, 10, 10, 10, 10, 10, 12, 12, 12, 12, 12, 12, 12, 12, 12, - 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, - 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, - 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, - 12, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, - 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, - 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, - 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, - 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, - 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, - 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, - 14, 14, 14,-10,-11,-12,-13,-14,-15,-16, 16, +static const uint16_t rv_chrom_len_count[15] = { + 1, 2, 4, 0, 8, 0, 16, 0, 32, 0, 64, 0, 128, 0, 256, }; static VLC rv_dc_lum, rv_dc_chrom; @@ -122,37 +85,18 @@ int ff_rv_decode_dc(MpegEncContext *s, int n) if (n < 4) { code = get_vlc2(&s->gb, rv_dc_lum.table, DC_VLC_BITS, 2); if (code < 0) { - /* XXX: I don't understand why they use LONGER codes than - * necessary. The following code would be completely useless - * if they had thought about it !!! */ - code = get_bits(&s->gb, 7); - if (code == 0x7c) { - code = (int8_t) (get_bits(&s->gb, 7) + 1); - } else if (code == 0x7d) { - code = -128 + get_bits(&s->gb, 7); - } else if (code == 0x7e) { - if (get_bits1(&s->gb) == 0) - code = (int8_t) (get_bits(&s->gb, 8) + 1); - else - code = (int8_t) (get_bits(&s->gb, 8)); - } else if (code == 0x7f) { - skip_bits(&s->gb, 11); - code = 1; - } + /* Skip entry - no error. */ + skip_bits(&s->gb, 18); + code = 1; } else { code -= 128; } } else { code = get_vlc2(&s->gb, rv_dc_chrom.table, DC_VLC_BITS, 2); - /* same remark */ if (code < 0) { - code = get_bits(&s->gb, 9); - if (code == 0x1fc) { - code = (int8_t) (get_bits(&s->gb, 7) + 1); - } else if (code == 0x1fd) { - code = -128 + get_bits(&s->gb, 7); - } else if (code == 0x1fe) { - skip_bits(&s->gb, 9); + if (show_bits(&s->gb, 9) == 0x1FE) { + /* Skip entry - no error. */ + skip_bits(&s->gb, 18); code = 1; } else { av_log(s->avctx, AV_LOG_ERROR, "chroma dc error\n"); @@ -383,14 +327,39 @@ static int rv20_decode_picture_header(RVDecContext *rv) return s->mb_width * s->mb_height - mb_pos; } +static av_cold void rv10_build_vlc(VLC *vlc, const uint16_t len_count[15], + const uint8_t sym_rl[][2], int sym_rl_elems) +{ + uint16_t syms[MAX_VLC_ENTRIES]; + uint8_t lens[MAX_VLC_ENTRIES]; + unsigned nb_syms = 0, nb_lens = 0; + + for (unsigned i = 0; i < sym_rl_elems; i++) { + unsigned cur_sym = sym_rl[i][0]; + for (unsigned tmp = nb_syms + sym_rl[i][1]; nb_syms <= tmp; nb_syms++) + syms[nb_syms] = 0xFF & cur_sym++; + } + + for (unsigned i = 0; i < 15; i++) + for (unsigned tmp = nb_lens + len_count[i]; nb_lens < tmp; nb_lens++) + lens[nb_lens] = i + 2; + av_assert1(nb_lens == nb_syms); + ff_init_vlc_from_lengths(vlc, DC_VLC_BITS, nb_lens, lens, 1, + syms, 2, 2, 0, INIT_VLC_STATIC_OVERLONG, NULL); +} + static av_cold void rv10_init_static(void) { - INIT_VLC_STATIC_FROM_LENGTHS(&rv_dc_lum, DC_VLC_BITS, 263, - rv_lum_len, 1, - rv_sym, 1, 1, 0, 0, 16384); - INIT_VLC_STATIC_FROM_LENGTHS(&rv_dc_chrom, DC_VLC_BITS, 263, - rv_chrom_len, 1, - rv_sym, 1, 1, 0, 0, 16388); + static VLC_TYPE table[16896 + 16640][2]; + + rv_dc_lum.table = table; + rv_dc_lum.table_allocated = 16896; + rv10_build_vlc(&rv_dc_lum, rv_lum_len_count, + rv_sym_run_len, FF_ARRAY_ELEMS(rv_sym_run_len)); + rv_dc_chrom.table = &table[16896]; + rv_dc_chrom.table_allocated = 16640; + rv10_build_vlc(&rv_dc_chrom, rv_chrom_len_count, + rv_sym_run_len, FF_ARRAY_ELEMS(rv_sym_run_len) - 2); ff_h263_decode_init_vlc(); }