From patchwork Wed Apr 10 13:31:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anton Khirnov X-Patchwork-Id: 47996 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:c90a:b0:1a7:a0dc:8de5 with SMTP id gx10csp557283pzb; Wed, 10 Apr 2024 06:32:41 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCVdSe76YaMisEGg/ngWZOXnlwMdr3z07M2lLlP/yWweJ4oHU+gwpARJWFqWH54k6JSRVQCM8G8KHF7gykzdxG+npuP51U62byZjnw== X-Google-Smtp-Source: AGHT+IGFGZuZjOER+as0qen6de7fDNeOu8Axk/224wcdy7vECrXkDh30SXlld6Y9ekeFbwBZjEkr X-Received: by 2002:ac2:5bc8:0:b0:513:a479:3ad9 with SMTP id u8-20020ac25bc8000000b00513a4793ad9mr1630979lfn.55.1712755961653; Wed, 10 Apr 2024 06:32:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1712755961; cv=none; d=google.com; s=arc-20160816; b=pgep8x95kADtXLG/0q4J+YDf46JFHY7z4sDx/+NccFRj/22jFqrO5mCeXnep6geBDc FUk+q/DdyPWJDUpip5+secBz+eMT46fi9hJ7R8HEr/+SH3exaClMkN3Cxm47Pt6MvZYc dP4VIBK+keHdtm8vNf/5sRX811EjyFDGsANV7GdImmFU4dOeQ5CjatkpWoVccLSKBex0 yjHemsLVvQpEtmDCe1apUdMEUSRIYUG7fVxOsjriYMfi2aPkYVpKKH00p2HQU5lCEWTG yqutRMTiRxuCpsYl44w36BUYI2TVV2WndoKgQvMeNQ67qzrSJMKiNbFvtNcH36FTyolK 2zpQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=6MW1u7biWuo5J8daYi7vD/Aa9srQ7jIUgTDEs9hLvjE=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=BIQGWnbMk5GL6fki23qd0pb91IavedFCSEK+p3m8rv1n4Zwa9+XKiK2ZmbIDQbuwhS 0OlvTbHibFdpQRaerZjmBK3I2VxsjIzEdr1yVZzot3D/sT8lxR0ozOR2PV+rPH2Kbxe6 RErL9QGeSWDkWmMZY0WFdn37RPvZ4KtHSBGAtM/mmCRqi1n6CTbQnBgSFOw5bNqlnhvX zh1DYpoUMp2Dng6kvJfVqgAkeqNEsYRLEFsg8EaB4Zc6vyUsnmZ2SrF5DaR6xJlDR0mS L1LcBjJSce+tDPOHJeBFY9p3JcKwirV6lobpVRzPHsCxNxp3+17fPVgvFHGBZlB1V7cW GJxQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@khirnov.net header.s=mail header.b=pxmOUcHM; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id c2-20020a056512104200b0051700381c4bsi2333763lfb.482.2024.04.10.06.32.41; Wed, 10 Apr 2024 06:32:41 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@khirnov.net header.s=mail header.b=pxmOUcHM; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 57B5968D1C4; Wed, 10 Apr 2024 16:31:41 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail1.khirnov.net (quelana.khirnov.net [94.230.150.81]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 4E60568CF68 for ; Wed, 10 Apr 2024 16:31:33 +0300 (EEST) Authentication-Results: mail1.khirnov.net; dkim=pass (2048-bit key; unprotected) header.d=khirnov.net header.i=@khirnov.net header.a=rsa-sha256 header.s=mail header.b=pxmOUcHM; dkim-atps=neutral Received: from localhost (mail1.khirnov.net [IPv6:::1]) by mail1.khirnov.net (Postfix) with ESMTP id 4BDA24D78 for ; Wed, 10 Apr 2024 15:31:28 +0200 (CEST) Received: from mail1.khirnov.net ([IPv6:::1]) by localhost (mail1.khirnov.net [IPv6:::1]) (amavis, port 10024) with ESMTP id tSQ1iLexh4Pu for ; Wed, 10 Apr 2024 15:31:27 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=khirnov.net; s=mail; t=1712755885; bh=5vVEyirYXPopdj9r03tB8hgQ/4c1t6UUyaMMPR0DxRs=; h=From:To:Subject:Date:In-Reply-To:References:From; b=pxmOUcHM0xpzyvwkohgbZiw4xwOZF1RNOqF64g2gwzA4PqALYfDQtwNMrvbz/XGEA UJXcAfDG3UA06yGUnogq8PQJpz7+lDg9vnEFriSxwDBvgy8RsXdYaK0epz3vDlyswn e4/+76oq8tgkFcEwkOXVk9nepuHYZBn1s+JOVM+m+xwq+/Do/Z/rfC+ztNoY8goDkS qKSM7UfVHI+XVNWRU1LkDG7MVeWKKYztWqQ9CDftGSbeWyp4UfvqgDjML3SryIPfZ5 8Wf1TKBeh4xYv9uDUc3xtoCxfdaJUtqBybJOVmdX1mbdtSHMIM4HUAjWP7e8wr3e28 b7Q/IsOBg+CwA== Received: from libav.khirnov.net (libav.khirnov.net [IPv6:2a00:c500:561:201::7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "libav.khirnov.net", Issuer "smtp.khirnov.net SMTP CA" (verified OK)) by mail1.khirnov.net (Postfix) with ESMTPS id EBD0B4D82 for ; Wed, 10 Apr 2024 15:31:24 +0200 (CEST) Received: from libav.khirnov.net (libav.khirnov.net [IPv6:::1]) by libav.khirnov.net (Postfix) with ESMTP id DFF163A059B for ; Wed, 10 Apr 2024 15:31:24 +0200 (CEST) From: Anton Khirnov To: ffmpeg-devel@ffmpeg.org Date: Wed, 10 Apr 2024 15:31:17 +0200 Message-ID: <20240410133118.28144-9-anton@khirnov.net> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240410133118.28144-1-anton@khirnov.net> References: <20240410133118.28144-1-anton@khirnov.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 09/10] lavc/hevc_ps: reduce the size of ShortTermRPS.used X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: Cyb7DbHAi4B3 It is currently an array of 32 uint8_t, each storing a single flag. A single uint32_t is sufficient. Reduces sizeof(HEVCSPS) by 1792 bytes. --- libavcodec/hevc_ps.c | 33 +++++++++++++++++++-------------- libavcodec/hevc_ps.h | 2 +- libavcodec/hevc_refs.c | 6 +++--- libavcodec/vulkan_hevc.c | 8 ++++---- 4 files changed, 27 insertions(+), 22 deletions(-) diff --git a/libavcodec/hevc_ps.c b/libavcodec/hevc_ps.c index a6b0021bc3..76fe507e7b 100644 --- a/libavcodec/hevc_ps.c +++ b/libavcodec/hevc_ps.c @@ -107,6 +107,7 @@ int ff_hevc_decode_short_term_rps(GetBitContext *gb, AVCodecContext *avctx, int k = 0; int i; + rps->used = 0; rps->rps_predict = 0; if (rps != sps->st_rps && sps->nb_st_rps) @@ -114,6 +115,7 @@ int ff_hevc_decode_short_term_rps(GetBitContext *gb, AVCodecContext *avctx, if (rps->rps_predict) { const ShortTermRPS *rps_ridx; + uint8_t used[32] = { 0 }; int delta_rps; if (is_slice_header) { @@ -139,13 +141,13 @@ int ff_hevc_decode_short_term_rps(GetBitContext *gb, AVCodecContext *avctx, } delta_rps = (1 - (rps->delta_rps_sign << 1)) * rps->abs_delta_rps; for (i = 0; i <= rps_ridx->num_delta_pocs; i++) { - int used = rps->used[k] = get_bits1(gb); + used[k] = get_bits1(gb); rps->use_delta_flag = 0; - if (!used) + if (!used[k]) rps->use_delta_flag = get_bits1(gb); - if (used || rps->use_delta_flag) { + if (used[k] || rps->use_delta_flag) { if (i < rps_ridx->num_delta_pocs) delta_poc = delta_rps + rps_ridx->delta_poc[i]; else @@ -157,7 +159,7 @@ int ff_hevc_decode_short_term_rps(GetBitContext *gb, AVCodecContext *avctx, } } - if (k >= FF_ARRAY_ELEMS(rps->used)) { + if (k >= FF_ARRAY_ELEMS(used)) { av_log(avctx, AV_LOG_ERROR, "Invalid num_delta_pocs: %d\n", k); return AVERROR_INVALIDDATA; @@ -167,35 +169,38 @@ int ff_hevc_decode_short_term_rps(GetBitContext *gb, AVCodecContext *avctx, rps->num_negative_pics = k0; // sort in increasing order (smallest first) if (rps->num_delta_pocs != 0) { - int used, tmp; + int u, tmp; for (i = 1; i < rps->num_delta_pocs; i++) { delta_poc = rps->delta_poc[i]; - used = rps->used[i]; + u = used[i]; for (k = i - 1; k >= 0; k--) { tmp = rps->delta_poc[k]; if (delta_poc < tmp) { rps->delta_poc[k + 1] = tmp; - rps->used[k + 1] = rps->used[k]; + used[k + 1] = used[k]; rps->delta_poc[k] = delta_poc; - rps->used[k] = used; + used[k] = u; } } } } if ((rps->num_negative_pics >> 1) != 0) { - int used; + int u; k = rps->num_negative_pics - 1; // flip the negative values to largest first for (i = 0; i < rps->num_negative_pics >> 1; i++) { delta_poc = rps->delta_poc[i]; - used = rps->used[i]; + u = used[i]; rps->delta_poc[i] = rps->delta_poc[k]; - rps->used[i] = rps->used[k]; + used[i] = used[k]; rps->delta_poc[k] = delta_poc; - rps->used[k] = used; + used[k] = u; k--; } } + + for (unsigned i = 0; i < FF_ARRAY_ELEMS(used); i++) + rps->used |= used[i] * (1 << i); } else { unsigned int nb_positive_pics; @@ -222,7 +227,7 @@ int ff_hevc_decode_short_term_rps(GetBitContext *gb, AVCodecContext *avctx, } prev -= delta_poc; rps->delta_poc[i] = prev; - rps->used[i] = get_bits1(gb); + rps->used |= get_bits1(gb) * (1 << i); } prev = 0; for (i = 0; i < nb_positive_pics; i++) { @@ -235,7 +240,7 @@ int ff_hevc_decode_short_term_rps(GetBitContext *gb, AVCodecContext *avctx, } prev += delta_poc; rps->delta_poc[rps->num_negative_pics + i] = prev; - rps->used[rps->num_negative_pics + i] = get_bits1(gb); + rps->used |= get_bits1(gb) * (1 << (rps->num_negative_pics + i)); } } } diff --git a/libavcodec/hevc_ps.h b/libavcodec/hevc_ps.h index 6ef29a8ea7..92b85115f7 100644 --- a/libavcodec/hevc_ps.h +++ b/libavcodec/hevc_ps.h @@ -79,7 +79,7 @@ typedef struct ShortTermRPS { int num_delta_pocs; int rps_idx_num_delta_pocs; int32_t delta_poc[32]; - uint8_t used[32]; + uint32_t used; } ShortTermRPS; typedef struct HEVCWindow { diff --git a/libavcodec/hevc_refs.c b/libavcodec/hevc_refs.c index aed649933d..19f3fa81da 100644 --- a/libavcodec/hevc_refs.c +++ b/libavcodec/hevc_refs.c @@ -501,7 +501,7 @@ int ff_hevc_frame_rps(HEVCContext *s) int poc = s->poc + short_rps->delta_poc[i]; int list; - if (!short_rps->used[i]) + if (!(short_rps->used & (1 << i))) list = ST_FOLL; else if (i < short_rps->num_negative_pics) list = ST_CURR_BEF; @@ -540,9 +540,9 @@ int ff_hevc_frame_nb_refs(const HEVCContext *s) if (rps) { for (i = 0; i < rps->num_negative_pics; i++) - ret += !!rps->used[i]; + ret += !!(rps->used & (1 << i)); for (; i < rps->num_delta_pocs; i++) - ret += !!rps->used[i]; + ret += !!(rps->used & (1 << i)); } if (long_rps) { diff --git a/libavcodec/vulkan_hevc.c b/libavcodec/vulkan_hevc.c index 5d7c6b1b64..c2b65fc201 100644 --- a/libavcodec/vulkan_hevc.c +++ b/libavcodec/vulkan_hevc.c @@ -374,17 +374,17 @@ static void set_sps(const HEVCSPS *sps, int sps_idx, /* NOTE: This is the predicted, and *reordered* version. * Probably incorrect, but the spec doesn't say which version to use. */ for (int j = 0; j < sps->st_rps[i].num_delta_pocs; j++) - str[i].used_by_curr_pic_flag |= sps->st_rps[i].used[j] << j; + str[i].used_by_curr_pic_flag |= st_rps->used; for (int j = 0; j < str[i].num_negative_pics; j++) { - str[i].delta_poc_s0_minus1[j] = st_rps->delta_poc[j] - (j ? st_rps->delta_poc[j - 1] : 0) - 1; - str[i].used_by_curr_pic_s0_flag |= sps->st_rps[i].used[j] << j; + str[i].delta_poc_s0_minus1[j] = st_rps->delta_poc[j] - (j ? st_rps->delta_poc[j - 1] : 0) - 1; + str[i].used_by_curr_pic_s0_flag |= st_rps->used & ((1 << str[i].num_negative_pics) - 1); } for (int j = 0; j < str[i].num_positive_pics; j++) { str[i].delta_poc_s0_minus1[j] = st_rps->delta_poc[st_rps->num_negative_pics + j] - (j ? st_rps->delta_poc[st_rps->num_negative_pics + j - 1] : 0) - 1; - str[i].used_by_curr_pic_s0_flag |= sps->st_rps[i].used[str[i].num_negative_pics + j] << j; + str[i].used_by_curr_pic_s0_flag |= st_rps->used >> str[i].num_negative_pics; } }