From patchwork Wed May 29 08:05:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anton Khirnov X-Patchwork-Id: 49337 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:8f0d:0:b0:460:55fa:d5ed with SMTP id i13csp464749vqu; Wed, 29 May 2024 01:06:03 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCWB9XzgZg8Xmr0WNfmRvkZSh+LZ4bwHWQMwlwcMlxZlNY+UJbimovujB1oxPkNRlAocmivTjpc7xwAcjtUBC9FD+lLnjvvYkS8U4g== X-Google-Smtp-Source: AGHT+IHKyNGyUd4GCRi1gZJqw9BdJPFM6NGQGy5HzzIr4X0znypRAfGE82mABWCGBjli4BCkmsYl X-Received: by 2002:a17:907:a19f:b0:a5c:dcd4:351b with SMTP id a640c23a62f3a-a626511457dmr900899666b.58.1716969962750; Wed, 29 May 2024 01:06:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1716969962; cv=none; d=google.com; s=arc-20160816; b=Lfs6V7gtDCbSwPbApm27q+WF37yHgIU6jnH81353jcn9SKEfq27EH38af12Gb7+OK4 gXl2KBRzyE0pF0RMhgsoChyTlj8BvwO1Vjm/rvZgfV/y+UeyLWf6IW40cSG/2NYIEcjb ZwrTXTrtQoNmEuCUxyeZC2O3oNQJ8cLqOun2JmcQFbwe6qlMQnzPxNEROLqqJLFIHIwx zdtRO3+UN5HPQ1AJWa/IRWoyhRxy4mmyZrWw00VmioHSvah0qXoL4nFNxG/095SuvJG5 jeyEQ/nLmhOhXb3GPJFxNDkyZLUkpoeLsa9bNAUa+2DIrFhIY/2ggT4lIrKG7mZDGZYV CA0g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=NY9UbZ7oIEnNsIACar3ja1kBncI7teTqvcGnCaUmTp4=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=CmCPoczPbeys5bffFZFEHIGS2cDLoAmqaDVI4jnk+NVIMwX+YAacwr/HVT/Bxc7asl HYvWzyf4vUAgsXIJmDGlSSQvPnUTcJyHkMYgq3VyDXj7DC18XmGDi2oWMK6lWp1vUu/c 6kzggCKYCGS0i1aj6uIaEQt2ul/I+hddUeLgL6fXaEy7e4cielXAMMk0y6y1XSD51EcZ ZrYJ5WYZu+DTzDcZRXYHN6PhzPS2zTsRzB7Lu/N8E66cxYG09heMmy+CU4ApmdClax6z 7dlO+TIboGA/+GfezzEwuaULAhwr8m1GMD6fV7E94Zl+kdZEmV8+D1Ul673jGr7E6ST1 vUpw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@khirnov.net header.s=mail header.b="mCW/53hk"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a640c23a62f3a-a64a84f23c8si56102366b.290.2024.05.29.01.06.02; Wed, 29 May 2024 01:06:02 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@khirnov.net header.s=mail header.b="mCW/53hk"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2ECD668D44A; Wed, 29 May 2024 11:05:58 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail1.khirnov.net (quelana.khirnov.net [94.230.150.81]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id A5B6468D097 for ; Wed, 29 May 2024 11:05:51 +0300 (EEST) Authentication-Results: mail1.khirnov.net; dkim=pass (2048-bit key; unprotected) header.d=khirnov.net header.i=@khirnov.net header.a=rsa-sha256 header.s=mail header.b=mCW/53hk; dkim-atps=neutral Received: from localhost (mail1.khirnov.net [IPv6:::1]) by mail1.khirnov.net (Postfix) with ESMTP id AF7D44DA2 for ; Wed, 29 May 2024 10:05:50 +0200 (CEST) Received: from mail1.khirnov.net ([IPv6:::1]) by localhost (mail1.khirnov.net [IPv6:::1]) (amavis, port 10024) with ESMTP id 7HlsqTXCa526 for ; Wed, 29 May 2024 10:05:49 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=khirnov.net; s=mail; t=1716969949; bh=cBN463tOZ5sYzsMaxk6d5RKVh4Vovw3RS7pK88xjaf0=; h=From:To:Subject:Date:In-Reply-To:References:From; b=mCW/53hkUq0Dcg11KuNMb6woiVAuU7AikOnaSVTRV7WbrHmdbrqbxRkaO9N6oHYlX iR7hvlGBy3SWlvSym4M6ugpAnkgAPsiYTB36Hh2uZbV7XkOAMTEpeq+c9oaJ4veK2J EV2WhwKZnHf4E2iS0XUzykBTl+zfJlfDzz4dTL7aLS5nnl0cx0QmWwrvtqZJkyNhap T6VgudCX2ticPxKY8kc/xaUluM82JmgI0BkaKl82nJE+Sp01kDp7DBoCmeC/yaMN44 386DmGvQQka4VvYp9wemTtx9W2C31TJI9ar3ivr6ALc85pNqKChG1d9O5OUpuos/Mc RAGcQTr+JKF9A== Received: from libav.khirnov.net (libav.khirnov.net [IPv6:2a00:c500:561:201::7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "libav.khirnov.net", Issuer "smtp.khirnov.net SMTP CA" (verified OK)) by mail1.khirnov.net (Postfix) with ESMTPS id 6EF3A20D for ; Wed, 29 May 2024 10:05:49 +0200 (CEST) Received: from libav.khirnov.net (libav.khirnov.net [IPv6:::1]) by libav.khirnov.net (Postfix) with ESMTP id 489043A0231 for ; Wed, 29 May 2024 10:05:49 +0200 (CEST) From: Anton Khirnov To: ffmpeg-devel@ffmpeg.org Date: Wed, 29 May 2024 10:05:37 +0200 Message-ID: <20240529080547.16477-1-anton@khirnov.net> X-Mailer: git-send-email 2.43.0 In-Reply-To: <07a4ce06-7959-4316-b84f-b7f159aec098@gmail.com> References: <07a4ce06-7959-4316-b84f-b7f159aec098@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2 09-10] lavc/hevc_ps: reduce the size of ShortTermRPS.used X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: GWQCbbvlXfN8 It is currently an array of 32 uint8_t, each storing a single flag. A single uint32_t is sufficient. Reduces sizeof(HEVCSPS) by 1792 bytes. --- libavcodec/hevc_ps.c | 33 +++++++++++++++++++-------------- libavcodec/hevc_ps.h | 2 +- libavcodec/hevc_refs.c | 6 +++--- libavcodec/vulkan_hevc.c | 13 +++++-------- 4 files changed, 28 insertions(+), 26 deletions(-) diff --git a/libavcodec/hevc_ps.c b/libavcodec/hevc_ps.c index a6b0021bc3..76fe507e7b 100644 --- a/libavcodec/hevc_ps.c +++ b/libavcodec/hevc_ps.c @@ -107,6 +107,7 @@ int ff_hevc_decode_short_term_rps(GetBitContext *gb, AVCodecContext *avctx, int k = 0; int i; + rps->used = 0; rps->rps_predict = 0; if (rps != sps->st_rps && sps->nb_st_rps) @@ -114,6 +115,7 @@ int ff_hevc_decode_short_term_rps(GetBitContext *gb, AVCodecContext *avctx, if (rps->rps_predict) { const ShortTermRPS *rps_ridx; + uint8_t used[32] = { 0 }; int delta_rps; if (is_slice_header) { @@ -139,13 +141,13 @@ int ff_hevc_decode_short_term_rps(GetBitContext *gb, AVCodecContext *avctx, } delta_rps = (1 - (rps->delta_rps_sign << 1)) * rps->abs_delta_rps; for (i = 0; i <= rps_ridx->num_delta_pocs; i++) { - int used = rps->used[k] = get_bits1(gb); + used[k] = get_bits1(gb); rps->use_delta_flag = 0; - if (!used) + if (!used[k]) rps->use_delta_flag = get_bits1(gb); - if (used || rps->use_delta_flag) { + if (used[k] || rps->use_delta_flag) { if (i < rps_ridx->num_delta_pocs) delta_poc = delta_rps + rps_ridx->delta_poc[i]; else @@ -157,7 +159,7 @@ int ff_hevc_decode_short_term_rps(GetBitContext *gb, AVCodecContext *avctx, } } - if (k >= FF_ARRAY_ELEMS(rps->used)) { + if (k >= FF_ARRAY_ELEMS(used)) { av_log(avctx, AV_LOG_ERROR, "Invalid num_delta_pocs: %d\n", k); return AVERROR_INVALIDDATA; @@ -167,35 +169,38 @@ int ff_hevc_decode_short_term_rps(GetBitContext *gb, AVCodecContext *avctx, rps->num_negative_pics = k0; // sort in increasing order (smallest first) if (rps->num_delta_pocs != 0) { - int used, tmp; + int u, tmp; for (i = 1; i < rps->num_delta_pocs; i++) { delta_poc = rps->delta_poc[i]; - used = rps->used[i]; + u = used[i]; for (k = i - 1; k >= 0; k--) { tmp = rps->delta_poc[k]; if (delta_poc < tmp) { rps->delta_poc[k + 1] = tmp; - rps->used[k + 1] = rps->used[k]; + used[k + 1] = used[k]; rps->delta_poc[k] = delta_poc; - rps->used[k] = used; + used[k] = u; } } } } if ((rps->num_negative_pics >> 1) != 0) { - int used; + int u; k = rps->num_negative_pics - 1; // flip the negative values to largest first for (i = 0; i < rps->num_negative_pics >> 1; i++) { delta_poc = rps->delta_poc[i]; - used = rps->used[i]; + u = used[i]; rps->delta_poc[i] = rps->delta_poc[k]; - rps->used[i] = rps->used[k]; + used[i] = used[k]; rps->delta_poc[k] = delta_poc; - rps->used[k] = used; + used[k] = u; k--; } } + + for (unsigned i = 0; i < FF_ARRAY_ELEMS(used); i++) + rps->used |= used[i] * (1 << i); } else { unsigned int nb_positive_pics; @@ -222,7 +227,7 @@ int ff_hevc_decode_short_term_rps(GetBitContext *gb, AVCodecContext *avctx, } prev -= delta_poc; rps->delta_poc[i] = prev; - rps->used[i] = get_bits1(gb); + rps->used |= get_bits1(gb) * (1 << i); } prev = 0; for (i = 0; i < nb_positive_pics; i++) { @@ -235,7 +240,7 @@ int ff_hevc_decode_short_term_rps(GetBitContext *gb, AVCodecContext *avctx, } prev += delta_poc; rps->delta_poc[rps->num_negative_pics + i] = prev; - rps->used[rps->num_negative_pics + i] = get_bits1(gb); + rps->used |= get_bits1(gb) * (1 << (rps->num_negative_pics + i)); } } } diff --git a/libavcodec/hevc_ps.h b/libavcodec/hevc_ps.h index 1d3bdca4c6..ed6372c747 100644 --- a/libavcodec/hevc_ps.h +++ b/libavcodec/hevc_ps.h @@ -79,7 +79,7 @@ typedef struct ShortTermRPS { int num_delta_pocs; int rps_idx_num_delta_pocs; int32_t delta_poc[32]; - uint8_t used[32]; + uint32_t used; } ShortTermRPS; typedef struct HEVCWindow { diff --git a/libavcodec/hevc_refs.c b/libavcodec/hevc_refs.c index 8da9ec982a..d6dc2f9e0a 100644 --- a/libavcodec/hevc_refs.c +++ b/libavcodec/hevc_refs.c @@ -497,7 +497,7 @@ int ff_hevc_frame_rps(HEVCContext *s) int poc = s->poc + short_rps->delta_poc[i]; int list; - if (!short_rps->used[i]) + if (!(short_rps->used & (1 << i))) list = ST_FOLL; else if (i < short_rps->num_negative_pics) list = ST_CURR_BEF; @@ -536,9 +536,9 @@ int ff_hevc_frame_nb_refs(const HEVCContext *s) if (rps) { for (i = 0; i < rps->num_negative_pics; i++) - ret += !!rps->used[i]; + ret += !!(rps->used & (1 << i)); for (; i < rps->num_delta_pocs; i++) - ret += !!rps->used[i]; + ret += !!(rps->used & (1 << i)); } if (long_rps) { diff --git a/libavcodec/vulkan_hevc.c b/libavcodec/vulkan_hevc.c index 21cf49c0ec..a35f3d992d 100644 --- a/libavcodec/vulkan_hevc.c +++ b/libavcodec/vulkan_hevc.c @@ -373,19 +373,16 @@ static void set_sps(const HEVCSPS *sps, int sps_idx, /* NOTE: This is the predicted, and *reordered* version. * Probably incorrect, but the spec doesn't say which version to use. */ - for (int j = 0; j < sps->st_rps[i].num_delta_pocs; j++) - str[i].used_by_curr_pic_flag |= sps->st_rps[i].used[j] << j; + str[i].used_by_curr_pic_flag = st_rps->used; + str[i].used_by_curr_pic_s0_flag = av_mod_uintp2(st_rps->used, str[i].num_negative_pics); + str[i].used_by_curr_pic_s1_flag = st_rps->used >> str[i].num_negative_pics; - for (int j = 0; j < str[i].num_negative_pics; j++) { + for (int j = 0; j < str[i].num_negative_pics; j++) str[i].delta_poc_s0_minus1[j] = st_rps->delta_poc[j] - (j ? st_rps->delta_poc[j - 1] : 0) - 1; - str[i].used_by_curr_pic_s0_flag |= sps->st_rps[i].used[j] << j; - } - for (int j = 0; j < str[i].num_positive_pics; j++) { + for (int j = 0; j < str[i].num_positive_pics; j++) str[i].delta_poc_s1_minus1[j] = st_rps->delta_poc[st_rps->num_negative_pics + j] - (j ? st_rps->delta_poc[st_rps->num_negative_pics + j - 1] : 0) - 1; - str[i].used_by_curr_pic_s1_flag |= sps->st_rps[i].used[str[i].num_negative_pics + j] << j; - } } *ltr = (StdVideoH265LongTermRefPicsSps) {