From patchwork Sat Sep 3 20:35:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andreas Rheinhardt X-Patchwork-Id: 37640 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp1748161pzh; Sat, 3 Sep 2022 13:36:45 -0700 (PDT) X-Google-Smtp-Source: AA6agR5Hp1m/Qs6vBF07MWG4Mjm3HBl0r9ogj3jq0AWwo7xhf5TWyFFjL2QEq0ZAacHEkSgmfuxI X-Received: by 2002:aa7:c488:0:b0:448:d11:4830 with SMTP id m8-20020aa7c488000000b004480d114830mr31717111edq.97.1662237405066; Sat, 03 Sep 2022 13:36:45 -0700 (PDT) Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id y21-20020a170906471500b0073db945fcc0si4264675ejq.214.2022.09.03.13.36.44; Sat, 03 Sep 2022 13:36:45 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@outlook.com header.s=selector1 header.b=KzocTOoZ; arc=fail (body hash mismatch); spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=outlook.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6A15068BA39; Sat, 3 Sep 2022 23:36:28 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from EUR01-HE1-obe.outbound.protection.outlook.com (mail-oln040092065072.outbound.protection.outlook.com [40.92.65.72]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 80AF968B71E for ; Sat, 3 Sep 2022 23:36:21 +0300 (EEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=a//iHs8Tjq4Q5yJo1pNkxl90f9zE3kSkjqQCoMqdUfoRJ/ItaIfw3rYuZcXMjAYyw7KLr1tbG1HE43dFUERTRmUB/KmtIWPXwU46r6ow/mkta4xrrvL45olyRO60nUGG+M2IEFaYxwXKWR5nsSNQjavBpfz0xxOteTINXaQdY9KUxtQceBPiLYlMIQM73WApEEhMe2kNXDU3YRlIKSAPXGGZWZr7ohHlfn2WvlbnSbve3m3pd0OdioTVaQNEwR3a8/24RzGuMz+vXmffqKQmhbAi+0x9Mo5imvpg0wAeloAviwUyVS6zc8SsHEcAMtoy8kErzPx5nhtmstz8vwPi4g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=/mSDdMNarvSfC3EYm90gD1H8mI8I4yLiUSWz/pDQFfU=; b=Vwei8InwA96ObHs1p/v4Ifg8VXVVc2kpSgvBwN1ycx5HRlQeuBL7N9DQSkpgIfRXY9aq/G1AX9Z3y7m3DH+xucW+aDLPBtHQB6s8RT004EWPYlwiSjYIJ9w4jvgjqlzqv4GC/U8Q93whJHLyTyyhOUElGFxU9ALVnT+fQDifqYOQNUxShw9YVYNJLRJ+kVRf5NHPoKmQqMTx6SYfoD0FNggOijf1at5GcD5GtTwiDSBatjv3UpZELkHZzC7+Ph1P1gZZXOOgFi5+oiS1NjfI7DHlPeEWhW5ns74kZlKZYlj/BBoPFZsnj30IpNSLsUKOKshHRBsp8BKDPK4nW+Q6EA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outlook.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=/mSDdMNarvSfC3EYm90gD1H8mI8I4yLiUSWz/pDQFfU=; b=KzocTOoZS0SFEdFrdrLMFt93A4IURQde0gJyXFa0rkBreWJ35rlvocs12NfGJJ32MW8ofTmSOLr3kzLHFZIV2fc7dfidcptaTES8Ch6Hgz8u0UwsrUS10XbjrqP9t6Moy//kNKCN1DtqGrY6T3IfwInURKvv0F+B/DOc/j9MyVJImYhFooG8wDP5BBO1JaUzNuJUzWgn8K9PJ6vPI+s2iJPrOfasoCFTjatRR1o3Vc46xnuzhucATKGmwDsfLB9SUHsKli7jXuCmS+j5u5EsCI4wuF/5NIk6SZo+2/8rbtLbkvKOspKXnkWjqw22hpsIbfPxeJZwAgekjUARjzbA8A== Received: from AS8P250MB0744.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:541::14) by PR3P250MB0241.EURP250.PROD.OUTLOOK.COM (2603:10a6:102:17c::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5588.12; Sat, 3 Sep 2022 20:36:06 +0000 Received: from AS8P250MB0744.EURP250.PROD.OUTLOOK.COM ([fe80::f85a:9a02:f8be:7c47]) by AS8P250MB0744.EURP250.PROD.OUTLOOK.COM ([fe80::f85a:9a02:f8be:7c47%2]) with mapi id 15.20.5588.010; Sat, 3 Sep 2022 20:36:06 +0000 From: Andreas Rheinhardt To: ffmpeg-devel@ffmpeg.org Date: Sat, 3 Sep 2022 22:35:59 +0200 Message-ID: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: X-TMN: [ydhMRulL/Fw73/KoKQqZfKcClThpg+1SFv5i5+AoUaQ=] X-ClientProxiedBy: AM6P193CA0086.EURP193.PROD.OUTLOOK.COM (2603:10a6:209:88::27) To AS8P250MB0744.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:541::14) X-Microsoft-Original-Message-ID: <20220903203559.1961353-4-andreas.rheinhardt@outlook.com> MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 57d22a4a-d639-4c69-20e8-08da8debed62 X-MS-Exchange-SLBlob-MailProps: EgT5Wr3QDKxJSTKLhBqHOpZClMCbmhyfgOB5RzCoNGc75e4ubbhcYh0sgKCF844dfwY9I7GdPkrDCX0X7EELcLyR7VrtaqIQJhRzW3vq4U+ZTKjzuE+Zjw5caHQIykwbyvzj8RlnElYi5GlYHtxasU8gntPB5aNydhxdiLre2M3r8zBshJ7df/YEiT3GAynIXdj4IY9X4xmRvKSsWlIDmg+3nKZ8Pv1jHoMk6AI8kRPTCzECDWWtgC5ydW4rMEotfaLy4na5FUclteJZSECm1Y41/ktjDCyutmnT2uOnYUSvNXXk2VMtg+Zye1S92NvGXLh6eCJJBlqyXVXAV6BTPbW+lShUAHhDdC8zNpyfKZifGYjhf2nuHPxl4dCU5MUPvD/8Sj5F3+eRIa3hLNOij1iHsRPENZbZQiIX+Gv3Ly5V1DdIkuxSEP55mPFn31fS5cBWAetmpKPzHknerUarAAcDhqnkwcz4gMYzH4G1FG79wWu9YyylYfKxnakwUnloAGrvqckaQ3scf3XBWDR/Oz0Jw1A6UEKcXDHRBEwmqPihdqjmE9dhhMahFuDNTiIYkNfjHWXdpIDJMluUr9l0tpxqK0V1pfnoPRo0VZjKas9z5rjfFum39UuCZKqXFhOo8tiGGb+kwr+0/S0hL/+8ZvtsfRrA+St6c2X5Y+wagGP0b4+YWflCMGoSH+zP5tvBNcNipnoZY+/agAWc/ZvCcBGutMj0yDheEPa1TM4gcSg= X-MS-TrafficTypeDiagnostic: PR3P250MB0241:EE_ X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 8uL5JIn+/MOSrgAWzthDivJSgk9Ge46+n62dUVRjxs7GpT87XjmV0pm2CGUS9AXPg5Y41mm4wFG2/hxFgrrRP23XE2iJsC5zxvqMoj5nK7T85hJ3vE3vVXkg0GIGI5qNsTNa4Ofr3JDO71oSDTBPWq82p/M3WBl/eXaFIdGLpMhALfxewRhY2DmVaWswvJPdfwapElKK3h534RFociUyB31s3s6TTM5ueJYdoKF6MlQltLY4GttC9/2PXdrpdi1xCIHg+S0i+eMiXOUs2W7XjCLNEnKGO/TpjRNeeWq/jMVgG++FmG8OD+9HInlAoDVP7B82/t2vtYAGpRW3V9ZoPRGO9zrgKjRSbIe4JjayTGidbKwXqvWKzTGmpMK2vAAJKckZM+BhsAcdWOXBrPevPCnauaYSgaT5r4e+B+VcIbcReK2wtsa1nV3g1k3SxZc0AdEWrujsSY3rndqXdWwkkXjLrxOCt6U1hGIpFfkbuHYJvqLGUfWV+8WGUAYDSs+1sgB60apj7BOhiBxfOSDU7vV9aCXnS4QBtVOkiNE1dkpM4n+13uyNeU7gGEapnhbijpayQImF+69cXMsOyqyKknc28/0RNSplovujqqoRCzDRsyJlESm0uRIaeOcI9G3todk7aqB6vLG8DG8v840reQ== X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: DFVFNnZSDW2liJyCerMaWa/8I0159mWzgg3cOB6J3qoMyH7s0tdW3K77JKC6ywo+lyN3iffiZzuQ3usMDLv0pYt/It5g7F9k3DfQpoMoDsZjP75NHXo5HequMWowN40PprGFstczVPjioI8zMHWw8/dS6o9Gg679sRrWZZ5AKgmplxSJFzeTNE6Nc05IkxvkY9VtoKHEXz/dRzn9SfRZf0jA+EPLcNWg1z2arA0V6ib4YzJUnHcx52k8iMP5ctKsWoKlhD372NWDMXZSTDnIOH0BxmGPU6Wn/amb7mtzaeFHhgJsZvgoTgzYUCKu4wuDb3MSknQSBnDhlDyy4mv8TKcp2F3sW9L4R46yv9TmOq5NLvt8sH1V75V+/3dCrwpV6906c19BreJO3cAIlH/wcf6oVl9+icu2e1SBydsPwYYB/JURDba7JMtkYwCFJtwW55j3dalCLrOoSqMCOfcGuUyxsJSDo2V33YxO0p/WDDiwTrK0pQlKxu1Ls2nj/2VtESatK0LhdaXO6sUERmwKPFEhBUa96JV+b72Cbi34dRaNBlH/BxFSIVNfodHRzIMLO2r9z/Yj6B+Tbm56KZxE6jOtV/58+fQZVPl6e7QXJSeRU5Fes7Y0OG/Pasa99ML7idzTiFP1mwSL/mdlWeszJMy3XNrxO+cWBDHPCImbZeCbvJyHGb8V7aC0/k+rOb5zqvtik/v0FWaRZwL+JpiPXGIb6vcbf7ZP7RmEFJVREYGPU+Zxnw3r4NKhNpnQ3HVawNQGF3l+u5tTjMFI2AZsT+4zXobZmD8Fs93PQ2XmsIQGDYItBpN1Myr8jgACFi1p1wNhuEapLZu1JDf8weBLM3/A/3TVbQmGo0tXO+1UVeCjB+f1jqxgxnTRjubebFzcR2ITurm4PLsSdg02GECRMis5H9iw2OIaQdlcOj7sUQH13tZI7SD80hG+cwbaHtiUiO0hR3awclZwexaEc+xyG/VULkJoxt2UmFCJ3MheY+RXAWo3JnNsuGNckHOX+hTd8lHuos5UORgxeg3oX5XXxih/zPkkSQ9ZfWYNwTCKZQ7eS8nGShbvypMiDctkExHADdSlCjuVboPLOcb4gg9hiGE61y9oWHYQGuAenB5SV2jZ+xKHK90KCTHfTroWt3Cwx66gWfhJ6DvHsO1XNWpfUF5SvQJv7f1+mNAbgpCLfF5Bk0l28Ilb5kBbVbsfrenUOk7u0KyyTxXpoe53waq12nA8m+PqrsPD2zZD1EIHBfNpjmYJ8jBfEzzpaiASzpgHWL6IcvpDTnhf6ghPbR++jMHfXYM/okICYgxzip98Y6bFnzGmucM+zMBrgQ2S2puQ X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 57d22a4a-d639-4c69-20e8-08da8debed62 X-MS-Exchange-CrossTenant-AuthSource: AS8P250MB0744.EURP250.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 03 Sep 2022 20:36:06.7839 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: PR3P250MB0241 Subject: [FFmpeg-devel] [PATCH 5/5] avcodec/cfhddata: Reduce stack usage X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Andreas Rheinhardt Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 2YmUH7FDqSfJ Creating CFHD RL VLC tables works by first extending the codes by the sign, followed by creating a VLC, followed by deriving the RL VLC from this VLC (which is then discarded). Extending the codes uses stack arrays. The tables used to initialize the VLC are already sorted from left-to-right in the tree. This means that the corresponding VLC entries are generally also ascending, but not always: Entries from subtables always follow the corresponding main table although it is possible for the right-most node to fit into the main table. This suggests that one can try to use the final destination buffer as scratch buffer for the tables with sign included. Unfortunately it works for neither of the tables if one uses the right-most part of the RL VLC buffer as scratch buffer; using the left-most part of the RL VLC buffer as scratch buffer might work if one traverses the VLC entries from end to start. But it works only for the little RL VLC (table 9), not for table 18. Therefore this patch uses the RL VLC buffer for table 9 as scratch buffer for creating the bigger table 18. Afterwards the left part of the buffer for table 9 is used as scratch buffer to create table 9. This fixes the cfhd part of ticket #9399 (if it is not already fixed). Notice that I do not consider the previous stack usage excessive. Signed-off-by: Andreas Rheinhardt --- I actually regard #9399 as a toolchain issue and not as a reason to pessimize the code for all the other arches/toolchains where it works. libavcodec/cfhddata.c | 47 +++++++++++++++++++++---------------------- 1 file changed, 23 insertions(+), 24 deletions(-) diff --git a/libavcodec/cfhddata.c b/libavcodec/cfhddata.c index efe932dc3b..fd5cc8174e 100644 --- a/libavcodec/cfhddata.c +++ b/libavcodec/cfhddata.c @@ -127,11 +127,8 @@ static const CFHD_RL_ELEM table_18_vlc[NB_VLC_TABLE_18] = { static av_cold int cfhd_init_vlc(CFHD_RL_VLC_ELEM out[], unsigned out_size, const CFHD_RL_ELEM table_vlc[], unsigned table_size, - void *logctx) + CFHD_RL_VLC_ELEM tmp[], void *logctx) { - uint8_t new_cfhd_vlc_len[NB_VLC_TABLE_18 * 2]; - uint16_t new_cfhd_vlc_run[NB_VLC_TABLE_18 * 2]; - int16_t new_cfhd_vlc_level[NB_VLC_TABLE_18 * 2]; VLC vlc; unsigned j; int ret; @@ -139,27 +136,28 @@ static av_cold int cfhd_init_vlc(CFHD_RL_VLC_ELEM out[], unsigned out_size, /** Similar to dv.c, generate signed VLC tables **/ for (unsigned i = j = 0; i < table_size; i++, j++) { - new_cfhd_vlc_len[j] = table_vlc[i].len; - new_cfhd_vlc_run[j] = table_vlc[i].run; - new_cfhd_vlc_level[j] = table_vlc[i].level; + tmp[j].len = table_vlc[i].len; + tmp[j].run = table_vlc[i].run; + tmp[j].level = table_vlc[i].level; /* Don't include the zero level nor escape bits */ if (table_vlc[i].level && table_vlc[i].run) { - new_cfhd_vlc_len[j]++; + tmp[j].len++; j++; - new_cfhd_vlc_len[j] = table_vlc[i].len + 1; - new_cfhd_vlc_run[j] = table_vlc[i].run; - new_cfhd_vlc_level[j] = -table_vlc[i].level; + tmp[j].len = table_vlc[i].len + 1; + tmp[j].run = table_vlc[i].run; + tmp[j].level = -table_vlc[i].level; } } - ret = ff_init_vlc_from_lengths(&vlc, VLC_BITS, j, new_cfhd_vlc_len, - 1, NULL, 0, 0, 0, 0, logctx); + ret = ff_init_vlc_from_lengths(&vlc, VLC_BITS, j, + &tmp[0].len, sizeof(tmp[0]), + NULL, 0, 0, 0, 0, logctx); if (ret < 0) return ret; av_assert0(vlc.table_size == out_size); - for (unsigned i = 0; i < out_size; i++) { + for (unsigned i = out_size; i-- > 0;) { int code = vlc.table[i].sym; int len = vlc.table[i].len; int level, run; @@ -168,8 +166,8 @@ static av_cold int cfhd_init_vlc(CFHD_RL_VLC_ELEM out[], unsigned out_size, run = 0; level = code; } else { - run = new_cfhd_vlc_run[code]; - level = new_cfhd_vlc_level[code]; + run = tmp[code].run; + level = tmp[code].level; } out[i].len = len; out[i].level = level; @@ -184,16 +182,17 @@ av_cold int ff_cfhd_init_vlcs(CFHDContext *s) { int ret; - /* Table 9 */ - ret = cfhd_init_vlc(s->table_9_rl_vlc, FF_ARRAY_ELEMS(s->table_9_rl_vlc), - table_9_vlc, FF_ARRAY_ELEMS(table_9_vlc), - s->avctx); - if (ret < 0) - return ret; - /* Table 18 */ + /* Table 18 - we reuse the unused table_9_rl_vlc as scratch buffer here */ ret = cfhd_init_vlc(s->table_18_rl_vlc, FF_ARRAY_ELEMS(s->table_18_rl_vlc), table_18_vlc, FF_ARRAY_ELEMS(table_18_vlc), - s->avctx); + s->table_9_rl_vlc, s->avctx); + if (ret < 0) + return ret; + /* Table 9 - table_9_rl_vlc itself is used as scratch buffer; it works + * because we are counting down in the final loop */ + ret = cfhd_init_vlc(s->table_9_rl_vlc, FF_ARRAY_ELEMS(s->table_9_rl_vlc), + table_9_vlc, FF_ARRAY_ELEMS(table_9_vlc), + s->table_9_rl_vlc, s->avctx); if (ret < 0) return ret; return 0;