From patchwork Sun Feb 18 19:31:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andreas Rheinhardt X-Patchwork-Id: 46355 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:dda5:b0:19e:cdac:8cce with SMTP id kw37csp898067pzb; Sun, 18 Feb 2024 11:29:27 -0800 (PST) X-Forwarded-Encrypted: i=2; AJvYcCX/iXqM/2KthBKa7gVtR6cOL0J5fsm8npwLBDJNbNLb9w8TqZn6/57x20x7ZvneG6lfcCUQg4TmnA6029AkdI547KOdPxl8vUBq7Q== X-Google-Smtp-Source: AGHT+IEFWg+8z6o69dZTr+q4wkngXRut8T0k0bAOkgZmaqtL6CijjAMUyQzNA80/hrh1DuufMP7n X-Received: by 2002:a17:906:301b:b0:a3e:9bce:b5b1 with SMTP id 27-20020a170906301b00b00a3e9bceb5b1mr247363ejz.5.1708284566761; Sun, 18 Feb 2024 11:29:26 -0800 (PST) Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a11-20020a170906468b00b00a3e6dc386d4si692172ejr.160.2024.02.18.11.29.26; Sun, 18 Feb 2024 11:29:26 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@outlook.com header.s=selector1 header.b=Jh8ICQ3z; arc=fail (body hash mismatch); spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=outlook.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5748268D38C; Sun, 18 Feb 2024 21:29:24 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05olkn2107.outbound.protection.outlook.com [40.92.90.107]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 2B74A68D13A for ; Sun, 18 Feb 2024 21:29:18 +0200 (EET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=VV4826gZR3d4KaG7a0kka2oS09DcWBz0VifMu4dHgHbVUZzbZUIJq1iO9D2Lw4mLEXVr9hWENspzx/gmNWOPs3Ct4J7l0sPXPo2UTIuDVpnyh46esTDLMiHZP/bAeIdE2xaboNmnFxUGpvTrW82vFfZgD6sYwYZr+7NeJQKX9AGe1WjEQGyb5CRGoI/27iv5oqJZGqMeWOtUKPRB0jlRLGU+MgtpdRb6g1XN5H7afObw2GmrPaBNGCj8hy6rpGjsIh6QHCbpLp1kEgIx/rJxuHZrPClTzONYLY0Pjg+fUHL5dJdCH4Jte1hp5mB5W8C4EppE+oFfeMD5cS9R2xTgtA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=QLkwozzsY5K+++gDqKSlHDfWzYo6ISFdAMgoOH88UrU=; b=oVSR+GAFpVjxdZuaRIkYBWvyrlSX0leMZX/lwZ+5aj7QU2qVGdQzF48JpfCzCr51w5TjBghdrBRjxgvaZ2NQB7S8j1zcqH1oI946uQpXYEwEwfedH/WUTPmWA+yrQWuxNQ7mNxpSMLV9X0xjLCLEE9iPuMcRAt3y/N3hhAO7GfQ4zon/PobfmX5lT6X4zksUI4um/+23ul+aHa7w0pWbQh2wOqgL+fbwFkVhwKIOEKtFpNYQIYvkW6gMl2Sbjcx5r9fIs+aLuubD0OMY8ks8+UF3h0mSaI0A1RKQ3DN6Gi91IhMWubdEO3NQUYxpXptjvn7Tf3peYwvJMG4Q13x1ag== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outlook.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=QLkwozzsY5K+++gDqKSlHDfWzYo6ISFdAMgoOH88UrU=; b=Jh8ICQ3zesMKgDlY91fwT1h709hNfNXTcjM6Dohs1vXqtBURYoUgHJuvXzsyvRADrKPGZ3OW93A25OGIH/RfNMTSvOWHWvykiVQ3thZYOWH2tN5QYKXdtQgLhLss+MZdxRwDbfz1xw5JGkRhLSLGiYKKiLkJcm5BNQfcPPEJyHRVeYMLx6iB1UBK0GMVfJFW1Lu9DAv4EKDqoip4veCy2ERTXtrVetrIhbXtXejxksdtlLIdFCWDSp8PtJas9F+WIAhhQc+c6RpqNjBOB0bYoW7prmND/vH359MsXuIRkOm47MQN5/bDd31S1u5AaDwibo/+CSuJyzf27p+z+xdGSg== Received: from AS8P250MB0744.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:541::14) by AS1P250MB0405.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:4ad::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7292.37; Sun, 18 Feb 2024 19:29:15 +0000 Received: from AS8P250MB0744.EURP250.PROD.OUTLOOK.COM ([fe80::65aa:deb0:a18e:d48d]) by AS8P250MB0744.EURP250.PROD.OUTLOOK.COM ([fe80::65aa:deb0:a18e:d48d%5]) with mapi id 15.20.7292.033; Sun, 18 Feb 2024 19:29:15 +0000 From: Andreas Rheinhardt To: ffmpeg-devel@ffmpeg.org Date: Sun, 18 Feb 2024 20:31:04 +0100 Message-ID: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: X-TMN: [t39WIUh4drx4qzitS5CT9u1rTDiLEMrPnEYSheYksAk=] X-ClientProxiedBy: ZR0P278CA0209.CHEP278.PROD.OUTLOOK.COM (2603:10a6:910:6a::26) To AS8P250MB0744.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:541::14) X-Microsoft-Original-Message-ID: <20240218193106.346214-1-andreas.rheinhardt@outlook.com> MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AS8P250MB0744:EE_|AS1P250MB0405:EE_ X-MS-Office365-Filtering-Correlation-Id: b6ec124f-12e3-4a48-a5b1-08dc30b7e4a8 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: abSY/YkDoAJFN1dZhqyYMMBBx1TpJUSzk8gfcE0yZa7mKsD4EIIIOV52T5zBRtRsy0UX60TV7+Y6L0PMz0yei4C5Ueo4NUzCZPc+RFPppmE7t9OIQisVZOv43uW1WI+ikild6x1U9DcWkyeVagWRtzEh7jhfpFL51ZI02YeHUM/2AcXL/BvH9HCNCnlmoFR1aLicce0X+UG+MvuQPKmFQVaa69jh+Y8G8LHHf8KHcMaLjcN7KbXUeZ/ZmeK8971iwLBDUDrkxA0BWFEskvhmMJCZ6E1iJwu7Z7eSX9kuV+kLIsZp1MDWU80NNdkozJCT05AkNEm4o61pfW/5a+oXjOEiUGAaxMJa5yyCy/hLgDQsCSLOz8iGfKHCyCGCmN1TI9QWl/fL48eEZ4m1M3hbJdeXW6X9OQ2JWUilqfwB+CJx/Xp6QOnUBRGfWLXBVbb36QI3BNjsfRROxDmk426XK5Fy0/qDXHCH+ZZsxumUpoN7hTXqebX4XZ0GKnz6cJJM9luEe0FkdbEMisU2xpS66QEGn6MebINDOl0kNebchQb47JWqdozAesbRCZPTx71Z6i23cafBcrQrnqCpL0KIj0k6jDPWPDq1pa4TsR9TlQ2zaY3vEwHGB2+3Gd2Wev02 X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: jKvvJ8wC0btDxljXh51k+8ux3hrNSM45nTwafpllMZuqmawpy8PB2PNelTj4QJHg3fCnJwptShQEMuBiPbIHK3ehNlvnSOMI+QJUtsqo+3tzx7/hJ5NNIU9KE9LMONgjFMeYUkFPobh49LEJl+Ult/Jj52W0EYYAEhdO0AicQMXagfIsHnoQ6y5Oo8y4qZEvjyG0LaYhlLki44v1vwR0o+9T1V2vErLKrtORw921+tMvay3a81gO4bORb5G9zdb6yh2m/kP8UxsIjNeCgAXZuTI4FjpzA2KnJCh+BvsLtWo6exVrpvAHlUp1djuu6offIgEtMRBYI+ZEkWrmfOxldk1MmOwpO8+yD30H9CcmqI2tEber4Ybl9RcNyxqJX5d0hX/gCpWtKje1ViLbxop2xqQLuzTjH2eb2Vtx4cxUjMjR3g/v8aDkOSYGsYxKwSmRSMPOhBdVEH6S5F0t5l3yI2Stb+8+GGC/UhzY9AXSAj2YLVbVEpGQHuOYbUXpjT0bw/bK26pQITPBOJQ9Zythm7faSszfgP/tZZVMcbLzJqgVjU2Z+Jxpx8LMawcQRi1mODlZFP42MRobV5EPYDc8F7QZZtwe9yYxWsss9Y+lAPyfYBeQ02rJCwUmD1t/AJPNr5H4V+PhBtIVGnnP1m8WyYYjun0X/sYBBUI3LceCw8T1aCYUO/jtIML2rEJ2ij///9ym/HFMGeuw1Cj5QW0LteZkeqELlUEza7nmv23rV1S+IkmEOP480u7N5jq0rtG1oeVc4BZqbMjIXAjRxM+jx6SMrbA70C0f+eIdz/ookDfeq1ctBI93GT5Jrct0gThKMETaUiyYr5p0M6jX5ZR7dfv9qDY/c/3Ep8OmMMv4uw5Z7qLifMnO7RInQAA+36jrRKDFDSxJx7zRkYUJCtvwLnkaaKxWCc4PUrm9u59wHF6JrJAL8/rBm+1iMxkrcAKgZ36JmrIKnKFi9KVuYeFnh4ysSrZRwgcQKNI2Q4QZ8wmaj/cUZeWJDvazdSdj0F8/E1cdXad85gSxm2oBZOtdpkx20J1aiYtP5kdDylSM9t5y5vzg9olE6TCjIclYWIxFZ7LhSr8r55CqCyF+RkK1r/WAIiS0K95dquM9tG+V363R6m5bemqZGbuK6fk/fesuzhDhpQpfuDikbebM+mQEOBeC9pv7OkXeMog1ZMYfLiQG28H72JnzxpResdnArYY7bwbmM4/NS2pNGUdY1MBb7C4hbjS26MOtiT8p8Tj0k/NPzT8v50qHCdRhvlGj+wbwVOIhlWbZ8L7yPh4atA0vnQ== X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: b6ec124f-12e3-4a48-a5b1-08dc30b7e4a8 X-MS-Exchange-CrossTenant-AuthSource: AS8P250MB0744.EURP250.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Feb 2024 19:29:15.4966 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS1P250MB0405 Subject: [FFmpeg-devel] [PATCH 2/4] avcodec/vvc/vvc_ps: Use union for luts to avoid unaligned accesses X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Andreas Rheinhardt Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: y0c8aACsaYME These arrays are currently accessed via uint16_t* pointers although nothing guarantees their alignment. Furthermore, this is problematic wrt the effective-type rules. Fix this by using a union of arrays of uint8_t and uint16_t. Signed-off-by: Andreas Rheinhardt --- libavcodec/vvc/vvc_filter.c | 2 +- libavcodec/vvc/vvc_filter_template.c | 4 ++-- libavcodec/vvc/vvc_inter.c | 4 ++-- libavcodec/vvc/vvc_ps.c | 8 ++++---- libavcodec/vvc/vvc_ps.h | 7 ++++--- libavcodec/vvc/vvcdsp.h | 2 +- 6 files changed, 14 insertions(+), 13 deletions(-) diff --git a/libavcodec/vvc/vvc_filter.c b/libavcodec/vvc/vvc_filter.c index df77e443f6..5fa711c9e0 100644 --- a/libavcodec/vvc/vvc_filter.c +++ b/libavcodec/vvc/vvc_filter.c @@ -1328,5 +1328,5 @@ void ff_vvc_lmcs_filter(const VVCLocalContext *lc, const int x, const int y) const int height = FFMIN(fc->ps.pps->height - y, ctb_size); uint8_t *data = fc->frame->data[LUMA] + y * fc->frame->linesize[LUMA] + (x << fc->ps.sps->pixel_shift); if (sc->sh.r->sh_lmcs_used_flag) - fc->vvcdsp.lmcs.filter(data, fc->frame->linesize[LUMA], width, height, fc->ps.lmcs.inv_lut); + fc->vvcdsp.lmcs.filter(data, fc->frame->linesize[LUMA], width, height, &fc->ps.lmcs.inv_lut); } diff --git a/libavcodec/vvc/vvc_filter_template.c b/libavcodec/vvc/vvc_filter_template.c index b7eaef5125..9b3a0e46f7 100644 --- a/libavcodec/vvc/vvc_filter_template.c +++ b/libavcodec/vvc/vvc_filter_template.c @@ -22,9 +22,9 @@ #include "libavcodec/h26x/h2656_sao_template.c" -static void FUNC(lmcs_filter_luma)(uint8_t *_dst, ptrdiff_t dst_stride, const int width, const int height, const uint8_t *_lut) +static void FUNC(lmcs_filter_luma)(uint8_t *_dst, ptrdiff_t dst_stride, const int width, const int height, const void *_lut) { - const pixel *lut = (const pixel *)_lut; + const pixel *lut = _lut; pixel *dst = (pixel*)_dst; dst_stride /= sizeof(pixel); diff --git a/libavcodec/vvc/vvc_inter.c b/libavcodec/vvc/vvc_inter.c index e05f3db93e..6c9c8a7165 100644 --- a/libavcodec/vvc/vvc_inter.c +++ b/libavcodec/vvc/vvc_inter.c @@ -571,7 +571,7 @@ static void pred_regular_luma(VVCLocalContext *lc, const int hf_idx, const int v const int intra_weight = ciip_derive_intra_weight(lc, x0, y0, sbw, sbh); fc->vvcdsp.intra.intra_pred(lc, x0, y0, sbw, sbh, 0); if (sc->sh.r->sh_lmcs_used_flag) - fc->vvcdsp.lmcs.filter(inter, inter_stride, sbw, sbh, fc->ps.lmcs.fwd_lut); + fc->vvcdsp.lmcs.filter(inter, inter_stride, sbw, sbh, &fc->ps.lmcs.fwd_lut); fc->vvcdsp.inter.put_ciip(dst, dst_stride, sbw, sbh, inter, inter_stride, intra_weight); } @@ -887,7 +887,7 @@ static void predict_inter(VVCLocalContext *lc) if (lc->sc->sh.r->sh_lmcs_used_flag && !cu->ciip_flag) { uint8_t* dst0 = POS(0, cu->x0, cu->y0); - fc->vvcdsp.lmcs.filter(dst0, fc->frame->linesize[LUMA], cu->cb_width, cu->cb_height, fc->ps.lmcs.fwd_lut); + fc->vvcdsp.lmcs.filter(dst0, fc->frame->linesize[LUMA], cu->cb_width, cu->cb_height, &fc->ps.lmcs.fwd_lut); } } diff --git a/libavcodec/vvc/vvc_ps.c b/libavcodec/vvc/vvc_ps.c index 376027ed81..e6e46d2039 100644 --- a/libavcodec/vvc/vvc_ps.c +++ b/libavcodec/vvc/vvc_ps.c @@ -642,9 +642,9 @@ static int lmcs_derive_lut(VVCLMCS *lmcs, const H266RawAPS *rlmcs, const H266Raw const uint16_t fwd_sample = lmcs_derive_lut_sample(sample, lmcs->pivot, input_pivot, scale_coeff, idx_y, max); if (bit_depth > 8) - ((uint16_t *)lmcs->fwd_lut)[sample] = fwd_sample; + lmcs->fwd_lut.u16[sample] = fwd_sample; else - lmcs->fwd_lut[sample] = fwd_sample; + lmcs->fwd_lut.u8 [sample] = fwd_sample; } @@ -659,9 +659,9 @@ static int lmcs_derive_lut(VVCLMCS *lmcs, const H266RawAPS *rlmcs, const H266Raw inv_scale_coeff, i, max); if (bit_depth > 8) - ((uint16_t *)lmcs->inv_lut)[sample] = inv_sample; + lmcs->inv_lut.u16[sample] = inv_sample; else - lmcs->inv_lut[sample] = inv_sample; + lmcs->inv_lut.u8 [sample] = inv_sample; } return 0; diff --git a/libavcodec/vvc/vvc_ps.h b/libavcodec/vvc/vvc_ps.h index 3d3aa061f5..5adf3f3453 100644 --- a/libavcodec/vvc/vvc_ps.h +++ b/libavcodec/vvc/vvc_ps.h @@ -191,9 +191,10 @@ typedef struct VVCLMCS { uint8_t min_bin_idx; uint8_t max_bin_idx; - //*2 for high depth - uint8_t fwd_lut[LMCS_MAX_LUT_SIZE * 2]; - uint8_t inv_lut[LMCS_MAX_LUT_SIZE * 2]; + union { + uint8_t u8[LMCS_MAX_LUT_SIZE]; + uint16_t u16[LMCS_MAX_LUT_SIZE]; ///< for high bit-depth + } fwd_lut, inv_lut; uint16_t pivot[LMCS_MAX_BIN_SIZE + 1]; uint16_t chroma_scale_coeff[LMCS_MAX_BIN_SIZE]; diff --git a/libavcodec/vvc/vvcdsp.h b/libavcodec/vvc/vvcdsp.h index 6f59e73654..f4fb3cb7d7 100644 --- a/libavcodec/vvc/vvcdsp.h +++ b/libavcodec/vvc/vvcdsp.h @@ -119,7 +119,7 @@ typedef struct VVCItxDSPContext { } VVCItxDSPContext; typedef struct VVCLMCSDSPContext { - void (*filter)(uint8_t *dst, ptrdiff_t dst_stride, int width, int height, const uint8_t *lut); + void (*filter)(uint8_t *dst, ptrdiff_t dst_stride, int width, int height, const void *lut); } VVCLMCSDSPContext; typedef struct VVCLFDSPContext {