From patchwork Tue May 28 13:54:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anton Khirnov X-Patchwork-Id: 49312 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:612c:142:b0:460:55fa:d5ed with SMTP id h2csp434971vqi; Tue, 28 May 2024 06:55:50 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCUSM0XXfShAaDDhpMf+NVIRPFiEttQwkvU9sob6QNV8sw43WphvrNdsm94+7bhZVEoRmhIpG59hu88AQfxfBOTYdDPeAbe+cEfslw== X-Google-Smtp-Source: AGHT+IENPRUXlHA/fsdSmNrAV5Rozp6qu0vRpNy3mTfHmEnlABPx3dc3+Bfwz8lsmEC/ZEDz6N1/ X-Received: by 2002:a05:600c:55cd:b0:418:4aac:a576 with SMTP id 5b1f17b1804b1-42108aa7d5dmr107272005e9.39.1716904549907; Tue, 28 May 2024 06:55:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1716904549; cv=none; d=google.com; s=arc-20160816; b=jczGfRYErQQ7xJqVdB/+edyPRViGHzjRsLhviT2HbMQFzkTPLheoMdxOvg9ZtgFvSC /Xzwy+zrmHeZz3MrgrETcvpkaMVVh4z+CQveSPlcLV1x1WlrWaOg2QJqC4IMHRZzbJMy tSFljMjb35MOBmqznguXSp4/bz+tIYdSuUv9UnOzk0mGmq7tC8nK1qF4Yq/ImxjFLVp7 vfCK5WSaMncDqAaxCvztMLTz9js4DCPJl57w9NaYwfiXz/HW+Q3yqIxf4L1i/8gLUV2T CS5Bwea9fuKAJM+xsX12jtD9YAvEvCHjlxIjhdKkT7aNMsStK9M7tFUe6MOUDl1Mt2io YJag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=dLu9XwmalVuziZCTmDhfkmi9Z0paJ28qWboIEC2in44=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=Dolw69eqRV7WSBcdFZ/l8Pd9ngJosfG9OL/nP7UBRkZQFOqKaPontSX5gwwLPYwOlL 5Hi8Q1PEx3JYLJOn6vL6PS1p4kSN46xabUn91I5Xb92d9A9gdt5ALMqaCIZ/iMoOn+kT EOJQ8i9gE6h8EBhS9q+frzOmOzUUA55M+CefCykA/l05ev1vrh5VAsW+95T1+nGqA6bu IQEOkT6JUWHQ8It4wutIsoy4jPDw9Sd3n9m0zYFUbqJ8P7QVNbrR6CMi0GSMHnrqTov6 AQYYAXI3h9095oNjJyZ702pAZgkkroaFYhaJHRsUEfkIK0rN+F0IUYKilDEF+kLW6G8K 0vTw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@khirnov.net header.s=mail header.b=c+Ecc5W+; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a640c23a62f3a-a626cd95af5si492760866b.866.2024.05.28.06.55.49; Tue, 28 May 2024 06:55:49 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@khirnov.net header.s=mail header.b=c+Ecc5W+; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B494D68D428; Tue, 28 May 2024 16:55:45 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail1.khirnov.net (quelana.khirnov.net [94.230.150.81]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id BF4D168D119 for ; Tue, 28 May 2024 16:55:38 +0300 (EEST) Authentication-Results: mail1.khirnov.net; dkim=pass (2048-bit key; unprotected) header.d=khirnov.net header.i=@khirnov.net header.a=rsa-sha256 header.s=mail header.b=c+Ecc5W+; dkim-atps=neutral Received: from localhost (mail1.khirnov.net [IPv6:::1]) by mail1.khirnov.net (Postfix) with ESMTP id 0E89B4DCF for ; Tue, 28 May 2024 15:55:38 +0200 (CEST) Received: from mail1.khirnov.net ([IPv6:::1]) by localhost (mail1.khirnov.net [IPv6:::1]) (amavis, port 10024) with ESMTP id z5C0BJN0PxJt for ; Tue, 28 May 2024 15:55:37 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=khirnov.net; s=mail; t=1716904536; bh=dBRtC246Huj1/SV1E9SGky3MYP+jFupwrb74F/nwHjU=; h=From:To:Subject:Date:In-Reply-To:References:From; b=c+Ecc5W+VRDL6eoXx0F9AsAL9O3TQA/3s+uuYjxFBbjwY6VAZRz1pOhrZG+4EsLx4 6tSTZ3LGc6uV/axfzDyY5PnH5LwHCjkyvS3eAABSvy57oE4AfWSzYnYhyPcdCEAr0o E+VDaEl/TJu3HdH3we9rHdl0K99kDSOOsETgRnXrg2Sz88FH7F8XcQe8MO1gT3b+xs nOHJNSEg6ioZPdVuoD6AQyVMSVhIVy4eiOqJcz906QEgldml/G9iwGEaK/vk8KW3V5 BXHU3M7F33PbWuXsnKvmhoJXe9r5ih0pBK893o0cjr67Rv5SHsVZq887isc5eC6DQF wE3NT2wQt7law== Received: from libav.khirnov.net (libav.khirnov.net [IPv6:2a00:c500:561:201::7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "libav.khirnov.net", Issuer "smtp.khirnov.net SMTP CA" (verified OK)) by mail1.khirnov.net (Postfix) with ESMTPS id 8F0DF4D7E for ; Tue, 28 May 2024 15:55:36 +0200 (CEST) Received: from libav.khirnov.net (libav.khirnov.net [IPv6:::1]) by libav.khirnov.net (Postfix) with ESMTP id 56DFB3A02CA for ; Tue, 28 May 2024 15:55:30 +0200 (CEST) From: Anton Khirnov To: ffmpeg-devel@ffmpeg.org Date: Tue, 28 May 2024 15:54:28 +0200 Message-ID: <20240528135437.24854-1-anton@khirnov.net> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2 03/10] lavc/hevcdec: allocate local_ctx as array of structs rather than pointers X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: PRA8lHu2hjWQ It is more efficient and easier to manage. --- libavcodec/hevcdec.c | 57 +++++++++++++++++++++++++------------------- libavcodec/hevcdec.h | 6 ++++- 2 files changed, 37 insertions(+), 26 deletions(-) diff --git a/libavcodec/hevcdec.c b/libavcodec/hevcdec.c index e84f45e3f8..88a481c043 100644 --- a/libavcodec/hevcdec.c +++ b/libavcodec/hevcdec.c @@ -2598,7 +2598,7 @@ static int hls_slice_data(HEVCContext *s) static int hls_decode_entry_wpp(AVCodecContext *avctxt, void *hevc_lclist, int job, int self_id) { - HEVCLocalContext *lc = ((HEVCLocalContext**)hevc_lclist)[self_id]; + HEVCLocalContext *lc = &((HEVCLocalContext*)hevc_lclist)[self_id]; const HEVCContext *const s = lc->parent; int ctb_size = 1 << s->ps.sps->log2_ctb_size; int more_data = 1; @@ -2682,7 +2682,7 @@ static int hls_slice_data_wpp(HEVCContext *s, const H2645NAL *nal) { const uint8_t *data = nal->data; int length = nal->size; - HEVCLocalContext *lc = s->HEVClc; + HEVCLocalContext *lc; int *ret; int64_t offset; int64_t startheader, cmpt = 0; @@ -2696,19 +2696,31 @@ static int hls_slice_data_wpp(HEVCContext *s, const H2645NAL *nal) return AVERROR_INVALIDDATA; } - for (i = 1; i < s->threads_number; i++) { - if (i < s->nb_local_ctx) - continue; - s->local_ctx[i] = av_mallocz(sizeof(HEVCLocalContext)); - if (!s->local_ctx[i]) - return AVERROR(ENOMEM); - s->nb_local_ctx++; + if (s->threads_number > s->nb_local_ctx) { + HEVCLocalContext *tmp = av_malloc_array(s->threads_number, sizeof(*s->local_ctx)); - s->local_ctx[i]->logctx = s->avctx; - s->local_ctx[i]->parent = s; - s->local_ctx[i]->common_cabac_state = &s->cabac; + if (!tmp) + return AVERROR(ENOMEM); + + memcpy(tmp, s->local_ctx, sizeof(*s->local_ctx) * s->nb_local_ctx); + av_free(s->local_ctx); + s->local_ctx = tmp; + s->HEVClc = &s->local_ctx[0]; + + for (unsigned i = s->nb_local_ctx; i < s->threads_number; i++) { + tmp = &s->local_ctx[i]; + + memset(tmp, 0, sizeof(*tmp)); + + tmp->logctx = s->avctx; + tmp->parent = s; + tmp->common_cabac_state = &s->cabac; + } + + s->nb_local_ctx = s->threads_number; } + lc = &s->local_ctx[0]; offset = (lc->gb.index >> 3); for (j = 0, cmpt = 0, startheader = offset + s->sh.entry_point_offset[0]; j < nal->skipped_bytes; j++) { @@ -2744,8 +2756,8 @@ static int hls_slice_data_wpp(HEVCContext *s, const H2645NAL *nal) s->data = data; for (i = 1; i < s->threads_number; i++) { - s->local_ctx[i]->first_qp_group = 1; - s->local_ctx[i]->qp_y = s->HEVClc->qp_y; + s->local_ctx[i].first_qp_group = 1; + s->local_ctx[i].qp_y = s->HEVClc->qp_y; } atomic_store(&s->wpp_err, 0); @@ -3474,12 +3486,6 @@ static av_cold int hevc_decode_free(AVCodecContext *avctx) av_freep(&s->sh.offset); av_freep(&s->sh.size); - if (s->local_ctx) { - for (i = 1; i < s->nb_local_ctx; i++) { - av_freep(&s->local_ctx[i]); - } - } - av_freep(&s->HEVClc); av_freep(&s->local_ctx); ff_h2645_packet_uninit(&s->pkt); @@ -3496,15 +3502,16 @@ static av_cold int hevc_init_context(AVCodecContext *avctx) s->avctx = avctx; - s->HEVClc = av_mallocz(sizeof(HEVCLocalContext)); - s->local_ctx = av_mallocz(sizeof(HEVCLocalContext*) * s->threads_number); - if (!s->HEVClc || !s->local_ctx) + s->local_ctx = av_mallocz(sizeof(*s->local_ctx)); + if (!s->local_ctx) return AVERROR(ENOMEM); + s->nb_local_ctx = 1; + + s->HEVClc = &s->local_ctx[0]; + s->HEVClc->parent = s; s->HEVClc->logctx = avctx; s->HEVClc->common_cabac_state = &s->cabac; - s->local_ctx[0] = s->HEVClc; - s->nb_local_ctx = 1; s->output_frame = av_frame_alloc(); if (!s->output_frame) diff --git a/libavcodec/hevcdec.h b/libavcodec/hevcdec.h index ca68fb54a7..5aa3d40450 100644 --- a/libavcodec/hevcdec.h +++ b/libavcodec/hevcdec.h @@ -439,13 +439,17 @@ typedef struct HEVCLocalContext { /* properties of the boundary of the current CTB for the purposes * of the deblocking filter */ int boundary_flags; + + // an array of these structs is used for per-thread state - pad its size + // to avoid false sharing + char padding[128]; } HEVCLocalContext; typedef struct HEVCContext { const AVClass *c; // needed by private avoptions AVCodecContext *avctx; - HEVCLocalContext **local_ctx; + HEVCLocalContext *local_ctx; unsigned nb_local_ctx; HEVCLocalContext *HEVClc;