From patchwork Fri Oct 4 14:31:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nuo Mi X-Patchwork-Id: 52046 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:938f:0:b0:48e:c0f8:d0de with SMTP id z15csp461455vqg; Fri, 4 Oct 2024 07:39:09 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCUKvdhifsB0gUYU+gU42tf1ZvdXitPCLz4ymW/K/bLRypNCQt6YBU/Cld8wXWpS2pk4zNtXnGvFgXvTBP3PSth3@gmail.com X-Google-Smtp-Source: AGHT+IHCX+LQxDmJ/ISXYal1722hnLhC9Uzvk4Bb8wgqSE6muv7oZEX7sJoyUXO2u2h4OVeOyULc X-Received: by 2002:a05:6402:4404:b0:5be:fc1d:fd38 with SMTP id 4fb4d7f45d1cf-5c8d2eb5a10mr2011004a12.36.1728052749315; Fri, 04 Oct 2024 07:39:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1728052749; cv=none; d=google.com; s=arc-20240605; b=Now8EUhSIB5Cpdsd4Un2y2kJzGOQBvF5ZsCovw4ZJMMsruTojqbEFz2zatkt0YoVcZ LNNpIUyX/PUhrJ4OCdCnXY9WmjwJE5/90H1nNIgd7CPa+W93IEZfJdkDKJMoIotVPijR aiFSJkfutMzTQQiqx90dFB1AwD77qTdDldI8CO6Eoy6jrfXZbnDvvy9TytnGyfKL4r45 pYeVI7nPLCYo/3Uhwo7zhwhzGIms94GyuDZzs8AVWGNHC4W2LtnbqLg+eIF5/0iqp9Hc GaAwtKsdS0Bihj4W96RwmKZzArU26IAhYFWAlarF5BS9Os4InLZQvm92lVb1e19rBoGK zOXg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=vLy9ncxdkgPqxpegaql2moJ9mp/3J0vZRb3/lyDMatI=; fh=mZk9AfRmPBMGW9h158yccPeJgZmEjzU2tMQtLZcF184=; b=P+ZWgSgzvyEsVA4t3mK4BwqQi3wJ0298p+iSljTlSUmhxtX9eFi6nX3dfPF83YQdcm QhjDC46CGgANGbnZmcsTNlLhpLc2TCPRY2OzdoblKuI1vTDoPULR+gnD9GwNZIoXdySd i8tStEnEVHZB7QpVAn22bePpueCJxx5fj+I66zAk6YIKwcc2ypiHOqnrGvH+QNTlI4v3 Fjb7Ww0+P/HHI4I0hq0P3jiDHEILOWKnpM7QE7yZsS2LK/GkaTp2XNdULz8JcV/OKcy0 sRn6zMNcpzN6/wrWhNHSzBpQAKg8dcrzAPkDDoV/1KwO9FQEs+2v53U8+hQJ1ytG+/FU uSwg==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=f3SqEO8g; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 4fb4d7f45d1cf-5c8ca4b884asi2514981a12.613.2024.10.04.07.39.08; Fri, 04 Oct 2024 07:39:09 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=f3SqEO8g; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 38B2E68DA21; Fri, 4 Oct 2024 17:31:31 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-oi1-f182.google.com (mail-oi1-f182.google.com [209.85.167.182]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 6167C68D863 for ; Fri, 4 Oct 2024 17:31:24 +0300 (EEST) Received: by mail-oi1-f182.google.com with SMTP id 5614622812f47-3e0438e81aaso1412835b6e.3 for ; Fri, 04 Oct 2024 07:31:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1728052282; x=1728657082; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=upBdnBIGdlejJkws4K3Xd+iAeYoorCk4lCnHAb4husI=; b=f3SqEO8ggHGYch3K/i7/aKvsUGeuezHLQTJFVr6UDEGipW72lgMkV1oW37+cIv2pbF B0W43rEAoNEL1L2Gv7pi1fmu1UV9teDcQHVFM7muZAL0DJVJkGnoByoPzgcT0nReD2F/ 0O8/AftFtkiYh+nhaBxteP+207M5PVWYOzfI3s55zijssQSiAwHf96RnYznumxhtWtKt SvmyxwqHbqpHTRX9IyYa3dX8wozaPKopMY5p8ZLtLKFbKae1s3Cv9qMzAPpHAac8dRq/ eJU9QMXWxn+aWwAutjgBHykXZgsdzKfgfYEIIVhdmsMAjGpPkSPsKHzW+CtCZnq8LMEN 7yfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728052282; x=1728657082; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=upBdnBIGdlejJkws4K3Xd+iAeYoorCk4lCnHAb4husI=; b=R682/Fk5PhVVlubAfbgfBfaqK3edIwaAz44bDOpXLw13pt5+m2b27lmcJgckiUg1Z2 oM7JHkVU21EbVgJ3C2vIAz9juy47mzsssvUlR0nMdD3ZudKkwI7TT481bFRZFraC4xaZ wjivoAnLeqbklsrTV2/+b+v3y8dIGfjyNh6uIMWJuW8y+xGWl+eFu6ICOBB78OpgBvGm wjDBhnJ3pFNCApCE1p6fYlIiKecYv8dNPBji1/+lUI/s1HsI3HdLqHOt73IkLeijW1sg z4l23xiA83TjIpfSIdJBbzc6ZlChOdYhLWDJIGUXus41uZHuSMdYlMxNC2nPGwDfs4Zm sNfw== X-Gm-Message-State: AOJu0YyyTpLVhrqmCOrUkeck9sgsfyhMGRvHoSdGIwz3R/pdrtR6to1A MrTNCzm3AduCpyJPCnIaLv1V1PHETLQ9IvUMyip911NRrIg3atKYHJWR8uXg X-Received: by 2002:a05:6808:10cf:b0:3e0:44ad:1d00 with SMTP id 5614622812f47-3e3c132ae59mr2052336b6e.18.1728052282411; Fri, 04 Oct 2024 07:31:22 -0700 (PDT) Received: from localhost ([112.64.8.17]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-7e9dcb5d85csm2535441a12.84.2024.10.04.07.31.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Oct 2024 07:31:21 -0700 (PDT) From: Nuo Mi To: ffmpeg-devel@ffmpeg.org Date: Fri, 4 Oct 2024 22:31:12 +0800 Message-Id: <20241004143115.382070-1-nuomi2021@gmail.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/4] avcodec/vvcdec: refact out deblock boundary strength stage X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Nuo Mi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: b6VL0PYC8ETi The deblock boundary strength stage utilizes ~5% of CPU resources for 8K clips. It's worth considering it as a standalone stage. This stage has been relocated to follow the parser process, allowing us to reuse CUs and TUs before releasing them. --- libavcodec/vvc/filter.c | 27 +++++++++++++++------------ libavcodec/vvc/filter.h | 9 +++++++++ libavcodec/vvc/thread.c | 24 +++++++++++++++++++++--- 3 files changed, 45 insertions(+), 15 deletions(-) diff --git a/libavcodec/vvc/filter.c b/libavcodec/vvc/filter.c index 25bef45eed..707fc24203 100644 --- a/libavcodec/vvc/filter.c +++ b/libavcodec/vvc/filter.c @@ -678,12 +678,14 @@ static void vvc_deblock_bs_chroma(const VVCLocalContext *lc, typedef void (*deblock_bs_fn)(const VVCLocalContext *lc, const int x0, const int y0, const int width, const int height, const int rs, const int vertical); -static void vvc_deblock_bs(const VVCLocalContext *lc, const int x0, const int y0, const int rs, const int vertical) +void ff_vvc_deblock_bs(VVCLocalContext *lc, const int rx, const int ry, const int rs) { const VVCFrameContext *fc = lc->fc; const VVCSPS *sps = fc->ps.sps; const VVCPPS *pps = fc->ps.pps; const int ctb_size = sps->ctb_size_y; + const int x0 = rx << sps->ctb_log2_size_y; + const int y0 = ry << sps->ctb_log2_size_y; const int x_end = FFMIN(x0 + ctb_size, pps->width) >> MIN_TU_LOG2; const int y_end = FFMIN(y0 + ctb_size, pps->height) >> MIN_TU_LOG2; const int has_chroma = !!sps->r->sps_chroma_format_idc; @@ -691,15 +693,18 @@ static void vvc_deblock_bs(const VVCLocalContext *lc, const int x0, const int y0 vvc_deblock_bs_luma, vvc_deblock_bs_chroma }; - for (int is_chroma = 0; is_chroma <= has_chroma; is_chroma++) { - const int hs = sps->hshift[is_chroma]; - const int vs = sps->vshift[is_chroma]; - for (int y = y0 >> MIN_TU_LOG2; y < y_end; y++) { - for (int x = x0 >> MIN_TU_LOG2; x < x_end; x++) { - const int off = y * fc->ps.pps->min_tu_width + x; - if ((fc->tab.tb_pos_x0[is_chroma][off] >> MIN_TU_LOG2) == x && (fc->tab.tb_pos_y0[is_chroma][off] >> MIN_TU_LOG2) == y) { - deblock_bs[is_chroma](lc, x << MIN_TU_LOG2, y << MIN_TU_LOG2, - fc->tab.tb_width[is_chroma][off] << hs, fc->tab.tb_height[is_chroma][off] << vs, rs, vertical); + ff_vvc_decode_neighbour(lc, x0, y0, rx, ry, rs); + for (int vertical = 0; vertical <= 1; vertical++) { + for (int is_chroma = 0; is_chroma <= has_chroma; is_chroma++) { + const int hs = sps->hshift[is_chroma]; + const int vs = sps->vshift[is_chroma]; + for (int y = y0 >> MIN_TU_LOG2; y < y_end; y++) { + for (int x = x0 >> MIN_TU_LOG2; x < x_end; x++) { + const int off = y * fc->ps.pps->min_tu_width + x; + if ((fc->tab.tb_pos_x0[is_chroma][off] >> MIN_TU_LOG2) == x && (fc->tab.tb_pos_y0[is_chroma][off] >> MIN_TU_LOG2) == y) { + deblock_bs[is_chroma](lc, x << MIN_TU_LOG2, y << MIN_TU_LOG2, + fc->tab.tb_width[is_chroma][off] << hs, fc->tab.tb_height[is_chroma][off] << vs, rs, vertical); + } } } } @@ -795,8 +800,6 @@ static void vvc_deblock(const VVCLocalContext *lc, int x0, int y0, const int rs, const uint8_t no_p[4] = { 0 }; const uint8_t no_q[4] = { 0 } ; - vvc_deblock_bs(lc, x0, y0, rs, vertical); - if (!vertical) { FFSWAP(int, x_end, y_end); FFSWAP(int, x0, y0); diff --git a/libavcodec/vvc/filter.h b/libavcodec/vvc/filter.h index 03cc74e071..29abbd98ce 100644 --- a/libavcodec/vvc/filter.h +++ b/libavcodec/vvc/filter.h @@ -33,6 +33,15 @@ */ void ff_vvc_lmcs_filter(const VVCLocalContext *lc, const int x0, const int y0); +/** + * derive boundary strength for the CTU + * @param lc local context for CTU + * @param rx raster x position for the CTU + * @param ry raster y position for the CTU + * @param rs raster position for the CTU + */ +void ff_vvc_deblock_bs(VVCLocalContext *lc, const int rx, const int ry, const int rs); + /** * vertical deblock filter for the CTU * @param lc local context for CTU diff --git a/libavcodec/vvc/thread.c b/libavcodec/vvc/thread.c index d75784e242..82c00dd4c9 100644 --- a/libavcodec/vvc/thread.c +++ b/libavcodec/vvc/thread.c @@ -42,6 +42,7 @@ typedef struct ProgressListener { typedef enum VVCTaskStage { VVC_TASK_STAGE_INIT, // for CTU(0, 0) only VVC_TASK_STAGE_PARSE, + VVC_TASK_STAGE_DEBLOCK_BS, VVC_TASK_STAGE_INTER, VVC_TASK_STAGE_RECON, VVC_TASK_STAGE_LMCS, @@ -111,6 +112,7 @@ static void add_task(VVCContext *s, VVCTask *t) const int priorities[] = { 0, // VVC_TASK_STAGE_INIT, 0, // VVC_TASK_STAGE_PARSE, + 1, // VVC_TASK_STAGE_DEBLOCK_BS // For an 8K clip, a CTU line completed in the reference frame may trigger 64 and more inter tasks. // We assign these tasks the lowest priority to avoid being overwhelmed with inter tasks. PRIORITY_LOWEST, // VVC_TASK_STAGE_INTER @@ -181,6 +183,8 @@ static int task_has_target_score(VVCTask *t, const VVCTaskStage stage, const uin // l:left, r:right, t: top, b: bottom static const uint8_t target_score[] = { + 2, //VVC_TASK_STAGE_DEBLOCK_BS,need l + t parse + 0, //VVC_TASK_STAGE_INTER, not used 2, //VVC_TASK_STAGE_RECON, need l + rt recon 3, //VVC_TASK_STAGE_LMCS, need r + b + rb recon 1, //VVC_TASK_STAGE_DEBLOCK_V, need l deblock v @@ -202,7 +206,7 @@ static int task_has_target_score(VVCTask *t, const VVCTaskStage stage, const uin } else if (stage == VVC_TASK_STAGE_INTER) { target = atomic_load(&t->target_inter_score); } else { - target = target_score[stage - VVC_TASK_STAGE_RECON]; + target = target_score[stage - VVC_TASK_STAGE_DEBLOCK_BS]; } //+1 for previous stage @@ -348,6 +352,10 @@ static void task_stage_done(const VVCTask *t, VVCContext *s) //this is a reserve map of ready_score, ordered by zigzag if (stage == VVC_TASK_STAGE_PARSE) { + ADD( 0, 1, VVC_TASK_STAGE_DEBLOCK_BS); + ADD( 1, 0, VVC_TASK_STAGE_DEBLOCK_BS); + if (t->rx < 0 || t->rx >= ft->ctu_width || t->ry < 0 || t->ry >= ft->ctu_height) + return; parse_task_done(s, fc, t->rx, t->ry); } else if (stage == VVC_TASK_STAGE_RECON) { ADD(-1, 1, VVC_TASK_STAGE_RECON); @@ -481,6 +489,14 @@ static int run_parse(VVCContext *s, VVCLocalContext *lc, VVCTask *t) return 0; } +static int run_deblock_bs(VVCContext *s, VVCLocalContext *lc, VVCTask *t) +{ + if (!lc->sc->sh.r->sh_deblocking_filter_disabled_flag) + ff_vvc_deblock_bs(lc, t->rx, t->ry, t->rs); + + return 0; +} + static int run_inter(VVCContext *s, VVCLocalContext *lc, VVCTask *t) { VVCFrameContext *fc = lc->fc; @@ -590,6 +606,7 @@ static int run_alf(VVCContext *s, VVCLocalContext *lc, VVCTask *t) const static char* task_name[] = { "INIT", "P", + "B", "I", "R", "L", @@ -611,6 +628,7 @@ static void task_run_stage(VVCTask *t, VVCContext *s, VVCLocalContext *lc) static const run_func run[] = { run_init, run_parse, + run_deblock_bs, run_inter, run_recon, run_lmcs, @@ -701,9 +719,9 @@ static void frame_thread_init_score(VVCFrameContext *fc) const VVCFrameThread *ft = fc->ft; VVCTask task; - task_init(&task, VVC_TASK_STAGE_RECON, fc, 0, 0); + task_init(&task, VVC_TASK_STAGE_PARSE, fc, 0, 0); - for (int i = VVC_TASK_STAGE_RECON; i < VVC_TASK_STAGE_LAST; i++) { + for (int i = VVC_TASK_STAGE_PARSE; i < VVC_TASK_STAGE_LAST; i++) { task.stage = i; for (task.rx = -1; task.rx <= ft->ctu_width; task.rx++) { From patchwork Fri Oct 4 14:31:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nuo Mi X-Patchwork-Id: 52045 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:938f:0:b0:48e:c0f8:d0de with SMTP id z15csp461411vqg; Fri, 4 Oct 2024 07:39:05 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCXRM9rBBVViQMF0vVtjl8l5KDHoXaVvqNoXcQDBuVhG1BrmRnjaNHi0llQFiMmIizPx0Xkeq9QuD7McBPspkU+6@gmail.com X-Google-Smtp-Source: AGHT+IFxbk+UpPmbWlqrIi7mFm92mi2j1+nTfN0jzpeLUoJaXKvLlmIsyAUQXvgOLfqgKzA6Q1FM X-Received: by 2002:a2e:be84:0:b0:2ef:2b38:879c with SMTP id 38308e7fff4ca-2faf3c0169cmr18243611fa.3.1728052745315; Fri, 04 Oct 2024 07:39:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1728052745; cv=none; d=google.com; s=arc-20240605; b=DRCLCFyMO2E+N18K65gsEsA9oTG0vd7ScuO6yszgRV3Pvk+n4kwVGYhordIh0v63vf DqxLkrGmMT9QE0yhrrgBUdXougK5j+7NfPP7jfWzSKC6Fck9Snd/gHmiHnQz4Y3Tf9vM pfYfnE2rMBpG/Ls8J4QcH+wDEQ8uDKQHpB+3Wh2Cml7DHXKiLZNtKzwBxt8RA+rwyb87 AFKFJnjhGKPmCotlvXB/4Q0U4eyVW97pAd5PmhPVJCJEsvg7Hw225PNi4hSMyzdIqLpu ZlJK60Ic3vwOSaWOclh05ojV8+mg5vxvJvrA1BxQmg1m77G3zoLJPLtQHu7DABcUoTzd AELA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=4/x7BWZixu/Skr6uvRUiz/TL49tTT43Q6sy2O9YGHlk=; fh=mZk9AfRmPBMGW9h158yccPeJgZmEjzU2tMQtLZcF184=; b=DPjAm8qJuuzJZJvoSm137dHcMSidmT0b6jn0O9jhb+61W0pBPlUszOCBoGHkS2og8O W12YefmSkc0EupKz/SQ/3BcPAMAsPTK3HMnpGxKzNrRAV2/e0yAFg4auDZgZn35GWNKe 5tauNCGj8TrRMV0QxtJgXKo2bVkCYCtTwLg37pznok2El/HwsV4op4HJYjVeVTB9aJ91 OrpjOIvfoUumVqJxLjJbhxcHRztzafN0ApZP08TdFIn1yRamiZMjqcYtpBXunAurwTNQ mIYEs3+8zDyxM0UnlwXXnAH8ExwCwFsMMl+4Dsm0LROzAL2uGySV8QKCWsklRVGsr04f r1tg==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=NYSu0gb6; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 38308e7fff4ca-2faf9ab0552si7371fa.73.2024.10.04.07.39.04; Fri, 04 Oct 2024 07:39:05 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=NYSu0gb6; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 9FDD868DA91; Fri, 4 Oct 2024 17:31:36 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-oi1-f169.google.com (mail-oi1-f169.google.com [209.85.167.169]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 527EC68DA5D for ; Fri, 4 Oct 2024 17:31:29 +0300 (EEST) Received: by mail-oi1-f169.google.com with SMTP id 5614622812f47-3e2a1049688so1105958b6e.1 for ; Fri, 04 Oct 2024 07:31:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1728052288; x=1728657088; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=BPYpHi2NiPA69tsrCtitSWWzKsP64K/8aDvBVfk2c4Q=; b=NYSu0gb6U2p21U2Vka2K8GoCvLms+vZAXPrJAjZVQzZuPVkcWGcgSmLb/CN7zYZ0CL d91zCETBz66swc75c/ZaTMoTYvC0Ziwsw9xMOIiauORyFlsirS/2zWfI8Ebs0OMDeQ5j f+xgpKXv4CVlfDPacOIB7LHixawBg++GbADR1nEeISGyub4SiDhvmie2fwlc7AFNgJy5 M6rQ50tbFi5Vczau4GjErln2kzToPDMPCgROtekIT062Nq2CT3B5USY98NfxViEHHX5w neSmUdE0hdho1ZIa3zh7VJdVpTk44fi1CNXbBpWK4m++TX2sgNeKi3wIS+iuX1qKk1rY IFRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728052288; x=1728657088; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BPYpHi2NiPA69tsrCtitSWWzKsP64K/8aDvBVfk2c4Q=; b=dMK1DiJ4TbI+J7sHQmeQNbcdM+jw+mbmidoQv9uIfAUhnke29wrxEZiHueGG3thUtQ I58HtlG0WJgjreE6XAfyLBXPiGB1DJzzYqAe/dNmzjHn3J5W1Za0aVwddAo0dr8hI3fp lklc/cYYTq8lMGiUWf4mDcW/qTyxuoXeNwpDttrLaQmTIY0edCEtqi0yQRolI4MPzSQi 2Pk7ApVCnJNWF8AhZwmTxptIPh8p/auySGojH1+Ay41Vp9ELt0zMKF8Iy427R4C6Mr1Y vnndwAR91ZJ3sWwSUsSkqm4bqyYMA29Yi+uTuAbMFV13PEyw3fH13B8JSCUh+JGlpB2v mrCA== X-Gm-Message-State: AOJu0YwaZESUz7789CmT+fIAyH5sHzCB/nhdpgQ25a2euyvg7n6XK5io psS+/zOG5FO2vC18BmrlOVIZf5u1jLW7uQIiT3RfYy+/P+2BtRUJbZj8PAA4 X-Received: by 2002:a05:6808:1b25:b0:3e0:374a:1a74 with SMTP id 5614622812f47-3e3c157b70emr2487082b6e.31.1728052287667; Fri, 04 Oct 2024 07:31:27 -0700 (PDT) Received: from localhost ([112.64.8.17]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71dd9d6fee2sm3264788b3a.9.2024.10.04.07.31.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Oct 2024 07:31:26 -0700 (PDT) From: Nuo Mi To: ffmpeg-devel@ffmpeg.org Date: Fri, 4 Oct 2024 22:31:13 +0800 Message-Id: <20241004143115.382070-2-nuomi2021@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241004143115.382070-1-nuomi2021@gmail.com> References: <20241004143115.382070-1-nuomi2021@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/4] avcodec/vvcdec: misc, move pcmf from min_tu_tl_init to min_cb_nz_tl_init X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Nuo Mi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 6kBfFxjuZjW/ pcmf are cu level flags --- libavcodec/vvc/ctu.c | 8 +++++--- libavcodec/vvc/dec.c | 4 +--- libavcodec/vvc/filter.c | 2 +- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/libavcodec/vvc/ctu.c b/libavcodec/vvc/ctu.c index b33ad576cf..8210ab520f 100644 --- a/libavcodec/vvc/ctu.c +++ b/libavcodec/vvc/ctu.c @@ -1240,16 +1240,18 @@ static void set_cu_tabs(const VVCLocalContext *lc, const CodingUnit *cu) set_cb_tab(lc, fc->tab.mmi, pu->mi.motion_model_idc); set_cb_tab(lc, fc->tab.msf, pu->merge_subblock_flag); - if (cu->tree_type != DUAL_TREE_CHROMA) + if (cu->tree_type != DUAL_TREE_CHROMA) { set_cb_tab(lc, fc->tab.skip, cu->skip_flag); + set_cb_tab(lc, fc->tab.pcmf[LUMA], cu->bdpcm_flag[LUMA]); + } + if (cu->tree_type != DUAL_TREE_LUMA) + set_cb_tab(lc, fc->tab.pcmf[CHROMA], cu->bdpcm_flag[CHROMA]); while (tu) { for (int j = 0; j < tu->nb_tbs; j++) { const TransformBlock *tb = tu->tbs + j; if (tb->c_idx != LUMA) set_qp_c_tab(lc, tu, tb); - if (tb->c_idx != CR && cu->bdpcm_flag[tb->c_idx]) - set_tb_tab(fc->tab.pcmf[tb->c_idx], 1, fc, tb); } tu = tu->next; } diff --git a/libavcodec/vvc/dec.c b/libavcodec/vvc/dec.c index edf2607f50..13ca752eec 100644 --- a/libavcodec/vvc/dec.c +++ b/libavcodec/vvc/dec.c @@ -150,6 +150,7 @@ static void min_cb_nz_tl_init(TabList *l, VVCFrameContext *fc) TL_ADD(cb_height[i], pic_size_in_min_cb); TL_ADD(cp_mv[i], pic_size_in_min_cb * MAX_CONTROL_POINTS); TL_ADD(cpm[i], pic_size_in_min_cb); + TL_ADD(pcmf[i], pic_size_in_min_cb); } // For luma, qp can only change at the CU level, so the qp tab size is related to the CU. TL_ADD(qp[LUMA], pic_size_in_min_cb); @@ -189,9 +190,6 @@ static void min_tu_tl_init(TabList *l, VVCFrameContext *fc) TL_ADD(tu_joint_cbcr_residual_flag, pic_size_in_min_tu); - for (int i = LUMA; i <= CHROMA; i++) - TL_ADD(pcmf[i], pic_size_in_min_tu); - for (int i = 0; i < VVC_MAX_SAMPLE_ARRAYS; i++) { TL_ADD(tu_coded_flag[i], pic_size_in_min_tu); diff --git a/libavcodec/vvc/filter.c b/libavcodec/vvc/filter.c index 707fc24203..9a45a735e0 100644 --- a/libavcodec/vvc/filter.c +++ b/libavcodec/vvc/filter.c @@ -543,9 +543,9 @@ static av_always_inline int deblock_bs(const VVCLocalContext *lc, const uint8_t chroma = !!c_idx; const int tu_p = (y_p >> log2_min_tu_size) * min_tu_width + (x_p >> log2_min_tu_size); const int tu_q = (y_q >> log2_min_tu_size) * min_tu_width + (x_q >> log2_min_tu_size); - const uint8_t pcmf = fc->tab.pcmf[chroma][tu_p] && fc->tab.pcmf[chroma][tu_q]; const int cb_p = (y_p >> log2_min_cb_size) * min_cb_width + (x_p >> log2_min_cb_size); const int cb_q = (y_q >> log2_min_cb_size) * min_cb_width + (x_q >> log2_min_cb_size); + const uint8_t pcmf = fc->tab.pcmf[chroma][cb_p] && fc->tab.pcmf[chroma][cb_q]; const uint8_t intra = fc->tab.cpm[chroma][cb_p] == MODE_INTRA || fc->tab.cpm[chroma][cb_q] == MODE_INTRA; const uint8_t same_mode = fc->tab.cpm[chroma][cb_p] == fc->tab.cpm[chroma][cb_q]; From patchwork Fri Oct 4 14:31:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nuo Mi X-Patchwork-Id: 52051 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:938f:0:b0:48e:c0f8:d0de with SMTP id z15csp633003vqg; Fri, 4 Oct 2024 12:39:06 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCVm2mtIHL/Ta62XtJoiBIKsfs4KrHehkOvGji2MnTAAHR8RJ2TxJL5Vr5aYb+HJgC13cFy622qwm5XBLZd0zXJ5@gmail.com X-Google-Smtp-Source: AGHT+IE1ET/m/vCZ6cz3Xf+faBCToZr8v9cvHNKqXOPh13S1DpQP/2L9tM/HNA0hcdj3H8Hd7qaE X-Received: by 2002:a05:6512:31cb:b0:539:8d67:1b1b with SMTP id 2adb3069b0e04-539ab86b7demr2448307e87.26.1728070745955; Fri, 04 Oct 2024 12:39:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1728070745; cv=none; d=google.com; s=arc-20240605; b=Ys6KRo9PFn5A/xG07NjBS4+nmDCgj2rfqeuSP+T4fb8tSPpKVVYUmdxn9/nc1jRmyt F61nW8fS8R48bAQKzRu9q8zQsxhLpPXEi1h/zaXU3csmD5+yFkmAZyWdcA6iSeMPclvh ovtvfVf+A3CrxCH2kabFMmXEYaLUPXUdAgCKV1XceBQSkuTlVP10VZPTKRT9P7SL5+LT mftVVhLInLUAPzzB8kO+5BNESY7IZTiRUFVHdqrqMitf9gAxdypNaomDNJAwYnsIkJeI aPwwnQrr8FWQkNbb2iq6jEoM7Oct8O6yvKok6eWbRh95f+wGRAPKCKchZd93nTGkZFPk jXbA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=ccM4+5VigvUoulLLLZZzxeEv2FQKufDaaKTt3ACr/4Q=; fh=mZk9AfRmPBMGW9h158yccPeJgZmEjzU2tMQtLZcF184=; b=l1y0xmkiGh7CAj1CFpDbFWa+h6WwsCstI4iATsVDAErbU0q1smDmsZqYKHE02keHMO FREmuf2/0R/UrddJ1+J0chK7vcmTtloUULFTDqKIGSA9kmidGQqAqRyC+GsVkuuSQl0D h3A5fJLGoMY6OVUwiUuTUwrMwnZNXqaMVIG4Ghqb/Aa5D8nmtfvhLYDGsJ31TTfHna3X Jmi+Yg+V0rhGny6N2+Box8qs2dkVUDkyTd3qrWpnemcvlpVoSL7dr3x62/jGGmfXbdgY svr2ZKJdce5lbVxvlhg8G3s2Qjz0ff22FuTqdn1J87q8o99Rt7IbXO9FktGI8L5f3E8W F+mg==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=G2JsYREb; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 2adb3069b0e04-539b002867bsi98658e87.603.2024.10.04.12.39.05; Fri, 04 Oct 2024 12:39:05 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=G2JsYREb; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1F21368DA61; Fri, 4 Oct 2024 17:31:41 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pl1-f170.google.com (mail-pl1-f170.google.com [209.85.214.170]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1D1AA68DAA8 for ; Fri, 4 Oct 2024 17:31:35 +0300 (EEST) Received: by mail-pl1-f170.google.com with SMTP id d9443c01a7336-20b7463dd89so24272605ad.2 for ; Fri, 04 Oct 2024 07:31:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1728052293; x=1728657093; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=mN1FJYYyk2vB1BcuN58GnuxbCRRkKGiARstYdM22le4=; b=G2JsYREbRbIy2RlAIN27EIh0uWDCJ0UmElGWkzqJKBge4qsxvTSmgUUmQ1qzOltNiY Qmm4pNTNik3gTroh7nmys6+Shyuo3hm+Vh/Pdgqa9VBElTcn4Zvwl//UPaTYFtpczrrt 2/XA8NHKM5RIsUrObXfy9Q/UbD9KV6UVhedD1nBwXc4gFK19joPDMNLlIxICiyWB7P58 Ok8fasavzi4A9XLjL+RMNdwsy/q1uTliEUdOiLMwoY44LNkuklrJHqiy1dKSo/x+GZRr 0tDzGx9AbGGzSqtLXCkKX/bGRCFi8RVVOG+ufbWP/ykFK0y1WsUyJQ12/JzFRDdoyOzn EDMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728052293; x=1728657093; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mN1FJYYyk2vB1BcuN58GnuxbCRRkKGiARstYdM22le4=; b=rjT8X8tgwvYNI4e27Gl+gStS31N3Fpr1lRkTw699NLsw/gDfShBaYAgvEI0Xl7+u7b vqmlnHGXAZux6faaz40LY9H73dsl642qLQjQjbc3tTjQurZfWyykrMtUTTNHXaFWXMsa q58EBPG7Cky1+0XGArwHS6xXXTFx7d+lTuUp8f+p4WSt/9ixX1whMiG1oAbh4Jb8dgSP ++iP94AyV8AKK2bB9HihM2wQQTkL/hPEyVciPqnJzYVZ8PNuiXSc0+R7Z5QQDNJiPRWe jz04Go6FJsckHK0LY8ccUYuoZ3dNKoLfeCwkYeYhFeHPJEyAynzcjXkZZ0xsbwnYXbkM +qDA== X-Gm-Message-State: AOJu0Yy+BAE2Z3u70V93V2WEbFrxRtNHvWCPb72BPG2u7hQEnPFs0nXe k8pqAU6jmiPmCHwTMa1zTOVo75AWtoRHkoNcpIvfiVOuqifRXiPJya50EfHv X-Received: by 2002:a17:902:ea11:b0:20b:7a31:4522 with SMTP id d9443c01a7336-20bff04a762mr41020285ad.42.1728052292972; Fri, 04 Oct 2024 07:31:32 -0700 (PDT) Received: from localhost ([112.64.8.17]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-20bef70704bsm24648025ad.268.2024.10.04.07.31.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Oct 2024 07:31:32 -0700 (PDT) From: Nuo Mi To: ffmpeg-devel@ffmpeg.org Date: Fri, 4 Oct 2024 22:31:14 +0800 Message-Id: <20241004143115.382070-3-nuomi2021@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241004143115.382070-1-nuomi2021@gmail.com> References: <20241004143115.382070-1-nuomi2021@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 3/4] avcodec/vvdec: refact, ff_vvc_deblock_bs use CodingUnit/TransformUnit instead of fc->tabs X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Nuo Mi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: WZ1HywXXZbzI perf result for: "perf record -F 99 ./ffmpeg_g -i Tango2_3840x2160_60_10_420_27_LD.266 -f null -" before: 5.24% 1.87% ffmpeg_g [.] vvc_deblock_bs_chroma 1.72% ffmpeg_g [.] ff_vvc_deblock_bs 1.65% ffmpeg_g [.] vvc_deblock_bs_luma after: 3.48% 1.84% ffmpeg_g [.] vvc_deblock_bs_chroma 1.64% ffmpeg_g [.] ff_vvc_deblock_bs + vvc_deblock_bs_luma(inlined) --- libavcodec/vvc/ctu.c | 2 + libavcodec/vvc/ctu.h | 3 ++ libavcodec/vvc/filter.c | 90 +++++++++++++++++------------------------ 3 files changed, 42 insertions(+), 53 deletions(-) diff --git a/libavcodec/vvc/ctu.c b/libavcodec/vvc/ctu.c index 8210ab520f..e49976c66b 100644 --- a/libavcodec/vvc/ctu.c +++ b/libavcodec/vvc/ctu.c @@ -241,6 +241,7 @@ static TransformUnit* add_tu(VVCFrameContext *fc, CodingUnit *cu, const int x0, tu->height = tu_height; tu->joint_cbcr_residual_flag = 0; memset(tu->coded_flag, 0, sizeof(tu->coded_flag)); + tu->avail[LUMA] = tu->avail[CHROMA] = 0; tu->nb_tbs = 0; return tu; @@ -267,6 +268,7 @@ static TransformBlock* add_tb(TransformUnit *tu, VVCLocalContext *lc, tb->ts = 0; tb->coeffs = lc->coeffs; lc->coeffs += tb_width * tb_height; + tu->avail[!!c_idx] = true; return tb; } diff --git a/libavcodec/vvc/ctu.h b/libavcodec/vvc/ctu.h index eab4612561..c5533c1ad0 100644 --- a/libavcodec/vvc/ctu.h +++ b/libavcodec/vvc/ctu.h @@ -23,6 +23,8 @@ #ifndef AVCODEC_VVC_CTU_H #define AVCODEC_VVC_CTU_H +#include + #include "libavcodec/cabac.h" #include "libavutil/mem_internal.h" @@ -172,6 +174,7 @@ typedef struct TransformUnit { int y0; int width; int height; + bool avail[CHROMA + 1]; // contains luma/chroma block uint8_t joint_cbcr_residual_flag; ///< tu_joint_cbcr_residual_flag diff --git a/libavcodec/vvc/filter.c b/libavcodec/vvc/filter.c index 9a45a735e0..a7f102bc64 100644 --- a/libavcodec/vvc/filter.c +++ b/libavcodec/vvc/filter.c @@ -451,15 +451,15 @@ static int boundary_strength(const VVCLocalContext *lc, const MvField *curr, con //part of 8.8.3.3 Derivation process of transform block boundary static void derive_max_filter_length_luma(const VVCFrameContext *fc, const int qx, const int qy, - const int is_intra, const int has_subblock, const int vertical, uint8_t *max_len_p, uint8_t *max_len_q) + const int size_q, const int has_subblock, const int vertical, uint8_t *max_len_p, uint8_t *max_len_q) { const int px = vertical ? qx - 1 : qx; const int py = !vertical ? qy - 1 : qy; const uint8_t *tb_size = vertical ? fc->tab.tb_width[LUMA] : fc->tab.tb_height[LUMA]; const int size_p = tb_size[(py >> MIN_TU_LOG2) * fc->ps.pps->min_tu_width + (px >> MIN_TU_LOG2)]; - const int size_q = tb_size[(qy >> MIN_TU_LOG2) * fc->ps.pps->min_tu_width + (qx >> MIN_TU_LOG2)]; const int min_cb_log2 = fc->ps.sps->min_cb_log2_size_y; const int off_p = (py >> min_cb_log2) * fc->ps.pps->min_cb_width + (px >> min_cb_log2); + if (size_p <= 4 || size_q <= 4) { *max_len_p = *max_len_q = 1; } else { @@ -525,7 +525,7 @@ static void vvc_deblock_subblock_bs(const VVCLocalContext *lc, } static av_always_inline int deblock_bs(const VVCLocalContext *lc, - const int x_p, const int y_p, const int x_q, const int y_q, + const int x_p, const int y_p, const int x_q, const int y_q, const CodingUnit *cu, const TransformUnit *tu, const RefPicList *rpl_p, const int c_idx, const int off_to_cb, const uint8_t has_sub_block) { const VVCFrameContext *fc = lc->fc; @@ -542,12 +542,10 @@ static av_always_inline int deblock_bs(const VVCLocalContext *lc, const MvField *mvf_q = &tab_mvf[pu_q]; const uint8_t chroma = !!c_idx; const int tu_p = (y_p >> log2_min_tu_size) * min_tu_width + (x_p >> log2_min_tu_size); - const int tu_q = (y_q >> log2_min_tu_size) * min_tu_width + (x_q >> log2_min_tu_size); const int cb_p = (y_p >> log2_min_cb_size) * min_cb_width + (x_p >> log2_min_cb_size); - const int cb_q = (y_q >> log2_min_cb_size) * min_cb_width + (x_q >> log2_min_cb_size); - const uint8_t pcmf = fc->tab.pcmf[chroma][cb_p] && fc->tab.pcmf[chroma][cb_q]; - const uint8_t intra = fc->tab.cpm[chroma][cb_p] == MODE_INTRA || fc->tab.cpm[chroma][cb_q] == MODE_INTRA; - const uint8_t same_mode = fc->tab.cpm[chroma][cb_p] == fc->tab.cpm[chroma][cb_q]; + const uint8_t pcmf = fc->tab.pcmf[chroma][cb_p] && cu->bdpcm_flag[chroma]; + const uint8_t intra = fc->tab.cpm[chroma][cb_p] == MODE_INTRA || cu->pred_mode == MODE_INTRA; + const uint8_t same_mode = fc->tab.cpm[chroma][cb_p] == cu->pred_mode; if (pcmf) return 0; @@ -557,12 +555,12 @@ static av_always_inline int deblock_bs(const VVCLocalContext *lc, if (chroma) { return fc->tab.tu_coded_flag[c_idx][tu_p] || - fc->tab.tu_coded_flag[c_idx][tu_q] || fc->tab.tu_joint_cbcr_residual_flag[tu_p] || - fc->tab.tu_joint_cbcr_residual_flag[tu_q]; + tu->coded_flag[c_idx] || + tu->joint_cbcr_residual_flag; } - if (fc->tab.tu_coded_flag[LUMA][tu_p] || fc->tab.tu_coded_flag[LUMA][tu_q]) + if (fc->tab.tu_coded_flag[LUMA][tu_p] || tu->coded_flag[LUMA]) return 1; if ((off_to_cb && ((off_to_cb % 8) || !has_sub_block))) @@ -606,27 +604,23 @@ static int deblock_is_boundary(const VVCLocalContext *lc, const int boundary, } static void vvc_deblock_bs_luma(const VVCLocalContext *lc, - const int x0, const int y0, const int width, const int height, const int rs, const int vertical) + const int x0, const int y0, const int width, const int height, + const CodingUnit *cu, const TransformUnit *tu, int rs, const int vertical) { - const VVCFrameContext *fc = lc->fc; - const MvField *tab_mvf = fc->tab.mvf; - const int mask = LUMA_GRID - 1; - const int log2_min_pu_size = MIN_PU_LOG2; - const int min_pu_width = fc->ps.pps->min_pu_width; - const int min_cb_log2 = fc->ps.sps->min_cb_log2_size_y; - const int min_cb_width = fc->ps.pps->min_cb_width; - const int pos = vertical ? x0 : y0; - const int off_q = (y0 >> min_cb_log2) * min_cb_width + (x0 >> min_cb_log2); - const int cb = (vertical ? fc->tab.cb_pos_x : fc->tab.cb_pos_y )[LUMA][off_q]; - const int is_intra = tab_mvf[(y0 >> log2_min_pu_size) * min_pu_width + - (x0 >> log2_min_pu_size)].pred_flag == PF_INTRA; + const VVCFrameContext *fc = lc->fc; + const PredictionUnit *pu = &cu->pu; + const int mask = LUMA_GRID - 1; + const int pos = vertical ? x0 : y0; + const int cb = vertical ? cu->x0 : cu->y0; + const int is_intra = cu->pred_mode == MODE_INTRA; + const int cb_size = vertical ? cu->cb_width : cu->cb_height; + const int has_sb = !is_intra && (pu->merge_subblock_flag || pu->inter_affine_flag) && cb_size > 8; if (deblock_is_boundary(lc, pos > 0 && !(pos & mask), pos, rs, vertical)) { const int is_vb = is_virtual_boundary(fc, pos, vertical); const int size = vertical ? height : width; + const int size_q = vertical ? width : height; const int off = cb - pos; - const int cb_size = (vertical ? fc->tab.cb_width : fc->tab.cb_height)[LUMA][off_q]; - const int has_sb = !is_intra && (fc->tab.msf[off_q] || fc->tab.iaf[off_q]) && cb_size > 8; const int flag = vertical ? BOUNDARY_LEFT_SLICE : BOUNDARY_UPPER_SLICE; const RefPicList *rpl_p = (lc->boundary_flags & flag) ? ff_vvc_get_ref_list(fc, fc->ref, x0 - vertical, y0 - !vertical) : lc->sc->rpl; @@ -635,24 +629,23 @@ static void vvc_deblock_bs_luma(const VVCLocalContext *lc, const int x = x0 + i * !vertical; const int y = y0 + i * vertical; uint8_t max_len_p, max_len_q; - const int bs = is_vb ? 0 : deblock_bs(lc, x - vertical, y - !vertical, x, y, rpl_p, LUMA, off, has_sb); + const int bs = is_vb ? 0 : deblock_bs(lc, x - vertical, y - !vertical, x, y, cu, tu, rpl_p, LUMA, off, has_sb); TAB_BS(fc->tab.bs[vertical][LUMA], x, y) = bs; - derive_max_filter_length_luma(fc, x, y, is_intra, has_sb, vertical, &max_len_p, &max_len_q); + derive_max_filter_length_luma(fc, x, y, size_q, has_sb, vertical, &max_len_p, &max_len_q); TAB_MAX_LEN(fc->tab.max_len_p[vertical], x, y) = max_len_p; TAB_MAX_LEN(fc->tab.max_len_q[vertical], x, y) = max_len_q; } } - if (!is_intra) { - if (fc->tab.msf[off_q] || fc->tab.iaf[off_q]) - vvc_deblock_subblock_bs(lc, cb, x0, y0, width, height, vertical); - } + if (has_sb) + vvc_deblock_subblock_bs(lc, cb, x0, y0, width, height, vertical); } static void vvc_deblock_bs_chroma(const VVCLocalContext *lc, - const int x0, const int y0, const int width, const int height, const int rs, const int vertical) + const int x0, const int y0, const int width, const int height, + const CodingUnit *cu, const TransformUnit *tu, const int rs, const int vertical) { const VVCFrameContext *fc = lc->fc; const int shift = (vertical ? fc->ps.sps->hshift : fc->ps.sps->vshift)[CHROMA]; @@ -667,7 +660,7 @@ static void vvc_deblock_bs_chroma(const VVCLocalContext *lc, for (int i = 0; i < size; i += 2) { const int x = x0 + i * !vertical; const int y = y0 + i * vertical; - const int bs = is_vb ? 0 : deblock_bs(lc, x - vertical, y - !vertical, x, y, NULL, c_idx, 0, 0); + const int bs = is_vb ? 0 : deblock_bs(lc, x - vertical, y - !vertical, x, y, cu, tu, NULL, c_idx, 0, 0); TAB_BS(fc->tab.bs[vertical][c_idx], x, y) = bs; } @@ -682,29 +675,20 @@ void ff_vvc_deblock_bs(VVCLocalContext *lc, const int rx, const int ry, const in { const VVCFrameContext *fc = lc->fc; const VVCSPS *sps = fc->ps.sps; - const VVCPPS *pps = fc->ps.pps; - const int ctb_size = sps->ctb_size_y; const int x0 = rx << sps->ctb_log2_size_y; const int y0 = ry << sps->ctb_log2_size_y; - const int x_end = FFMIN(x0 + ctb_size, pps->width) >> MIN_TU_LOG2; - const int y_end = FFMIN(y0 + ctb_size, pps->height) >> MIN_TU_LOG2; - const int has_chroma = !!sps->r->sps_chroma_format_idc; - deblock_bs_fn deblock_bs[] = { - vvc_deblock_bs_luma, vvc_deblock_bs_chroma - }; ff_vvc_decode_neighbour(lc, x0, y0, rx, ry, rs); - for (int vertical = 0; vertical <= 1; vertical++) { - for (int is_chroma = 0; is_chroma <= has_chroma; is_chroma++) { - const int hs = sps->hshift[is_chroma]; - const int vs = sps->vshift[is_chroma]; - for (int y = y0 >> MIN_TU_LOG2; y < y_end; y++) { - for (int x = x0 >> MIN_TU_LOG2; x < x_end; x++) { - const int off = y * fc->ps.pps->min_tu_width + x; - if ((fc->tab.tb_pos_x0[is_chroma][off] >> MIN_TU_LOG2) == x && (fc->tab.tb_pos_y0[is_chroma][off] >> MIN_TU_LOG2) == y) { - deblock_bs[is_chroma](lc, x << MIN_TU_LOG2, y << MIN_TU_LOG2, - fc->tab.tb_width[is_chroma][off] << hs, fc->tab.tb_height[is_chroma][off] << vs, rs, vertical); - } + for (const CodingUnit *cu = fc->tab.cus[rs]; cu; cu = cu->next) { + for (const TransformUnit *tu = cu->tus.head; tu; tu = tu->next) { + for (int vertical = 0; vertical <= 1; vertical++) { + if (tu->avail[LUMA]) + vvc_deblock_bs_luma(lc, tu->x0, tu->y0, tu->width, tu->height, cu, tu, rs, vertical); + if (tu->avail[CHROMA]) { + if (cu->isp_split_type != ISP_NO_SPLIT && cu->tree_type == SINGLE_TREE) + vvc_deblock_bs_chroma(lc, cu->x0, cu->y0, cu->cb_width, cu->cb_height, cu, tu, rs, vertical); + else + vvc_deblock_bs_chroma(lc, tu->x0, tu->y0, tu->width, tu->height, cu, tu, rs, vertical); } } } From patchwork Fri Oct 4 14:31:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nuo Mi X-Patchwork-Id: 52052 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:938f:0:b0:48e:c0f8:d0de with SMTP id z15csp654018vqg; Fri, 4 Oct 2024 13:29:05 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCUy6g7kp6j9wqEkDrRh1rf0kKmyNDLR4boDFQjmP8bwHah7aSXk8L5fbelYlfKDEimjw9zYb2fe3wcFfBBZA0W8@gmail.com X-Google-Smtp-Source: AGHT+IHb4n1niSQLxKAYtYTqv43Q1zYmBvxjZYSbYz1oAaDWHgNhPCaVj7Nge5rbimZWqq9vFX7D X-Received: by 2002:a2e:6111:0:b0:2fa:c528:1ce5 with SMTP id 38308e7fff4ca-2faf3d706aamr27618611fa.35.1728073745029; Fri, 04 Oct 2024 13:29:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1728073744; cv=none; d=google.com; s=arc-20240605; b=ff0ljBSMZ1YOHmLfIYEKPQlxmnfg7R39sQIxPPs9LkGyUfxfdGuycvaLyYoRCDI3zL nMyT3pACXibneeMSQ6D+y8xkiqADZqPg4+6tz6lBc6XsOfsbuWc5Lm3/rK5M2wAyj6Cm 0JwOLyRxOzVKwhin1btWiVJH2ob3z7Iavjjitlz6TTepsSvWaHoOz56ZpU6sTf5pxNOU ZjyLODF0ESKfPj9e5TBuzUJvK4aVkcbVw+7Rlo4NBtctzXEmRhtDkQ4NBhJc+3FbZqzp 9f0kycodzh60JlcCULHh4/sbb0MgJA3WRftylP7bCQ0SLN+EU0R0UefTHSLn5zdxU2lh mFZg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=c9qN0For+IWUpJsBmjRmsG1npBdWwQudk31Dx5dwrLY=; fh=mZk9AfRmPBMGW9h158yccPeJgZmEjzU2tMQtLZcF184=; b=joPDl4kThYUpqyUEAauUlvGG5qK2bd5pa60hklYfI+OR726Nxr/IzEjBig7sb3gcw6 XpXAqZUH9ql2nUFegGBlcRkRjxeHlLfUX2n5d9TO4+lAgkGR1FfQc6PRYyeiGmBJmO6j mz0ecm9XOwFChD/YR7AdNpq2UY9j3dwolDi9V+u4Fp1Il1/tj2jZhc5QPBO1ZnfQg/9U BS+T7QONXfVVrOkpYO7z14xcxo4CTjXJWaP/23YSW5KVvfiqn4tI676v2hB0OQdNiMiP B/Oj/dhN0O1C0Lpl2IPM63oYoFoBvWsFlMV3klLklcAa7lRJsYeCmYSceSPyoWcDTJzd bomw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=YGeFD9Lu; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 4fb4d7f45d1cf-5c8e05eff56si304759a12.350.2024.10.04.13.29.03; Fri, 04 Oct 2024 13:29:04 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=YGeFD9Lu; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 309C268DAEC; Fri, 4 Oct 2024 17:31:48 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pj1-f49.google.com (mail-pj1-f49.google.com [209.85.216.49]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 2483468DAF0 for ; Fri, 4 Oct 2024 17:31:42 +0300 (EEST) Received: by mail-pj1-f49.google.com with SMTP id 98e67ed59e1d1-2e053f42932so1638326a91.0 for ; Fri, 04 Oct 2024 07:31:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1728052300; x=1728657100; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=yIOCY6GzLu4GsW/KM2zcCkt02keBpevgorLHbZPHBfk=; b=YGeFD9Lu1P9vFHG7qtXEseTLTypfqEWfxsdY3dxRMJicYVJVNYkRanUA5QyxQYHSir gO/FFwFELlI+2YmCs9Mo2p42BRXVKkXiDMWVbT8zTPiaRlCmNjbOfqEk5Bl68IvbqiH1 nQ604UaaDJOO2ru9ZIuQrL7idFMB2aVW3DVOOAcQi0ZRxSyuR0LidiTJs3wbrxMDgT+j Po0JFGq77PJa+zfNWMrwHmh4miZoKSOBraETfgmuGkODsGdtYRSSjek9coXAESEmMvpm cLKMabfqAQ7HyqWnWS0IVkLT1uJ4pDRDI8S0AptyjpsFHfcNQGQn+WTs2y51XU4n87KR 6OlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728052300; x=1728657100; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yIOCY6GzLu4GsW/KM2zcCkt02keBpevgorLHbZPHBfk=; b=QOaHi+VcYkE+uLN1ufXE1ya/mo+nhYyqLtr+cxp9Sok4AEUfb7dQXi7Ltijd/J8CMK qtyjnDM/109CXRxKQAMdIU4DwR7WX6lJsXimmnQQiRlJDYvyzarAYaZI10YfDJXqGi6y QiOwIF/L3UJxJQ00QVU3VyVrT9MR9vE8k9+p+MjzIcQHnOb7d06nZ4eZ1njTKSgWYkrV yyx8AOjRNFAwzP2Q1J8otftI6hJN954ESUhp3I7dyU7QfMYly1ZRm8s2d+go1CbqRLeO 99Wbcuc54ogzldEM+HPYwMHwZJr6dV+BDQsjLSyO8k9VfktxrwVEN+wNm3qrVWHyTnK7 DgPA== X-Gm-Message-State: AOJu0Yw07bqakr2V2DZ0uxZssQxShn1ZPuhj6Bjh8iCLxwD6X6H3taJn K8jhvqXIs91uQSQlOXVWHV2O974i+oygKS0QlBq+97vNGtzWSKombzc3P5+D X-Received: by 2002:a17:90b:1185:b0:2d3:d066:f58b with SMTP id 98e67ed59e1d1-2e1e62259c0mr3560032a91.12.1728052300066; Fri, 04 Oct 2024 07:31:40 -0700 (PDT) Received: from localhost ([112.64.8.17]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2e1e85da490sm1670313a91.32.2024.10.04.07.31.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Oct 2024 07:31:39 -0700 (PDT) From: Nuo Mi To: ffmpeg-devel@ffmpeg.org Date: Fri, 4 Oct 2024 22:31:15 +0800 Message-Id: <20241004143115.382070-4-nuomi2021@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241004143115.382070-1-nuomi2021@gmail.com> References: <20241004143115.382070-1-nuomi2021@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 4/4] avcodec/vvcdec: remove unused tb_pos_x0 and tb_pos_y0 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Nuo Mi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: sJAylkZMMciv This change will save approximately 531 MB for an 8K clip when processed with 16 threads. The calculation is as follows: 7680 * 4320 * sizeof(int) * 2 * 2 * 16 / (4 * 4). --- libavcodec/vvc/ctu.c | 10 +++------- libavcodec/vvc/dec.c | 2 -- libavcodec/vvc/dec.h | 2 -- 3 files changed, 3 insertions(+), 11 deletions(-) diff --git a/libavcodec/vvc/ctu.c b/libavcodec/vvc/ctu.c index e49976c66b..1e06119cfd 100644 --- a/libavcodec/vvc/ctu.c +++ b/libavcodec/vvc/ctu.c @@ -38,7 +38,7 @@ typedef enum VVCModeType { MODE_TYPE_INTRA, } VVCModeType; -static void set_tb_pos(const VVCFrameContext *fc, const TransformBlock *tb) +static void set_tb_size(const VVCFrameContext *fc, const TransformBlock *tb) { const int x_tb = tb->x0 >> MIN_TU_LOG2; const int y_tb = tb->y0 >> MIN_TU_LOG2; @@ -50,10 +50,6 @@ static void set_tb_pos(const VVCFrameContext *fc, const TransformBlock *tb) for (int y = y_tb; y < end; y++) { const int off = y * fc->ps.pps->min_tu_width + x_tb; - for (int i = 0; i < width; i++) { - fc->tab.tb_pos_x0[is_chroma][off + i] = tb->x0; - fc->tab.tb_pos_y0[is_chroma][off + i] = tb->y0; - } memset(fc->tab.tb_width [is_chroma] + off, tb->tb_width, width); memset(fc->tab.tb_height[is_chroma] + off, tb->tb_height, width); } @@ -397,7 +393,7 @@ static int hls_transform_unit(VVCLocalContext *lc, int x0, int y0,int tu_width, set_tb_tab(fc->tab.tu_coded_flag[tb->c_idx], tu->coded_flag[tb->c_idx], fc, tb); } if (tb->c_idx != CR) - set_tb_pos(fc, tb); + set_tb_size(fc, tb); if (tb->c_idx == CB) set_tb_tab(fc->tab.tu_joint_cbcr_residual_flag, tu->joint_cbcr_residual_flag, fc, tb); } @@ -514,7 +510,7 @@ static int skipped_transform_tree(VVCLocalContext *lc, int x0, int y0,int tu_wid for (int i = c_start; i < c_end; i++) { TransformBlock *tb = add_tb(tu, lc, x0, y0, tu_width >> sps->hshift[i], tu_height >> sps->vshift[i], i); if (i != CR) - set_tb_pos(fc, tb); + set_tb_size(fc, tb); } } diff --git a/libavcodec/vvc/dec.c b/libavcodec/vvc/dec.c index 13ca752eec..9b7afe4c38 100644 --- a/libavcodec/vvc/dec.c +++ b/libavcodec/vvc/dec.c @@ -207,8 +207,6 @@ static void min_tu_nz_tl_init(TabList *l, VVCFrameContext *fc) tl_init(l, 0, changed); for (int i = LUMA; i <= CHROMA; i++) { - TL_ADD(tb_pos_x0[i], pic_size_in_min_tu); - TL_ADD(tb_pos_y0[i], pic_size_in_min_tu); TL_ADD(tb_width[i], pic_size_in_min_tu); TL_ADD(tb_height[i], pic_size_in_min_tu); } diff --git a/libavcodec/vvc/dec.h b/libavcodec/vvc/dec.h index 159c60942b..7254b515fd 100644 --- a/libavcodec/vvc/dec.h +++ b/libavcodec/vvc/dec.h @@ -172,8 +172,6 @@ typedef struct VVCFrameContext { uint8_t *tu_coded_flag[VVC_MAX_SAMPLE_ARRAYS]; ///< tu_y_coded_flag[][], tu_cb_coded_flag[][], tu_cr_coded_flag[][] uint8_t *tu_joint_cbcr_residual_flag; ///< tu_joint_cbcr_residual_flag[][] - int *tb_pos_x0[2]; - int *tb_pos_y0[2]; uint8_t *tb_width[2]; uint8_t *tb_height[2]; uint8_t *pcmf[2];