From patchwork Mon May 13 15:42:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Tomas_H=C3=A4rdin?= X-Patchwork-Id: 48852 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a21:3a48:b0:1af:fc2d:ff5a with SMTP id zu8csp417663pzb; Mon, 13 May 2024 08:43:09 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCWIDtrlELJI8R9j+BYeJlxE5EnLU24DP7fFTaGgH42J2j12Oc5hn3+QUsZfHl2eqrex7KWJR5QjP+9d9xYlp+t0bdLMfUeK1pKC6g== X-Google-Smtp-Source: AGHT+IESRzLB3gIA0ax/EIUJJ31vJTWRCCgvpj2EL7zgyDorexCDpet+z+CGEkRTLy8E3bxbWy6B X-Received: by 2002:a2e:b013:0:b0:2e5:67a7:ddb0 with SMTP id 38308e7fff4ca-2e567a7df97mr43891181fa.3.1715614988572; Mon, 13 May 2024 08:43:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1715614988; cv=none; d=google.com; s=arc-20160816; b=RlOeCMMLlepRlJcTr4iR2IxH0ICaeColaQLpmUsDN3JFQijY80Mens9Jv+DJ+SvTyw BN23kA9c7SNbAkqvH6sqj6EwBt20AFbCI0dFGly87TblOFtPxOQpPqy2dHibk5wto1Uw v7QS4IZPkrApNDNZf6vqCABnfqHuplqNH8jKKA/+DiOyWwQsFyrU8GzbMNMoND+mUnMe kV/W8k0ojUcDnZ/5nN1VGUFo7Pf5b4JRFTM9y74HVes4gfeXI4XALukfjQIPiJ5NBYkN k7MSpvU5wJGubl/Z3jEsoxuTEnByPGKxnNcFQ+iCeua3kjIAhtojz5S5EuDKYof8/7Eh X5vA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject :mime-version:user-agent:date:to:from:message-id:delivered-to; bh=29M04+evB5iXkq8UovJaMtnN/+1Epd4dAQgyV1oZ9j4=; fh=e5zN9xSzcxLA6bGo3lF+CqTbY/oLwzApV03EO/RBfgQ=; b=RzZ1UzHFhyE/r/84cnx0NVLRRRE2DZurOA7kUV9WfG0OjMfU7oUh6sQ0Fy8ss2WBYx eJzlwUPvYiAS1SzaEPfxyBuHQa8VRR9/bbHRUeaQBT+NRnX+Xj5cVQJ4OO2VZE30MEGO ShGGAJiBBsnUrIOdiaBYvZZPycXHi3VcYd2VBVOqSsUvJXCqlIVkP/o3cVXzeo8F+pCy hikF55ZMumSWiWLYtdeDEE3a+KOYRqgknDAA/IPVqcFV+aV16GpcDoWUOG9uxTXrzUkK gjJdAegBMCXmV6yEkRvbpV91i7xOwtXnqM5ZUlWLGx7H2GvsmcvVB5LUEzThPoiim1LL 9eMQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 4fb4d7f45d1cf-574c1229e35si1763826a12.19.2024.05.13.08.43.07; Mon, 13 May 2024 08:43:08 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id AAFBC68D4BA; Mon, 13 May 2024 18:43:03 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from glom.nmugroup.com (glom.nmugroup.com [193.183.80.6]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id D80E668BECE for ; Mon, 13 May 2024 18:42:55 +0300 (EEST) Received: from localhost (localhost [127.0.0.1]) by glom.nmugroup.com (Postfix) with ESMTP id 6A472542954C for ; Mon, 13 May 2024 17:42:55 +0200 (CEST) Received: from [10.10.150.6] (unknown [134.65.164.34]) (Authenticated sender: git01) by glom.nmugroup.com (Postfix) with ESMTPSA id 3FF3D5428DE6 for ; Mon, 13 May 2024 17:42:55 +0200 (CEST) Message-ID: From: Tomas =?iso-8859-1?q?H=E4rdin?= To: FFmpeg development discussions and patches Date: Mon, 13 May 2024 17:42:54 +0200 User-Agent: Evolution 3.46.4-2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/2] lavc/speedhqdec: Add AV_CODEC_CAP_SLICE_THREADS X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: AcWmPjr9PDrW On a 36 core Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz command: /usr/bin/time ./ffmpeg -t 30 -thread_type slice -threads $THREADS -i $INPUT.mov -vcodec rawvideo -f null - before: frame= 1500 fps=160 q=-0.0 Lsize=N/A time=00:00:30.00 bitrate=N/A speed= 3.2x 10.54user 0.37system 0:09.40elapsed 116%CPU (0avgtext+0avgdata 175300maxresident)k -thread_type slice -threads 1 frame= 1500 fps=161 q=-0.0 Lsize=N/A time=00:00:30.00 bitrate=N/A speed=3.22x 10.57user 0.29system 0:09.34elapsed 116%CPU (0avgtext+0avgdata 175580maxresident)k -thread_type slice -threads 2 frame= 1500 fps=318 q=-0.0 Lsize=N/A time=00:00:30.00 bitrate=N/A speed=6.36x 10.53user 0.39system 0:04.73elapsed 230%CPU (0avgtext+0avgdata 175632maxresident)k -thread_type slice -threads 4 frame= 1500 fps=615 q=-0.0 Lsize=N/A time=00:00:30.00 bitrate=N/A speed=12.3x 10.58user 0.34system 0:02.46elapsed 444%CPU (0avgtext+0avgdata 175452maxresident)k -thread_type slice -threads 8 frame= 1500 fps=613 q=-0.0 Lsize=N/A time=00:00:30.00 bitrate=N/A speed=12.3x 10.60user 0.33system 0:02.46elapsed 443%CPU (0avgtext+0avgdata 180532maxresident)k ^ same as -threads 4 as we'd expect for progressive essence I don't have any interlaced samples at the moment, and speedhqenc can't make any. I also noticed speedhqenc produces broken output when width % 16 == 8. Will file a ticket on that tomorrow. /Tomas From 29a0380a1537ba205ec91399512f676301d5e930 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Tomas=20H=C3=A4rdin?= Date: Mon, 13 May 2024 16:36:31 +0200 Subject: [PATCH 1/2] lavc/speedhqdec: Add AV_CODEC_CAP_SLICE_THREADS Each field slice is assigned to one thread. Serial performance is unaffected. --- libavcodec/speedhqdec.c | 59 ++++++++++++++++++++++++++--------------- 1 file changed, 38 insertions(+), 21 deletions(-) diff --git a/libavcodec/speedhqdec.c b/libavcodec/speedhqdec.c index d6b1fff7a5..77a159f7e5 100644 --- a/libavcodec/speedhqdec.c +++ b/libavcodec/speedhqdec.c @@ -58,6 +58,8 @@ typedef struct SHQContext { enum { SHQ_SUBSAMPLING_420, SHQ_SUBSAMPLING_422, SHQ_SUBSAMPLING_444 } subsampling; enum { SHQ_NO_ALPHA, SHQ_RLE_ALPHA, SHQ_DCT_ALPHA } alpha_type; + AVPacket *avpkt; + uint32_t second_field_offset; } SHQContext; /* NOTE: The first element is always 16, unscaled. */ @@ -266,9 +268,10 @@ static int decode_speedhq_border(const SHQContext *s, GetBitContext *gb, AVFrame return 0; } -static int decode_speedhq_field(const SHQContext *s, const uint8_t *buf, int buf_size, AVFrame *frame, int field_number, int start, int end, int line_stride) +static int decode_speedhq_field(const SHQContext *s, const uint8_t *buf, int buf_size, AVFrame *frame, int field_number, int start, int end, int line_stride, int slice_number) { - int ret, slice_number, slice_offsets[5]; + int ret, x, y, slice_offsets[5]; + uint32_t slice_begin, slice_end; int linesize_y = frame->linesize[0] * line_stride; int linesize_cb = frame->linesize[1] * line_stride; int linesize_cr = frame->linesize[2] * line_stride; @@ -283,21 +286,17 @@ static int decode_speedhq_field(const SHQContext *s, const uint8_t *buf, int buf slice_offsets[0] = start; slice_offsets[4] = end; - for (slice_number = 1; slice_number < 4; slice_number++) { + for (x = 1; x < 4; x++) { uint32_t last_offset, slice_len; - last_offset = slice_offsets[slice_number - 1]; + last_offset = slice_offsets[x - 1]; slice_len = AV_RL24(buf + last_offset); - slice_offsets[slice_number] = last_offset + slice_len; + slice_offsets[x] = last_offset + slice_len; - if (slice_len < 3 || slice_offsets[slice_number] > end - 3) + if (slice_len < 3 || slice_offsets[x] > end - 3) return AVERROR_INVALIDDATA; } - for (slice_number = 0; slice_number < 4; slice_number++) { - uint32_t slice_begin, slice_end; - int x, y; - slice_begin = slice_offsets[slice_number]; slice_end = slice_offsets[slice_number + 1]; @@ -390,14 +389,34 @@ static int decode_speedhq_field(const SHQContext *s, const uint8_t *buf, int buf } } } - } - if (s->subsampling != SHQ_SUBSAMPLING_444 && (frame->width & 15)) + if (s->subsampling != SHQ_SUBSAMPLING_444 && (frame->width & 15) && slice_number == 3) return decode_speedhq_border(s, &gb, frame, field_number, line_stride); return 0; } +static int decode_slice_progressive(AVCodecContext *avctx, void *arg, int jobnr, int threadnr) +{ + SHQContext *s = avctx->priv_data; + (void)threadnr; + + return decode_speedhq_field(avctx->priv_data, s->avpkt->data, s->avpkt->size, arg, 0, 4, s->avpkt->size, 1, jobnr); +} + +static int decode_slice_interlaced(AVCodecContext *avctx, void *arg, int jobnr, int threadnr) +{ + SHQContext *s = avctx->priv_data; + int field_number = jobnr / 4; + int slice_number = jobnr % 4; + (void)threadnr; + + if (field_number == 0) + return decode_speedhq_field(avctx->priv_data, s->avpkt->data, s->avpkt->size, arg, 0, 4, s->second_field_offset, 2, slice_number); + else + return decode_speedhq_field(avctx->priv_data, s->avpkt->data, s->avpkt->size, arg, 1, s->second_field_offset, s->avpkt->size, 2, slice_number); +} + static void compute_quant_matrix(int *output, int qscale) { int i; @@ -411,7 +430,6 @@ static int speedhq_decode_frame(AVCodecContext *avctx, AVFrame *frame, const uint8_t *buf = avpkt->data; int buf_size = avpkt->size; uint8_t quality; - uint32_t second_field_offset; int ret; if (buf_size < 4 || avctx->width < 8 || avctx->width % 8 != 0) @@ -429,8 +447,8 @@ static int speedhq_decode_frame(AVCodecContext *avctx, AVFrame *frame, compute_quant_matrix(s->quant_matrix, 100 - quality); - second_field_offset = AV_RL24(buf + 1); - if (second_field_offset >= buf_size - 3) { + s->second_field_offset = AV_RL24(buf + 1); + if (s->second_field_offset >= buf_size - 3) { return AVERROR_INVALIDDATA; } @@ -441,8 +459,9 @@ static int speedhq_decode_frame(AVCodecContext *avctx, AVFrame *frame, return ret; } frame->flags |= AV_FRAME_FLAG_KEY; + s->avpkt = avpkt; - if (second_field_offset == 4 || second_field_offset == (buf_size-4)) { + if (s->second_field_offset == 4 || s->second_field_offset == (buf_size-4)) { /* * Overlapping first and second fields is used to signal * encoding only a single field. In this case, "height" @@ -452,12 +471,10 @@ static int speedhq_decode_frame(AVCodecContext *avctx, AVFrame *frame, * but this matches the convention used in NDI, which is * the primary user of this trick. */ - if ((ret = decode_speedhq_field(s, buf, buf_size, frame, 0, 4, buf_size, 1)) < 0) + if ((ret = avctx->execute2(avctx, decode_slice_progressive, frame, NULL, 4)) < 0) return ret; } else { - if ((ret = decode_speedhq_field(s, buf, buf_size, frame, 0, 4, second_field_offset, 2)) < 0) - return ret; - if ((ret = decode_speedhq_field(s, buf, buf_size, frame, 1, second_field_offset, buf_size, 2)) < 0) + if ((ret = avctx->execute2(avctx, decode_slice_interlaced, frame, NULL, 8)) < 0) return ret; } @@ -653,5 +670,5 @@ const FFCodec ff_speedhq_decoder = { .priv_data_size = sizeof(SHQContext), .init = speedhq_decode_init, FF_CODEC_DECODE_CB(speedhq_decode_frame), - .p.capabilities = AV_CODEC_CAP_DR1 | AV_CODEC_CAP_FRAME_THREADS, + .p.capabilities = AV_CODEC_CAP_DR1 | AV_CODEC_CAP_FRAME_THREADS | AV_CODEC_CAP_SLICE_THREADS, }; -- 2.39.2 From patchwork Mon May 13 15:43:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Tomas_H=C3=A4rdin?= X-Patchwork-Id: 48853 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a21:3a48:b0:1af:fc2d:ff5a with SMTP id zu8csp417936pzb; Mon, 13 May 2024 08:43:39 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCXW6dtoTkLcb8RrVJAEsoHCzj6pVz5EU0BG4ZLt9HdJvlupdw3MafNm8oJMgcEQ4A818tmIFbW1vhoD2luWcYzGSGi93AZXxR6TLQ== X-Google-Smtp-Source: AGHT+IEhiI9g9g2KvZPmkyhse5I/AqdnxOdeXkRrU93aI8YVWwe/ZOl5PyW5dmgIPtaPWWiCWjSf X-Received: by 2002:a17:906:1d05:b0:a59:c9b1:cb68 with SMTP id a640c23a62f3a-a5a2d55a759mr748592566b.7.1715615019042; Mon, 13 May 2024 08:43:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1715615019; cv=none; d=google.com; s=arc-20160816; b=JnHUeWmCtdms3sMTvHI6LI37ER20VAg7MvuQEEe0YsDIQw5CfkjhNNDbwZs82izFVi 5hPDn2jCaYo4tcPo8gpNLUJU/3W90Etb9vLx6IN11/yBT5iookmV9htsvLb9RngtZNSf dPYsnwL9nonLm3/t4korS6jwXs+ANeChf2+zK8VunVlx9D8sZzn20dbWEsChDQX5g8ON 5l6rSLS6Cr1rY3xs65CNDJc4eh5sZWdYT4ciHK/Cw9tBWGvMOqvt533cl7PsjS6vQxjG a/OiDX2oPY3OCxwy5ernsjTmhzyvkoU+9LxeF4RFKb1qTqulGVZXHRJ1kwGSbMzlNMhh ME7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject :mime-version:user-agent:references:in-reply-to:date:to:from :message-id:delivered-to; bh=iitSm90UbNNrd4OOL8pIpl/jNPvD5mePCi58UwBIf9c=; fh=e5zN9xSzcxLA6bGo3lF+CqTbY/oLwzApV03EO/RBfgQ=; b=ebqVR0x9CFZURd69kNRXxDh2FZ732e7wPK/Hv+w4OHnaOCOi9di/lPYlJWETznbCSX UrtCE8OMZIQPeN5uyQrj3TYgYTIpD1C0gS46XorIyzlDOb45jRr8IHPxM2TtB8MXtlIw 5v6Qg36AzkFugrwuGfASiAamBFDW/fDAorz33n7OHW7XynfT/62Oiue5fkGsRmXlGr5u S5jYfzsKXvY2HomfYMYY+Yoi+alDx8RwIWjNGpgLCOQSAUCz74qPptYhcWfjzYAVSXmq 2Nxvv46EFa4lgDE06lzhiHp14/ECZ7SrU5F2+wXxeb/sOQKeowiHgAaV3jfYrzUex4If EpBw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a640c23a62f3a-a5a17c2c4a0si547117866b.966.2024.05.13.08.43.38; Mon, 13 May 2024 08:43:39 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4B00468D55C; Mon, 13 May 2024 18:43:36 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from glom.nmugroup.com (glom.nmugroup.com [193.183.80.6]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 92E3368BECE for ; Mon, 13 May 2024 18:43:29 +0300 (EEST) Received: from localhost (localhost [127.0.0.1]) by glom.nmugroup.com (Postfix) with ESMTP id 3D2D05429585 for ; Mon, 13 May 2024 17:43:29 +0200 (CEST) Received: from [10.10.150.6] (unknown [134.65.164.34]) (Authenticated sender: git01) by glom.nmugroup.com (Postfix) with ESMTPSA id 162145428DE6 for ; Mon, 13 May 2024 17:43:28 +0200 (CEST) Message-ID: From: Tomas =?iso-8859-1?q?H=E4rdin?= To: FFmpeg development discussions and patches Date: Mon, 13 May 2024 17:43:28 +0200 In-Reply-To: References: User-Agent: Evolution 3.46.4-2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/2] lavc/speedhqdec: Reindent X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: S/ex61YRd4bA From 17aceef1c1a1bb25d651610cd52bc94dbdf20e0d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Tomas=20H=C3=A4rdin?= Date: Mon, 13 May 2024 17:01:28 +0200 Subject: [PATCH 2/2] lavc/speedhqdec: Reindent --- libavcodec/speedhqdec.c | 152 ++++++++++++++++++++-------------------- 1 file changed, 76 insertions(+), 76 deletions(-) diff --git a/libavcodec/speedhqdec.c b/libavcodec/speedhqdec.c index 77a159f7e5..06ae0a7a85 100644 --- a/libavcodec/speedhqdec.c +++ b/libavcodec/speedhqdec.c @@ -297,98 +297,98 @@ static int decode_speedhq_field(const SHQContext *s, const uint8_t *buf, int buf return AVERROR_INVALIDDATA; } - slice_begin = slice_offsets[slice_number]; - slice_end = slice_offsets[slice_number + 1]; + slice_begin = slice_offsets[slice_number]; + slice_end = slice_offsets[slice_number + 1]; - if ((ret = init_get_bits8(&gb, buf + slice_begin + 3, slice_end - slice_begin - 3)) < 0) - return ret; + if ((ret = init_get_bits8(&gb, buf + slice_begin + 3, slice_end - slice_begin - 3)) < 0) + return ret; - for (y = slice_number * 16 * line_stride; y < frame->height; y += line_stride * 64) { - uint8_t *dest_y, *dest_cb, *dest_cr, *dest_a; - int last_dc[4] = { 1024, 1024, 1024, 1024 }; - uint8_t last_alpha[16]; + for (y = slice_number * 16 * line_stride; y < frame->height; y += line_stride * 64) { + uint8_t *dest_y, *dest_cb, *dest_cr, *dest_a; + int last_dc[4] = { 1024, 1024, 1024, 1024 }; + uint8_t last_alpha[16]; - memset(last_alpha, 255, sizeof(last_alpha)); + memset(last_alpha, 255, sizeof(last_alpha)); - dest_y = frame->data[0] + frame->linesize[0] * (y + field_number); - if (s->subsampling == SHQ_SUBSAMPLING_420) { - dest_cb = frame->data[1] + frame->linesize[1] * (y/2 + field_number); - dest_cr = frame->data[2] + frame->linesize[2] * (y/2 + field_number); - } else { - dest_cb = frame->data[1] + frame->linesize[1] * (y + field_number); - dest_cr = frame->data[2] + frame->linesize[2] * (y + field_number); - } - if (s->alpha_type != SHQ_NO_ALPHA) { - dest_a = frame->data[3] + frame->linesize[3] * (y + field_number); - } + dest_y = frame->data[0] + frame->linesize[0] * (y + field_number); + if (s->subsampling == SHQ_SUBSAMPLING_420) { + dest_cb = frame->data[1] + frame->linesize[1] * (y/2 + field_number); + dest_cr = frame->data[2] + frame->linesize[2] * (y/2 + field_number); + } else { + dest_cb = frame->data[1] + frame->linesize[1] * (y + field_number); + dest_cr = frame->data[2] + frame->linesize[2] * (y + field_number); + } + if (s->alpha_type != SHQ_NO_ALPHA) { + dest_a = frame->data[3] + frame->linesize[3] * (y + field_number); + } - for (x = 0; x < frame->width - 8 * (s->subsampling != SHQ_SUBSAMPLING_444); x += 16) { - /* Decode the four luma blocks. */ - if ((ret = decode_dct_block(s, &gb, last_dc, 0, dest_y, linesize_y)) < 0) - return ret; - if ((ret = decode_dct_block(s, &gb, last_dc, 0, dest_y + 8, linesize_y)) < 0) - return ret; - if ((ret = decode_dct_block(s, &gb, last_dc, 0, dest_y + 8 * linesize_y, linesize_y)) < 0) - return ret; - if ((ret = decode_dct_block(s, &gb, last_dc, 0, dest_y + 8 * linesize_y + 8, linesize_y)) < 0) - return ret; + for (x = 0; x < frame->width - 8 * (s->subsampling != SHQ_SUBSAMPLING_444); x += 16) { + /* Decode the four luma blocks. */ + if ((ret = decode_dct_block(s, &gb, last_dc, 0, dest_y, linesize_y)) < 0) + return ret; + if ((ret = decode_dct_block(s, &gb, last_dc, 0, dest_y + 8, linesize_y)) < 0) + return ret; + if ((ret = decode_dct_block(s, &gb, last_dc, 0, dest_y + 8 * linesize_y, linesize_y)) < 0) + return ret; + if ((ret = decode_dct_block(s, &gb, last_dc, 0, dest_y + 8 * linesize_y + 8, linesize_y)) < 0) + return ret; + + /* + * Decode the first chroma block. For 4:2:0, this is the only one; + * for 4:2:2, it's the top block; for 4:4:4, it's the top-left block. + */ + if ((ret = decode_dct_block(s, &gb, last_dc, 1, dest_cb, linesize_cb)) < 0) + return ret; + if ((ret = decode_dct_block(s, &gb, last_dc, 2, dest_cr, linesize_cr)) < 0) + return ret; - /* - * Decode the first chroma block. For 4:2:0, this is the only one; - * for 4:2:2, it's the top block; for 4:4:4, it's the top-left block. - */ - if ((ret = decode_dct_block(s, &gb, last_dc, 1, dest_cb, linesize_cb)) < 0) + if (s->subsampling != SHQ_SUBSAMPLING_420) { + /* For 4:2:2, this is the bottom block; for 4:4:4, it's the bottom-left block. */ + if ((ret = decode_dct_block(s, &gb, last_dc, 1, dest_cb + 8 * linesize_cb, linesize_cb)) < 0) return ret; - if ((ret = decode_dct_block(s, &gb, last_dc, 2, dest_cr, linesize_cr)) < 0) + if ((ret = decode_dct_block(s, &gb, last_dc, 2, dest_cr + 8 * linesize_cr, linesize_cr)) < 0) return ret; - if (s->subsampling != SHQ_SUBSAMPLING_420) { - /* For 4:2:2, this is the bottom block; for 4:4:4, it's the bottom-left block. */ - if ((ret = decode_dct_block(s, &gb, last_dc, 1, dest_cb + 8 * linesize_cb, linesize_cb)) < 0) - return ret; - if ((ret = decode_dct_block(s, &gb, last_dc, 2, dest_cr + 8 * linesize_cr, linesize_cr)) < 0) - return ret; - - if (s->subsampling == SHQ_SUBSAMPLING_444) { - /* Top-right and bottom-right blocks. */ - if ((ret = decode_dct_block(s, &gb, last_dc, 1, dest_cb + 8, linesize_cb)) < 0) - return ret; - if ((ret = decode_dct_block(s, &gb, last_dc, 2, dest_cr + 8, linesize_cr)) < 0) - return ret; - if ((ret = decode_dct_block(s, &gb, last_dc, 1, dest_cb + 8 * linesize_cb + 8, linesize_cb)) < 0) - return ret; - if ((ret = decode_dct_block(s, &gb, last_dc, 2, dest_cr + 8 * linesize_cr + 8, linesize_cr)) < 0) - return ret; - - dest_cb += 8; - dest_cr += 8; - } - } - dest_y += 16; - dest_cb += 8; - dest_cr += 8; - - if (s->alpha_type == SHQ_RLE_ALPHA) { - /* Alpha coded using 16x8 RLE blocks. */ - if ((ret = decode_alpha_block(s, &gb, last_alpha, dest_a, linesize_a)) < 0) - return ret; - if ((ret = decode_alpha_block(s, &gb, last_alpha, dest_a + 8 * linesize_a, linesize_a)) < 0) - return ret; - dest_a += 16; - } else if (s->alpha_type == SHQ_DCT_ALPHA) { - /* Alpha encoded exactly like luma. */ - if ((ret = decode_dct_block(s, &gb, last_dc, 3, dest_a, linesize_a)) < 0) + if (s->subsampling == SHQ_SUBSAMPLING_444) { + /* Top-right and bottom-right blocks. */ + if ((ret = decode_dct_block(s, &gb, last_dc, 1, dest_cb + 8, linesize_cb)) < 0) return ret; - if ((ret = decode_dct_block(s, &gb, last_dc, 3, dest_a + 8, linesize_a)) < 0) + if ((ret = decode_dct_block(s, &gb, last_dc, 2, dest_cr + 8, linesize_cr)) < 0) return ret; - if ((ret = decode_dct_block(s, &gb, last_dc, 3, dest_a + 8 * linesize_a, linesize_a)) < 0) + if ((ret = decode_dct_block(s, &gb, last_dc, 1, dest_cb + 8 * linesize_cb + 8, linesize_cb)) < 0) return ret; - if ((ret = decode_dct_block(s, &gb, last_dc, 3, dest_a + 8 * linesize_a + 8, linesize_a)) < 0) + if ((ret = decode_dct_block(s, &gb, last_dc, 2, dest_cr + 8 * linesize_cr + 8, linesize_cr)) < 0) return ret; - dest_a += 16; + + dest_cb += 8; + dest_cr += 8; } } + dest_y += 16; + dest_cb += 8; + dest_cr += 8; + + if (s->alpha_type == SHQ_RLE_ALPHA) { + /* Alpha coded using 16x8 RLE blocks. */ + if ((ret = decode_alpha_block(s, &gb, last_alpha, dest_a, linesize_a)) < 0) + return ret; + if ((ret = decode_alpha_block(s, &gb, last_alpha, dest_a + 8 * linesize_a, linesize_a)) < 0) + return ret; + dest_a += 16; + } else if (s->alpha_type == SHQ_DCT_ALPHA) { + /* Alpha encoded exactly like luma. */ + if ((ret = decode_dct_block(s, &gb, last_dc, 3, dest_a, linesize_a)) < 0) + return ret; + if ((ret = decode_dct_block(s, &gb, last_dc, 3, dest_a + 8, linesize_a)) < 0) + return ret; + if ((ret = decode_dct_block(s, &gb, last_dc, 3, dest_a + 8 * linesize_a, linesize_a)) < 0) + return ret; + if ((ret = decode_dct_block(s, &gb, last_dc, 3, dest_a + 8 * linesize_a + 8, linesize_a)) < 0) + return ret; + dest_a += 16; + } } + } if (s->subsampling != SHQ_SUBSAMPLING_444 && (frame->width & 15) && slice_number == 3) return decode_speedhq_border(s, &gb, frame, field_number, line_stride); -- 2.39.2