From patchwork Mon Apr 23 18:58:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jerome Borsboom X-Patchwork-Id: 8607 Delivered-To: ffmpegpatchwork@gmail.com Received: by 2002:a02:155:0:0:0:0:0 with SMTP id c82-v6csp1019180jad; Mon, 23 Apr 2018 11:58:35 -0700 (PDT) X-Google-Smtp-Source: AIpwx4834Lg0pEsls7B4WkkyiE6c/7JdI3F1ROcTNO6gIWfXS1Hb1GB6o3o7LDNYw2uqjVX3q/Lp X-Received: by 10.28.86.132 with SMTP id k126mr10955340wmb.17.1524509915823; Mon, 23 Apr 2018 11:58:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524509915; cv=none; d=google.com; s=arc-20160816; b=DbHOFM3WSOTGvCKqqCO8xxfRg/04EENuktIYWNd2zU4YQGZx84DEennFdy/t/W1IxF TVTpoDl88Vse1YaYZC7vM63rdbJTTFdQS/yoZtQZsboW237T8yNGVVFD5CoX/bHaES1/ JsvVyTURZQKRu4W0UWRGbpatB19VExQwvZ5lXb1udzfAeCdHxmYquXbajb3N8jwgWjm5 EQssT1R8CsZZrPSB9kMlwOo7yFFV3x60phTfD8eEMfcRhLkdpo5u4IToOJ4izfC6fg8N dpBwIt0ixcX1pLl0GRHFiU+hkJRe8qEc8vB8Azt5Y2lraDwyhWOCYYdO0no0HXVI6xoo G9RQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:content-language:mime-version:user-agent:date :message-id:from:to:dkim-signature:delivered-to :arc-authentication-results; bh=uuuUaZGpHwF85v+6NkIzIeED1W7Qg4HWKKOAjpuxacw=; b=ZQpnY7u6uPw2MVtRNpHI0+wgN2czBoD1eSzp8OvYyCu8h8ZNaQCiQ6IEw6vvZJ2DVe xosAGkYD66WO5NfgILIg1DxlCyJX9QWE6HNgcVkjm7+V45nLULRysk3yOdx8PCZ2Kw/S 7xDkbbV9mo71Nk9sF4jzCsoolx0YnDq91VtXTRcr+4hScrb6nK5Sk0gwgLafkDusPpYJ yDV0I2pBxPBFZ09LqHCJVMSDXSYmqyDiGeszvg1uR9o8UIB+BNsK0sIyEV8erUl3dO1d XihlNYqh0PHFiEy8Q9jgYXyEBLPLdwyfjb9rwiBaUq4cMsTP2dJPAx4cC8d9HjtkiLrd r4Iw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@carpalis.nl header.s=default header.b=hLWsn9au; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 137si5859598wml.228.2018.04.23.11.58.35; Mon, 23 Apr 2018 11:58:35 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@carpalis.nl header.s=default header.b=hLWsn9au; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0C082689EEF; Mon, 23 Apr 2018 21:58:05 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from kyoto.xs4all.nl (kyoto.xs4all.nl [83.161.153.34]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 22553689E21 for ; Mon, 23 Apr 2018 21:58:03 +0300 (EEST) Received: from [IPv6:2001:980:9507:0:8e70:5aff:fec6:83fc] ([IPv6:2001:980:9507:0:8e70:5aff:fec6:83fc]) (authenticated bits=0) by kyoto.xs4all.nl (8.14.7/8.14.7) with ESMTP id w3NIwV6B003270 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 23 Apr 2018 20:58:31 +0200 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=carpalis.nl; s=default; t=1524509911; bh=HCsqVXMQraYB9QyZGC4XkCWPrmSj84XmemS2QhgfV2A=; h=To:From:Subject:Date; b=hLWsn9auWsRm2RVwzXCF5Ni59FSZTn5yt32RUd9pla7WkDomfhYBnX70It8V1I5+t u12u2re+a+Uf8LInvd8z+7OeyzHtHrJJStY3Yk1r2/ZOvcF2cVlbk/mIR9wJctCeNh ELCeOmrL/Qn+C7mwgQmFi72RpkYFuT4+LxzJWLLFxuGCjSrW2b5pXCattVxnqLT5fr PRpWy74P9Uv9FtUiNgLDa3ppPwh4quMgMKyC4JuuNfH0BeqyZqdt53zVRKAXdVBqNo urU8md1oiz/FC/a/wbxf/AsO6CBUc2Lz/+b0cVeObAkhiCtQG5ATMNG3XkRBgIkYW4 SDD/vAqy3eyOw== To: ffmpeg-devel@ffmpeg.org From: Jerome Borsboom Message-ID: <1f0e6dc4-af87-dee9-b708-fddb2e552e5a@carpalis.nl> Date: Mon, 23 Apr 2018 20:58:31 +0200 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 Content-Language: nl Subject: [FFmpeg-devel] [PATCH 01/14] avcodec/vc1: re-implement and expand VC-1 overlap smooting X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" The existing implementation did overlap smoothing for progressive frames only. This rewritten version implements overlap smoothing for all applicable frame types for both progessive and frame/field-interlace. Signed-off-by: Jerome Borsboom --- This patch-set improves the VC-1 software decoder to the point where the fate checksums are equal to checksums from the Intel hardware decoded image on Haswell. libavcodec/vc1.h | 2 + libavcodec/vc1_loopfilter.c | 94 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 96 insertions(+) diff --git a/libavcodec/vc1.h b/libavcodec/vc1.h index 8fc0729cb8..85504c2f9f 100644 --- a/libavcodec/vc1.h +++ b/libavcodec/vc1.h @@ -425,6 +425,8 @@ void ff_vc1_decode_blocks(VC1Context *v); void ff_vc1_loop_filter_iblk(VC1Context *v, int pq); void ff_vc1_loop_filter_iblk_delayed(VC1Context *v, int pq); void ff_vc1_smooth_overlap_filter_iblk(VC1Context *v); +void ff_vc1_i_overlap_filter(VC1Context *v); +void ff_vc1_p_overlap_filter(VC1Context *v); void ff_vc1_apply_p_loop_filter(VC1Context *v); void ff_vc1_mc_1mv(VC1Context *v, int dir); diff --git a/libavcodec/vc1_loopfilter.c b/libavcodec/vc1_loopfilter.c index 025776bac9..3122b1a258 100644 --- a/libavcodec/vc1_loopfilter.c +++ b/libavcodec/vc1_loopfilter.c @@ -208,6 +208,100 @@ void ff_vc1_smooth_overlap_filter_iblk(VC1Context *v) } } +static av_always_inline void vc1_h_overlap_filter(VC1Context *v, int16_t (*left_block)[64], + int16_t (*right_block)[64], int block_num) +{ + if (left_block != right_block || (block_num & 5) == 1) { + if (block_num > 3) + v->vc1dsp.vc1_h_s_overlap(left_block[block_num], right_block[block_num]); + else if (block_num & 1) + v->vc1dsp.vc1_h_s_overlap(right_block[block_num - 1], right_block[block_num]); + else + v->vc1dsp.vc1_h_s_overlap(left_block[block_num + 1], right_block[block_num]); + } +} + +static av_always_inline void vc1_v_overlap_filter(VC1Context *v, int16_t (*top_block)[64], + int16_t (*bottom_block)[64], int block_num) +{ + if (top_block != bottom_block || block_num & 2) { + if (block_num > 3) + v->vc1dsp.vc1_v_s_overlap(top_block[block_num], bottom_block[block_num]); + else if (block_num & 2) + v->vc1dsp.vc1_v_s_overlap(bottom_block[block_num - 2], bottom_block[block_num]); + else + v->vc1dsp.vc1_v_s_overlap(top_block[block_num + 2], bottom_block[block_num]); + } +} + +void ff_vc1_i_overlap_filter(VC1Context *v) +{ + MpegEncContext *s = &v->s; + int16_t (*topleft_blk)[64], (*top_blk)[64], (*left_blk)[64], (*cur_blk)[64]; + int block_count = CONFIG_GRAY && (s->avctx->flags & AV_CODEC_FLAG_GRAY) ? 4 : 6; + int mb_pos = s->mb_x + s->mb_y * s->mb_stride; + int i; + + topleft_blk = v->block[v->topleft_blk_idx]; + top_blk = v->block[v->top_blk_idx]; + left_blk = v->block[v->left_blk_idx]; + cur_blk = v->block[v->cur_blk_idx]; + + /* Within a MB, the horizontal overlap always runs before the vertical. + * To accomplish that, we run the H on the left and internal vertical + * borders of the currently decoded MB. Then, we wait for the next overlap + * iteration to do H overlap on the right edge of this MB, before moving + * over and running the V overlap on the top and internal horizontal + * borders. Therefore, the H overlap trails by one MB col and the + * V overlap trails by one MB row. This is reflected in the time at which + * we run the put_pixels loop, i.e. delayed by one row and one column. */ + for (i = 0; i < block_count; i++) + if (v->pq >= 9 || v->condover == CONDOVER_ALL || + (v->over_flags_plane[mb_pos] && ((i & 5) == 1 || v->over_flags_plane[mb_pos - 1]))) + vc1_h_overlap_filter(v, s->mb_x ? left_blk : cur_blk, cur_blk, i); + + if (v->fcm != ILACE_FRAME) + for (i = 0; i < block_count; i++) { + if (s->mb_x && (v->pq >= 9 || v->condover == CONDOVER_ALL || + (v->over_flags_plane[mb_pos - 1] && + ((i & 2) || v->over_flags_plane[mb_pos - 1 - s->mb_stride])))) + vc1_v_overlap_filter(v, s->first_slice_line ? left_blk : topleft_blk, left_blk, i); + if (s->mb_x == s->mb_width - 1) + if (v->pq >= 9 || v->condover == CONDOVER_ALL || + (v->over_flags_plane[mb_pos] && + ((i & 2) || v->over_flags_plane[mb_pos - s->mb_stride]))) + vc1_v_overlap_filter(v, s->first_slice_line ? cur_blk : top_blk, cur_blk, i); + } +} + +void ff_vc1_p_overlap_filter(VC1Context *v) +{ + MpegEncContext *s = &v->s; + int16_t (*topleft_blk)[64], (*top_blk)[64], (*left_blk)[64], (*cur_blk)[64]; + int block_count = CONFIG_GRAY && (s->avctx->flags & AV_CODEC_FLAG_GRAY) ? 4 : 6; + int i; + + topleft_blk = v->block[v->topleft_blk_idx]; + top_blk = v->block[v->top_blk_idx]; + left_blk = v->block[v->left_blk_idx]; + cur_blk = v->block[v->cur_blk_idx]; + + for (i = 0; i < block_count; i++) + if (v->mb_type[0][s->block_index[i]] && (s->mb_x == 0 || v->mb_type[0][s->block_index[i] - 1])) + vc1_h_overlap_filter(v, s->mb_x ? left_blk : cur_blk, cur_blk, i); + + if (v->fcm != ILACE_FRAME) + for (i = 0; i < block_count; i++) { + if (s->mb_x && v->mb_type[0][s->block_index[i] - 1] && + (s->first_slice_line || v->mb_type[0][s->block_index[i] - s->block_wrap[i] - 1])) + vc1_v_overlap_filter(v, s->first_slice_line ? left_blk : topleft_blk, left_blk, i); + if (s->mb_x == s->mb_width - 1) + if (v->mb_type[0][s->block_index[i]] && + (s->first_slice_line || v->mb_type[0][s->block_index[i] - s->block_wrap[i]])) + vc1_v_overlap_filter(v, s->first_slice_line ? cur_blk : top_blk, cur_blk, i); + } +} + static av_always_inline void vc1_apply_p_v_loop_filter(VC1Context *v, int block_num) { MpegEncContext *s = &v->s;