From patchwork Tue Aug 14 07:44:57 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Gagandeep Singh <deepgagan231197@gmail.com>
X-Patchwork-Id: 9991
Delivered-To: ffmpegpatchwork@gmail.com
Received: by 2002:a02:104:0:0:0:0:0 with SMTP id c4-v6csp4042932jad;
	Tue, 14 Aug 2018 00:45:18 -0700 (PDT)
X-Google-Smtp-Source: 
 AA+uWPwYNiwiZYhQXH3yZQncAFRv3igzbXpdaFNkYs+c/ltmzXP6vgPUlivlC7qbPWpoKLMFWLQg
X-Received: by 2002:adf:9366:: with SMTP id
	93-v6mr12016685wro.60.1534232718624;
	Tue, 14 Aug 2018 00:45:18 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1534232718; cv=none;
	d=google.com; s=arc-20160816;
	b=uuUVIv7kMzmXqhbNeI5OuNrKCet9csI6Ybk+F/hHbKdi60uSa7ZW5qL3ogUNpkYc4v
	+1OH24tRPCYsIHfz9u1tibJzxuuXv30bppptgWt/N7gdbxYmQnMojoZhptJMn0X5TjDA
	MgNaqPBuod2Re7Ny/gxOSkaI1Zzjdgcz97nWtQMoHFRtQsn3qfLGv+95DE699+yQwHne
	/cDVRoFJNzwK7hdS77A9SnnE/B8FtMHqe8byAx4y5aor3QBWSx9Oc09TX/0LYxrpFJoD
	XdGLEhIe9T55M2uanFwNav9qJ7SJ4WgQ8zv3SLsVIkKYQ75kGDWDo2+fXEml3/n9/waJ
	uTgA==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
	s=arc-20160816;
	h=sender:errors-to:reply-to:list-subscribe:list-help:list-post
	:list-archive:list-unsubscribe:list-id:precedence:subject:to
	:message-id:date:from:mime-version:dkim-signature:delivered-to
	:arc-authentication-results;
	bh=yGTZkpi/eHpyUqhJGJ6IWFcZZAURL/z+qxv6Or1Perg=;
	b=VdZw5B1ixr5PRFmrP6EvPl7cyS8US8LbgyNMxAz8nfIlAUHW5QPE6ccUl/7Ic8PmEs
	Iw72h5EjGR5ZZ6V0NaPv9g6XJNxRsYIhQGxFtdrZzczmfYNWDZV+VMAK1b0uxrp982nj
	0j++KYTws+2jRgX3/w7Scdfw6ZK/WMd9spt/I3MCTk8YmClSweyCbpJaDv/pNNaD6/TL
	eZGimTCB5VUcK3usXiwPacOCc7EShGEBCgkGnqnh2uWKUBCUKC4HduLrHupp+FFutp5L
	bHUe0rJQEYJlRoKbIEb4QD9i2UaaQz0P9BOGlZcLP7ILPOWIEa/aynBi2PVyadVpfbOd
	BdDQ==
ARC-Authentication-Results: i=1; mx.google.com;
	dkim=neutral (body hash did not verify) header.i=@gmail.com
	header.s=20161025 header.b=JSZkhfCG;
	spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
	designates 79.124.17.100 as permitted sender)
	smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
	dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100])
	by mx.google.com with ESMTP id
	m15-v6si8273446wma.158.2018.08.14.00.45.18;
	Tue, 14 Aug 2018 00:45:18 -0700 (PDT)
Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
	designates 79.124.17.100 as permitted sender)
	client-ip=79.124.17.100;
Authentication-Results: mx.google.com;
	dkim=neutral (body hash did not verify) header.i=@gmail.com
	header.s=20161025 header.b=JSZkhfCG;
	spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
	designates 79.124.17.100 as permitted sender)
	smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
	dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6AC8A68A580;
	Tue, 14 Aug 2018 10:44:53 +0300 (EEST)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from mail-it0-f44.google.com (mail-it0-f44.google.com
	[209.85.214.44])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 3F8DB680758
	for <ffmpeg-devel@ffmpeg.org>; Tue, 14 Aug 2018 10:44:46 +0300 (EEST)
Received: by mail-it0-f44.google.com with SMTP id d70-v6so13840939ith.1
	for <ffmpeg-devel@ffmpeg.org>; Tue, 14 Aug 2018 00:45:10 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
	h=mime-version:from:date:message-id:subject:to;
	bh=iijjF4VQ1Lt+2YQST7LYNMJW14c03zcNd1rgaRdh8F0=;
	b=JSZkhfCGvLM+E2qsiv5NXaSnOdCSdd4XXpUJGB9yw19aEX2jKzDAPavaWF92Cuq7/u
	idunUWuFniuAfET9wh+ytIl0wo+VBgUjR84NeFKXudyGlvWZK123Tps6zhw7MxlNXX8S
	tjr5ZFBqCNMMYsDkwRppSDDEZshvVsyxrcLwT0Wgk2d0N/TQ/LY9qAeSumfTdqxwXAR0
	Tw1CYwT6f4OxBudnhiyG5AEyhzs9tstwRwxatXxBBexrt4EWKw3e7YpPyNKlMxUMIdtf
	llOfkzaFEF6hJYCCZPmpnRwz0Bt6HV0AeihbdOmNDuxtCYRMrp3w2vwF2KSLLD8PFYFE
	pJwQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:mime-version:from:date:message-id:subject:to;
	bh=iijjF4VQ1Lt+2YQST7LYNMJW14c03zcNd1rgaRdh8F0=;
	b=k/kHAdj34Q2bKmSfpBIUqV2lcF7eZGDFaotWKWQDL1VlFpYNLVfoY/B9da7qGa7zA1
	xy28h4LdlfHfyLY3vD5lGNaZjNVEt09HQQ65iuSD1bL2cFrAiMPT7o99vXHelHTB9z9p
	R9Pi3OvGkbRNOy8ezI7abfhNhBag3uchExSpmNGRfqTrkfkZQ5QzlF6DvJjfFYIo825z
	3ueRd3WCBouIp37Ydyyjgi9usIE0eOXegNuZYLDe2VJfHRKdC6uFKA8O6hWEBhg0JVfm
	s6V/YnFn6iTtEEYq5/GyK0cBPiO57KcGh3tvrOB5QN/g4jqDIMWabrmNajrx3pi5iGxv
	JTYA==
X-Gm-Message-State: AOUpUlGon3sTLl26OaX1bNgzc4geC2iTnygwW7NWWKX11T4zRQq+omoy
	VNiIM+M54wIqkWIJv/bEykDiv5HAz909KqaafFWTCw==
X-Received: by 2002:a24:5cc8:: with SMTP id
	q191-v6mr13750445itb.63.1534232708684;
	Tue, 14 Aug 2018 00:45:08 -0700 (PDT)
MIME-Version: 1.0
From: Gagandeep Singh <deepgagan231197@gmail.com>
Date: Tue, 14 Aug 2018 13:14:57 +0530
Message-ID: 
 <CAOi=zRv-+_9Ch5K97eH=c5ri+wWTgoitK58n_0sbrFCiK8T3BQ@mail.gmail.com>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
X-Content-Filtered-By: Mailman/MimeDel 2.1.20
Subject: [FFmpeg-devel] [GSOC][PATCH 3/3] lavc/cfhd:frame threading support
	for 3d transform progressive and interlaced samples
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <http://ffmpeg.org/mailman/options/ffmpeg-devel>,
	<mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <http://ffmpeg.org/pipermail/ffmpeg-devel/>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <http://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
	<mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches
	<ffmpeg-devel@ffmpeg.org>
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>

Last patch adding frame thread support for ip samples in both progressive
and interlaced versions.

Gagandeep Singh

From fa23549c61a6d8413cdc79c570376c53795a6ff1 Mon Sep 17 00:00:00 2001
From: Gagandeep Singh <deepgagan231197@gmail.com>
Date: Tue, 14 Aug 2018 12:43:20 +0530
Subject: [GSOC][FFmpeg-devel][PATCH 3/3] lavc/cfhd:frame threading support for 3d transform
 progressive and interlaced samples

---
 libavcodec/cfhd.c | 378 ++++++++++++++++++++++++++++------------------
 libavcodec/cfhd.h |   8 +-
 2 files changed, 242 insertions(+), 144 deletions(-)

diff --git a/libavcodec/cfhd.c b/libavcodec/cfhd.c
index 2c538f0bbd..7c298056ca 100644
--- a/libavcodec/cfhd.c
+++ b/libavcodec/cfhd.c
@@ -63,13 +63,23 @@ enum CFHDParam {
 
 static av_cold int cfhd_init(AVCodecContext *avctx)
 {
+    int ret;
+
     CFHDContext *s = avctx->priv_data;
+    if (!avctx->internal->is_copy) {
+        avctx->internal->allocate_progress = 1;
+        ret = ff_cfhd_init_vlcs(s);
+    } else
+        ret = 0;
 
     avctx->bits_per_raw_sample = 10;
     s->avctx                   = avctx;
     s->progressive             = 0;
+    s->i_frame.f = av_frame_alloc();
+    s->p_frame.f = av_frame_alloc();
 
-    return ff_cfhd_init_vlcs(s);
+
+    return ret;
 }
 
 static void init_plane_defaults(CFHDContext *s)
@@ -268,15 +278,18 @@ static void free_buffers(CFHDContext *s)
     for (i = 0; i < FF_ARRAY_ELEMS(s->plane); i++) {
         av_freep(&s->plane[i].idwt_buf);
         av_freep(&s->plane[i].idwt_tmp);
-        if (s->transform_type == 0)
+        if (s->transform_type == 0) {
             for (j = 0; j < 9; j++)
                 s->plane[i].subband[j] = NULL;
-        else
+            for (j = 0; j < 8; j++)
+                s->plane[i].l_h[j] = NULL;
+        }
+        else {
             for (j = 0; j < 17; j++)
                 s->plane[i].subband[j] = NULL;
-
-        for (j = 0; j < 8; j++)
-            s->plane[i].l_h[j] = NULL;
+            for (j = 0; j < 12; j++)
+                s->plane[i].l_h[j] = NULL;
+        }
     }
     s->a_height = 0;
     s->a_width  = 0;
@@ -394,8 +407,10 @@ static int alloc_buffers(AVCodecContext *avctx)
         s->plane[i].l_h[7] = s->plane[i].idwt_tmp + 2 * w2 * h2;
         if (s->transform_type == 2) {
             frame2 = s->plane[i].idwt_tmp + 4 * w2 * h2;
-            s->plane[i].l_h[8] = frame2;
-            s->plane[i].l_h[9] = frame2 + 2 * w2 * h2;
+            s->plane[i].l_h[8]  = frame2;
+            s->plane[i].l_h[9]  = frame2 + 2 * w4 * h4;
+            s->plane[i].l_h[10] = frame2;
+            s->plane[i].l_h[11] = frame2 + 2 * w2 * h2;
             }
     }
 
@@ -406,14 +421,28 @@ static int alloc_buffers(AVCodecContext *avctx)
     return 0;
 }
 
+static int update_thread_context(AVCodecContext *dst, const AVCodecContext *src)
+{
+    CFHDContext *csrc = src->priv_data;
+    CFHDContext *cdst = dst->priv_data;
+    cdst->transform_type = csrc->transform_type;
+    if (csrc->sample_type != 1 && csrc->transform_type != 0) {
+        cdst->progressive = csrc->progressive;
+        cdst->picture = &csrc->p_frame;
+        cdst->connection = &csrc->i_frame;
+        cdst->buffers = csrc->plane;
+    }
+
+    return 0;
+}
+
 static int cfhd_decode(AVCodecContext *avctx, void *data, int *got_frame,
                        AVPacket *avpkt)
 {
     CFHDContext *s = avctx->priv_data;
     GetByteContext gb;
     ThreadFrame frame = { .f = data };
-    AVFrame *pic = data;
-    int ret = 0, i, j, planes, plane, got_buffer = 0;
+    int ret = 0, i, j, planes, plane, got_buffer = 0, progress1 = 1, progress2 = 1;
     int16_t *coeff_data;
 
     s->coded_format = AV_PIX_FMT_YUV422P10;
@@ -537,7 +566,9 @@ static int cfhd_decode(AVCodecContext *avctx, void *data, int *got_frame,
         } else if (tag == 1) {
             s->sample_type = data;
             if (data == 2)
-                s->pframe = 1;
+                s->pframe  = 1;
+            else if (data == 1)
+                s->transform_type = 2;
             av_log(avctx, AV_LOG_DEBUG, "Sample type? %"PRIu16"\n", data);
         } else if (tag == 10) {
             s->transform_type = data;
@@ -657,21 +688,54 @@ static int cfhd_decode(AVCodecContext *avctx, void *data, int *got_frame,
                     return ret;
                 }
             }
-            ret = ff_set_dimensions(avctx, s->coded_width, s->coded_height);
-            if (ret < 0)
-                return ret;
-            if (s->cropped_height)
-                avctx->height = s->cropped_height;
-            frame.f->width =
-            frame.f->height = 0;
-
-            if ((ret = ff_thread_get_buffer(avctx, &frame, 0)) < 0)
-                return ret;
-
+            if (s->transform_type == 2) {
+                if (s->sample_type != 1) {
+                    s->picture = &s->i_frame;
+                    s->connection = &s->p_frame;
+                    s->buffers = s->plane;
+                }
+                ret = ff_set_dimensions(avctx, s->coded_width, s->coded_height);
+                if (ret < 0)
+                    return ret;
+                if (s->sample_type != 1) {
+                    if (s->i_frame.f->data[0])
+                        ff_thread_release_buffer(avctx, &s->i_frame);
+                    if (s->p_frame.f->data[0])
+                        ff_thread_release_buffer(avctx, &s->p_frame);
+                    av_frame_copy_props(s->i_frame.f, frame.f);
+                    av_frame_copy_props(s->p_frame.f, frame.f);
+                    if (s->cropped_height)
+                        avctx->height = s->cropped_height;
+                    s->picture->f->width =
+                    s->picture->f->height = 0;
+                    s->connection->f->width =
+                    s->connection->f->height = 0;
+                    if ((ret = ff_thread_get_buffer(avctx, s->picture, 0)) < 0)
+                        return ret;
+                    if ((ret = ff_thread_get_buffer(avctx, s->connection, 0)) < 0)
+                        return ret;
+                }
+            } else {
+                s->picture = &s->i_frame;
+                s->buffers = s->plane;
+                if (s->picture->f->data[0])
+                    ff_thread_release_buffer(avctx, s->picture);
+                av_frame_copy_props(s->i_frame.f, frame.f);
+                ret = ff_set_dimensions(avctx, s->coded_width, s->coded_height);
+                if (ret < 0)
+                    return ret;
+                if (s->cropped_height)
+                    avctx->height = s->cropped_height;
+                s->picture->f->width =
+                s->picture->f->height = 0;
+                if ((ret = ff_thread_get_buffer(avctx, s->picture, 0)) < 0)
+                    return ret;
+            }
             s->coded_width = 0;
             s->coded_height = 0;
             s->coded_format = AV_PIX_FMT_NONE;
             got_buffer = 1;
+            ff_thread_finish_setup(avctx);
         }
         coeff_data = s->plane[s->channel_num].subband[s->subband_num_actual];
 
@@ -835,6 +899,8 @@ static int cfhd_decode(AVCodecContext *avctx, void *data, int *got_frame,
                        &coeff_data[(highpass_height - 1) * highpass_stride],
                        highpass_stride * sizeof(*coeff_data));
             }
+            if (s->transform_type == 2 && s->subband_num_actual == 10)
+                ff_thread_report_progress(s->picture, progress1 += 1, 0);
         }
     }
     //disabled to run mountain sample file
@@ -975,7 +1041,6 @@ static int cfhd_decode(AVCodecContext *avctx, void *data, int *got_frame,
             ret = AVERROR(EINVAL);
             goto end;
         }
-
         av_log(avctx, AV_LOG_DEBUG, "Level 3 plane %i %i %i %i\n", plane, lowpass_height, lowpass_width, highpass_stride);
         if (s->progressive) {
             low    = s->plane[plane].subband[0];
@@ -998,18 +1063,18 @@ static int cfhd_decode(AVCodecContext *avctx, void *data, int *got_frame,
                 output++;
             }
 
-            dst = (int16_t *)pic->data[act_plane];
+            dst = (int16_t *)s->picture->f->data[act_plane];
             low  = s->plane[plane].l_h[6];
             high = s->plane[plane].l_h[7];
             for (i = 0; i < lowpass_height * 2; i++) {
                 horiz_filter_clip(dst, low, high, lowpass_width, s->bpc);
                 low  += lowpass_width;
                 high += lowpass_width;
-                dst  += pic->linesize[act_plane] / 2;
+                dst  += s->picture->f->linesize[act_plane] / 2;
             }
         } else {
-            av_log(avctx, AV_LOG_DEBUG, "interlaced frame ? %d", pic->interlaced_frame);
-            pic->interlaced_frame = 1;
+            av_log(avctx, AV_LOG_DEBUG, "interlaced frame ? %d", s->picture->f->interlaced_frame);
+            s->picture->f->interlaced_frame = 1;
             low    = s->plane[plane].subband[0];
             high   = s->plane[plane].subband[7];
             output = s->plane[plane].l_h[6];
@@ -1030,23 +1095,23 @@ static int cfhd_decode(AVCodecContext *avctx, void *data, int *got_frame,
                 output += lowpass_width * 2;
             }
 
-            dst  = (int16_t *)pic->data[act_plane];
+            dst  = (int16_t *)s->picture->f->data[act_plane];
             low  = s->plane[plane].l_h[6];
             high = s->plane[plane].l_h[7];
             for (i = 0; i < lowpass_height; i++) {
-                inverse_temporal_filter(dst, low, high, lowpass_width * 2,  pic->linesize[act_plane]/2, 0);
+                inverse_temporal_filter(dst, low, high, lowpass_width * 2,  s->picture->f->linesize[act_plane]/2, 0);
                 low  += lowpass_width * 2;
                 high += lowpass_width * 2;
-                dst  += pic->linesize[act_plane];
+                dst  += s->picture->f->linesize[act_plane];
             }
         }
     }
-    //this is the serial version on ip sample decoding so buffers allocated using alloc_buffers() are not freed,
-    //so the stored decoded coefficients data is used for generating the second frame once empty packet is passed in sample_type = 1
+    av_frame_ref(frame.f, s->picture->f);
+    ff_thread_report_progress(s->picture, INT_MAX, 0);
     } else if (s->transform_type == 2 && s->sample_type != 1) {
         for (plane = 0; plane < planes && !ret; plane++) {
-            int lowpass_height  = s->plane[plane].band[0][0].height;
-            int lowpass_width   = s->plane[plane].band[0][0].width;
+            int lowpass_height  = s->plane[plane].band[0][1].height;
+            int lowpass_width   = s->plane[plane].band[0][1].width;
             int highpass_stride = s->plane[plane].band[0][1].stride;
             int act_plane = plane == 1 ? 2 : plane == 2 ? 1 : plane;
             int16_t *low, *high, *output, *dst;
@@ -1058,8 +1123,6 @@ static int cfhd_decode(AVCodecContext *avctx, void *data, int *got_frame,
                 goto end;
             }
 
-            av_log(avctx, AV_LOG_DEBUG, "Decoding level 1 plane %i %i %i %i\n", plane, lowpass_height, lowpass_width, highpass_stride);
-
             low    = s->plane[plane].subband[0];
             high   = s->plane[plane].subband[2];
             output = s->plane[plane].l_h[0];
@@ -1110,8 +1173,6 @@ static int cfhd_decode(AVCodecContext *avctx, void *data, int *got_frame,
                 goto end;
             }
 
-            av_log(avctx, AV_LOG_DEBUG, "Level 2 lowpass plane %i %i %i %i\n", plane, lowpass_height, lowpass_width, highpass_stride);
-
             low    = s->plane[plane].subband[0];
             high   = s->plane[plane].subband[5];
             output = s->plane[plane].l_h[3];
@@ -1149,40 +1210,9 @@ static int cfhd_decode(AVCodecContext *avctx, void *data, int *got_frame,
                 output += lowpass_width * 2;
             }
 
-            low    = s->plane[plane].subband[7];
-            high   = s->plane[plane].subband[9];
-            output = s->plane[plane].l_h[3];
-            for (i = 0; i < lowpass_width; i++) {
-                vert_filter(output, lowpass_width, low, lowpass_width, high, highpass_stride, lowpass_height);
-                low++;
-                high++;
-                output++;
-            }
-
-            low    = s->plane[plane].subband[8];
-            high   = s->plane[plane].subband[10];
-            output = s->plane[plane].l_h[4];
-            for (i = 0; i < lowpass_width; i++) {
-                vert_filter(output, lowpass_width, low, highpass_stride, high, highpass_stride, lowpass_height);
-                low++;
-                high++;
-                output++;
-            }
-
-            low    = s->plane[plane].l_h[3];
-            high   = s->plane[plane].l_h[4];
-            output = s->plane[plane].subband[7];
-            for (i = 0; i < lowpass_height * 2; i++) {
-                horiz_filter(output, low, high, lowpass_width);
-                low    += lowpass_width;
-                high   += lowpass_width;
-                output += lowpass_width * 2;
-            }
-
             lowpass_height  = s->plane[plane].band[4][1].height;
             lowpass_width   = s->plane[plane].band[4][1].width;
             highpass_stride = s->plane[plane].band[4][1].stride;
-            av_log(avctx, AV_LOG_DEBUG, "temporal level %i %i %i %i\n", plane, lowpass_height, lowpass_width, highpass_stride);
 
             if (lowpass_height > s->plane[plane].band[4][1].a_height || lowpass_width > s->plane[plane].band[4][1].a_width ||
                 !highpass_stride || s->plane[plane].band[4][1].width > s->plane[plane].band[4][1].a_width) {
@@ -1190,7 +1220,7 @@ static int cfhd_decode(AVCodecContext *avctx, void *data, int *got_frame,
                 ret = AVERROR(EINVAL);
                 goto end;
             }
-
+            ff_thread_await_progress(s->connection, progress2 += 1, 0);
             low    = s->plane[plane].subband[0];
             high   = s->plane[plane].subband[7];
             output = s->plane[plane].subband[0];
@@ -1199,6 +1229,7 @@ static int cfhd_decode(AVCodecContext *avctx, void *data, int *got_frame,
                 low    += lowpass_width;
                 high   += lowpass_width;
             }
+            ff_thread_report_progress(s->picture, progress1 += 1, 0);
             if (s->progressive) {
                 low    = s->plane[plane].subband[0];
                 high   = s->plane[plane].subband[15];
@@ -1220,37 +1251,17 @@ static int cfhd_decode(AVCodecContext *avctx, void *data, int *got_frame,
                     output++;
                 }
 
-                low    = s->plane[plane].subband[7];
-                high   = s->plane[plane].subband[12];
-                output = s->plane[plane].l_h[8];
-                for (i = 0; i < lowpass_width; i++) {
-                    vert_filter(output, lowpass_width, low, lowpass_width, high, highpass_stride, lowpass_height);
-                    low++;
-                    high++;
-                    output++;
-                }
-
-                low    = s->plane[plane].subband[11];
-                high   = s->plane[plane].subband[13];
-                output = s->plane[plane].l_h[9];
-                for (i = 0; i < lowpass_width; i++) {
-                    vert_filter(output, lowpass_width, low, highpass_stride, high, highpass_stride, lowpass_height);
-                    low++;
-                    high++;
-                    output++;
-                }
-
-                dst = (int16_t *)pic->data[act_plane];
+                dst = (int16_t *)s->picture->f->data[act_plane];
                 low  = s->plane[plane].l_h[6];
                 high = s->plane[plane].l_h[7];
                 for (i = 0; i < lowpass_height * 2; i++) {
-                    horiz_filter(dst, low, high, lowpass_width);
+                    horiz_filter_clip(dst, low, high, lowpass_width, s->bpc);
                     low  += lowpass_width;
                     high += lowpass_width;
-                    dst  += pic->linesize[act_plane] / 2;
+                    dst  += s->picture->f->linesize[act_plane] / 2;
                 }
             } else {
-                pic->interlaced_frame = 1;
+                s->picture->f->interlaced_frame = 1;
                 low    = s->plane[plane].subband[0];
                 high   = s->plane[plane].subband[14];
                 output = s->plane[plane].l_h[6];
@@ -1271,67 +1282,137 @@ static int cfhd_decode(AVCodecContext *avctx, void *data, int *got_frame,
                     output += lowpass_width * 2;
                 }
 
-                low    = s->plane[plane].subband[7];
-                high   = s->plane[plane].subband[11];
-                output = s->plane[plane].l_h[8];
-                for (i = 0; i < lowpass_height; i++) {
-                    horiz_filter(output, low, high, lowpass_width);
-                    low    += lowpass_width;
-                    high   += lowpass_width;
-                    output += lowpass_width * 2;
-                }
-
-                low    = s->plane[plane].subband[12];
-                high   = s->plane[plane].subband[13];
-                output = s->plane[plane].l_h[9];
-                for (i = 0; i < lowpass_height; i++) {
-                    horiz_filter(output, low, high, lowpass_width);
-                    low    += lowpass_width;
-                    high   += lowpass_width;
-                    output += lowpass_width * 2;
-                }
-
-
-                dst  = (int16_t *)pic->data[act_plane];
+                dst  = (int16_t *)s->picture->f->data[act_plane];
                 low  = s->plane[plane].l_h[6];
                 high = s->plane[plane].l_h[7];
                 for (i = 0; i < lowpass_height; i++) {
-                    inverse_temporal_filter(dst, low, high, lowpass_width * 2,  pic->linesize[act_plane]/2, 0);
+                    inverse_temporal_filter(dst, low, high, lowpass_width * 2,  s->picture->f->linesize[act_plane]/2, 0);
                     low  += lowpass_width * 2;
                     high += lowpass_width * 2;
-                    dst  += pic->linesize[act_plane];
+                    dst  += s->picture->f->linesize[act_plane];
                 }
             }
         }
+        ff_thread_report_progress(s->picture, INT_MAX, 0);
+        ff_thread_await_progress(s->connection, INT_MAX, 0);
+        av_frame_ref(frame.f, s->picture->f);
     } else if (s->sample_type == 1) {
-        int16_t *low, *high, *dst;
-        int lowpass_height, lowpass_width;
+        int16_t *low, *high, *dst, *output;
+        int lowpass_height, lowpass_width, highpass_stride, act_plane;
+        progress1 = 1, progress2 = 1;
         for (plane = 0; plane < planes && !ret; plane++) {
-            int act_plane = plane == 1 ? 2 : plane == 2 ? 1 : plane;
-            lowpass_height  = s->plane[plane].band[4][1].height;
-            lowpass_width   = s->plane[plane].band[4][1].width;
+            ff_thread_await_progress(s->connection, progress1 += 1, 0);
+            // highpass inverse for temporal
+            lowpass_height  = s->buffers[plane].band[1][1].a_height;
+            lowpass_width   = s->buffers[plane].band[1][1].a_width;
+            highpass_stride = s->buffers[plane].band[1][1].a_width;
+
+            low    = s->buffers[plane].subband[7];
+            high   = s->buffers[plane].subband[9];
+            output = s->buffers[plane].l_h[8];
+            for (i = 0; i < lowpass_width; i++) {
+                vert_filter(output, lowpass_width, low, lowpass_width, high, highpass_stride, lowpass_height);
+                low++;
+                high++;
+                output++;
+            }
+
+            low    = s->buffers[plane].subband[8];
+            high   = s->buffers[plane].subband[10];
+            output = s->buffers[plane].l_h[9];
+            for (i = 0; i < lowpass_width; i++) {
+                vert_filter(output, lowpass_width, low, highpass_stride, high, highpass_stride, lowpass_height);
+                low++;
+                high++;
+                output++;
+            }
+
+            low    = s->buffers[plane].l_h[8];
+            high   = s->buffers[plane].l_h[9];
+            output = s->buffers[plane].subband[7];
+            for (i = 0; i < lowpass_height * 2; i++) {
+                horiz_filter(output, low, high, lowpass_width);
+                low    += lowpass_width;
+                high   += lowpass_width;
+                output += lowpass_width * 2;
+            }
+            ff_thread_report_progress(s->picture, progress2 += 1, 0);
+        }
+        for (plane = 0; plane < planes && !ret; plane++) {
+            ff_thread_await_progress(s->connection, progress1 += 1, 0);
+
+            act_plane = plane == 1 ? 2 : plane == 2 ? 1 : plane;
+            lowpass_height  = s->buffers[plane].band[4][1].a_height;
+            lowpass_width   = s->buffers[plane].band[4][1].a_width;
+            highpass_stride = s->buffers[plane].band[4][1].a_width;
+
             if (s->progressive) {
-                dst = (int16_t *)pic->data[act_plane];
-                low  = s->plane[plane].l_h[8];
-                high = s->plane[plane].l_h[9];
+                low    = s->buffers[plane].subband[7];
+                high   = s->buffers[plane].subband[12];
+                output = s->buffers[plane].l_h[10];
+                for (i = 0; i < lowpass_width; i++) {
+                    vert_filter(output, lowpass_width, low, lowpass_width, high, highpass_stride, lowpass_height);
+                    low++;
+                    high++;
+                    output++;
+                }
+
+                low    = s->buffers[plane].subband[11];
+                high   = s->buffers[plane].subband[13];
+                output = s->buffers[plane].l_h[11];
+                for (i = 0; i < lowpass_width; i++) {
+                    vert_filter(output, lowpass_width, low, highpass_stride, high, highpass_stride, lowpass_height);
+                    low++;
+                    high++;
+                    output++;
+                }
+
+                dst = (int16_t *)s->picture->f->data[act_plane];
+                low  = s->buffers[plane].l_h[10];
+                high = s->buffers[plane].l_h[11];
                 for (i = 0; i < lowpass_height * 2; i++) {
-                    horiz_filter(dst, low, high, lowpass_width);
+                    horiz_filter_clip(dst, low, high, lowpass_width, s->bpc);
                     low  += lowpass_width;
                     high += lowpass_width;
-                    dst  += pic->linesize[act_plane] / 2;
+                    dst  += s->picture->f->linesize[act_plane] / 2;
                 }
             } else {
-                dst  = (int16_t *)pic->data[act_plane];
-                low  = s->plane[plane].l_h[8];
-                high = s->plane[plane].l_h[9];
+                av_log(avctx, AV_LOG_DEBUG, "interlaced frame ? %d", s->picture->f->interlaced_frame);
+                s->picture->f->interlaced_frame = 1;
+                low    = s->buffers[plane].subband[7];
+                high   = s->buffers[plane].subband[11];
+                output = s->buffers[plane].l_h[10];
+                for (i = 0; i < lowpass_height; i++) {
+                    horiz_filter(output, low, high, lowpass_width);
+                    low    += lowpass_width;
+                    high   += lowpass_width;
+                    output += lowpass_width * 2;
+                }
+
+                low    = s->buffers[plane].subband[12];
+                high   = s->buffers[plane].subband[13];
+                output = s->buffers[plane].l_h[11];
                 for (i = 0; i < lowpass_height; i++) {
-                    inverse_temporal_filter(dst, low, high, lowpass_width * 2,  pic->linesize[act_plane]/2, 0);
+                    horiz_filter(output, low, high, lowpass_width);
+                    low    += lowpass_width;
+                    high   += lowpass_width;
+                    output += lowpass_width * 2;
+                }
+
+                dst  = (int16_t *)s->picture->f->data[act_plane];
+                low  = s->buffers[plane].l_h[10];
+                high = s->buffers[plane].l_h[11];
+                for (i = 0; i < lowpass_height; i++) {
+                    inverse_temporal_filter(dst, low, high, lowpass_width * 2,  s->picture->f->linesize[act_plane]/2, 0);
                     low  += lowpass_width * 2;
                     high += lowpass_width * 2;
-                    dst  += pic->linesize[act_plane];
+                    dst  += s->picture->f->linesize[act_plane];
                 }
             }
         }
+        ff_thread_report_progress(s->picture, INT_MAX, 0);
+        ff_thread_await_progress(s->connection, INT_MAX, 0);
+        av_frame_ref(frame.f, s->picture->f);
     }
 
 end:
@@ -1352,19 +1433,30 @@ static av_cold int cfhd_close(AVCodecContext *avctx)
         ff_free_vlc(&s->vlc_9);
         ff_free_vlc(&s->vlc_18);
     }
+    if (s->i_frame.f && s->i_frame.f->data[0])
+        ff_thread_release_buffer(avctx, &s->i_frame);
+    if (s->p_frame.f && s->p_frame.f->data[0])
+        ff_thread_release_buffer(avctx, &s->p_frame);
+
+    if (s->i_frame.f)
+        av_frame_free(&s->i_frame.f);
+    if (s->p_frame.f)
+        av_frame_free(&s->p_frame.f);
 
     return 0;
 }
 
 AVCodec ff_cfhd_decoder = {
-    .name             = "cfhd",
-    .long_name        = NULL_IF_CONFIG_SMALL("Cineform HD"),
-    .type             = AVMEDIA_TYPE_VIDEO,
-    .id               = AV_CODEC_ID_CFHD,
-    .priv_data_size   = sizeof(CFHDContext),
-    .init             = cfhd_init,
-    .close            = cfhd_close,
-    .decode           = cfhd_decode,
-    .capabilities     = AV_CODEC_CAP_DR1,
-    .caps_internal    = FF_CODEC_CAP_INIT_CLEANUP,
+    .name                  = "cfhd",
+    .long_name             = NULL_IF_CONFIG_SMALL("Cineform HD"),
+    .type                  = AVMEDIA_TYPE_VIDEO,
+    .id                    = AV_CODEC_ID_CFHD,
+    .priv_data_size        = sizeof(CFHDContext),
+    .init                  = cfhd_init,
+    .close                 = cfhd_close,
+    .decode                = cfhd_decode,
+    .init_thread_copy      = ONLY_IF_THREADS_ENABLED(cfhd_init),
+    .update_thread_context = ONLY_IF_THREADS_ENABLED(update_thread_context),
+    .capabilities          = AV_CODEC_CAP_DR1 | AV_CODEC_CAP_FRAME_THREADS,
+    .caps_internal         = FF_CODEC_CAP_INIT_THREADSAFE | FF_CODEC_CAP_INIT_CLEANUP,
 };
diff --git a/libavcodec/cfhd.h b/libavcodec/cfhd.h
index 047c0f2028..d7a2ffe0a7 100644
--- a/libavcodec/cfhd.h
+++ b/libavcodec/cfhd.h
@@ -29,6 +29,7 @@
 #include "bytestream.h"
 #include "get_bits.h"
 #include "vlc.h"
+#include "thread.h"
 
 #define VLC_BITS       9
 #define SUBBAND_COUNT 17
@@ -63,7 +64,7 @@ typedef struct Plane {
 
     /* TODO: merge this into SubBand structure */
     int16_t *subband[SUBBAND_COUNT];
-    int16_t *l_h[10];
+    int16_t *l_h[12];
 
     SubBand band[DWT_LEVELS][4];
 } Plane;
@@ -76,6 +77,10 @@ typedef struct Peak {
 
 typedef struct CFHDContext {
     AVCodecContext *avctx;
+    ThreadFrame i_frame;
+    ThreadFrame p_frame;
+    ThreadFrame *connection;
+    ThreadFrame *picture;
 
     CFHD_RL_VLC_ELEM table_9_rl_vlc[2088];
     VLC vlc_9;
@@ -116,6 +121,7 @@ typedef struct CFHDContext {
 
     uint8_t prescale_shift[3];
     Plane plane[4];
+    Plane *buffers;
     Peak peak;
 } CFHDContext;
 
-- 
2.17.1