From patchwork Mon Aug  6 21:20:21 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Sergey Lavrushkin <dualfal@gmail.com>
X-Patchwork-Id: 9921
Delivered-To: ffmpegpatchwork@gmail.com
Received: by 2002:a02:104:0:0:0:0:0 with SMTP id c4-v6csp3732702jad;
	Mon, 6 Aug 2018 14:26:57 -0700 (PDT)
X-Google-Smtp-Source: 
 AAOMgpfDF783OFX8/2mLkLeNtNTFg9USidVcY5irp2z7LVuq3uA1GuK4d8UyJeqz7yChXaeqeFWV
X-Received: by 2002:a1c:6585:: with SMTP id
	z127-v6mr11869607wmb.5.1533590817569;
	Mon, 06 Aug 2018 14:26:57 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1533590817; cv=none;
	d=google.com; s=arc-20160816;
	b=h6EILEcLT16q3uRtA9yd2/AHXSNlT9gkwZTvzjdtC/xYEPiDQ1sgiRShDEKJOfU+bQ
	xcukHe4BPAACHfMxDV+8B0Xn05WsxyBSBMPkw/ZATGPo8zSB2LTYM8tusu2EC1KqI6Te
	IMGEOxD9QsuntxKkOYVrPt99urjKc4uUVI2eT372KiqOOCk5tllU/BIrNybodamirNHk
	TZT5vmWy+aJgvW+9UR1yoDHBh3ipmRRapbGsA9gQI8VxblRWAmrHGKTp/wRa4na/gfbL
	lmFRgAXPHSgogtwdEMpLQBz3GeHWAxJwmjUdlG5alP4wQWsl1OuSLgiltznrW4sZOSgI
	ISNQ==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
	s=arc-20160816;
	h=sender:errors-to:reply-to:list-subscribe:list-help:list-post
	:list-archive:list-unsubscribe:list-id:precedence:subject:to
	:message-id:date:from:references:in-reply-to:mime-version
	:dkim-signature:delivered-to:arc-authentication-results;
	bh=VXmdhm8lBRcMEw0L506hkvwGp3S2jZ0fgIWmCO1jpj8=;
	b=tTA2miEHiVt3eGHyZZe+QR7DIHNKzweS9AblGjNglr7xZ0YAzaoFpxBbGwI5FhlwyF
	1I7YHJd1ZvPQRrrLizwtgddxclLJuPgeMVlJ+E+JWCPly9uzchhuGjFVaT0iZN1ybGhm
	HS/X9To5OoQsSIlAu6ndmBw1SE6ki8ouhaeOCRJWgNXGwCCx0LCB7597mNByEkXdsu1u
	ZRgQ6Wme72It7VqStcurLh9WUsEgoJYnnHoHJaqvQqhNMyjR3QYQnuzFnK6PUl3+jCPj
	+EgaHI5Hc4pFeEqGUL4y3rnq0XtleR3deBtC3EW74PYgUg0FDAFHQgXXJBZuZikft7Lq
	mUBg==
ARC-Authentication-Results: i=1; mx.google.com;
	dkim=neutral (body hash did not verify) header.i=@gmail.com
	header.s=20161025 header.b=L6oiu2PF;
	spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
	designates 79.124.17.100 as permitted sender)
	smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
	dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100])
	by mx.google.com with ESMTP id
	l11-v6si10102642wro.140.2018.08.06.14.26.57;
	Mon, 06 Aug 2018 14:26:57 -0700 (PDT)
Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
	designates 79.124.17.100 as permitted sender)
	client-ip=79.124.17.100;
Authentication-Results: mx.google.com;
	dkim=neutral (body hash did not verify) header.i=@gmail.com
	header.s=20161025 header.b=L6oiu2PF;
	spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
	designates 79.124.17.100 as permitted sender)
	smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
	dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 78018689BDC;
	Tue,  7 Aug 2018 00:26:35 +0300 (EEST)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from mail-pf1-f178.google.com (mail-pf1-f178.google.com
	[209.85.210.178])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 0940E689253
	for <ffmpeg-devel@ffmpeg.org>; Tue,  7 Aug 2018 00:26:29 +0300 (EEST)
Received: by mail-pf1-f178.google.com with SMTP id i26-v6so7439278pfo.12
	for <ffmpeg-devel@ffmpeg.org>; Mon, 06 Aug 2018 14:26:49 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
	h=mime-version:in-reply-to:references:from:date:message-id:subject:to;
	bh=YxqrrCrbsZh5PWDCWEnIf4aGFiPRushy79mq9XV7n6o=;
	b=L6oiu2PFKe8qp+qvoIsP06H533Gkh+XWhrewpRwevgqnQZRCl/nUO5CQvmtrqxvksZ
	xjWAu9wvaSA5IgbAU0bkPjABc+Gb2KM0eAf716teG6QowOVjlbkfgLdmWjZCJAIPy/FS
	wSJg/v7FrlDxx+bhuEgqQCvw9OdY66DsjoH3pneCRLMOZ6c8qISpFDr9gpE8Z2qrqt4w
	MpYLjdZV07+6PmoLhqbSFjwg27hTw8ShGdKmDsvD6tm2INeu7/H59ap45kdnJmLrZ5dB
	Bkcda3T9sriRE/b38QcJyCV/iI0b88rfNmu8xXBTXEx4fpDRQgJBuPHyo5tAokNnyWB0
	0wVQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:mime-version:in-reply-to:references:from:date
	:message-id:subject:to;
	bh=YxqrrCrbsZh5PWDCWEnIf4aGFiPRushy79mq9XV7n6o=;
	b=njBSS36/bD+wutqgcL3zKOfGTW7Y2J4rhjIY/CcuxS0KXId2xFl/zDhBTJ2mlWYyR9
	HR5DYZbFgQUJ0vztFiBJkL1kZYozaf9dCn8f9RIK0gV03CnFoiLjoOIsLAM5uK4xXDNO
	gbB2RSF4KrMo9EuJ3qORr9dqGPhgjv2wind3arDjazzQV3r4BpA0zopNKax2FvL4hB6k
	J8rwLj4ZYlH2OIHFvgErSv0vJWBXo6QnYXsRdInzcqHEve6BF9wSBPRm9dpo1Pge+PyL
	Q9HKDifFIqBERCDTXU+Oyf/fIL+SEFC3wrnfg6tDWnqRgjbNysoGRjWfEwrVLZJ/2POy
	SWnQ==
X-Gm-Message-State: AOUpUlHFl9kyhrRQnSKPxZ1WUKKd1Vb+Ad6hZcjx3sJxykS+w/teKDow
	fDEVMf83fl2hNipzYC68YytjkTD4T367sfDSsLHWo3cT
X-Received: by 2002:a62:1a8f:: with SMTP id
	a137-v6mr18817132pfa.190.1533590422113;
	Mon, 06 Aug 2018 14:20:22 -0700 (PDT)
MIME-Version: 1.0
Received: by 2002:a17:90a:501:0:0:0:0 with HTTP; Mon, 6 Aug 2018 14:20:21
	-0700 (PDT)
In-Reply-To: <20180802185248.18168-7-dualfal@gmail.com>
References: <20180802185248.18168-1-dualfal@gmail.com>
	<20180802185248.18168-7-dualfal@gmail.com>
From: Sergey Lavrushkin <dualfal@gmail.com>
Date: Tue, 7 Aug 2018 00:20:21 +0300
Message-ID: 
 <CAAeE=qrGYZHPt0S_84Ku=OTA=e9p8yMWYetP475_L+LjEX2LEw@mail.gmail.com>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
X-Content-Filtered-By: Mailman/MimeDel 2.1.20
Subject: Re: [FFmpeg-devel] [PATCH 6/7] libavfilter/vf_sr.c: Removes uint8
	-> float and float -> uint8 conversions.
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <http://ffmpeg.org/mailman/options/ffmpeg-devel>,
	<mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <http://ffmpeg.org/pipermail/ffmpeg-devel/>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <http://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
	<mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches
	<ffmpeg-devel@ffmpeg.org>
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>

Updated patch.

2018-08-02 21:52 GMT+03:00 Sergey Lavrushkin <dualfal@gmail.com>:

> This patch removes conversions, declared inside the sr filter, and uses
> libswscale inside
> the filter to perform them for only Y channel of input. The sr filter
> still has uint
> formats as input, as it does not use chroma channels in models and these
> channels are
> upscaled using libswscale, float formats for input would cause unnecessary
> conversions
> during scaling for these channels.
>
> ---
>  libavfilter/vf_sr.c | 134 +++++++++++++++++++-----------
> ----------------------
>  1 file changed, 48 insertions(+), 86 deletions(-)
>
> diff --git a/libavfilter/vf_sr.c b/libavfilter/vf_sr.c
> index 944a0e28e7..5ad1baa4c0 100644
> --- a/libavfilter/vf_sr.c
> +++ b/libavfilter/vf_sr.c
> @@ -45,8 +45,8 @@ typedef struct SRContext {
>      DNNModel *model;
>      DNNData input, output;
>      int scale_factor;
> -    struct SwsContext *sws_context;
> -    int sws_slice_h;
> +    struct SwsContext *sws_contexts[3];
> +    int sws_slice_h, sws_input_linesize, sws_output_linesize;
>  } SRContext;
>
>  #define OFFSET(x) offsetof(SRContext, x)
> @@ -95,6 +95,10 @@ static av_cold int init(AVFilterContext *context)
>          return AVERROR(EIO);
>      }
>
> +    sr_context->sws_contexts[0] = NULL;
> +    sr_context->sws_contexts[1] = NULL;
> +    sr_context->sws_contexts[2] = NULL;
> +
>      return 0;
>  }
>
> @@ -110,6 +114,7 @@ static int query_formats(AVFilterContext *context)
>          av_log(context, AV_LOG_ERROR, "could not create formats list\n");
>          return AVERROR(ENOMEM);
>      }
> +
>      return ff_set_common_formats(context, formats_list);
>  }
>
> @@ -140,21 +145,31 @@ static int config_props(AVFilterLink *inlink)
>      else{
>          outlink->h = sr_context->output.height;
>          outlink->w = sr_context->output.width;
> +        sr_context->sws_contexts[1] = sws_getContext(sr_context->input.width,
> sr_context->input.height, AV_PIX_FMT_GRAY8,
> +
>  sr_context->input.width, sr_context->input.height, AV_PIX_FMT_GRAYF32,
> +                                                     0, NULL, NULL, NULL);
> +        sr_context->sws_input_linesize = sr_context->input.width << 2;
> +        sr_context->sws_contexts[2] = sws_getContext(sr_context->output.width,
> sr_context->output.height, AV_PIX_FMT_GRAYF32,
> +
>  sr_context->output.width, sr_context->output.height, AV_PIX_FMT_GRAY8,
> +                                                     0, NULL, NULL, NULL);
> +        sr_context->sws_output_linesize = sr_context->output.width << 2;
> +        if (!sr_context->sws_contexts[1] || !sr_context->sws_contexts[2]){
> +            av_log(context, AV_LOG_ERROR, "could not create SwsContext
> for conversions\n");
> +            return AVERROR(ENOMEM);
> +        }
>          switch (sr_context->model_type){
>          case SRCNN:
> -            sr_context->sws_context = sws_getContext(inlink->w,
> inlink->h, inlink->format,
> -                                                     outlink->w,
> outlink->h, outlink->format, SWS_BICUBIC, NULL, NULL, NULL);
> -            if (!sr_context->sws_context){
> -                av_log(context, AV_LOG_ERROR, "could not create
> SwsContext\n");
> +            sr_context->sws_contexts[0] = sws_getContext(inlink->w,
> inlink->h, inlink->format,
> +                                                         outlink->w,
> outlink->h, outlink->format,
> +                                                         SWS_BICUBIC,
> NULL, NULL, NULL);
> +            if (!sr_context->sws_contexts[0]){
> +                av_log(context, AV_LOG_ERROR, "could not create
> SwsContext for scaling\n");
>                  return AVERROR(ENOMEM);
>              }
>              sr_context->sws_slice_h = inlink->h;
>              break;
>          case ESPCN:
> -            if (inlink->format == AV_PIX_FMT_GRAY8){
> -                sr_context->sws_context = NULL;
> -            }
> -            else{
> +            if (inlink->format != AV_PIX_FMT_GRAY8){
>                  sws_src_h = sr_context->input.height;
>                  sws_src_w = sr_context->input.width;
>                  sws_dst_h = sr_context->output.height;
> @@ -184,13 +199,14 @@ static int config_props(AVFilterLink *inlink)
>                      sws_dst_w = AV_CEIL_RSHIFT(sws_dst_w, 2);
>                      break;
>                  default:
> -                    av_log(context, AV_LOG_ERROR, "could not create
> SwsContext for input pixel format");
> +                    av_log(context, AV_LOG_ERROR, "could not create
> SwsContext for scaling for given input pixel format");
>                      return AVERROR(EIO);
>                  }
> -                sr_context->sws_context = sws_getContext(sws_src_w,
> sws_src_h, AV_PIX_FMT_GRAY8,
> -                                                         sws_dst_w,
> sws_dst_h, AV_PIX_FMT_GRAY8, SWS_BICUBIC, NULL, NULL, NULL);
> -                if (!sr_context->sws_context){
> -                    av_log(context, AV_LOG_ERROR, "could not create
> SwsContext\n");
> +                sr_context->sws_contexts[0] = sws_getContext(sws_src_w,
> sws_src_h, AV_PIX_FMT_GRAY8,
> +                                                             sws_dst_w,
> sws_dst_h, AV_PIX_FMT_GRAY8,
> +                                                             SWS_BICUBIC,
> NULL, NULL, NULL);
> +                if (!sr_context->sws_contexts[0]){
> +                    av_log(context, AV_LOG_ERROR, "could not create
> SwsContext for scaling\n");
>                      return AVERROR(ENOMEM);
>                  }
>                  sr_context->sws_slice_h = sws_src_h;
> @@ -201,61 +217,12 @@ static int config_props(AVFilterLink *inlink)
>      }
>  }
>
> -typedef struct ThreadData{
> -    uint8_t *data;
> -    int data_linesize, height, width;
> -} ThreadData;
> -
> -static int uint8_to_float(AVFilterContext *context, void *arg, int jobnr,
> int nb_jobs)
> -{
> -    SRContext *sr_context = context->priv;
> -    const ThreadData *td = arg;
> -    const int slice_start = (td->height *  jobnr     ) / nb_jobs;
> -    const int slice_end   = (td->height * (jobnr + 1)) / nb_jobs;
> -    const uint8_t *src = td->data + slice_start * td->data_linesize;
> -    float *dst = sr_context->input.data + slice_start * td->width;
> -    int y, x;
> -
> -    for (y = slice_start; y < slice_end; ++y){
> -        for (x = 0; x < td->width; ++x){
> -            dst[x] = (float)src[x] / 255.0f;
> -        }
> -        src += td->data_linesize;
> -        dst += td->width;
> -    }
> -
> -    return 0;
> -}
> -
> -static int float_to_uint8(AVFilterContext *context, void *arg, int jobnr,
> int nb_jobs)
> -{
> -    SRContext *sr_context = context->priv;
> -    const ThreadData *td = arg;
> -    const int slice_start = (td->height *  jobnr     ) / nb_jobs;
> -    const int slice_end   = (td->height * (jobnr + 1)) / nb_jobs;
> -    const float *src = sr_context->output.data + slice_start * td->width;
> -    uint8_t *dst = td->data + slice_start * td->data_linesize;
> -    int y, x;
> -
> -    for (y = slice_start; y < slice_end; ++y){
> -        for (x = 0; x < td->width; ++x){
> -            dst[x] = (uint8_t)(255.0f * FFMIN(src[x], 1.0f));
> -        }
> -        src += td->width;
> -        dst += td->data_linesize;
> -    }
> -
> -    return 0;
> -}
> -
>  static int filter_frame(AVFilterLink *inlink, AVFrame *in)
>  {
>      AVFilterContext *context = inlink->dst;
>      SRContext *sr_context = context->priv;
>      AVFilterLink *outlink = context->outputs[0];
>      AVFrame *out = ff_get_video_buffer(outlink, outlink->w, outlink->h);
> -    ThreadData td;
> -    int nb_threads;
>      DNNReturnType dnn_result;
>
>      if (!out){
> @@ -268,28 +235,23 @@ static int filter_frame(AVFilterLink *inlink,
> AVFrame *in)
>      out->width = sr_context->output.width;
>      switch (sr_context->model_type){
>      case SRCNN:
> -        sws_scale(sr_context->sws_context, (const uint8_t **)in->data,
> in->linesize,
> +        sws_scale(sr_context->sws_contexts[0], (const uint8_t
> **)in->data, in->linesize,
>                    0, sr_context->sws_slice_h, out->data, out->linesize);
> -        td.data = out->data[0];
> -        td.data_linesize = out->linesize[0];
> -        td.height = out->height;
> -        td.width = out->width;
> +
> +        sws_scale(sr_context->sws_contexts[1], (const uint8_t
> **)out->data, out->linesize,
> +                  0, out->height, (uint8_t * const*)(&sr_context->input.data),
> &sr_context->sws_input_linesize);
>          break;
>      case ESPCN:
> -        if (sr_context->sws_context){
> -            sws_scale(sr_context->sws_context, (const uint8_t
> **)(in->data + 1), in->linesize + 1,
> +        if (sr_context->sws_contexts[0]){
> +            sws_scale(sr_context->sws_contexts[0], (const uint8_t
> **)(in->data + 1), in->linesize + 1,
>                        0, sr_context->sws_slice_h, out->data + 1,
> out->linesize + 1);
> -            sws_scale(sr_context->sws_context, (const uint8_t
> **)(in->data + 2), in->linesize + 2,
> +            sws_scale(sr_context->sws_contexts[0], (const uint8_t
> **)(in->data + 2), in->linesize + 2,
>                        0, sr_context->sws_slice_h, out->data + 2,
> out->linesize + 2);
>          }
> -        td.data = in->data[0];
> -        td.data_linesize = in->linesize[0];
> -        td.height = in->height;
> -        td.width = in->width;
> -    }
>
> -    nb_threads = ff_filter_get_nb_threads(context);
> -    context->internal->execute(context, uint8_to_float, &td, NULL,
> FFMIN(td.height, nb_threads));
> +        sws_scale(sr_context->sws_contexts[1], (const uint8_t
> **)in->data, in->linesize,
> +                  0, in->height, (uint8_t * const*)(&sr_context->input.data),
> &sr_context->sws_input_linesize);
> +    }
>      av_frame_free(&in);
>
>      dnn_result = (sr_context->dnn_module->execute_model)(sr_context->
> model);
> @@ -298,17 +260,15 @@ static int filter_frame(AVFilterLink *inlink,
> AVFrame *in)
>          return AVERROR(EIO);
>      }
>
> -    td.data = out->data[0];
> -    td.data_linesize = out->linesize[0];
> -    td.height = out->height;
> -    td.width = out->width;
> -    context->internal->execute(context, float_to_uint8, &td, NULL,
> FFMIN(td.height, nb_threads));
> +    sws_scale(sr_context->sws_contexts[2], (const uint8_t
> **)(&sr_context->output.data), &sr_context->sws_output_linesize,
> +              0, out->height, (uint8_t * const*)out->data, out->linesize);
>
>      return ff_filter_frame(outlink, out);
>  }
>
>  static av_cold void uninit(AVFilterContext *context)
>  {
> +    int i;
>      SRContext *sr_context = context->priv;
>
>      if (sr_context->dnn_module){
> @@ -316,8 +276,10 @@ static av_cold void uninit(AVFilterContext *context)
>          av_freep(&sr_context->dnn_module);
>      }
>
> -    if (sr_context->sws_context){
> -        sws_freeContext(sr_context->sws_context);
> +    for (i = 0; i < 3; ++i){
> +        if (sr_context->sws_contexts[i]){
> +            sws_freeContext(sr_context->sws_contexts[i]);
> +        }
>      }
>  }
>
> --
> 2.14.1
>
>

From 572fee3362a00113aaddb49c3385f9386388b3fb Mon Sep 17 00:00:00 2001
From: Sergey Lavrushkin <dualfal@gmail.com>
Date: Tue, 31 Jul 2018 18:58:28 +0300
Subject: [PATCH 7/9] libavfilter/vf_sr.c: Removes uint8 -> float and float ->
 uint8 conversions.

---
 libavfilter/vf_sr.c | 134 +++++++++++++++++++---------------------------------
 1 file changed, 48 insertions(+), 86 deletions(-)

diff --git a/libavfilter/vf_sr.c b/libavfilter/vf_sr.c
index 944a0e28e7..5ad1baa4c0 100644
--- a/libavfilter/vf_sr.c
+++ b/libavfilter/vf_sr.c
@@ -45,8 +45,8 @@ typedef struct SRContext {
     DNNModel *model;
     DNNData input, output;
     int scale_factor;
-    struct SwsContext *sws_context;
-    int sws_slice_h;
+    struct SwsContext *sws_contexts[3];
+    int sws_slice_h, sws_input_linesize, sws_output_linesize;
 } SRContext;
 
 #define OFFSET(x) offsetof(SRContext, x)
@@ -95,6 +95,10 @@ static av_cold int init(AVFilterContext *context)
         return AVERROR(EIO);
     }
 
+    sr_context->sws_contexts[0] = NULL;
+    sr_context->sws_contexts[1] = NULL;
+    sr_context->sws_contexts[2] = NULL;
+
     return 0;
 }
 
@@ -110,6 +114,7 @@ static int query_formats(AVFilterContext *context)
         av_log(context, AV_LOG_ERROR, "could not create formats list\n");
         return AVERROR(ENOMEM);
     }
+
     return ff_set_common_formats(context, formats_list);
 }
 
@@ -140,21 +145,31 @@ static int config_props(AVFilterLink *inlink)
     else{
         outlink->h = sr_context->output.height;
         outlink->w = sr_context->output.width;
+        sr_context->sws_contexts[1] = sws_getContext(sr_context->input.width, sr_context->input.height, AV_PIX_FMT_GRAY8,
+                                                     sr_context->input.width, sr_context->input.height, AV_PIX_FMT_GRAYF32,
+                                                     0, NULL, NULL, NULL);
+        sr_context->sws_input_linesize = sr_context->input.width << 2;
+        sr_context->sws_contexts[2] = sws_getContext(sr_context->output.width, sr_context->output.height, AV_PIX_FMT_GRAYF32,
+                                                     sr_context->output.width, sr_context->output.height, AV_PIX_FMT_GRAY8,
+                                                     0, NULL, NULL, NULL);
+        sr_context->sws_output_linesize = sr_context->output.width << 2;
+        if (!sr_context->sws_contexts[1] || !sr_context->sws_contexts[2]){
+            av_log(context, AV_LOG_ERROR, "could not create SwsContext for conversions\n");
+            return AVERROR(ENOMEM);
+        }
         switch (sr_context->model_type){
         case SRCNN:
-            sr_context->sws_context = sws_getContext(inlink->w, inlink->h, inlink->format,
-                                                     outlink->w, outlink->h, outlink->format, SWS_BICUBIC, NULL, NULL, NULL);
-            if (!sr_context->sws_context){
-                av_log(context, AV_LOG_ERROR, "could not create SwsContext\n");
+            sr_context->sws_contexts[0] = sws_getContext(inlink->w, inlink->h, inlink->format,
+                                                         outlink->w, outlink->h, outlink->format,
+                                                         SWS_BICUBIC, NULL, NULL, NULL);
+            if (!sr_context->sws_contexts[0]){
+                av_log(context, AV_LOG_ERROR, "could not create SwsContext for scaling\n");
                 return AVERROR(ENOMEM);
             }
             sr_context->sws_slice_h = inlink->h;
             break;
         case ESPCN:
-            if (inlink->format == AV_PIX_FMT_GRAY8){
-                sr_context->sws_context = NULL;
-            }
-            else{
+            if (inlink->format != AV_PIX_FMT_GRAY8){
                 sws_src_h = sr_context->input.height;
                 sws_src_w = sr_context->input.width;
                 sws_dst_h = sr_context->output.height;
@@ -184,13 +199,14 @@ static int config_props(AVFilterLink *inlink)
                     sws_dst_w = AV_CEIL_RSHIFT(sws_dst_w, 2);
                     break;
                 default:
-                    av_log(context, AV_LOG_ERROR, "could not create SwsContext for input pixel format");
+                    av_log(context, AV_LOG_ERROR, "could not create SwsContext for scaling for given input pixel format");
                     return AVERROR(EIO);
                 }
-                sr_context->sws_context = sws_getContext(sws_src_w, sws_src_h, AV_PIX_FMT_GRAY8,
-                                                         sws_dst_w, sws_dst_h, AV_PIX_FMT_GRAY8, SWS_BICUBIC, NULL, NULL, NULL);
-                if (!sr_context->sws_context){
-                    av_log(context, AV_LOG_ERROR, "could not create SwsContext\n");
+                sr_context->sws_contexts[0] = sws_getContext(sws_src_w, sws_src_h, AV_PIX_FMT_GRAY8,
+                                                             sws_dst_w, sws_dst_h, AV_PIX_FMT_GRAY8,
+                                                             SWS_BICUBIC, NULL, NULL, NULL);
+                if (!sr_context->sws_contexts[0]){
+                    av_log(context, AV_LOG_ERROR, "could not create SwsContext for scaling\n");
                     return AVERROR(ENOMEM);
                 }
                 sr_context->sws_slice_h = sws_src_h;
@@ -201,61 +217,12 @@ static int config_props(AVFilterLink *inlink)
     }
 }
 
-typedef struct ThreadData{
-    uint8_t *data;
-    int data_linesize, height, width;
-} ThreadData;
-
-static int uint8_to_float(AVFilterContext *context, void *arg, int jobnr, int nb_jobs)
-{
-    SRContext *sr_context = context->priv;
-    const ThreadData *td = arg;
-    const int slice_start = (td->height *  jobnr     ) / nb_jobs;
-    const int slice_end   = (td->height * (jobnr + 1)) / nb_jobs;
-    const uint8_t *src = td->data + slice_start * td->data_linesize;
-    float *dst = sr_context->input.data + slice_start * td->width;
-    int y, x;
-
-    for (y = slice_start; y < slice_end; ++y){
-        for (x = 0; x < td->width; ++x){
-            dst[x] = (float)src[x] / 255.0f;
-        }
-        src += td->data_linesize;
-        dst += td->width;
-    }
-
-    return 0;
-}
-
-static int float_to_uint8(AVFilterContext *context, void *arg, int jobnr, int nb_jobs)
-{
-    SRContext *sr_context = context->priv;
-    const ThreadData *td = arg;
-    const int slice_start = (td->height *  jobnr     ) / nb_jobs;
-    const int slice_end   = (td->height * (jobnr + 1)) / nb_jobs;
-    const float *src = sr_context->output.data + slice_start * td->width;
-    uint8_t *dst = td->data + slice_start * td->data_linesize;
-    int y, x;
-
-    for (y = slice_start; y < slice_end; ++y){
-        for (x = 0; x < td->width; ++x){
-            dst[x] = (uint8_t)(255.0f * FFMIN(src[x], 1.0f));
-        }
-        src += td->width;
-        dst += td->data_linesize;
-    }
-
-    return 0;
-}
-
 static int filter_frame(AVFilterLink *inlink, AVFrame *in)
 {
     AVFilterContext *context = inlink->dst;
     SRContext *sr_context = context->priv;
     AVFilterLink *outlink = context->outputs[0];
     AVFrame *out = ff_get_video_buffer(outlink, outlink->w, outlink->h);
-    ThreadData td;
-    int nb_threads;
     DNNReturnType dnn_result;
 
     if (!out){
@@ -268,28 +235,23 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in)
     out->width = sr_context->output.width;
     switch (sr_context->model_type){
     case SRCNN:
-        sws_scale(sr_context->sws_context, (const uint8_t **)in->data, in->linesize,
+        sws_scale(sr_context->sws_contexts[0], (const uint8_t **)in->data, in->linesize,
                   0, sr_context->sws_slice_h, out->data, out->linesize);
-        td.data = out->data[0];
-        td.data_linesize = out->linesize[0];
-        td.height = out->height;
-        td.width = out->width;
+
+        sws_scale(sr_context->sws_contexts[1], (const uint8_t **)out->data, out->linesize,
+                  0, out->height, (uint8_t * const*)(&sr_context->input.data), &sr_context->sws_input_linesize);
         break;
     case ESPCN:
-        if (sr_context->sws_context){
-            sws_scale(sr_context->sws_context, (const uint8_t **)(in->data + 1), in->linesize + 1,
+        if (sr_context->sws_contexts[0]){
+            sws_scale(sr_context->sws_contexts[0], (const uint8_t **)(in->data + 1), in->linesize + 1,
                       0, sr_context->sws_slice_h, out->data + 1, out->linesize + 1);
-            sws_scale(sr_context->sws_context, (const uint8_t **)(in->data + 2), in->linesize + 2,
+            sws_scale(sr_context->sws_contexts[0], (const uint8_t **)(in->data + 2), in->linesize + 2,
                       0, sr_context->sws_slice_h, out->data + 2, out->linesize + 2);
         }
-        td.data = in->data[0];
-        td.data_linesize = in->linesize[0];
-        td.height = in->height;
-        td.width = in->width;
-    }
 
-    nb_threads = ff_filter_get_nb_threads(context);
-    context->internal->execute(context, uint8_to_float, &td, NULL, FFMIN(td.height, nb_threads));
+        sws_scale(sr_context->sws_contexts[1], (const uint8_t **)in->data, in->linesize,
+                  0, in->height, (uint8_t * const*)(&sr_context->input.data), &sr_context->sws_input_linesize);
+    }
     av_frame_free(&in);
 
     dnn_result = (sr_context->dnn_module->execute_model)(sr_context->model);
@@ -298,17 +260,15 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in)
         return AVERROR(EIO);
     }
 
-    td.data = out->data[0];
-    td.data_linesize = out->linesize[0];
-    td.height = out->height;
-    td.width = out->width;
-    context->internal->execute(context, float_to_uint8, &td, NULL, FFMIN(td.height, nb_threads));
+    sws_scale(sr_context->sws_contexts[2], (const uint8_t **)(&sr_context->output.data), &sr_context->sws_output_linesize,
+              0, out->height, (uint8_t * const*)out->data, out->linesize);
 
     return ff_filter_frame(outlink, out);
 }
 
 static av_cold void uninit(AVFilterContext *context)
 {
+    int i;
     SRContext *sr_context = context->priv;
 
     if (sr_context->dnn_module){
@@ -316,8 +276,10 @@ static av_cold void uninit(AVFilterContext *context)
         av_freep(&sr_context->dnn_module);
     }
 
-    if (sr_context->sws_context){
-        sws_freeContext(sr_context->sws_context);
+    for (i = 0; i < 3; ++i){
+        if (sr_context->sws_contexts[i]){
+            sws_freeContext(sr_context->sws_contexts[i]);
+        }
     }
 }
 
-- 
2.14.1