From patchwork Tue May 14 02:19:24 2019
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Philip Langdale <philipl@overt.org>
X-Patchwork-Id: 13097
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
X-Original-To: patchwork@ffaux-bg.ffmpeg.org
Delivered-To: patchwork@ffaux-bg.ffmpeg.org
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100])
	by ffaux.localdomain (Postfix) with ESMTP id 9227C44752A
	for <patchwork@ffaux-bg.ffmpeg.org>;
	Tue, 14 May 2019 05:19:34 +0300 (EEST)
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 693FF68972B;
	Tue, 14 May 2019 05:19:34 +0300 (EEST)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from mail.overt.org (mail.overt.org [157.230.92.47])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id A87D06882FF
	for <ffmpeg-devel@ffmpeg.org>; Tue, 14 May 2019 05:19:28 +0300 (EEST)
Received: from authenticated-user (mail.overt.org [157.230.92.47])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by mail.overt.org (Postfix) with ESMTPSA id 12A15401FC
	for <ffmpeg-devel@ffmpeg.org>; Mon, 13 May 2019 21:19:26 -0500 (CDT)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=overt.org; s=mail;
	t=1557800367; bh=dnzoGefgIxXBj52VD2B/yt/YzNO/rjA0f8KMw3Iu3Lw=;
	h=Date:From:To:Subject:In-Reply-To:References:From;
	b=WzBgegRuw+gzQQorrYDdGwt+z3syzXViramIri/jxKPvwjmf/FgdcoWWfKrElIwAN
	V+9yBFixgdRmRrPZY4pf6vuIXF4jTuhQgWAji/OCT0QOQBcLJi/FNKPjsgGGf5i2Sw
	9ENSg8MT4blcOxePltU2GA2BnHSzqvz8oUPkHXXQArtLXlwWD3F6md0GMnldM5ABNT
	+FF1ovm3B+if9XTUR27DAhbnqGTx1gnYFAGDEnn+x91kwJs9hp2ri2IAMDWgS1gCeW
	vKerS0wERUYP5ZlQUKyjFwRcEZH1bzmdLWEsOf9ahVRTqjCjSToKgzuDSqBzWY6Xmh
	61nRd0av89FpA==
Date: Mon, 13 May 2019 19:19:24 -0700
From: Philip Langdale <philipl@overt.org>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Message-ID: <20190513191924.581be6ad@fido6>
In-Reply-To: <1557746289-636-1-git-send-email-svechnikov66@gmail.com>
References: <1557746289-636-1-git-send-email-svechnikov66@gmail.com>
MIME-Version: 1.0
Subject: Re: [FFmpeg-devel] [PATCH] libavfilter/vf_scale_cuda: fix src_pitch
	for 10bit videos
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <http://ffmpeg.org/mailman/options/ffmpeg-devel>,
	<mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <http://ffmpeg.org/pipermail/ffmpeg-devel/>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <http://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
	<mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches
	<ffmpeg-devel@ffmpeg.org>
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>

On Mon, 13 May 2019 11:18:09 +0000
Sergey Svechnikov <svechnikov66@gmail.com> wrote:

> When scaling a 10bit video using scale_cuda filter (witch uses pixel
> format AV_PIX_FMT_P010LE), the output video gets distorted. I think
> it has something to do with the differences in processing between
> cuda_sdk and ffnvcodec with cuda_nvcc (the problem appears after this
> commit
> https://github.com/FFmpeg/FFmpeg/commit/2544c7ea67ca9521c5de36396bc9ac7058223742).
> To solve the problem we should not divide the input frame planes'
> linesizes by 2 and leave them as they are. More info, samples and
> reproduction steps are here
> https://github.com/Svechnikov/ffmpeg-scale-cuda-10bit-problem ---
> libavfilter/vf_scale_cuda.c | 4 ++-- 1 file changed, 2 insertions(+),
> 2 deletions(-)
> 
> diff --git a/libavfilter/vf_scale_cuda.c b/libavfilter/vf_scale_cuda.c
> index c97a802..7fc33ee 100644
> --- a/libavfilter/vf_scale_cuda.c
> +++ b/libavfilter/vf_scale_cuda.c
> @@ -423,11 +423,11 @@ static int scalecuda_resize(AVFilterContext
> *ctx, break;
>      case AV_PIX_FMT_P010LE:
>          call_resize_kernel(ctx, s->cu_func_ushort, 1,
> -                           in->data[0], in->width, in->height,
> in->linesize[0]/2,
> +                           in->data[0], in->width, in->height,
> in->linesize[0], out->data[0], out->width, out->height,
> out->linesize[0]/2, 2);
>          call_resize_kernel(ctx, s->cu_func_ushort2, 2,
> -                           in->data[1], in->width / 2, in->height /
> 2, in->linesize[1]/2,
> +                           in->data[1], in->width / 2, in->height /
> 2, in->linesize[1], out->data[0] + out->linesize[0] * ((out->height +
> 31) & ~0x1f), out->width / 2, out->height / 2, out->linesize[1] / 4,
> 2); break;

Thanks for reporting the problem. I took a look and identified the
precise mistake I made. I dropped the `pixel_size` scaling factor
when setting `pitchInBytes`. Here is the fix I intend to apply.

--phil

diff --git a/libavfilter/vf_scale_cuda.c b/libavfilter/vf_scale_cuda.c
index c97a802ddc..ecfd6a1c92 100644
--- a/libavfilter/vf_scale_cuda.c
+++ b/libavfilter/vf_scale_cuda.c
@@ -357,7 +357,7 @@ static int call_resize_kernel(AVFilterContext *ctx,
CUfunction func, int channel .res.pitch2D.numChannels = channels,
         .res.pitch2D.width = src_width,
         .res.pitch2D.height = src_height,
-        .res.pitch2D.pitchInBytes = src_pitch,
+        .res.pitch2D.pitchInBytes = src_pitch * pixel_size,
         .res.pitch2D.devPtr = (CUdeviceptr)src_dptr,
     };