From patchwork Tue May 14 03:12:22 2019
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Philip Langdale <philipl@overt.org>
X-Patchwork-Id: 13099
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
X-Original-To: patchwork@ffaux-bg.ffmpeg.org
Delivered-To: patchwork@ffaux-bg.ffmpeg.org
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100])
	by ffaux.localdomain (Postfix) with ESMTP id 4F93A445B87
	for <patchwork@ffaux-bg.ffmpeg.org>;
	Tue, 14 May 2019 06:12:44 +0300 (EEST)
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3D908689957;
	Tue, 14 May 2019 06:12:44 +0300 (EEST)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from mail.overt.org (mail.overt.org [157.230.92.47])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id EEDBD6899A3
	for <ffmpeg-devel@ffmpeg.org>; Tue, 14 May 2019 06:12:35 +0300 (EEST)
Received: from authenticated-user (mail.overt.org [157.230.92.47])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128
	bits)) (No client certificate requested)
	by mail.overt.org (Postfix) with ESMTPSA id 8E643409BA;
	Mon, 13 May 2019 22:12:34 -0500 (CDT)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=overt.org; s=mail;
	t=1557803554; bh=6NYdxczvyfi5Sp6eXi5lhbM7Cj1IE+wcWKSI8tpNURs=;
	h=From:To:Cc:Subject:Date:In-Reply-To:References:From;
	b=umD+vjw62FKDSqpNnG7EkAdx9t/Xp6oJu9aSHyvymPGXX8PUseIbS3kaplPd4L2rZ
	7cZptL1H32jJTqwlhfhIJXZFACGEPcTB9y/OxWwJJhbiQSyCPmbTpcnCSCffRN3oi+
	VFi8O3kDhbKPC8nzKYu+pHRxDDrlxIJR3QtJ22KDwfDjFSfRJqiy8HBPaiUX4w/KOn
	//CJns4JDaPD+PMp5p/Oo63i2wPyb5MAKRPWH7Iqz7f16X2/dRZyppgcM9PAcCu1QK
	JkfFAS9GVN4pZ3EtbT3ffzmUd7OPWdpzLTFYSduU4lhbeXRnlKcSr7NWGgALT9Z7n2
	ITJp00JkUeiVA==
From: Philip Langdale <philipl@overt.org>
To: ffmpeg-devel@ffmpeg.org
Date: Mon, 13 May 2019 20:12:22 -0700
Message-Id: <20190514031222.9760-4-philipl@overt.org>
In-Reply-To: <20190514031222.9760-1-philipl@overt.org>
References: <20190514031222.9760-1-philipl@overt.org>
MIME-Version: 1.0
Subject: [FFmpeg-devel] [PATCH 3/3] avfilter/vf_scale_cuda: Simplify output
	plane addressing
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <http://ffmpeg.org/mailman/options/ffmpeg-devel>,
	<mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <http://ffmpeg.org/pipermail/ffmpeg-devel/>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <http://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
	<mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches
	<ffmpeg-devel@ffmpeg.org>
Cc: Yogender Gupta <ygupta@nvidia.com>, Philip Langdale <philipl@overt.org>
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>

I'm not sure why this was written the way it was originally. We
initialise the plane addresses correctly in hwcontext_cuda so
why try and play games to calculate the plane offsets directly
in this code?
---
 libavfilter/vf_scale_cuda.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/libavfilter/vf_scale_cuda.c b/libavfilter/vf_scale_cuda.c
index a833dcd1a4..b7cdb81081 100644
--- a/libavfilter/vf_scale_cuda.c
+++ b/libavfilter/vf_scale_cuda.c
@@ -390,12 +390,12 @@ static int scalecuda_resize(AVFilterContext *ctx,
                            out->data[0], out->width, out->height, out->linesize[0],
                            1);
         call_resize_kernel(ctx, s->cu_func_uchar, 1,
-                           in->data[0]+in->linesize[0]*in->height, in->width/2, in->height/2, in->linesize[0]/2,
-                           out->data[0]+out->linesize[0]*out->height, out->width/2, out->height/2, out->linesize[0]/2,
+                           in->data[1], in->width/2, in->height/2, in->linesize[0]/2,
+                           out->data[1], out->width/2, out->height/2, out->linesize[0]/2,
                            1);
         call_resize_kernel(ctx, s->cu_func_uchar, 1,
-                           in->data[0]+ ALIGN_UP((in->linesize[0]*in->height*5)/4, s->tex_alignment), in->width/2, in->height/2, in->linesize[0]/2,
-                           out->data[0]+(out->linesize[0]*out->height*5)/4, out->width/2, out->height/2, out->linesize[0]/2,
+                           in->data[2], in->width/2, in->height/2, in->linesize[0]/2,
+                           out->data[2], out->width/2, out->height/2, out->linesize[0]/2,
                            1);
         break;
     case AV_PIX_FMT_YUV444P:
@@ -404,12 +404,12 @@ static int scalecuda_resize(AVFilterContext *ctx,
                            out->data[0], out->width, out->height, out->linesize[0],
                            1);
         call_resize_kernel(ctx, s->cu_func_uchar, 1,
-                           in->data[0]+in->linesize[0]*in->height, in->width, in->height, in->linesize[0],
-                           out->data[0]+out->linesize[0]*out->height, out->width, out->height, out->linesize[0],
+                           in->data[1], in->width, in->height, in->linesize[0],
+                           out->data[1], out->width, out->height, out->linesize[0],
                            1);
         call_resize_kernel(ctx, s->cu_func_uchar, 1,
-                           in->data[0]+in->linesize[0]*in->height*2, in->width, in->height, in->linesize[0],
-                           out->data[0]+out->linesize[0]*out->height*2, out->width, out->height, out->linesize[0],
+                           in->data[2], in->width, in->height, in->linesize[0],
+                           out->data[2], out->width, out->height, out->linesize[0],
                            1);
         break;
     case AV_PIX_FMT_YUV444P16:
@@ -433,7 +433,7 @@ static int scalecuda_resize(AVFilterContext *ctx,
                            1);
         call_resize_kernel(ctx, s->cu_func_uchar2, 2,
                            in->data[1], in->width/2, in->height/2, in->linesize[1],
-                           out->data[0] + out->linesize[0] * ((out->height + 31) & ~0x1f), out->width/2, out->height/2, out->linesize[1]/2,
+                           out->data[1], out->width/2, out->height/2, out->linesize[1]/2,
                            1);
         break;
     case AV_PIX_FMT_P010LE:
@@ -443,7 +443,7 @@ static int scalecuda_resize(AVFilterContext *ctx,
                            2);
         call_resize_kernel(ctx, s->cu_func_ushort2, 2,
                            in->data[1], in->width / 2, in->height / 2, in->linesize[1]/2,
-                           out->data[0] + out->linesize[0] * ((out->height + 31) & ~0x1f), out->width / 2, out->height / 2, out->linesize[1] / 4,
+                           out->data[1], out->width / 2, out->height / 2, out->linesize[1] / 4,
                            2);
         break;
     case AV_PIX_FMT_P016LE:
@@ -453,7 +453,7 @@ static int scalecuda_resize(AVFilterContext *ctx,
                            2);
         call_resize_kernel(ctx, s->cu_func_ushort2, 2,
                            in->data[1], in->width / 2, in->height / 2, in->linesize[1] / 2,
-                           out->data[0] + out->linesize[0] * ((out->height + 31) & ~0x1f), out->width / 2, out->height / 2, out->linesize[1] / 4,
+                           out->data[1], out->width / 2, out->height / 2, out->linesize[1] / 4,
                            2);
         break;
     default: