From patchwork Mon Sep 30 23:01:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Koushik Dutta X-Patchwork-Id: 51948 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:a303:0:b0:48e:c0f8:d0de with SMTP id v3csp36619vqm; Mon, 30 Sep 2024 16:01:23 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCVz8l528wGchVTXfkMJbmOSzz930QdmpPmzbCJZRqmCx3rYgaY7Axaog0d+1WgzBBdVVV+mjn5gpK0toskPyUo5@gmail.com X-Google-Smtp-Source: AGHT+IHPdOPLn3UUAV3Eh96+j/L6iQF5WSv94zzP9N3WoqrWRB/SWQ4kHQfh3cKbGiJy13EK/l/6 X-Received: by 2002:a05:6512:3c8e:b0:536:54db:ddd0 with SMTP id 2adb3069b0e04-5389fbc6660mr6826078e87.0.1727737282787; Mon, 30 Sep 2024 16:01:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1727737282; cv=none; d=google.com; s=arc-20240605; b=h3cIaJo+Z5077FStc2Hxi/lEWK4sPlQTbL9Ga6OT2iNfwNMq94SnjtRxZ7lWweqjHZ 4sj1yXFMSpxDOw+du4xKUIywV+jPiWPXbzjQ8frryZGq4gFMk3ycoyQNvCOakMeaqGwJ 56+Dvyh1S0nxmWkzfIbYyQVfS0nUyyjBdjJM2G1T3r80CeweaWB1x4Lh8Aifl3jTTv4k /HUAJyOoHM0LUe8oh00TDpYP/2CWW0nAIzHCzfpiqHjJLIjWQd/Z/Ri/8yRplIP7diZ4 UdJZZzmhbtynP0UkXPOFaikFSf0Ed+yd3NUeeT6+GnFHVDTyRDMAfrAl8lgnUZjL5ygR x/3A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=2dK2r4EX/rOq6iY1ytHwZeV8y0sC7OfYvN39sb8QqJM=; fh=vf1rgm8KpEMxcRKiIfYPC5TR7MMWdHSIxg51vjgwMpE=; b=RWl40PimbAEiDE/Z5v4sbQmEwHJXsZereyOXC9AtyIXnrKEZA7CuCbHpXECUdMTJFy QszfplUsxMXSyOFc/1Oj10H7fPM+chtlz2NPj0bJyfaVi36J9O3fONQcJ3MEQBbK9C3E RDhRT/R7M+daYpaxyRhRZWuazG0u1w4Wrp12eZgV11VO5SPMWRfluPe9eRhnpeK1C3mx 4D9gqICWiU+BGCWe9OsCTbyKW9BOOOlUqMV3AXolFKZYFrEksJvctVOV6PkX2SU2DYC1 qufNBYM60djc5vcej0+iDaS0KnOebOOYeUHSXvtn2t+4B3dbZR5tGxVGrDAD6r9DiLio W4IA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=MafSoRRJ; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 2adb3069b0e04-538a043b107si3032058e87.355.2024.09.30.16.01.22; Mon, 30 Sep 2024 16:01:22 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=MafSoRRJ; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id BA1E868D6C1; Tue, 1 Oct 2024 02:01:17 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pf1-f180.google.com (mail-pf1-f180.google.com [209.85.210.180]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 348F268D5A7 for ; Tue, 1 Oct 2024 02:01:10 +0300 (EEST) Received: by mail-pf1-f180.google.com with SMTP id d2e1a72fcca58-71979bf5e7aso3582719b3a.1 for ; Mon, 30 Sep 2024 16:01:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1727737268; x=1728342068; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=j3TvSPHFp9D4SWdFwMma6Aht+7B3okNtFTXJ3nMU05w=; b=MafSoRRJ2SVkxi+v296Noe2bPs3bpadq2FQyjj84OoxBhq9433lftxIKczxA3TyuI1 TcyQwkPOQRzFwfQv2duFBrz1+EOrDbp9i5U2uLrbClP6YvrDYPVTBkSESoYH5P3TUoUU vDGQeiCe4Zdxl3t3K64sJD5nRs5zKshG71atVmhgTpJyAT6dONcWFUxE9kw0WQdc1fBX bhArMPjkzSZ92BOYNmhTyz8EQHeGQfb1xwZdx55HtSDxKOtmKY/HTzf5RqUZJXDibKwe 6xhSGm0iKaHRhfKtpq1C1Zw502bYxTtG370W7Ns6SliWOu801WSgwZ7zQRcf6cY6mm3k vrIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727737268; x=1728342068; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=j3TvSPHFp9D4SWdFwMma6Aht+7B3okNtFTXJ3nMU05w=; b=hB+512mmiWRLvsGbYTMpHZCCTLCUBIR8qw3vRNS7ixAt0mT92tZ7fMqdNA5Uf9UxfC 6uY78b19RgVcmmM6pCX35XOsf9ySZ9Ym6KEx1pV8qofMawr9/eTpPkZ4vO/F7sgwZ6gD MpOfXV/EzJwSZRYHLJlOMc4SpJuQsHSIJG9QR5MRRH1amnB53RioEt9/uUUyv3/CuYgI YmqVlnVJ5UwuH355SHkL2g+sqxvbAnHH90NHtFeTix7LwZxHJexPVQTcKY1N6OJNDG7y vm0D3XSAn+SouxPC+3wz2pclBvm6d6BEg/vX96Hk+iiH3UwfPyNVR8dm0MgW7c5J+GRI MNyw== X-Gm-Message-State: AOJu0YygGiqoUMOhbbo3u4U62yUtLfweaswbv8tTlSako291bpdzxREc IXKbp0M7Mk59MCqYggqONWMpP2x10hJ2L5pTprp7BLzIZTuiDeT+LBUE/Q== X-Received: by 2002:a05:6a20:c70b:b0:1d3:294e:6c8d with SMTP id adf61e73a8af0-1d4fa6f9995mr20021389637.21.1727737267651; Mon, 30 Sep 2024 16:01:07 -0700 (PDT) Received: from Koushik-MacStudio.tail05204.ts.net ([2001:559:76c:0:fcbe:ac04:10b3:5a65]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71b264bec57sm6819479b3a.69.2024.09.30.16.01.06 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 30 Sep 2024 16:01:07 -0700 (PDT) From: Koushik Dutta To: ffmpeg-devel@ffmpeg.org Date: Mon, 30 Sep 2024 16:01:04 -0700 Message-Id: <20240930230104.43578-1-koushd@gmail.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] program_opencl: implement planar and format options X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Koushik Dutta Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: VaXfUgpjVjNS OpenCL kernels currently run in planar mode. The kernel is run once per plane. This change adds a new planar option which is enabled by default to preserve existing default behavior. Disabling the new planar option on program_opencl provides all image planes to a single invocation of the kernel. The plane index is omitted in this mode. The new format option allows setting the output format of the filter rather than assuming it is the same as the source. These two options allow implementing more complex kernels which can perform colorspace conversion as part of the kernel. Filter setup for nv12 to rgba: program_opencl=kernel=nv12torgba:format=rgba:planar=0:source=... Kernel that supports processing all planes on the input image: __kernel void nv12torgba(__write_only image2d_t output_image, __read_only image2d_t y_image, __read_only image2d_t uv_image) Signed-off-by: Koushik Dutta --- libavfilter/vf_program_opencl.c | 115 +++++++++++++++++++++++++------- 1 file changed, 90 insertions(+), 25 deletions(-) diff --git a/libavfilter/vf_program_opencl.c b/libavfilter/vf_program_opencl.c index f032400fbe..7490057c63 100644 --- a/libavfilter/vf_program_opencl.c +++ b/libavfilter/vf_program_opencl.c @@ -47,6 +47,8 @@ typedef struct ProgramOpenCLContext { int width, height; enum AVPixelFormat source_format; AVRational source_rate; + + int planar; } ProgramOpenCLContext; static int program_opencl_loaded(AVFilterContext *avctx) { @@ -106,6 +108,7 @@ static int program_opencl_run(AVFilterContext *avctx) size_t global_work[2]; cl_mem src, dst; int err, input, plane; + int planar_offset = 0; if (!ctx->loaded) { err = program_opencl_load(avctx); @@ -119,22 +122,73 @@ static int program_opencl_run(AVFilterContext *avctx) goto fail; } - for (plane = 0; plane < FF_ARRAY_ELEMS(output->data); plane++) { - dst = (cl_mem)output->data[plane]; - if (!dst) - break; + if (ctx->planar) { + for (plane = 0; plane < FF_ARRAY_ELEMS(output->data); plane++) { + dst = (cl_mem)output->data[plane]; + if (!dst) + break; - cle = clSetKernelArg(ctx->kernel, 0, sizeof(cl_mem), &dst); - if (cle != CL_SUCCESS) { - av_log(avctx, AV_LOG_ERROR, "Failed to set kernel " - "destination image argument: %d.\n", cle); - err = AVERROR_UNKNOWN; - goto fail; + cle = clSetKernelArg(ctx->kernel, 0, sizeof(cl_mem), &dst); + if (cle != CL_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Failed to set kernel " + "destination image argument: %d.\n", cle); + err = AVERROR_UNKNOWN; + goto fail; + } + cle = clSetKernelArg(ctx->kernel, 1, sizeof(cl_uint), &ctx->index); + if (cle != CL_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Failed to set kernel " + "index argument: %d.\n", cle); + err = AVERROR_UNKNOWN; + goto fail; + } + + for (input = 0; input < ctx->nb_inputs; input++) { + av_assert0(ctx->frames[input]); + + src = (cl_mem)ctx->frames[input]->data[plane]; + av_assert0(src); + + cle = clSetKernelArg(ctx->kernel, 2 + input, sizeof(cl_mem), &src); + if (cle != CL_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Failed to set kernel " + "source image argument %d: %d.\n", input, cle); + err = AVERROR_UNKNOWN; + goto fail; + } + } + + err = ff_opencl_filter_work_size_from_image(avctx, global_work, + output, plane, 0); + if (err < 0) + goto fail; + + av_log(avctx, AV_LOG_DEBUG, "Run kernel on plane %d " + "(%"SIZE_SPECIFIER"x%"SIZE_SPECIFIER").\n", + plane, global_work[0], global_work[1]); + + cle = clEnqueueNDRangeKernel(ctx->command_queue, ctx->kernel, 2, NULL, + global_work, NULL, 0, NULL, NULL); + CL_FAIL_ON_ERROR(AVERROR(EIO), "Failed to enqueue kernel: %d.\n", cle); } - cle = clSetKernelArg(ctx->kernel, 1, sizeof(cl_uint), &ctx->index); + } + else { + for (plane = 0; plane < FF_ARRAY_ELEMS(output->data); plane++) { + dst = (cl_mem)output->data[plane]; + if (!dst) + break; + if (plane) { + av_log(avctx, AV_LOG_ERROR, "Kernel requires multiplanar output, " + "but planar option is unset.\n"); + return AVERROR(EINVAL); + } + } + + dst = (cl_mem)output->data[0]; + cle = clSetKernelArg(ctx->kernel, 0, sizeof(cl_mem), &dst); if (cle != CL_SUCCESS) { av_log(avctx, AV_LOG_ERROR, "Failed to set kernel " - "index argument: %d.\n", cle); + "destination image argument: %d.\n", cle); err = AVERROR_UNKNOWN; goto fail; } @@ -142,26 +196,29 @@ static int program_opencl_run(AVFilterContext *avctx) for (input = 0; input < ctx->nb_inputs; input++) { av_assert0(ctx->frames[input]); - src = (cl_mem)ctx->frames[input]->data[plane]; - av_assert0(src); - - cle = clSetKernelArg(ctx->kernel, 2 + input, sizeof(cl_mem), &src); - if (cle != CL_SUCCESS) { - av_log(avctx, AV_LOG_ERROR, "Failed to set kernel " - "source image argument %d: %d.\n", input, cle); - err = AVERROR_UNKNOWN; - goto fail; + for (plane = 0; plane < FF_ARRAY_ELEMS(output->data); plane++, planar_offset++) { + src = (cl_mem)ctx->frames[input]->data[plane]; + if (!src) + break; + + cle = clSetKernelArg(ctx->kernel, 1 + planar_offset, sizeof(cl_mem), &src); + if (cle != CL_SUCCESS) { + av_log(avctx, AV_LOG_ERROR, "Failed to set kernel " + "source image argument %d plane %d: %d.\n", input, plane, cle); + err = AVERROR_UNKNOWN; + goto fail; + } } } err = ff_opencl_filter_work_size_from_image(avctx, global_work, - output, plane, 0); + output, 0, 0); if (err < 0) goto fail; - av_log(avctx, AV_LOG_DEBUG, "Run kernel on plane %d " - "(%"SIZE_SPECIFIER"x%"SIZE_SPECIFIER").\n", - plane, global_work[0], global_work[1]); + av_log(avctx, AV_LOG_DEBUG, "Run kernel on all planes " + "(%"SIZE_SPECIFIER"x%"SIZE_SPECIFIER").\n", + global_work[0], global_work[1]); cle = clEnqueueNDRangeKernel(ctx->command_queue, ctx->kernel, 2, NULL, global_work, NULL, 0, NULL, NULL); @@ -306,6 +363,8 @@ static av_cold int program_opencl_init(AVFilterContext *avctx) if (err < 0) return err; } + + ctx->ocf.output_format = ctx->source_format; } return 0; @@ -374,6 +433,12 @@ static const AVOption program_opencl_options[] = { { "s", "Video size", OFFSET(width), AV_OPT_TYPE_IMAGE_SIZE, { .str = NULL }, 0, 0, FLAGS }, + { "format", "Pixel format for output framebuffer", + OFFSET(source_format), AV_OPT_TYPE_PIXEL_FMT, + { .i64 = AV_PIX_FMT_NONE }, -1, INT32_MAX, FLAGS }, + + {"planar", "Kernel will run once per plane or receive all planes as multiple inputs", OFFSET(planar), AV_OPT_TYPE_BOOL, {.i64=1}, 0, 1 }, + { NULL }, };