From patchwork Wed Apr 10 03:37:20 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jarek Samic X-Patchwork-Id: 12681 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id E9C37447A90 for ; Wed, 10 Apr 2019 06:44:04 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B656268AE8E; Wed, 10 Apr 2019 06:44:04 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-it1-f194.google.com (mail-it1-f194.google.com [209.85.166.194]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id A72B168AE57 for ; Wed, 10 Apr 2019 06:43:57 +0300 (EEST) Received: by mail-it1-f194.google.com with SMTP id a190so1093155ite.4 for ; Tue, 09 Apr 2019 20:43:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=5jvyC9SggKCHidgTz2NmoAgTvvjQht3tte9VVKxSgKc=; b=smvYacorMyWplbekxpK0vnl1foQOpLb2WB8gSe4w+9TxCTzvt32qpeYWKHR7F/aFkE cR/peXTftBe3f0QwJi/d9CpdnVlWw/2fztGFVtAjdVDMwA/2fAcVPgG9EJVXsLX3im/d mS493nt4IsNEAdKjuo0dCHQdwLVxXHvdEacyPCCIsGyUgUu7Ytp34a+YG9Vc+M++1fm/ AMD61kiRPYp2L5+/ZY5P+l4zHgnb/a5iXlGG7/WZTPe8EGbuFfBXRclVUAkUtWQPoESp 0v29zx9Vd+bhrexsQc3Yg0k3s/8wXFrzO/qmCwy1rktG43PIdnlcjDb5Vr9WvL1lOkto HBWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=5jvyC9SggKCHidgTz2NmoAgTvvjQht3tte9VVKxSgKc=; b=n4l91QKR/7R4i2mm0Tj7WCvbFpLnLx8p65WJeIBkMXKli9+HBWYJjxGmkVdSgldiBA FJ9VD+mX8v/TLkszBj5oxJ9LpiZLTsvS7TAjn57usober4+qX5lHgpzjqGKVmRjm578J Oke7FkGe3kbqmVtacauZAVHmFRa3Whi9vSO14uSDwrTpI1+dXsNBvQY/uw7ucTGt37y/ 7tvY+H2NNxFIPSovr0G9ZoLpVKwhAj2UwddsBcPINAEiVTVc4BLcB3FVNpXsZ6q1nzry jIw2tFZT/bxi99VI5/qGkoS+Qlv4MFmWxkdWWhUO7HZqc/h/k3OiWT2tdUeVoQY/OdTS crXA== X-Gm-Message-State: APjAAAXr1DzOc2IH8AFj4y8WVYjxIo/3ivTh/MyQD1KtRhiaBm0ZU1Br hozpUmLs8zdYLryLDA/1X/11LjZ5JOo= X-Google-Smtp-Source: APXvYqzpNPQLd76yInG8vvpYfY4IFqTK1nWLipFFtZUpsXtO7lN/PfLSdSPa8ZE36Cr2qLJDClJo9w== X-Received: by 2002:a24:5a01:: with SMTP id v1mr1590032ita.0.1554867476467; Tue, 09 Apr 2019 20:37:56 -0700 (PDT) Received: from cldire-arch.stormhome.local (rrcs-70-61-229-139.central.biz.rr.com. [70.61.229.139]) by smtp.gmail.com with ESMTPSA id v23sm13483133ioq.6.2019.04.09.20.37.54 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 09 Apr 2019 20:37:55 -0700 (PDT) From: Jarek Samic To: ffmpeg-devel@ffmpeg.org Date: Tue, 9 Apr 2019 23:37:20 -0400 Message-Id: <20190410033720.21172-1-cldfire3@gmail.com> X-Mailer: git-send-email 2.21.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] lavfi: add colorkey_opencl filter X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Jarek Samic Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" This is a direct port of the CPU filter. Signed-off-by: Jarek Samic --- This is my submission for the GSoC OpenCL video filters project qualification task. Command you can use to try it out: ./ffmpeg -i some_video -i some_img -init_hw_device opencl=gpu -filter_hw_device gpu -filter_complex "[0:v]format=rgba, hwupload, colorkey_opencl=yellow:0.4:0.2, hwdownload, format=rgba[over];[1:v][over]overlay" output Based on simple observation of that command running vs. an equivalent one with the CPU colorkey filter, it would appear that the OpenCL version is ~10-20% faster including the overhead to upload / download from the GPU (at least with my test input and hardware). You will notice that I am using overlay rather than overlay_opencl above. I am not sure what's going on, but the overlay_opencl filter is not working for me: every time I try to use it it either stops after the first couple of frames or spits out duplicate frames forever. Even just running this command that is basically a copy of the example command to overlay an image logo on the top-left corner of an input video: ./ffmpeg -i ../video.mp4 -i ../img.png -init_hw_device opencl=gpu -filter_hw_device gpu -filter_complex "[0:v]hwupload[a], [1:v]format=yuv420p, hwupload[b], [a][b]overlay_opencl, hwdownload, format=yuv420p" ../vid_test.mp4 Results in an infinite number of duplicate frames. (The format of the input video is yuv420p, to be clear.) Before I take the time to dig in and investigate what's going on, is anyone else aware of what could be causing this (or any existing known issues)? configure | 1 + doc/filters.texi | 33 +++++ libavfilter/Makefile | 2 + libavfilter/allfilters.c | 1 + libavfilter/opencl/colorkey.cl | 45 ++++++ libavfilter/opencl_source.h | 1 + libavfilter/vf_colorkey_opencl.c | 234 +++++++++++++++++++++++++++++++ 7 files changed, 317 insertions(+) create mode 100644 libavfilter/opencl/colorkey.cl create mode 100644 libavfilter/vf_colorkey_opencl.c diff --git a/configure b/configure index f6123f53e5..a4dd9ee167 100755 --- a/configure +++ b/configure @@ -3410,6 +3410,7 @@ boxblur_filter_deps="gpl" boxblur_opencl_filter_deps="opencl gpl" bs2b_filter_deps="libbs2b" colormatrix_filter_deps="gpl" +colorkey_opencl_filter_deps="opencl" convolution_opencl_filter_deps="opencl" convolve_filter_deps="avcodec" convolve_filter_select="fft" diff --git a/doc/filters.texi b/doc/filters.texi index 867607d870..390c8b97cf 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -19030,6 +19030,39 @@ Apply erosion filter with threshold0 set to 30, threshold1 set 40, threshold2 se @end example @end itemize +@section colorkey_opencl +RGB colorspace color keying. + +The filter accepts the following options: + +@table @option +@item color +The color which will be replaced with transparency. + +@item similarity +Similarity percentage with the key color. + +0.01 matches only the exact key color, while 1.0 matches everything. + +@item blend +Blend percentage. + +0.0 makes pixels either fully transparent, or not transparent at all. + +Higher values result in semi-transparent pixels, with a higher transparency +the more similar the pixels color is to the key color. +@end table + +@subsection Examples + +@itemize +@item +Make every semi-green pixel in the input transparent with some slight blending: +@example +-i INPUT -vf "hwupload, colorkey_opencl=green:0.3:0.1, hwdownload" OUTPUT +@end example +@end itemize + @section overlay_opencl Overlay one video on top of another. diff --git a/libavfilter/Makefile b/libavfilter/Makefile index fef6ec5c55..9589dd8747 100644 --- a/libavfilter/Makefile +++ b/libavfilter/Makefile @@ -176,6 +176,8 @@ OBJS-$(CONFIG_CODECVIEW_FILTER) += vf_codecview.o OBJS-$(CONFIG_COLORBALANCE_FILTER) += vf_colorbalance.o OBJS-$(CONFIG_COLORCHANNELMIXER_FILTER) += vf_colorchannelmixer.o OBJS-$(CONFIG_COLORKEY_FILTER) += vf_colorkey.o +OBJS-$(CONFIG_COLORKEY_OPENCL_FILTER) += vf_colorkey_opencl.o opencl.o \ + opencl/colorkey.o OBJS-$(CONFIG_COLORLEVELS_FILTER) += vf_colorlevels.o OBJS-$(CONFIG_COLORMATRIX_FILTER) += vf_colormatrix.o OBJS-$(CONFIG_COLORSPACE_FILTER) += vf_colorspace.o colorspace.o colorspacedsp.o diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c index c51ae0f3c7..ff4eb5bf6b 100644 --- a/libavfilter/allfilters.c +++ b/libavfilter/allfilters.c @@ -165,6 +165,7 @@ extern AVFilter ff_vf_codecview; extern AVFilter ff_vf_colorbalance; extern AVFilter ff_vf_colorchannelmixer; extern AVFilter ff_vf_colorkey; +extern AVFilter ff_vf_colorkey_opencl; extern AVFilter ff_vf_colorlevels; extern AVFilter ff_vf_colormatrix; extern AVFilter ff_vf_colorspace; diff --git a/libavfilter/opencl/colorkey.cl b/libavfilter/opencl/colorkey.cl new file mode 100644 index 0000000000..5d8a0bb8df --- /dev/null +++ b/libavfilter/opencl/colorkey.cl @@ -0,0 +1,45 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +__kernel void colorkey( + __read_only image2d_t src, + __write_only image2d_t dst, + uchar4 colorkey_rgba, + float similarity, + float blend +) { + const sampler_t sampler = CLK_NORMALIZED_COORDS_FALSE | + CLK_FILTER_NEAREST; + int2 loc = (int2)(get_global_id(0), get_global_id(1)); + float4 pixel = read_imagef(src, sampler, loc); + + float dr = pixel.s0 - (float)colorkey_rgba.s0 / 255.0; + float dg = pixel.s1 - (float)colorkey_rgba.s1 / 255.0; + float db = pixel.s2 - (float)colorkey_rgba.s2 / 255.0; + + double diff = sqrt(dr * dr + dg * dg + db * db); + + if (blend > 0.0001) { + pixel.s3 = clamp((diff - similarity) / blend, 0.0, 1.0); + } else { + pixel.s3 = (diff > similarity) ? 1.0 : 0.0; + } + + write_imagef(dst, loc, pixel); +} + diff --git a/libavfilter/opencl_source.h b/libavfilter/opencl_source.h index 4118138c30..51f7178cf2 100644 --- a/libavfilter/opencl_source.h +++ b/libavfilter/opencl_source.h @@ -20,6 +20,7 @@ #define AVFILTER_OPENCL_SOURCE_H extern const char *ff_opencl_source_avgblur; +extern const char *ff_opencl_source_colorkey; extern const char *ff_opencl_source_colorspace_common; extern const char *ff_opencl_source_convolution; extern const char *ff_opencl_source_neighbor; diff --git a/libavfilter/vf_colorkey_opencl.c b/libavfilter/vf_colorkey_opencl.c new file mode 100644 index 0000000000..4769c529b0 --- /dev/null +++ b/libavfilter/vf_colorkey_opencl.c @@ -0,0 +1,234 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/opt.h" +#include "libavutil/imgutils.h" +#include "avfilter.h" +#include "formats.h" +#include "internal.h" +#include "opencl.h" +#include "opencl_source.h" +#include "video.h" + +typedef struct ColorkeyOpenCLContext { + OpenCLFilterContext ocf; + // Whether or not the above `OpenCLFilterContext` has been initialized + int initialized; + + cl_command_queue command_queue; + cl_kernel kernel_colorkey; + + // The color we are supposed to replace with transparency + cl_uchar4 colorkey_rgba[4]; + // Similarity percentage compared to `colorkey_rgba`, ranging from `0.01` to `1.0` + // where `0.01` matches only the key color and `1.0` matches all colors + float similarity; + // Blending percentage where `0.0` results in fully transparent pixels, `1.0` results + // in fully opaque pixels, and numbers in between result in transparency that varies + // based on the similarity to the key color + float blend; +} ColorkeyOpenCLContext; + +static int colorkey_opencl_init(AVFilterContext* avctx) +{ + ColorkeyOpenCLContext *ctx = avctx->priv; + cl_int cle; + int err; + + err = ff_opencl_filter_load_program(avctx, &ff_opencl_source_colorkey, 1); + if (err < 0) + goto fail; + + ctx->command_queue = clCreateCommandQueue( + ctx->ocf.hwctx->context, + ctx->ocf.hwctx->device_id, + 0, &cle + ); + + CL_FAIL_ON_ERROR(AVERROR(EIO), "Failed to create OpenCL " + "command queue %d.\n", cle); + + ctx->kernel_colorkey = clCreateKernel(ctx->ocf.program, "colorkey", &cle); + CL_FAIL_ON_ERROR(AVERROR(EIO), "Failed to create horizontal " + "kernel %d.\n", cle); + + ctx->initialized = 1; + return 0; + +fail: + if (ctx->command_queue) + clReleaseCommandQueue(ctx->command_queue); + if (ctx->kernel_colorkey) + clReleaseKernel(ctx->kernel_colorkey); + return err; +} + +static int filter_frame(AVFilterLink* link, AVFrame* input_frame) +{ + AVFilterContext* avctx = link->dst; + AVFilterLink* outlink = avctx->outputs[0]; + ColorkeyOpenCLContext* colorkey_ctx = avctx->priv; + AVFrame* output_frame = NULL; + int err; + cl_int cle; + size_t global_work[2]; + cl_mem src, dst; + + if (!input_frame->hw_frames_ctx) + return AVERROR(EINVAL); + + if (!colorkey_ctx->initialized) { + AVHWFramesContext *input_frames_ctx = + (AVHWFramesContext*)input_frame->hw_frames_ctx->data; + int fmt = input_frames_ctx->sw_format; + + // Make sure the input is a format we support + if (fmt != AV_PIX_FMT_ARGB && + fmt != AV_PIX_FMT_RGBA && + fmt != AV_PIX_FMT_ABGR && + fmt != AV_PIX_FMT_BGRA && + fmt != AV_PIX_FMT_NONE + ) { + av_log(avctx, AV_LOG_ERROR, "unsupported (non-RGB) format in colorkey_opencl.\n"); + err = AVERROR(ENOSYS); + goto fail; + } + + err = colorkey_opencl_init(avctx); + if (err < 0) + goto fail; + } + + // This filter only operates on RGB data and we know that will be on the first plane + src = (cl_mem)input_frame->data[0]; + output_frame = ff_get_video_buffer(outlink, outlink->w, outlink->h); + if (!output_frame) { + err = AVERROR(ENOMEM); + goto fail; + } + dst = (cl_mem)output_frame->data[0]; + + CL_SET_KERNEL_ARG(colorkey_ctx->kernel_colorkey, 0, cl_mem, &src); + CL_SET_KERNEL_ARG(colorkey_ctx->kernel_colorkey, 1, cl_mem, &dst); + CL_SET_KERNEL_ARG(colorkey_ctx->kernel_colorkey, 2, cl_uchar4, &colorkey_ctx->colorkey_rgba); + CL_SET_KERNEL_ARG(colorkey_ctx->kernel_colorkey, 3, float, &colorkey_ctx->similarity); + CL_SET_KERNEL_ARG(colorkey_ctx->kernel_colorkey, 4, float, &colorkey_ctx->blend); + + err = ff_opencl_filter_work_size_from_image(avctx, global_work, input_frame, 0, 0); + if (err < 0) + goto fail; + + cle = clEnqueueNDRangeKernel( + colorkey_ctx->command_queue, + colorkey_ctx->kernel_colorkey, + 2, + NULL, + global_work, + NULL, + 0, + NULL, + NULL + ); + + CL_FAIL_ON_ERROR(AVERROR(EIO), "Failed to enqueue colorkey kernel: %d.\n", cle); + + // Run queued kernel + cle = clFinish(colorkey_ctx->command_queue); + CL_FAIL_ON_ERROR(AVERROR(EIO), "Failed to finish command queue: %d.\n", cle); + + err = av_frame_copy_props(output_frame, input_frame); + if (err < 0) + goto fail; + + av_frame_free(&input_frame); + + return ff_filter_frame(outlink, output_frame); + +fail: + clFinish(colorkey_ctx->command_queue); + av_frame_free(&input_frame); + av_frame_free(&output_frame); + return err; +} + +static av_cold void colorkey_opencl_uninit(AVFilterContext* avctx) +{ + ColorkeyOpenCLContext *ctx = avctx->priv; + cl_int cle; + + if (ctx->kernel_colorkey) { + cle = clReleaseKernel(ctx->kernel_colorkey); + if (cle != CL_SUCCESS) + av_log(avctx, AV_LOG_ERROR, "Failed to release " + "kernel: %d.\n", cle); + } + + if (ctx->command_queue) { + cle = clReleaseCommandQueue(ctx->command_queue); + if (cle != CL_SUCCESS) + av_log(avctx, AV_LOG_ERROR, "Failed to release " + "command queue: %d.\n", cle); + } + + ff_opencl_filter_uninit(avctx); +} + +static const AVFilterPad colorkey_opencl_inputs[] = { + { + .name = "default", + .type = AVMEDIA_TYPE_VIDEO, + .filter_frame = filter_frame, + .config_props = &ff_opencl_filter_config_input, + }, + { NULL } +}; + +static const AVFilterPad colorkey_opencl_outputs[] = { + { + .name = "default", + .type = AVMEDIA_TYPE_VIDEO, + .config_props = &ff_opencl_filter_config_output, + }, + { NULL } +}; + +#define OFFSET(x) offsetof(ColorkeyOpenCLContext, x) +#define FLAGS AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_VIDEO_PARAM + +static const AVOption colorkey_opencl_options[] = { + { "color", "set the colorkey key color", OFFSET(colorkey_rgba), AV_OPT_TYPE_COLOR, { .str = "black" }, CHAR_MIN, CHAR_MAX, FLAGS }, + { "similarity", "set the colorkey similarity value", OFFSET(similarity), AV_OPT_TYPE_FLOAT, { .dbl = 0.01 }, 0.01, 1.0, FLAGS }, + { "blend", "set the colorkey key blend value", OFFSET(blend), AV_OPT_TYPE_FLOAT, { .dbl = 0.0 }, 0.0, 1.0, FLAGS }, + { NULL } +}; + +AVFILTER_DEFINE_CLASS(colorkey_opencl); + +AVFilter ff_vf_colorkey_opencl = { + .name = "colorkey_opencl", + .description = NULL_IF_CONFIG_SMALL("Turns a certain color into transparency. Operates on RGB colors."), + .priv_size = sizeof(ColorkeyOpenCLContext), + .priv_class = &colorkey_opencl_class, + .init = &ff_opencl_filter_init, + .uninit = &colorkey_opencl_uninit, + .query_formats = &ff_opencl_filter_query_formats, + .inputs = colorkey_opencl_inputs, + .outputs = colorkey_opencl_outputs, + .flags_internal = FF_FILTER_FLAG_HWFRAME_AWARE +}; +