Message ID | 20240227163502.28268-1-gnattuoc@me.com |
---|---|
State | New |
Headers | show |
Series | [FFmpeg-devel,v4] avfilter: add vf_overlay_videotoolbox | expand |
Context | Check | Description |
---|---|---|
yinshiyou/make_loongarch64 | success | Make finished |
yinshiyou/make_fate_loongarch64 | success | Make fate finished |
andriy/make_x86 | success | Make finished |
andriy/make_fate_x86 | success | Make fate finished |
A ping for this as it is already a week. > On Feb 28, 2024, at 00:35, gnattu via ffmpeg-devel <ffmpeg-devel@ffmpeg.org> wrote: > > Overlay filter for VideoToolbox hwframes. Unlike most hardware > overlay filters, this filter does not require the two inputs to > have the same pixel format; instead, it will perform format > conversion automatically with hardware accelerated methods. > > Signed-off-by: Gnattu OC <gnattuoc@me.com> > --- > Changes from v3: > > - Fixes an issue that 8bit depth BGRA overlay frames are not correctly converted to 16bit > - Added a constraint to input pixel formats as VideoToolbox cannot convert all of its hardwareframes > > Changelog | 1 + > configure | 1 + > doc/filters.texi | 52 ++ > libavfilter/Makefile | 3 + > libavfilter/allfilters.c | 1 + > libavfilter/metal/utils.m | 7 +- > .../metal/vf_overlay_videotoolbox.metal | 58 ++ > libavfilter/vf_overlay_videotoolbox.m | 609 ++++++++++++++++++ > 8 files changed, 730 insertions(+), 2 deletions(-) > create mode 100644 libavfilter/metal/vf_overlay_videotoolbox.metal > create mode 100644 libavfilter/vf_overlay_videotoolbox.m > > diff --git a/Changelog b/Changelog > index 610ee61dd6..3ecfdab81b 100644 > --- a/Changelog > +++ b/Changelog > @@ -27,6 +27,7 @@ version <next>: > - a C11-compliant compiler is now required; note that this requirement > will be bumped to C17 in the near future, so consider updating your > build environment if it lacks C17 support > +- VideoToolbox overlay filter > > version 6.1: > - libaribcaption decoder > diff --git a/configure b/configure > index 23066efa32..a7c349d126 100755 > --- a/configure > +++ b/configure > @@ -3807,6 +3807,7 @@ overlay_qsv_filter_deps="libmfx" > overlay_qsv_filter_select="qsvvpp" > overlay_vaapi_filter_deps="vaapi VAProcPipelineCaps_blend_flags" > overlay_vulkan_filter_deps="vulkan spirv_compiler" > +overlay_videotoolbox_filter_deps="metal corevideo coreimage videotoolbox" > owdenoise_filter_deps="gpl" > pad_opencl_filter_deps="opencl" > pan_filter_deps="swresample" > diff --git a/doc/filters.texi b/doc/filters.texi > index e0436a5755..bfb77562cb 100644 > --- a/doc/filters.texi > +++ b/doc/filters.texi > @@ -19033,6 +19033,58 @@ See @ref{framesync}. > > This filter also supports the @ref{framesync} options. > > +@section overlay_videotoolbox > + > +Overlay one video on top of another. > + > +This is the VideoToolbox variant of the @ref{overlay} filter. > +It takes two inputs and has one output. The first input is the "main" video on which the second input is overlaid. > +It only accepts VideoToolbox frames. The underlying input pixel formats do not have to match. > +Different input pixel formats and color spaces will be automatically converted using hardware accelerated methods. > +The final output will have the same pixel format and color space as the "main" input. > + > +The filter accepts the following options: > + > +@table @option > + > +@item x > +Set the x coordinate of the overlaid video on the main video. > +Default value is @code{0}. > + > +@item y > +Set the y coordinate of the overlaid video on the main video. > +Default value is @code{0}. > + > +@item eof_action > +See @ref{framesync}. > + > +@item shortest > +See @ref{framesync}. > + > +@item repeatlast > +See @ref{framesync}. > + > +@end table > + > +@subsection Examples > + > +@itemize > +@item > +Overlay an image LOGO at the top-left corner of the INPUT video. > +The INPUT video is in nv12 format and the LOGO image is in rgba format. > +@example > +-hwaccel videotoolbox -i INPUT -i LOGO -codec:v:0 h264_videotoolbox -filter_complex "[0:v]format=nv12,hwupload[a], [1:v]format=rgba,hwupload[b], [a][b]overlay_videotoolbox" OUTPUT > +@end example > +@item > +Overlay an SDR video OVERLAY at the top-left corner of the HDR video MAIN. > +The INPUT video is in p010 format and the LOGO image is in nv12 format. > +The OUTPUT video will also be an HDR video with OVERLAY mapped to HDR. > +@example > +-hwaccel videotoolbox -i MAIN -i OVERLAY -codec:v:0 hevc_videotoolbox -tag:v hvc1 -filter_complex "[0:v]format=p010,hwupload[a], [1:v]format=nv12,hwupload[b], [a][b]overlay_videotoolbox" OUTPUT > +@end example > + > +@end itemize > + > @section owdenoise > > Apply Overcomplete Wavelet denoiser. > diff --git a/libavfilter/Makefile b/libavfilter/Makefile > index f6c1d641d6..ea1389ab57 100644 > --- a/libavfilter/Makefile > +++ b/libavfilter/Makefile > @@ -413,6 +413,9 @@ OBJS-$(CONFIG_OVERLAY_OPENCL_FILTER) += vf_overlay_opencl.o opencl.o \ > OBJS-$(CONFIG_OVERLAY_QSV_FILTER) += vf_overlay_qsv.o framesync.o > OBJS-$(CONFIG_OVERLAY_VAAPI_FILTER) += vf_overlay_vaapi.o framesync.o vaapi_vpp.o > OBJS-$(CONFIG_OVERLAY_VULKAN_FILTER) += vf_overlay_vulkan.o vulkan.o vulkan_filter.o > +OBJS-$(CONFIG_OVERLAY_VIDEOTOOLBOX_FILTER) += vf_overlay_videotoolbox.o framesync.o \ > + metal/vf_overlay_videotoolbox.metallib.o \ > + metal/utils.o > OBJS-$(CONFIG_OWDENOISE_FILTER) += vf_owdenoise.o > OBJS-$(CONFIG_PAD_FILTER) += vf_pad.o > OBJS-$(CONFIG_PAD_OPENCL_FILTER) += vf_pad_opencl.o opencl.o opencl/pad.o > diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c > index 149bf50997..ec9d975ecb 100644 > --- a/libavfilter/allfilters.c > +++ b/libavfilter/allfilters.c > @@ -389,6 +389,7 @@ extern const AVFilter ff_vf_overlay_qsv; > extern const AVFilter ff_vf_overlay_vaapi; > extern const AVFilter ff_vf_overlay_vulkan; > extern const AVFilter ff_vf_overlay_cuda; > +extern const AVFilter ff_vf_overlay_videotoolbox; > extern const AVFilter ff_vf_owdenoise; > extern const AVFilter ff_vf_pad; > extern const AVFilter ff_vf_pad_opencl; > diff --git a/libavfilter/metal/utils.m b/libavfilter/metal/utils.m > index f365d3ceea..db5c5f6f10 100644 > --- a/libavfilter/metal/utils.m > +++ b/libavfilter/metal/utils.m > @@ -55,6 +55,9 @@ CVMetalTextureRef ff_metal_texture_from_pixbuf(void *ctx, > { > CVMetalTextureRef tex = NULL; > CVReturn ret; > + bool is_planer = CVPixelBufferIsPlanar(pixbuf); > + size_t width = is_planer ? CVPixelBufferGetWidthOfPlane(pixbuf, plane) : CVPixelBufferGetWidth(pixbuf); > + size_t height = is_planer ? CVPixelBufferGetHeightOfPlane(pixbuf, plane) : CVPixelBufferGetHeight(pixbuf); > > ret = CVMetalTextureCacheCreateTextureFromImage( > NULL, > @@ -62,8 +65,8 @@ CVMetalTextureRef ff_metal_texture_from_pixbuf(void *ctx, > pixbuf, > NULL, > format, > - CVPixelBufferGetWidthOfPlane(pixbuf, plane), > - CVPixelBufferGetHeightOfPlane(pixbuf, plane), > + width, > + height, > plane, > &tex > ); > diff --git a/libavfilter/metal/vf_overlay_videotoolbox.metal b/libavfilter/metal/vf_overlay_videotoolbox.metal > new file mode 100644 > index 0000000000..936e57e03e > --- /dev/null > +++ b/libavfilter/metal/vf_overlay_videotoolbox.metal > @@ -0,0 +1,58 @@ > +/* > + * Copyright (C) 2024 Gnattu OC <gnattuoc@me.com> > + * > + * This file is part of FFmpeg. > + * > + * FFmpeg is free software; you can redistribute it and/or > + * modify it under the terms of the GNU Lesser General Public > + * License as published by the Free Software Foundation; either > + * version 2.1 of the License, or (at your option) any later version. > + * > + * FFmpeg is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + * Lesser General Public License for more details. > + * > + * You should have received a copy of the GNU Lesser General Public > + * License along with FFmpeg; if not, write to the Free Software > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA > + */ > + > +#include <metal_stdlib> > +#include <metal_integer> > +#include <metal_texture> > + > +using namespace metal; > + > +struct mtlBlendParams { > + uint x_position; > + uint y_position; > +}; > + > +/* > + * Blend shader for premultiplied alpha textures > + */ > +kernel void blend_shader( > + texture2d<float, access::read> source [[ texture(0) ]], > + texture2d<float, access::read> mask [[ texture(1) ]], > + texture2d<float, access::write> dest [[ texture(2) ]], > + constant mtlBlendParams& params [[ buffer(3) ]], > + uint2 gid [[ thread_position_in_grid ]]) > +{ > + const auto mask_size = uint2(mask.get_width(), > + mask.get_height()); > + const auto loc_overlay = uint2(params.x_position, params.y_position); > + if (gid.x < loc_overlay.x || > + gid.y < loc_overlay.y || > + gid.x >= mask_size.x + loc_overlay.x || > + gid.y >= mask_size.y + loc_overlay.y) > + { > + float4 source_color = source.read(gid); > + dest.write(source_color, gid); > + } else { > + float4 source_color = source.read(gid); > + float4 mask_color = mask.read((gid - loc_overlay)); > + float4 result_color = source_color * (1.0f - mask_color.w) + (mask_color * mask_color.w); > + dest.write(result_color, gid); > + } > +} > diff --git a/libavfilter/vf_overlay_videotoolbox.m b/libavfilter/vf_overlay_videotoolbox.m > new file mode 100644 > index 0000000000..a6614c1f55 > --- /dev/null > +++ b/libavfilter/vf_overlay_videotoolbox.m > @@ -0,0 +1,609 @@ > +/* > + * Copyright (C) 2024 Gnattu OC <gnattuoc@me.com> > + * > + * This file is part of FFmpeg. > + * > + * FFmpeg is free software; you can redistribute it and/or > + * modify it under the terms of the GNU Lesser General Public > + * License as published by the Free Software Foundation; either > + * version 2.1 of the License, or (at your option) any later version. > + * > + * FFmpeg is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + * Lesser General Public License for more details. > + * > + * You should have received a copy of the GNU Lesser General Public > + * License along with FFmpeg; if not, write to the Free Software > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA > + */ > + > +#include <CoreImage/CoreImage.h> > +#include <VideoToolbox/VideoToolbox.h> > +#include "internal.h" > +#include "metal/utils.h" > +#include "framesync.h" > +#include "libavutil/hwcontext.h" > +#include "libavutil/pixdesc.h" > +#include "libavutil/opt.h" > +#include "libavutil/objc.h" > +#include "video.h" > + > +#include <assert.h> > + > +extern char ff_vf_overlay_videotoolbox_metallib_data[]; > +extern unsigned int ff_vf_overlay_videotoolbox_metallib_len; > + > +// Although iOS 8.0 introduced basic Metal support, its feature set is not complete and does not have CoreImage compatability. > +// We have to set the minimum iOS version to 9.0. > +typedef struct API_AVAILABLE(macos(10.11), ios(9.0)) OverlayVideoToolboxContext { > + AVBufferRef *device_ref; > + FFFrameSync fs; > + > + CVMetalTextureCacheRef texture_cache; > + CVPixelBufferRef input_main_pixel_buffer_cache; > + CVPixelBufferRef input_overlay_pixel_buffer_cache; > + CVPixelBufferRef output_pixel_buffer_cache; > + CIContext *ci_ctx; > + VTPixelTransferSessionRef vt_session; > + > + id<MTLDevice> mtl_device; > + id<MTLLibrary> mtl_library; > + id<MTLCommandQueue> mtl_queue; > + id<MTLComputePipelineState> mtl_pipeline; > + id<MTLFunction> mtl_function; > + id<MTLBuffer> mtl_params_buffer; > + > + uint x_position; > + uint y_position; > + uint hwframe_ctx_allocated; > +} OverlayVideoToolboxContext API_AVAILABLE(macos(10.11), ios(9.0)); > + > +typedef struct MtlBlendParams { > + uint x_position; > + uint y_position; > +} MtlBlendParams; > + > +// Using sizeof(OverlayVideoToolboxContext) without an availability check will error > +// if we're targeting an older OS version, so we need to calculate the size ourselves > +// (we'll statically verify it's correct in overlay_videotoolbox_init behind a check) > +#define OVERLAY_VT_CTX_SIZE (sizeof(FFFrameSync) + sizeof(uint) * 3 + sizeof(void*) * 13 + 4) > + > +// Neither VideoToolbox nor CoreImage can convert YUV420P frames into 16-bit depth color formats. > +// Additionally, the only hardware formats that support an Alpha channel are AYUV64 and BGRA. > +// However, neither can be directly manipulated with YUV420P frames. > +// In such cases, the user will have to use NV12 instead. > +static const enum AVPixelFormat supported_main_formats[] = { > + AV_PIX_FMT_NV12, > + AV_PIX_FMT_P010, > + AV_PIX_FMT_NONE, > +}; > + > +static const enum AVPixelFormat supported_overlay_formats[] = { > + AV_PIX_FMT_NV12, > + AV_PIX_FMT_P010, > + AV_PIX_FMT_AYUV64, > + AV_PIX_FMT_BGRA, > + AV_PIX_FMT_NONE, > +}; > + > +/** > + * Helper to find out if provided format is supported by filter > + */ > +static int format_is_supported(const enum AVPixelFormat formats[], enum AVPixelFormat fmt) > +{ > + for (int i = 0; formats[i] != AV_PIX_FMT_NONE; i++) > + if (formats[i] == fmt) > + return 1; > + return 0; > +} > + > +static void call_kernel(AVFilterContext *avctx, > + id<MTLTexture> dst, > + id<MTLTexture> main, > + id<MTLTexture> overlay, > + uint x_position, > + uint y_position) API_AVAILABLE(macos(10.11), ios(9.0)) > +{ > + OverlayVideoToolboxContext *ctx = avctx->priv; > + // Both the command buffer and encoder are auto-released by objc on default. > + // Use CFBridgingRetain to get a more C-like behavior. > + id<MTLCommandBuffer> buffer = CFBridgingRetain(ctx->mtl_queue.commandBuffer); > + id<MTLComputeCommandEncoder> encoder = CFBridgingRetain((__bridge id<MTLCommandBuffer>)buffer.computeCommandEncoder); > + > + MtlBlendParams *params = (MtlBlendParams *)ctx->mtl_params_buffer.contents; > + *params = (MtlBlendParams){ > + .x_position = x_position, > + .y_position = y_position, > + }; > + > + [(__bridge id<MTLComputeCommandEncoder>)encoder setTexture: main atIndex: 0]; > + [(__bridge id<MTLComputeCommandEncoder>)encoder setTexture: overlay atIndex: 1]; > + [(__bridge id<MTLComputeCommandEncoder>)encoder setTexture: dst atIndex: 2]; > + [(__bridge id<MTLComputeCommandEncoder>)encoder setBuffer: ctx->mtl_params_buffer offset: 0 atIndex: 3]; > + ff_metal_compute_encoder_dispatch(ctx->mtl_device, ctx->mtl_pipeline, (__bridge id<MTLComputeCommandEncoder>)encoder, dst.width, dst.height); > + [(__bridge id<MTLComputeCommandEncoder>)encoder endEncoding]; > + > + [(__bridge id<MTLCommandBuffer>)buffer commit]; > + [(__bridge id<MTLCommandBuffer>)buffer waitUntilCompleted]; > + > + ff_objc_release(&encoder); > + ff_objc_release(&buffer); > +} > + > +// Copies and/or converts one pixel buffer to another. > +// This transparently handles pixel format and color spaces, and will do a conversion if needed. > +static int transfer_pixel_buffer(OverlayVideoToolboxContext *ctx, CVPixelBufferRef source, CVPixelBufferRef destination) > +{ > + if (@available(macOS 10.8, iOS 16.0, *)) { > + int ret = 0; > + ret = VTPixelTransferSessionTransferImage(ctx->vt_session, source, destination); > + if (ret < 0) > + return ret; > + } else { > + CIImage *temp_image = NULL; > + temp_image = CFBridgingRetain([CIImage imageWithCVPixelBuffer: source]); > + [(__bridge CIContext*)ctx->ci_ctx render: (__bridge CIImage*)temp_image toCVPixelBuffer: destination]; > + CFRelease(temp_image); > + CVBufferPropagateAttachments(source, destination); > + } > + return 0; > +} > + > +static int overlay_vt_blend(FFFrameSync *fs) API_AVAILABLE(macos(10.11), ios(9.0)) > +{ > + AVFilterContext *avctx = fs->parent; > + OverlayVideoToolboxContext *ctx = avctx->priv; > + AVFilterLink *outlink = avctx->outputs[0]; > + AVFilterLink *inlink_main = avctx->inputs[0]; > + AVFilterLink *inlink_overlay = avctx->inputs[1]; > + AVFrame *input_main, *input_overlay; > + AVFrame *output; > + AVHWFramesContext *frames_ctx = (AVHWFramesContext*)inlink_main->hw_frames_ctx->data; > + AVHWFramesContext *frames_ctx_overlay = (AVHWFramesContext*)inlink_overlay->hw_frames_ctx->data; > + const AVPixFmtDescriptor *in_overlay_desc; > + > + CVMetalTextureRef main, dst, overlay; > + id<MTLTexture> tex_main, tex_overlay, tex_dst; > + > + MTLPixelFormat mtl_format; > + OSType cv_format; > + int ret; > + int i, overlay_planes = 0; > + > + in_overlay_desc = av_pix_fmt_desc_get(frames_ctx_overlay->sw_format); > + if (@available(macOS 11.3, iOS 14.2, *)) { > + mtl_format = MTLPixelFormatRGBA16Unorm; > + cv_format = kCVPixelFormatType_64RGBALE; > + } else { > + // On older OS versions, 64-bit RGBA with 16-bit little-endian full-range samples is not supported. > + // To handle inputs with color depth greater than 8, convert colors to float type during filtering on these versions. > + mtl_format = MTLPixelFormatRGBA16Float; > + cv_format = kCVPixelFormatType_64RGBAHalf; > + } > + > + // read main and overlay frames from inputs > + ret = ff_framesync_get_frame(fs, 0, &input_main, 0); > + if (ret < 0) > + return ret; > + ret = ff_framesync_get_frame(fs, 1, &input_overlay, 0); > + if (ret < 0) > + return ret; > + > + if (!input_main) > + return AVERROR_BUG; > + > + output = ff_get_video_buffer(outlink, outlink->w, outlink->h); > + if (!output) > + return AVERROR(ENOMEM); > + > + ret = av_frame_copy_props(output, input_main); > + if (ret < 0) > + return ret; > + > + if (!input_overlay) { > + ret = transfer_pixel_buffer(ctx, (CVPixelBufferRef)input_main->data[3], (CVPixelBufferRef)output->data[3]); > + if (ret < 0) > + return ret; > + return ff_filter_frame(outlink, output); > + } > + > + for (i = 0; i < in_overlay_desc->nb_components; i++) > + overlay_planes = FFMAX(overlay_planes, > + in_overlay_desc->comp[i].plane + 1); > + > + if (!ctx->input_overlay_pixel_buffer_cache) { > + ret = CVPixelBufferCreate(kCFAllocatorDefault, > + CVPixelBufferGetWidthOfPlane((CVPixelBufferRef)input_overlay->data[3], 0), > + CVPixelBufferGetHeightOfPlane((CVPixelBufferRef)input_overlay->data[3], 0), > + cv_format, > + (__bridge CFDictionaryRef)@{ > + (NSString *)kCVPixelBufferCGImageCompatibilityKey: @(YES), > + (NSString *)kCVPixelBufferMetalCompatibilityKey: @(YES) > + }, > + &ctx->input_overlay_pixel_buffer_cache); > + if (ret < 0) > + return ret; > + } > + > + if (!ctx->input_main_pixel_buffer_cache) { > + ret = CVPixelBufferCreate(kCFAllocatorDefault, > + CVPixelBufferGetWidthOfPlane((CVPixelBufferRef)input_main->data[3], 0), > + CVPixelBufferGetHeightOfPlane((CVPixelBufferRef)input_main->data[3], 0), > + cv_format, > + (__bridge CFDictionaryRef)@{ > + (NSString *)kCVPixelBufferCGImageCompatibilityKey: @(YES), > + (NSString *)kCVPixelBufferMetalCompatibilityKey: @(YES) > + }, > + &ctx->input_main_pixel_buffer_cache); > + if (ret < 0) > + return ret; > + } > + > + if (!ctx->output_pixel_buffer_cache) { > + ret = CVPixelBufferCreate(kCFAllocatorDefault, > + CVPixelBufferGetWidthOfPlane((CVPixelBufferRef)input_main->data[3], 0), > + CVPixelBufferGetHeightOfPlane((CVPixelBufferRef)input_main->data[3], 0), > + cv_format, > + (__bridge CFDictionaryRef)@{ > + (NSString *)kCVPixelBufferCGImageCompatibilityKey: @(YES), > + (NSString *)kCVPixelBufferMetalCompatibilityKey: @(YES) > + }, > + &ctx->output_pixel_buffer_cache); > + if (ret < 0) > + return ret; > + } > + > + ret = transfer_pixel_buffer(ctx, (CVPixelBufferRef)input_main->data[3], ctx->input_main_pixel_buffer_cache); > + if (ret < 0) > + return ret; > + > + ret = transfer_pixel_buffer(ctx, (CVPixelBufferRef)input_overlay->data[3], ctx->input_overlay_pixel_buffer_cache); > + if (ret < 0) > + return ret; > + > + overlay = ff_metal_texture_from_pixbuf(avctx, ctx->texture_cache, ctx->input_overlay_pixel_buffer_cache, 0, mtl_format); > + main = ff_metal_texture_from_pixbuf(avctx, ctx->texture_cache, ctx->input_main_pixel_buffer_cache, 0, mtl_format); > + dst = ff_metal_texture_from_pixbuf(avctx, ctx->texture_cache, ctx->output_pixel_buffer_cache, 0, mtl_format); > + > + if (!overlay || !main || !dst) { > + return AVERROR(ENOSYS); > + } > + > + tex_main = CVMetalTextureGetTexture(main); > + tex_overlay = CVMetalTextureGetTexture(overlay); > + tex_dst = CVMetalTextureGetTexture(dst); > + > + call_kernel(avctx, tex_dst, tex_main, tex_overlay, ctx->x_position, ctx->y_position); > + > + ret = transfer_pixel_buffer(ctx, ctx->output_pixel_buffer_cache, (CVPixelBufferRef)output->data[3]); > + if (ret < 0) { > + CFRelease(main); > + CFRelease(overlay); > + CFRelease(dst); > + return ret; > + } > + > + CFRelease(main); > + CFRelease(overlay); > + CFRelease(dst); > + > + return ff_filter_frame(outlink, output); > +} > + > +static av_cold void do_uninit(AVFilterContext *avctx) API_AVAILABLE(macos(10.11), ios(9.0)) > +{ > + OverlayVideoToolboxContext *ctx = avctx->priv; > + > + if (ctx->hwframe_ctx_allocated) { > + av_buffer_unref(&ctx->device_ref); > + ctx->hwframe_ctx_allocated = 0; > + } > + ff_framesync_uninit(&ctx->fs); > + > + if (ctx->ci_ctx) { > + CFRelease(ctx->ci_ctx); > + ctx->ci_ctx = NULL; > + } > + > + ff_objc_release(&ctx->mtl_params_buffer); > + ff_objc_release(&ctx->mtl_function); > + ff_objc_release(&ctx->mtl_pipeline); > + ff_objc_release(&ctx->mtl_queue); > + ff_objc_release(&ctx->mtl_library); > + ff_objc_release(&ctx->mtl_device); > + > + if (ctx->texture_cache) { > + CFRelease(ctx->texture_cache); > + ctx->texture_cache = NULL; > + } > + if (ctx->input_main_pixel_buffer_cache) { > + CFRelease(ctx->input_main_pixel_buffer_cache); > + ctx->input_main_pixel_buffer_cache = NULL; > + } > + if (ctx->input_overlay_pixel_buffer_cache) { > + CFRelease(ctx->input_overlay_pixel_buffer_cache); > + ctx->input_overlay_pixel_buffer_cache = NULL; > + } > + if (ctx->output_pixel_buffer_cache) { > + CFRelease(ctx->output_pixel_buffer_cache); > + ctx->output_pixel_buffer_cache = NULL; > + } > + if (ctx->vt_session) { > + VTPixelTransferSessionInvalidate(ctx->vt_session); > + CFRelease(ctx->vt_session); > + ctx->vt_session = NULL; > + } > +} > + > +static av_cold void overlay_videotoolbox_uninit(AVFilterContext *ctx) > +{ > + if (@available(macOS 10.11, iOS 9.0, *)) { > + do_uninit(ctx); > + } > +} > + > +static av_cold int do_init(AVFilterContext *avctx) API_AVAILABLE(macos(10.11), ios(9.0)) > +{ > + OverlayVideoToolboxContext *ctx = avctx->priv; > + NSError *err = nil; > + CVReturn ret; > + dispatch_data_t libData; > + > + ctx->mtl_device = MTLCreateSystemDefaultDevice(); > + if (!ctx->mtl_device) { > + av_log(avctx, AV_LOG_ERROR, "Unable to find Metal device\n"); > + goto fail; > + } > + > + av_log(ctx, AV_LOG_INFO, "Using Metal device: %s\n", ctx->mtl_device.name.UTF8String); > + > + libData = dispatch_data_create( > + ff_vf_overlay_videotoolbox_metallib_data, > + ff_vf_overlay_videotoolbox_metallib_len, > + nil, > + nil); > + > + ctx->mtl_library = [ctx->mtl_device newLibraryWithData: libData error: &err]; > + dispatch_release(libData); > + libData = nil; > + ctx->mtl_function = [ctx->mtl_library newFunctionWithName: @"blend_shader"]; > + if (!ctx->mtl_function) { > + av_log(avctx, AV_LOG_ERROR, "Failed to create Metal function!\n"); > + goto fail; > + } > + > + ctx->mtl_queue = ctx->mtl_device.newCommandQueue; > + if (!ctx->mtl_queue) { > + av_log(avctx, AV_LOG_ERROR, "Failed to create Metal command queue!\n"); > + goto fail; > + } > + > + ctx->mtl_pipeline = [ctx->mtl_device > + newComputePipelineStateWithFunction: ctx->mtl_function > + error: &err]; > + if (err) { > + av_log(ctx, AV_LOG_ERROR, "Failed to create Metal compute pipeline: %s\n", err.description.UTF8String); > + goto fail; > + } > + > + ctx->mtl_params_buffer = [ctx->mtl_device > + newBufferWithLength: sizeof(MtlBlendParams) > + options: MTLResourceStorageModeShared]; > + if (!ctx->mtl_params_buffer) { > + av_log(avctx, AV_LOG_ERROR, "Failed to create Metal buffer for parameters\n"); > + goto fail; > + } > + > + ret = CVMetalTextureCacheCreate( > + NULL, > + NULL, > + ctx->mtl_device, > + NULL, > + &ctx->texture_cache > + ); > + if (ret != kCVReturnSuccess) { > + av_log(avctx, AV_LOG_ERROR, "Failed to create CVMetalTextureCache: %d\n", ret); > + goto fail; > + } > + > + if (@available(macOS 10.8, iOS 16.0, *)) { > + ret = VTPixelTransferSessionCreate(NULL, &ctx->vt_session); > + if (ret != kCVReturnSuccess) { > + av_log(avctx, AV_LOG_ERROR, "Failed to create VTPixelTransferSession: %d\n", ret); > + goto fail; > + } > + } else { > + // Use CoreImage as fallback for old OS. > + // CoreImage has comparable performance to VTPixelTransferSession, but it supports less pixel formats than VTPixelTransferSession. > + // Warn user about possible incorrect results. > + av_log(avctx, AV_LOG_WARNING, "VTPixelTransferSessionTransferImage is not available on this OS version, fallback using CoreImage\n"); > + av_log(avctx, AV_LOG_WARNING, "Try an overlay with BGRA format if you see no overlay\n"); > + if (@available(macOS 10.15, iOS 13.0, *)) { > + ctx->ci_ctx = CFBridgingRetain([CIContext contextWithMTLCommandQueue: ctx->mtl_queue]); > + } else { > + ctx->ci_ctx = CFBridgingRetain([CIContext contextWithMTLDevice: ctx->mtl_device]); > + } > + } > + > + ctx->fs.on_event = &overlay_vt_blend; > + > + return 0; > +fail: > + overlay_videotoolbox_uninit(avctx); > + return AVERROR_EXTERNAL; > +} > + > +static av_cold int overlay_videotoolbox_init(AVFilterContext *ctx) > +{ > + if (@available(macOS 10.11, iOS 9.0, *)) { > + // Ensure we calculated OVERLAY_VT_CTX_SIZE correctly > + static_assert(OVERLAY_VT_CTX_SIZE == sizeof(OverlayVideoToolboxContext), "Incorrect OVERLAY_VT_CTX_SIZE value!"); > + return do_init(ctx); > + } else { > + av_log(ctx, AV_LOG_ERROR, "Metal is not available on this OS version\n"); > + return AVERROR(ENOSYS); > + } > +} > + > +static int do_config_output(AVFilterLink *link) API_AVAILABLE(macos(10.11), ios(9.0)) > +{ > + AVFilterContext *avctx = link->src; > + AVFilterLink *inlink_main = avctx->inputs[0]; > + AVFilterLink *inlink_overlay = avctx->inputs[1]; > + OverlayVideoToolboxContext *ctx = avctx->priv; > + AVHWFramesContext *main_frames, *output_frames, *overlay_frames; > + AVBufferRef *input_ref, *overlay_ref; > + int ret = 0; > + > + if (!inlink_main->hw_frames_ctx || > + !inlink_overlay->hw_frames_ctx) { > + av_log(avctx, AV_LOG_ERROR, "An input HW frames reference is " > + "required to associate the processing device.\n"); > + return AVERROR(EINVAL); > + } > + > + input_ref = inlink_main->hw_frames_ctx; > + overlay_ref = inlink_overlay->hw_frames_ctx; > + main_frames = (AVHWFramesContext*)input_ref->data; > + overlay_frames = (AVHWFramesContext*)overlay_ref->data; > + av_assert0(main_frames); > + av_assert0(overlay_frames); > + > + if (!format_is_supported(supported_main_formats, main_frames->sw_format)) { > + av_log(ctx, AV_LOG_ERROR, "Unsupported main input format: %s.\n", > + av_get_pix_fmt_name(main_frames->sw_format)); > + if (main_frames->sw_format == AV_PIX_FMT_YUV420P) { > + av_log(ctx, AV_LOG_WARNING, "Hint: Use %s instead of %s.\n", > + av_get_pix_fmt_name(AV_PIX_FMT_NV12), > + av_get_pix_fmt_name(AV_PIX_FMT_YUV420P)); > + } > + return AVERROR(ENOSYS); > + } > + > + if (!format_is_supported(supported_overlay_formats, overlay_frames->sw_format)) { > + av_log(ctx, AV_LOG_ERROR, "Unsupported overlay input format: %s.\n", > + av_get_pix_fmt_name(overlay_frames->sw_format)); > + if (overlay_frames->sw_format == AV_PIX_FMT_YUV420P) { > + av_log(ctx, AV_LOG_WARNING, "Hint: Use %s instead of %s.\n", > + av_get_pix_fmt_name(AV_PIX_FMT_NV12), > + av_get_pix_fmt_name(AV_PIX_FMT_YUV420P)); > + } > + return AVERROR(ENOSYS); > + } > + > + ctx->device_ref = av_buffer_ref(main_frames->device_ref); > + if (!ctx->device_ref) { > + av_log(ctx, AV_LOG_ERROR, "A device reference create failed.\n"); > + return AVERROR(ENOMEM); > + } > + > + link->hw_frames_ctx = av_hwframe_ctx_alloc(ctx->device_ref); > + if (!link->hw_frames_ctx) { > + av_log(avctx, AV_LOG_ERROR, "Failed to create HW frame context " > + "for output.\n"); > + ret = AVERROR(ENOMEM); > + return ret; > + } > + ctx->hwframe_ctx_allocated = 1; > + > + output_frames = (AVHWFramesContext*)link->hw_frames_ctx->data; > + > + output_frames->format = AV_PIX_FMT_VIDEOTOOLBOX; > + output_frames->sw_format = main_frames->sw_format; > + output_frames->width = inlink_main->w; > + output_frames->height = inlink_main->h; > + > + ret = ff_filter_init_hw_frames(avctx, link, 1); > + if (ret < 0) > + return ret; > + > + ret = av_hwframe_ctx_init(link->hw_frames_ctx); > + if (ret < 0) { > + av_log(avctx, AV_LOG_ERROR, "Failed to initialise VideoToolbox frame " > + "context for output: %d\n", ret); > + return ret; > + } > + > + link->time_base = inlink_main->time_base; > + ctx->fs.time_base = link->time_base; > + > + ret = ff_framesync_init_dualinput(&ctx->fs, avctx); > + if (ret < 0) > + return ret; > + > + ret = ff_framesync_configure(&ctx->fs); > + return ret; > +} > + > +static int config_output(AVFilterLink *link) > +{ > + AVFilterContext *ctx = link->src; > + if (@available(macOS 10.11, iOS 9.0, *)) { > + return do_config_output(link); > + } else { > + av_log(ctx, AV_LOG_ERROR, "Metal is not available on this OS version\n"); > + return AVERROR(ENOSYS); > + } > +} > + > +static int overlay_videotoolbox_activate(AVFilterContext *avctx) > +{ > + OverlayVideoToolboxContext *ctx = avctx->priv; > + return ff_framesync_activate(&ctx->fs); > +} > + > +#define FLAGS (AV_OPT_FLAG_FILTERING_PARAM | AV_OPT_FLAG_VIDEO_PARAM) > +#define OFFSET(x) offsetof(OverlayVideoToolboxContext, x) > + > +static const AVOption overlay_videotoolbox_options[] = { > + { "x", "Overlay x position", > + OFFSET(x_position), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, INT_MAX, .flags = FLAGS }, > + { "y", "Overlay y position", > + OFFSET(y_position), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, INT_MAX, .flags = FLAGS }, > + { "eof_action", "Action to take when encountering EOF from secondary input ", > + OFFSET(fs.opt_eof_action), AV_OPT_TYPE_INT, { .i64 = EOF_ACTION_REPEAT }, > + EOF_ACTION_REPEAT, EOF_ACTION_PASS, .flags = FLAGS, .unit = "eof_action" }, > + { "repeat", "Repeat the previous frame.", 0, AV_OPT_TYPE_CONST, { .i64 = EOF_ACTION_REPEAT }, .flags = FLAGS, .unit = "eof_action" }, > + { "endall", "End both streams.", 0, AV_OPT_TYPE_CONST, { .i64 = EOF_ACTION_ENDALL }, .flags = FLAGS, .unit = "eof_action" }, > + { "pass", "Pass through the main input.", 0, AV_OPT_TYPE_CONST, { .i64 = EOF_ACTION_PASS }, .flags = FLAGS, .unit = "eof_action" }, > + { "shortest", "force termination when the shortest input terminates", OFFSET(fs.opt_shortest), AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1, FLAGS }, > + { "repeatlast", "repeat overlay of the last overlay frame", OFFSET(fs.opt_repeatlast), AV_OPT_TYPE_BOOL, { .i64 = 1 }, 0, 1, FLAGS }, > + { NULL }, > +}; > + > +FRAMESYNC_DEFINE_CLASS(overlay_videotoolbox, OverlayVideoToolboxContext, fs); > + > +static const AVFilterPad overlay_videotoolbox_inputs[] = { > + { > + .name = "main", > + .type = AVMEDIA_TYPE_VIDEO, > + }, > + { > + .name = "overlay", > + .type = AVMEDIA_TYPE_VIDEO, > + }, > +}; > + > +static const AVFilterPad overlay_videotoolbox_outputs[] = { > + { > + .name = "default", > + .type = AVMEDIA_TYPE_VIDEO, > + .config_props = config_output, > + }, > +}; > + > +const AVFilter ff_vf_overlay_videotoolbox = { > + .name = "overlay_videotoolbox", > + .description = NULL_IF_CONFIG_SMALL("Overlay filter for VideoToolbox frames using Metal compute"), > + .priv_size = OVERLAY_VT_CTX_SIZE, > + .priv_class = &overlay_videotoolbox_class, > + .init = overlay_videotoolbox_init, > + .uninit = overlay_videotoolbox_uninit, > + .activate = overlay_videotoolbox_activate, > + .preinit = overlay_videotoolbox_framesync_preinit, > + FILTER_SINGLE_PIXFMT(AV_PIX_FMT_VIDEOTOOLBOX), > + FILTER_INPUTS(overlay_videotoolbox_inputs), > + FILTER_OUTPUTS(overlay_videotoolbox_outputs), > + .flags_internal = FF_FILTER_FLAG_HWFRAME_AWARE, > +}; > -- > 2.39.3 (Apple Git-145) > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
diff --git a/Changelog b/Changelog index 610ee61dd6..3ecfdab81b 100644 --- a/Changelog +++ b/Changelog @@ -27,6 +27,7 @@ version <next>: - a C11-compliant compiler is now required; note that this requirement will be bumped to C17 in the near future, so consider updating your build environment if it lacks C17 support +- VideoToolbox overlay filter version 6.1: - libaribcaption decoder diff --git a/configure b/configure index 23066efa32..a7c349d126 100755 --- a/configure +++ b/configure @@ -3807,6 +3807,7 @@ overlay_qsv_filter_deps="libmfx" overlay_qsv_filter_select="qsvvpp" overlay_vaapi_filter_deps="vaapi VAProcPipelineCaps_blend_flags" overlay_vulkan_filter_deps="vulkan spirv_compiler" +overlay_videotoolbox_filter_deps="metal corevideo coreimage videotoolbox" owdenoise_filter_deps="gpl" pad_opencl_filter_deps="opencl" pan_filter_deps="swresample" diff --git a/doc/filters.texi b/doc/filters.texi index e0436a5755..bfb77562cb 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -19033,6 +19033,58 @@ See @ref{framesync}. This filter also supports the @ref{framesync} options. +@section overlay_videotoolbox + +Overlay one video on top of another. + +This is the VideoToolbox variant of the @ref{overlay} filter. +It takes two inputs and has one output. The first input is the "main" video on which the second input is overlaid. +It only accepts VideoToolbox frames. The underlying input pixel formats do not have to match. +Different input pixel formats and color spaces will be automatically converted using hardware accelerated methods. +The final output will have the same pixel format and color space as the "main" input. + +The filter accepts the following options: + +@table @option + +@item x +Set the x coordinate of the overlaid video on the main video. +Default value is @code{0}. + +@item y +Set the y coordinate of the overlaid video on the main video. +Default value is @code{0}. + +@item eof_action +See @ref{framesync}. + +@item shortest +See @ref{framesync}. + +@item repeatlast +See @ref{framesync}. + +@end table + +@subsection Examples + +@itemize +@item +Overlay an image LOGO at the top-left corner of the INPUT video. +The INPUT video is in nv12 format and the LOGO image is in rgba format. +@example +-hwaccel videotoolbox -i INPUT -i LOGO -codec:v:0 h264_videotoolbox -filter_complex "[0:v]format=nv12,hwupload[a], [1:v]format=rgba,hwupload[b], [a][b]overlay_videotoolbox" OUTPUT +@end example +@item +Overlay an SDR video OVERLAY at the top-left corner of the HDR video MAIN. +The INPUT video is in p010 format and the LOGO image is in nv12 format. +The OUTPUT video will also be an HDR video with OVERLAY mapped to HDR. +@example +-hwaccel videotoolbox -i MAIN -i OVERLAY -codec:v:0 hevc_videotoolbox -tag:v hvc1 -filter_complex "[0:v]format=p010,hwupload[a], [1:v]format=nv12,hwupload[b], [a][b]overlay_videotoolbox" OUTPUT +@end example + +@end itemize + @section owdenoise Apply Overcomplete Wavelet denoiser. diff --git a/libavfilter/Makefile b/libavfilter/Makefile index f6c1d641d6..ea1389ab57 100644 --- a/libavfilter/Makefile +++ b/libavfilter/Makefile @@ -413,6 +413,9 @@ OBJS-$(CONFIG_OVERLAY_OPENCL_FILTER) += vf_overlay_opencl.o opencl.o \ OBJS-$(CONFIG_OVERLAY_QSV_FILTER) += vf_overlay_qsv.o framesync.o OBJS-$(CONFIG_OVERLAY_VAAPI_FILTER) += vf_overlay_vaapi.o framesync.o vaapi_vpp.o OBJS-$(CONFIG_OVERLAY_VULKAN_FILTER) += vf_overlay_vulkan.o vulkan.o vulkan_filter.o +OBJS-$(CONFIG_OVERLAY_VIDEOTOOLBOX_FILTER) += vf_overlay_videotoolbox.o framesync.o \ + metal/vf_overlay_videotoolbox.metallib.o \ + metal/utils.o OBJS-$(CONFIG_OWDENOISE_FILTER) += vf_owdenoise.o OBJS-$(CONFIG_PAD_FILTER) += vf_pad.o OBJS-$(CONFIG_PAD_OPENCL_FILTER) += vf_pad_opencl.o opencl.o opencl/pad.o diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c index 149bf50997..ec9d975ecb 100644 --- a/libavfilter/allfilters.c +++ b/libavfilter/allfilters.c @@ -389,6 +389,7 @@ extern const AVFilter ff_vf_overlay_qsv; extern const AVFilter ff_vf_overlay_vaapi; extern const AVFilter ff_vf_overlay_vulkan; extern const AVFilter ff_vf_overlay_cuda; +extern const AVFilter ff_vf_overlay_videotoolbox; extern const AVFilter ff_vf_owdenoise; extern const AVFilter ff_vf_pad; extern const AVFilter ff_vf_pad_opencl; diff --git a/libavfilter/metal/utils.m b/libavfilter/metal/utils.m index f365d3ceea..db5c5f6f10 100644 --- a/libavfilter/metal/utils.m +++ b/libavfilter/metal/utils.m @@ -55,6 +55,9 @@ CVMetalTextureRef ff_metal_texture_from_pixbuf(void *ctx, { CVMetalTextureRef tex = NULL; CVReturn ret; + bool is_planer = CVPixelBufferIsPlanar(pixbuf); + size_t width = is_planer ? CVPixelBufferGetWidthOfPlane(pixbuf, plane) : CVPixelBufferGetWidth(pixbuf); + size_t height = is_planer ? CVPixelBufferGetHeightOfPlane(pixbuf, plane) : CVPixelBufferGetHeight(pixbuf); ret = CVMetalTextureCacheCreateTextureFromImage( NULL, @@ -62,8 +65,8 @@ CVMetalTextureRef ff_metal_texture_from_pixbuf(void *ctx, pixbuf, NULL, format, - CVPixelBufferGetWidthOfPlane(pixbuf, plane), - CVPixelBufferGetHeightOfPlane(pixbuf, plane), + width, + height, plane, &tex ); diff --git a/libavfilter/metal/vf_overlay_videotoolbox.metal b/libavfilter/metal/vf_overlay_videotoolbox.metal new file mode 100644 index 0000000000..936e57e03e --- /dev/null +++ b/libavfilter/metal/vf_overlay_videotoolbox.metal @@ -0,0 +1,58 @@ +/* + * Copyright (C) 2024 Gnattu OC <gnattuoc@me.com> + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include <metal_stdlib> +#include <metal_integer> +#include <metal_texture> + +using namespace metal; + +struct mtlBlendParams { + uint x_position; + uint y_position; +}; + +/* + * Blend shader for premultiplied alpha textures + */ +kernel void blend_shader( + texture2d<float, access::read> source [[ texture(0) ]], + texture2d<float, access::read> mask [[ texture(1) ]], + texture2d<float, access::write> dest [[ texture(2) ]], + constant mtlBlendParams& params [[ buffer(3) ]], + uint2 gid [[ thread_position_in_grid ]]) +{ + const auto mask_size = uint2(mask.get_width(), + mask.get_height()); + const auto loc_overlay = uint2(params.x_position, params.y_position); + if (gid.x < loc_overlay.x || + gid.y < loc_overlay.y || + gid.x >= mask_size.x + loc_overlay.x || + gid.y >= mask_size.y + loc_overlay.y) + { + float4 source_color = source.read(gid); + dest.write(source_color, gid); + } else { + float4 source_color = source.read(gid); + float4 mask_color = mask.read((gid - loc_overlay)); + float4 result_color = source_color * (1.0f - mask_color.w) + (mask_color * mask_color.w); + dest.write(result_color, gid); + } +} diff --git a/libavfilter/vf_overlay_videotoolbox.m b/libavfilter/vf_overlay_videotoolbox.m new file mode 100644 index 0000000000..a6614c1f55 --- /dev/null +++ b/libavfilter/vf_overlay_videotoolbox.m @@ -0,0 +1,609 @@ +/* + * Copyright (C) 2024 Gnattu OC <gnattuoc@me.com> + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include <CoreImage/CoreImage.h> +#include <VideoToolbox/VideoToolbox.h> +#include "internal.h" +#include "metal/utils.h" +#include "framesync.h" +#include "libavutil/hwcontext.h" +#include "libavutil/pixdesc.h" +#include "libavutil/opt.h" +#include "libavutil/objc.h" +#include "video.h" + +#include <assert.h> + +extern char ff_vf_overlay_videotoolbox_metallib_data[]; +extern unsigned int ff_vf_overlay_videotoolbox_metallib_len; + +// Although iOS 8.0 introduced basic Metal support, its feature set is not complete and does not have CoreImage compatability. +// We have to set the minimum iOS version to 9.0. +typedef struct API_AVAILABLE(macos(10.11), ios(9.0)) OverlayVideoToolboxContext { + AVBufferRef *device_ref; + FFFrameSync fs; + + CVMetalTextureCacheRef texture_cache; + CVPixelBufferRef input_main_pixel_buffer_cache; + CVPixelBufferRef input_overlay_pixel_buffer_cache; + CVPixelBufferRef output_pixel_buffer_cache; + CIContext *ci_ctx; + VTPixelTransferSessionRef vt_session; + + id<MTLDevice> mtl_device; + id<MTLLibrary> mtl_library; + id<MTLCommandQueue> mtl_queue; + id<MTLComputePipelineState> mtl_pipeline; + id<MTLFunction> mtl_function; + id<MTLBuffer> mtl_params_buffer; + + uint x_position; + uint y_position; + uint hwframe_ctx_allocated; +} OverlayVideoToolboxContext API_AVAILABLE(macos(10.11), ios(9.0)); + +typedef struct MtlBlendParams { + uint x_position; + uint y_position; +} MtlBlendParams; + +// Using sizeof(OverlayVideoToolboxContext) without an availability check will error +// if we're targeting an older OS version, so we need to calculate the size ourselves +// (we'll statically verify it's correct in overlay_videotoolbox_init behind a check) +#define OVERLAY_VT_CTX_SIZE (sizeof(FFFrameSync) + sizeof(uint) * 3 + sizeof(void*) * 13 + 4) + +// Neither VideoToolbox nor CoreImage can convert YUV420P frames into 16-bit depth color formats. +// Additionally, the only hardware formats that support an Alpha channel are AYUV64 and BGRA. +// However, neither can be directly manipulated with YUV420P frames. +// In such cases, the user will have to use NV12 instead. +static const enum AVPixelFormat supported_main_formats[] = { + AV_PIX_FMT_NV12, + AV_PIX_FMT_P010, + AV_PIX_FMT_NONE, +}; + +static const enum AVPixelFormat supported_overlay_formats[] = { + AV_PIX_FMT_NV12, + AV_PIX_FMT_P010, + AV_PIX_FMT_AYUV64, + AV_PIX_FMT_BGRA, + AV_PIX_FMT_NONE, +}; + +/** + * Helper to find out if provided format is supported by filter + */ +static int format_is_supported(const enum AVPixelFormat formats[], enum AVPixelFormat fmt) +{ + for (int i = 0; formats[i] != AV_PIX_FMT_NONE; i++) + if (formats[i] == fmt) + return 1; + return 0; +} + +static void call_kernel(AVFilterContext *avctx, + id<MTLTexture> dst, + id<MTLTexture> main, + id<MTLTexture> overlay, + uint x_position, + uint y_position) API_AVAILABLE(macos(10.11), ios(9.0)) +{ + OverlayVideoToolboxContext *ctx = avctx->priv; + // Both the command buffer and encoder are auto-released by objc on default. + // Use CFBridgingRetain to get a more C-like behavior. + id<MTLCommandBuffer> buffer = CFBridgingRetain(ctx->mtl_queue.commandBuffer); + id<MTLComputeCommandEncoder> encoder = CFBridgingRetain((__bridge id<MTLCommandBuffer>)buffer.computeCommandEncoder); + + MtlBlendParams *params = (MtlBlendParams *)ctx->mtl_params_buffer.contents; + *params = (MtlBlendParams){ + .x_position = x_position, + .y_position = y_position, + }; + + [(__bridge id<MTLComputeCommandEncoder>)encoder setTexture: main atIndex: 0]; + [(__bridge id<MTLComputeCommandEncoder>)encoder setTexture: overlay atIndex: 1]; + [(__bridge id<MTLComputeCommandEncoder>)encoder setTexture: dst atIndex: 2]; + [(__bridge id<MTLComputeCommandEncoder>)encoder setBuffer: ctx->mtl_params_buffer offset: 0 atIndex: 3]; + ff_metal_compute_encoder_dispatch(ctx->mtl_device, ctx->mtl_pipeline, (__bridge id<MTLComputeCommandEncoder>)encoder, dst.width, dst.height); + [(__bridge id<MTLComputeCommandEncoder>)encoder endEncoding]; + + [(__bridge id<MTLCommandBuffer>)buffer commit]; + [(__bridge id<MTLCommandBuffer>)buffer waitUntilCompleted]; + + ff_objc_release(&encoder); + ff_objc_release(&buffer); +} + +// Copies and/or converts one pixel buffer to another. +// This transparently handles pixel format and color spaces, and will do a conversion if needed. +static int transfer_pixel_buffer(OverlayVideoToolboxContext *ctx, CVPixelBufferRef source, CVPixelBufferRef destination) +{ + if (@available(macOS 10.8, iOS 16.0, *)) { + int ret = 0; + ret = VTPixelTransferSessionTransferImage(ctx->vt_session, source, destination); + if (ret < 0) + return ret; + } else { + CIImage *temp_image = NULL; + temp_image = CFBridgingRetain([CIImage imageWithCVPixelBuffer: source]); + [(__bridge CIContext*)ctx->ci_ctx render: (__bridge CIImage*)temp_image toCVPixelBuffer: destination]; + CFRelease(temp_image); + CVBufferPropagateAttachments(source, destination); + } + return 0; +} + +static int overlay_vt_blend(FFFrameSync *fs) API_AVAILABLE(macos(10.11), ios(9.0)) +{ + AVFilterContext *avctx = fs->parent; + OverlayVideoToolboxContext *ctx = avctx->priv; + AVFilterLink *outlink = avctx->outputs[0]; + AVFilterLink *inlink_main = avctx->inputs[0]; + AVFilterLink *inlink_overlay = avctx->inputs[1]; + AVFrame *input_main, *input_overlay; + AVFrame *output; + AVHWFramesContext *frames_ctx = (AVHWFramesContext*)inlink_main->hw_frames_ctx->data; + AVHWFramesContext *frames_ctx_overlay = (AVHWFramesContext*)inlink_overlay->hw_frames_ctx->data; + const AVPixFmtDescriptor *in_overlay_desc; + + CVMetalTextureRef main, dst, overlay; + id<MTLTexture> tex_main, tex_overlay, tex_dst; + + MTLPixelFormat mtl_format; + OSType cv_format; + int ret; + int i, overlay_planes = 0; + + in_overlay_desc = av_pix_fmt_desc_get(frames_ctx_overlay->sw_format); + if (@available(macOS 11.3, iOS 14.2, *)) { + mtl_format = MTLPixelFormatRGBA16Unorm; + cv_format = kCVPixelFormatType_64RGBALE; + } else { + // On older OS versions, 64-bit RGBA with 16-bit little-endian full-range samples is not supported. + // To handle inputs with color depth greater than 8, convert colors to float type during filtering on these versions. + mtl_format = MTLPixelFormatRGBA16Float; + cv_format = kCVPixelFormatType_64RGBAHalf; + } + + // read main and overlay frames from inputs + ret = ff_framesync_get_frame(fs, 0, &input_main, 0); + if (ret < 0) + return ret; + ret = ff_framesync_get_frame(fs, 1, &input_overlay, 0); + if (ret < 0) + return ret; + + if (!input_main) + return AVERROR_BUG; + + output = ff_get_video_buffer(outlink, outlink->w, outlink->h); + if (!output) + return AVERROR(ENOMEM); + + ret = av_frame_copy_props(output, input_main); + if (ret < 0) + return ret; + + if (!input_overlay) { + ret = transfer_pixel_buffer(ctx, (CVPixelBufferRef)input_main->data[3], (CVPixelBufferRef)output->data[3]); + if (ret < 0) + return ret; + return ff_filter_frame(outlink, output); + } + + for (i = 0; i < in_overlay_desc->nb_components; i++) + overlay_planes = FFMAX(overlay_planes, + in_overlay_desc->comp[i].plane + 1); + + if (!ctx->input_overlay_pixel_buffer_cache) { + ret = CVPixelBufferCreate(kCFAllocatorDefault, + CVPixelBufferGetWidthOfPlane((CVPixelBufferRef)input_overlay->data[3], 0), + CVPixelBufferGetHeightOfPlane((CVPixelBufferRef)input_overlay->data[3], 0), + cv_format, + (__bridge CFDictionaryRef)@{ + (NSString *)kCVPixelBufferCGImageCompatibilityKey: @(YES), + (NSString *)kCVPixelBufferMetalCompatibilityKey: @(YES) + }, + &ctx->input_overlay_pixel_buffer_cache); + if (ret < 0) + return ret; + } + + if (!ctx->input_main_pixel_buffer_cache) { + ret = CVPixelBufferCreate(kCFAllocatorDefault, + CVPixelBufferGetWidthOfPlane((CVPixelBufferRef)input_main->data[3], 0), + CVPixelBufferGetHeightOfPlane((CVPixelBufferRef)input_main->data[3], 0), + cv_format, + (__bridge CFDictionaryRef)@{ + (NSString *)kCVPixelBufferCGImageCompatibilityKey: @(YES), + (NSString *)kCVPixelBufferMetalCompatibilityKey: @(YES) + }, + &ctx->input_main_pixel_buffer_cache); + if (ret < 0) + return ret; + } + + if (!ctx->output_pixel_buffer_cache) { + ret = CVPixelBufferCreate(kCFAllocatorDefault, + CVPixelBufferGetWidthOfPlane((CVPixelBufferRef)input_main->data[3], 0), + CVPixelBufferGetHeightOfPlane((CVPixelBufferRef)input_main->data[3], 0), + cv_format, + (__bridge CFDictionaryRef)@{ + (NSString *)kCVPixelBufferCGImageCompatibilityKey: @(YES), + (NSString *)kCVPixelBufferMetalCompatibilityKey: @(YES) + }, + &ctx->output_pixel_buffer_cache); + if (ret < 0) + return ret; + } + + ret = transfer_pixel_buffer(ctx, (CVPixelBufferRef)input_main->data[3], ctx->input_main_pixel_buffer_cache); + if (ret < 0) + return ret; + + ret = transfer_pixel_buffer(ctx, (CVPixelBufferRef)input_overlay->data[3], ctx->input_overlay_pixel_buffer_cache); + if (ret < 0) + return ret; + + overlay = ff_metal_texture_from_pixbuf(avctx, ctx->texture_cache, ctx->input_overlay_pixel_buffer_cache, 0, mtl_format); + main = ff_metal_texture_from_pixbuf(avctx, ctx->texture_cache, ctx->input_main_pixel_buffer_cache, 0, mtl_format); + dst = ff_metal_texture_from_pixbuf(avctx, ctx->texture_cache, ctx->output_pixel_buffer_cache, 0, mtl_format); + + if (!overlay || !main || !dst) { + return AVERROR(ENOSYS); + } + + tex_main = CVMetalTextureGetTexture(main); + tex_overlay = CVMetalTextureGetTexture(overlay); + tex_dst = CVMetalTextureGetTexture(dst); + + call_kernel(avctx, tex_dst, tex_main, tex_overlay, ctx->x_position, ctx->y_position); + + ret = transfer_pixel_buffer(ctx, ctx->output_pixel_buffer_cache, (CVPixelBufferRef)output->data[3]); + if (ret < 0) { + CFRelease(main); + CFRelease(overlay); + CFRelease(dst); + return ret; + } + + CFRelease(main); + CFRelease(overlay); + CFRelease(dst); + + return ff_filter_frame(outlink, output); +} + +static av_cold void do_uninit(AVFilterContext *avctx) API_AVAILABLE(macos(10.11), ios(9.0)) +{ + OverlayVideoToolboxContext *ctx = avctx->priv; + + if (ctx->hwframe_ctx_allocated) { + av_buffer_unref(&ctx->device_ref); + ctx->hwframe_ctx_allocated = 0; + } + ff_framesync_uninit(&ctx->fs); + + if (ctx->ci_ctx) { + CFRelease(ctx->ci_ctx); + ctx->ci_ctx = NULL; + } + + ff_objc_release(&ctx->mtl_params_buffer); + ff_objc_release(&ctx->mtl_function); + ff_objc_release(&ctx->mtl_pipeline); + ff_objc_release(&ctx->mtl_queue); + ff_objc_release(&ctx->mtl_library); + ff_objc_release(&ctx->mtl_device); + + if (ctx->texture_cache) { + CFRelease(ctx->texture_cache); + ctx->texture_cache = NULL; + } + if (ctx->input_main_pixel_buffer_cache) { + CFRelease(ctx->input_main_pixel_buffer_cache); + ctx->input_main_pixel_buffer_cache = NULL; + } + if (ctx->input_overlay_pixel_buffer_cache) { + CFRelease(ctx->input_overlay_pixel_buffer_cache); + ctx->input_overlay_pixel_buffer_cache = NULL; + } + if (ctx->output_pixel_buffer_cache) { + CFRelease(ctx->output_pixel_buffer_cache); + ctx->output_pixel_buffer_cache = NULL; + } + if (ctx->vt_session) { + VTPixelTransferSessionInvalidate(ctx->vt_session); + CFRelease(ctx->vt_session); + ctx->vt_session = NULL; + } +} + +static av_cold void overlay_videotoolbox_uninit(AVFilterContext *ctx) +{ + if (@available(macOS 10.11, iOS 9.0, *)) { + do_uninit(ctx); + } +} + +static av_cold int do_init(AVFilterContext *avctx) API_AVAILABLE(macos(10.11), ios(9.0)) +{ + OverlayVideoToolboxContext *ctx = avctx->priv; + NSError *err = nil; + CVReturn ret; + dispatch_data_t libData; + + ctx->mtl_device = MTLCreateSystemDefaultDevice(); + if (!ctx->mtl_device) { + av_log(avctx, AV_LOG_ERROR, "Unable to find Metal device\n"); + goto fail; + } + + av_log(ctx, AV_LOG_INFO, "Using Metal device: %s\n", ctx->mtl_device.name.UTF8String); + + libData = dispatch_data_create( + ff_vf_overlay_videotoolbox_metallib_data, + ff_vf_overlay_videotoolbox_metallib_len, + nil, + nil); + + ctx->mtl_library = [ctx->mtl_device newLibraryWithData: libData error: &err]; + dispatch_release(libData); + libData = nil; + ctx->mtl_function = [ctx->mtl_library newFunctionWithName: @"blend_shader"]; + if (!ctx->mtl_function) { + av_log(avctx, AV_LOG_ERROR, "Failed to create Metal function!\n"); + goto fail; + } + + ctx->mtl_queue = ctx->mtl_device.newCommandQueue; + if (!ctx->mtl_queue) { + av_log(avctx, AV_LOG_ERROR, "Failed to create Metal command queue!\n"); + goto fail; + } + + ctx->mtl_pipeline = [ctx->mtl_device + newComputePipelineStateWithFunction: ctx->mtl_function + error: &err]; + if (err) { + av_log(ctx, AV_LOG_ERROR, "Failed to create Metal compute pipeline: %s\n", err.description.UTF8String); + goto fail; + } + + ctx->mtl_params_buffer = [ctx->mtl_device + newBufferWithLength: sizeof(MtlBlendParams) + options: MTLResourceStorageModeShared]; + if (!ctx->mtl_params_buffer) { + av_log(avctx, AV_LOG_ERROR, "Failed to create Metal buffer for parameters\n"); + goto fail; + } + + ret = CVMetalTextureCacheCreate( + NULL, + NULL, + ctx->mtl_device, + NULL, + &ctx->texture_cache + ); + if (ret != kCVReturnSuccess) { + av_log(avctx, AV_LOG_ERROR, "Failed to create CVMetalTextureCache: %d\n", ret); + goto fail; + } + + if (@available(macOS 10.8, iOS 16.0, *)) { + ret = VTPixelTransferSessionCreate(NULL, &ctx->vt_session); + if (ret != kCVReturnSuccess) { + av_log(avctx, AV_LOG_ERROR, "Failed to create VTPixelTransferSession: %d\n", ret); + goto fail; + } + } else { + // Use CoreImage as fallback for old OS. + // CoreImage has comparable performance to VTPixelTransferSession, but it supports less pixel formats than VTPixelTransferSession. + // Warn user about possible incorrect results. + av_log(avctx, AV_LOG_WARNING, "VTPixelTransferSessionTransferImage is not available on this OS version, fallback using CoreImage\n"); + av_log(avctx, AV_LOG_WARNING, "Try an overlay with BGRA format if you see no overlay\n"); + if (@available(macOS 10.15, iOS 13.0, *)) { + ctx->ci_ctx = CFBridgingRetain([CIContext contextWithMTLCommandQueue: ctx->mtl_queue]); + } else { + ctx->ci_ctx = CFBridgingRetain([CIContext contextWithMTLDevice: ctx->mtl_device]); + } + } + + ctx->fs.on_event = &overlay_vt_blend; + + return 0; +fail: + overlay_videotoolbox_uninit(avctx); + return AVERROR_EXTERNAL; +} + +static av_cold int overlay_videotoolbox_init(AVFilterContext *ctx) +{ + if (@available(macOS 10.11, iOS 9.0, *)) { + // Ensure we calculated OVERLAY_VT_CTX_SIZE correctly + static_assert(OVERLAY_VT_CTX_SIZE == sizeof(OverlayVideoToolboxContext), "Incorrect OVERLAY_VT_CTX_SIZE value!"); + return do_init(ctx); + } else { + av_log(ctx, AV_LOG_ERROR, "Metal is not available on this OS version\n"); + return AVERROR(ENOSYS); + } +} + +static int do_config_output(AVFilterLink *link) API_AVAILABLE(macos(10.11), ios(9.0)) +{ + AVFilterContext *avctx = link->src; + AVFilterLink *inlink_main = avctx->inputs[0]; + AVFilterLink *inlink_overlay = avctx->inputs[1]; + OverlayVideoToolboxContext *ctx = avctx->priv; + AVHWFramesContext *main_frames, *output_frames, *overlay_frames; + AVBufferRef *input_ref, *overlay_ref; + int ret = 0; + + if (!inlink_main->hw_frames_ctx || + !inlink_overlay->hw_frames_ctx) { + av_log(avctx, AV_LOG_ERROR, "An input HW frames reference is " + "required to associate the processing device.\n"); + return AVERROR(EINVAL); + } + + input_ref = inlink_main->hw_frames_ctx; + overlay_ref = inlink_overlay->hw_frames_ctx; + main_frames = (AVHWFramesContext*)input_ref->data; + overlay_frames = (AVHWFramesContext*)overlay_ref->data; + av_assert0(main_frames); + av_assert0(overlay_frames); + + if (!format_is_supported(supported_main_formats, main_frames->sw_format)) { + av_log(ctx, AV_LOG_ERROR, "Unsupported main input format: %s.\n", + av_get_pix_fmt_name(main_frames->sw_format)); + if (main_frames->sw_format == AV_PIX_FMT_YUV420P) { + av_log(ctx, AV_LOG_WARNING, "Hint: Use %s instead of %s.\n", + av_get_pix_fmt_name(AV_PIX_FMT_NV12), + av_get_pix_fmt_name(AV_PIX_FMT_YUV420P)); + } + return AVERROR(ENOSYS); + } + + if (!format_is_supported(supported_overlay_formats, overlay_frames->sw_format)) { + av_log(ctx, AV_LOG_ERROR, "Unsupported overlay input format: %s.\n", + av_get_pix_fmt_name(overlay_frames->sw_format)); + if (overlay_frames->sw_format == AV_PIX_FMT_YUV420P) { + av_log(ctx, AV_LOG_WARNING, "Hint: Use %s instead of %s.\n", + av_get_pix_fmt_name(AV_PIX_FMT_NV12), + av_get_pix_fmt_name(AV_PIX_FMT_YUV420P)); + } + return AVERROR(ENOSYS); + } + + ctx->device_ref = av_buffer_ref(main_frames->device_ref); + if (!ctx->device_ref) { + av_log(ctx, AV_LOG_ERROR, "A device reference create failed.\n"); + return AVERROR(ENOMEM); + } + + link->hw_frames_ctx = av_hwframe_ctx_alloc(ctx->device_ref); + if (!link->hw_frames_ctx) { + av_log(avctx, AV_LOG_ERROR, "Failed to create HW frame context " + "for output.\n"); + ret = AVERROR(ENOMEM); + return ret; + } + ctx->hwframe_ctx_allocated = 1; + + output_frames = (AVHWFramesContext*)link->hw_frames_ctx->data; + + output_frames->format = AV_PIX_FMT_VIDEOTOOLBOX; + output_frames->sw_format = main_frames->sw_format; + output_frames->width = inlink_main->w; + output_frames->height = inlink_main->h; + + ret = ff_filter_init_hw_frames(avctx, link, 1); + if (ret < 0) + return ret; + + ret = av_hwframe_ctx_init(link->hw_frames_ctx); + if (ret < 0) { + av_log(avctx, AV_LOG_ERROR, "Failed to initialise VideoToolbox frame " + "context for output: %d\n", ret); + return ret; + } + + link->time_base = inlink_main->time_base; + ctx->fs.time_base = link->time_base; + + ret = ff_framesync_init_dualinput(&ctx->fs, avctx); + if (ret < 0) + return ret; + + ret = ff_framesync_configure(&ctx->fs); + return ret; +} + +static int config_output(AVFilterLink *link) +{ + AVFilterContext *ctx = link->src; + if (@available(macOS 10.11, iOS 9.0, *)) { + return do_config_output(link); + } else { + av_log(ctx, AV_LOG_ERROR, "Metal is not available on this OS version\n"); + return AVERROR(ENOSYS); + } +} + +static int overlay_videotoolbox_activate(AVFilterContext *avctx) +{ + OverlayVideoToolboxContext *ctx = avctx->priv; + return ff_framesync_activate(&ctx->fs); +} + +#define FLAGS (AV_OPT_FLAG_FILTERING_PARAM | AV_OPT_FLAG_VIDEO_PARAM) +#define OFFSET(x) offsetof(OverlayVideoToolboxContext, x) + +static const AVOption overlay_videotoolbox_options[] = { + { "x", "Overlay x position", + OFFSET(x_position), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, INT_MAX, .flags = FLAGS }, + { "y", "Overlay y position", + OFFSET(y_position), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, INT_MAX, .flags = FLAGS }, + { "eof_action", "Action to take when encountering EOF from secondary input ", + OFFSET(fs.opt_eof_action), AV_OPT_TYPE_INT, { .i64 = EOF_ACTION_REPEAT }, + EOF_ACTION_REPEAT, EOF_ACTION_PASS, .flags = FLAGS, .unit = "eof_action" }, + { "repeat", "Repeat the previous frame.", 0, AV_OPT_TYPE_CONST, { .i64 = EOF_ACTION_REPEAT }, .flags = FLAGS, .unit = "eof_action" }, + { "endall", "End both streams.", 0, AV_OPT_TYPE_CONST, { .i64 = EOF_ACTION_ENDALL }, .flags = FLAGS, .unit = "eof_action" }, + { "pass", "Pass through the main input.", 0, AV_OPT_TYPE_CONST, { .i64 = EOF_ACTION_PASS }, .flags = FLAGS, .unit = "eof_action" }, + { "shortest", "force termination when the shortest input terminates", OFFSET(fs.opt_shortest), AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1, FLAGS }, + { "repeatlast", "repeat overlay of the last overlay frame", OFFSET(fs.opt_repeatlast), AV_OPT_TYPE_BOOL, { .i64 = 1 }, 0, 1, FLAGS }, + { NULL }, +}; + +FRAMESYNC_DEFINE_CLASS(overlay_videotoolbox, OverlayVideoToolboxContext, fs); + +static const AVFilterPad overlay_videotoolbox_inputs[] = { + { + .name = "main", + .type = AVMEDIA_TYPE_VIDEO, + }, + { + .name = "overlay", + .type = AVMEDIA_TYPE_VIDEO, + }, +}; + +static const AVFilterPad overlay_videotoolbox_outputs[] = { + { + .name = "default", + .type = AVMEDIA_TYPE_VIDEO, + .config_props = config_output, + }, +}; + +const AVFilter ff_vf_overlay_videotoolbox = { + .name = "overlay_videotoolbox", + .description = NULL_IF_CONFIG_SMALL("Overlay filter for VideoToolbox frames using Metal compute"), + .priv_size = OVERLAY_VT_CTX_SIZE, + .priv_class = &overlay_videotoolbox_class, + .init = overlay_videotoolbox_init, + .uninit = overlay_videotoolbox_uninit, + .activate = overlay_videotoolbox_activate, + .preinit = overlay_videotoolbox_framesync_preinit, + FILTER_SINGLE_PIXFMT(AV_PIX_FMT_VIDEOTOOLBOX), + FILTER_INPUTS(overlay_videotoolbox_inputs), + FILTER_OUTPUTS(overlay_videotoolbox_outputs), + .flags_internal = FF_FILTER_FLAG_HWFRAME_AWARE, +};
Overlay filter for VideoToolbox hwframes. Unlike most hardware overlay filters, this filter does not require the two inputs to have the same pixel format; instead, it will perform format conversion automatically with hardware accelerated methods. Signed-off-by: Gnattu OC <gnattuoc@me.com> --- Changes from v3: - Fixes an issue that 8bit depth BGRA overlay frames are not correctly converted to 16bit - Added a constraint to input pixel formats as VideoToolbox cannot convert all of its hardwareframes Changelog | 1 + configure | 1 + doc/filters.texi | 52 ++ libavfilter/Makefile | 3 + libavfilter/allfilters.c | 1 + libavfilter/metal/utils.m | 7 +- .../metal/vf_overlay_videotoolbox.metal | 58 ++ libavfilter/vf_overlay_videotoolbox.m | 609 ++++++++++++++++++ 8 files changed, 730 insertions(+), 2 deletions(-) create mode 100644 libavfilter/metal/vf_overlay_videotoolbox.metal create mode 100644 libavfilter/vf_overlay_videotoolbox.m