[FFmpeg-devel] doc/filters: add documentation to all existing OpenCL filters, except tonemap filter

Submitted by Danil Iashchenko on July 5, 2018, 2:11 p.m.

Details

Message ID 1530799900-8675-1-git-send-email-danyaschenko@gmail.com
State New
Headers show

Commit Message

Danil Iashchenko July 5, 2018, 2:11 p.m.
add documentation to all existing OpenCL filters
---
>I like this patch, the other thing, is it have any plan to update
>openCL part like VAAPI (https://trac.ffmpeg.org/wiki/Hardware/VAAPI)
>in https://trac.ffmpeg.org/wiki/HWAccelIntro, thanks.
I am going to add it during the next week, thanks!

>Why not for tonemap_opencl?
Added.

>Since these filters are distinct, please list options as well, since their software counterparts could conceivably change. Option entries should include default value, range, and semantic.
Fixed.

>Why are all examples of filters, that operate upon a single stream, shown within -filter_complex? '-vf' is preferred.
Fixed.

>P.S. Once this new section is created, program_opencl entry should be shifted there, in a separate patch.
Shifted to opencl section.

Thanks, Danil.

 doc/filters.texi | 741 +++++++++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 611 insertions(+), 130 deletions(-)

Comments

Mark Thompson July 11, 2018, 11:03 p.m.
On 05/07/18 15:11, Danil Iashchenko wrote:
> add documentation to all existing OpenCL filters
> ---
>> I like this patch, the other thing, is it have any plan to update
>> openCL part like VAAPI (https://trac.ffmpeg.org/wiki/Hardware/VAAPI)
>> in https://trac.ffmpeg.org/wiki/HWAccelIntro, thanks.
> I am going to add it during the next week, thanks!
> 
>> Why not for tonemap_opencl?
> Added.

The title still says "except tonemap", though.

>> Since these filters are distinct, please list options as well, since their software counterparts could conceivably change. Option entries should include default value, range, and semantic.
> Fixed.
> 
>> Why are all examples of filters, that operate upon a single stream, shown within -filter_complex? '-vf' is preferred.
> Fixed.
> 
>> P.S. Once this new section is created, program_opencl entry should be shifted there, in a separate patch.
> Shifted to opencl section.

There is openclsrc as well.

I'm not sure I'm completely in favour of that change, anyway?  The section feels more coherent to me as "OpenCL versions of other filters" rather than "all filters using OpenCL" - program_opencl/openclsrc do something really quite different.  Does anyone else have an opinion on that?

(Also, the "in a separate patch" comment from Gyan is probably sensible if you keep this way around.)

>  doc/filters.texi | 741 +++++++++++++++++++++++++++++++++++++++++++++----------
>  1 file changed, 611 insertions(+), 130 deletions(-)
> 
> diff --git a/doc/filters.texi b/doc/filters.texi
> index fb5f3ee..16a350a 100644
> --- a/doc/filters.texi
> +++ b/doc/filters.texi
> ...
> @@ -17422,6 +17292,617 @@ pixel format "yuv422p" @var{hsub} is 2 and @var{vsub} is 1.
>  
>  @c man end VIDEO FILTERS
>  
> +@chapter OpenCL Video Filters
> +@c man begin OPENCL VIDEO FILTERS
> +
> +Below is a description of the currently available OpenCL video filters.
> +
> +To enable compilation of these filters you need to configure FFmpeg with
> +@code{--enable-opencl}.

A pointer to the -init_hw_device/-filter_hw_device options would probably be a good idea here, since all of the examples using hwupload/hwdownload require that.

> +@section avgblur_opencl
> +
> +Apply average blur filter.
> +
> +The filter accepts the following options:
> +
> +@table @option
> +@item sizeX
> +Set horizontal kernel size. By default value is @code{1}.

I know it's copied from the avgblur filter, but this sounds very misleading to me.  It's the radius of the kernel, not the size of the kernel itself - it works on 2*sizeX+1 by 2*sizeY+1 rectangle, not a sizeX by sizeY one.  (Could make non-opencl avgblur clearer in a separate patch too.)

> +
> +@item planes
> +Set which planes to filter. By default all planes are filtered.
> +
> +@item sizeY
> +Set vertical kernel size, if zero it will be same as @code{sizeX}.
> +By default value is @code{0}.
> +@end table
> +
> +@subsection Example
> +
> +@itemize
> +@item
> +Apply average blur filter with sizeX and sizeY set to 3

For an example, I'd say a bit more about exactly what it does - e.g. "Apply average blur filter with horizontal and vertical size of 3, setting each pixel of the output to the average value of the 7x7 region centred on it in the input." (or something like that, maybe a comment about what it does at the edges too).

> +@example
> +-i INPUT -vf "hwupload, avgblur_opencl=3, hwdownload" OUTPUT
> +@end example
> +@end itemize
> +
> +@section boxblur_opencl
> +
> +Apply a boxblur algorithm to the input video.
> +
> +It accepts the following parameters:
> +
> +@table @option
> +
> +@item luma_radius, lr
> +@item luma_power, lp
> +@item chroma_radius, cr
> +@item chroma_power, cp
> +@item alpha_radius, ar
> +@item alpha_power, ap
> +
> +@end table
> +
> +A description of the accepted options follows.
> +
> +@table @option
> +@item luma_radius, lr
> +@item chroma_radius, cr
> +@item alpha_radius, ar
> +Set an expression for the box radius in pixels used for blurring the
> +corresponding input plane.
> +
> +The radius value must be a non-negative number, and must not be
> +greater than the value of the expression @code{min(w,h)/2} for the
> +luma and alpha planes, and of @code{min(cw,ch)/2} for the chroma
> +planes.
> +
> +Default value for @option{luma_radius} is "2". If not specified,
> +@option{chroma_radius} and @option{alpha_radius} default to the
> +corresponding value set for @option{luma_radius}.
> +
> +The expressions can contain the following constants:
> +@table @option
> +@item w
> +@item h
> +The input width and height in pixels.
> +
> +@item cw
> +@item ch
> +The input chroma image width and height in pixels.
> +
> +@item hsub
> +@item vsub
> +The horizontal and vertical chroma subsample values. For example, for the
> +pixel format "yuv422p", @var{hsub} is 2 and @var{vsub} is 1.
> +@end table
> +
> +@item luma_power, lp
> +@item chroma_power, cp
> +@item alpha_power, ap
> +Specify how many times the boxblur filter is applied to the
> +corresponding plane.
> +
> +Default value for @option{luma_power} is 2. If not specified,
> +@option{chroma_power} and @option{alpha_power} default to the
> +corresponding value set for @option{luma_power}.
> +
> +A value of 0 will disable the effect.
> +@end table
> +
> +@subsection Examples
> +
> +@itemize
> +@item
> +Apply a boxblur filter with the luma, chroma, and alpha radius

Again, try to describe what the examples actually do rather than just the options provided.

> +set to 2:
> +@example
> +-i INPUT -vf "hwupload, boxblur_opencl=luma_radius=2:luma_power=1, hwdownload" OUTPUT
> +-i INPUT -vf "hwupload, boxblur_opencl=2:1, hwdownload" OUTPUT
> +@end example
> +
> +@item
> +Set the luma radius to 2, and alpha and chroma radius to 0:
> +@example
> +-i INPUT -vf "hwupload, boxblur_opencl=2:1:cr=0:ar=0, hwdownload" OUTPUT
> +@end example
> +
> +@item
> +Set the luma and chroma radius to a fraction of the video dimension:
> +@example
> +-i INPUT -vf "hwupload, boxblur_opencl=luma_radius=min(h\,w)/10:luma_power=1:chroma_radius=min(cw\,ch)/10:chroma_power=1, hwdownload" OUTPUT
> +@end example
> +@end itemize
> +
> +@section convolution_opencl
> +
> +Apply convolution of 3x3, 5x5, 7x7 or horizontal/vertical up to 49 elements.
> +
> +The filter accepts the following options:
> +
> +@table @option
> +@item 0m
> +@item 1m
> +@item 2m
> +@item 3m
> +Set matrix for each plane.
> +Matrix is sequence of 9, 25 or 49 signed integers in @var{square} mode,
> +and from 1 to 49 odd number of signed integers in @var{row} mode.
> +
> +@item 0rdiv
> +@item 1rdiv
> +@item 2rdiv
> +@item 3rdiv
> +Set multiplier for calculated value for each plane.
> +If unset or 0, it will be sum of all matrix elements.
> +
> +@item 0bias
> +@item 1bias
> +@item 2bias
> +@item 3bias
> +Set bias for each plane. This value is added to the result of the multiplication.
> +Useful for making the overall image brighter or darker. Default is 0.0.
> +
> +@item 0mode
> +@item 1mode
> +@item 2mode
> +@item 3mode
> +Set matrix mode for each plane. Can be @var{square}, @var{row} or @var{column}.
> +Default is @var{square}.
> +@end table
> +
> +@subsection Examples
> +
> +@itemize
> +@item
> +Apply sharpen:
> +@example
> +-i INPUT -vf "hwupload, convolution_opencl=0 -1 0 -1 5 -1 0 -1 0:0 -1 0 -1 5 -1 0 -1 0:0 -1 0 -1 5 -1 0 -1 0:0 -1 0 -1 5 -1 0 -1 0, hwdownload" OUTPUT
> +@end example
> +
> +@item
> +Apply blur:
> +@example
> +-i INPUT -vf "hwupload, convolution_opencl=1 1 1 1 1 1 1 1 1:1 1 1 1 1 1 1 1 1:1 1 1 1 1 1 1 1 1:1 1 1 1 1 1 1 1 1:1/9:1/9:1/9:1/9, hwdownload" OUTPUT
> +@end example
> +
> +@item
> +Apply edge enhance:
> +@example
> +-i INPUT -vf "hwupload, convolution_opencl=0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:5:1:1:1:0:128:128:128, hwdownload" OUTPUT
> +@end example
> +
> +@item
> +Apply edge detect:
> +@example
> +-i INPUT -vf "hwupload, convolution_opencl=0 1 0 1 -4 1 0 1 0:0 1 0 1 -4 1 0 1 0:0 1 0 1 -4 1 0 1 0:0 1 0 1 -4 1 0 1 0:5:5:5:1:0:128:128:128, hwdownload" OUTPUT
> +@end example
> +
> +@item
> +Apply laplacian edge detector which includes diagonals:
> +@example
> +-i INPUT -vf "hwupload, convolution_opencl=1 1 1 1 -8 1 1 1 1:1 1 1 1 -8 1 1 1 1:1 1 1 1 -8 1 1 1 1:1 1 1 1 -8 1 1 1 1:5:5:5:1:0:128:128:0, hwdownload" OUTPUT
> +@end example
> +
> +@item
> +Apply emboss:
> +@example
> +-i INPUT -vf "hwupload, convolution_opencl=-2 -1 0 -1 1 1 0 1 2:-2 -1 0 -1 1 1 0 1 2:-2 -1 0 -1 1 1 0 1 2:-2 -1 0 -1 1 1 0 1 2, hwdownload" OUTPUT
> +@end example
> +@end itemize
> +
> +@section overlay_opencl
> +
> +Overlay one video on top of another.
> +
> +It takes two inputs and has one output. The first input is the "main"
> +video on which the second input is overlaid.
> +
> +The filter accepts the following options:
> +
> +@table @option
> +
> +@item x
> +Set the x coordinate of the overlaid video on the main video
> +By default value is @code{0}.
> +
> +@item y
> +Set the x coordinate of the overlaid video on the main video
> +By default value is @code{0}.
> +
> +@end table
> +
> +@subsection Example
> +
> +@itemize
> +@item
> +Insert a transparent PNG logo in the bottom left corner of the input
> +@example
> +-i INPUT -i LOGO -filter_complex "[0:v]hwupload[a],[1:v]hwupload[b],[a][b]overlay_opencl[out],[out]hwdownload" OUTPUT
> +@end example
> +@end itemize

There is quite a bit of trickiness wrt formats here - if you have a PNG logo and an RGB input this isn't going to do what you expect.  (Try it!)

That also doesn't end up in the bottom left unless the PNG is padded with transparancy at the top to the same height as the video.

> +@section prewitt_opencl
> +
> +Apply prewitt operator to input video stream.
> +
> +The filter accepts the following option:
> +
> +@table @option
> +@item planes
> +Set which planes will be processed, unprocessed planes will be copied.
> +By default value @code{0xf}, all planes will be processed.
> +
> +@item scale
> +Set value which will be multiplied with filtered result.
> +By default value is @code{1}.
> +
> +@item delta
> +Set value which will be added to filtered result.
> +By default value is @code{0}.
> +@end table
> +
> +@subsection Example
> +
> +@itemize
> +@item
> +Apply prewitt operator with scale set to 2 and delta set to 10
> +@example
> +-i INPUT -vf "hwupload, prewitt_opencl=scale=2:delta=10, hwdownload" OUTPUT
> +@end example
> +@end itemize
> +
> ...
> +
> +@end itemize
> +
> +@section roberts_opencl
> +Apply roberts cross operator to input video stream.
> +
> +The filter accepts the following option:
> +
> +@table @option
> +@item planes
> +Set which planes will be processed, unprocessed planes will be copied.
> +By default value @code{0xf}, all planes will be processed.
> +
> +@item scale
> +Set value which will be multiplied with filtered result.
> +By default value is @code{1}.
> +
> +@item delta
> +Set value which will be added to filtered result.
> +By default value is @code{0}.
> +@end table
> +
> +@subsection Example
> +
> +@itemize
> +@item
> +Apply roberts cross operator with scale set to 2 and delta set to 10
> +@example
> +-i INPUT -vf "hwupload, roberts_opencl=scale=2:delta=10, hwdownload" OUTPUT
> +@end example
> +@end itemize
> +
> +@section sobel_opencl
> +
> +Apply sobel operator to input video stream.
> +
> +The filter accepts the following option:
> +
> +@table @option
> +@item planes
> +Set which planes will be processed, unprocessed planes will be copied.
> +By default value @code{0xf}, all planes will be processed.
> +
> +@item scale
> +Set value which will be multiplied with filtered result.
> +By default value is @code{1}.
> +
> +@item delta
> +Set value which will be added to filtered result.
> +By default value is @code{0}.
> +@end table
> +
> +@subsection Example
> +
> +@itemize
> +@item
> +Apply sobel operator with scale set to 2 and delta set to 10
> +@example
> +-i INPUT -vf "hwupload, sobel_opencl=scale=2:delta=10, hwdownload" OUTPUT
> +@end example
> +@end itemize
> +
> +@section tonemap_opencl
> +Tone map colors from different dynamic ranges.
> +
> +@subsection Options
> +The filter accepts the following options.
> +
> +@table @option
> +@item tonemap
> +Set the tone map algorithm to use.
> +
> +Possible values are:
> +@table @var
> +@item none
> +Do not apply any tone map, only desaturate overbright pixels.
> +
> +@item clip
> +Hard-clip any out-of-range values. Use it for perfect color accuracy for
> +in-range values, while distorting out-of-range values.
> +
> +@item linear
> +Stretch the entire reference gamut to a linear multiple of the display.
> +
> +@item gamma
> +Fit a logarithmic transfer between the tone curves.
> +
> +@item reinhard
> +Preserve overall image brightness with a simple curve, using nonlinear
> +contrast, which results in flattening details and degrading color accuracy.
> +
> +@item hable
> +Preserve both dark and bright details better than @var{reinhard}, at the cost
> +of slightly darkening everything. Use it when detail preservation is more
> +important than color and brightness accuracy.
> +
> +@item mobius
> +Smoothly map out-of-range values, while retaining contrast and colors for
> +in-range material as much as possible. Use it when color accuracy is more
> +important than detail preservation.
> +@end table
> +
> +Default is none.
> +
> +@item param
> +Tune the tone mapping algorithm.
> +
> +This affects the following algorithms:
> +@table @var
> +@item none
> +Ignored.
> +
> +@item linear
> +Specifies the scale factor to use while stretching.
> +Default to 1.0.
> +
> +@item gamma
> +Specifies the exponent of the function.
> +Default to 1.8.
> +
> +@item clip
> +Specify an extra linear coefficient to multiply into the signal before clipping.
> +Default to 1.0.
> +
> +@item reinhard
> +Specify the local contrast coefficient at the display peak.
> +Default to 0.5, which means that in-gamut values will be about half as bright
> +as when clipping.
> +
> +@item hable
> +Ignored.
> +
> +@item mobius
> +Specify the transition point from linear to mobius transform. Every value
> +below this point is guaranteed to be mapped 1:1. The higher the value, the
> +more accurate the result will be, at the cost of losing bright details.
> +Default to 0.3, which due to the steep initial slope still preserves in-range
> +colors fairly accurately.
> +@end table
> +
> +@item desat
> +Apply desaturation for highlights that exceed this level of brightness. The
> +higher the parameter, the more color information will be preserved. This
> +setting helps prevent unnaturally blown-out colors for super-highlights, by
> +(smoothly) turning into white instead. This makes images feel more natural,
> +at the cost of reducing information about out-of-range colors.
> +
> +The default of 2.0 is somewhat conservative and will mostly just apply to
> +skies or directly sunlit surfaces. A setting of 0.0 disables this option.
> +
> +This option works only if the input frame has a supported color tag.
> +
> +@item peak
> +Override signal/nominal/reference peak with this value. Useful when the
> +embedded peak information in display metadata is not reliable or when tone
> +mapping from a lower range to a higher range.
> +@end table

I don't think tonemap and tonemap_opencl are similar enough to just copy this.

> +@section unsharp_opencl
> +
> +Sharpen or blur the input video.
> +
> +It accepts the following parameters:
> +
> +@table @option
> +@item luma_msize_x, lx
> +Set the luma matrix horizontal size. It must be an odd integer between
> +@code{3} and @code{23}. The default value is @code{5}.
> +
> +@item luma_msize_y, ly
> +Set the luma matrix vertical size. It must be an odd integer between @code{3}
> +and @code{23}. The default value is @code{5}.
> +
> +@item luma_amount, la
> +Set the luma effect strength. It must be a floating point number, reasonable
> +values lay between @code{-1.5} and @code{1.5}.
> +
> +Negative values will blur the input video, while positive values will
> +sharpen it, a value of zero will disable the effect.
> +
> +Default value is @code{1.0}.
> +
> +@item chroma_msize_x, cx
> +Set the chroma matrix horizontal size. It must be an odd integer
> +between @code{3} and @code{23}. The default value is @code{5}.
> +
> +@item chroma_msize_y, cy
> +Set the chroma matrix vertical size. It must be an odd integer
> +between @code{3} and @code{23}. The default value is @code{5}.
> +
> +@item chroma_amount, ca
> +Set the chroma effect strength. It must be a floating point number, reasonable
> +values lay between @code{-1.5} and @code{1.5}.
> +
> +Negative values will blur the input video, while positive values will
> +sharpen it, a value of zero will disable the effect.
> +
> +Default value is @code{0.0}.
> +
> +@end table
> +
> +All parameters are optional and default to the equivalent of the
> +string '5:5:1.0:5:5:0.0'.
> +
> +@subsection Examples
> +
> +@itemize
> +@item
> +Apply strong luma sharpen effect:
> +@example
> +-i INPUT -vf "hwupload, unsharp_opencl=luma_msize_x=7:luma_msize_y=7:luma_amount=2.5, hwdownload" OUTPUT
> +@end example
> +
> +@item
> +Apply a strong blur of both luma and chroma parameters:
> +@example
> +-i INPUT -vf "hwupload, unsharp_opencl=7:7:-2:7:7:-2, hwdownload" OUTPUT
> +@end example
> +@end itemize
> +
> +@c man end OPENCL VIDEO FILTERS
> +
>  @chapter Video Sources
>  @c man begin VIDEO SOURCES
>  
> 

- Mark
Gyan July 12, 2018, 5:32 a.m.
On 12-07-2018 04:33 AM, Mark Thompson wrote:

> I'm not sure I'm completely in favour of that change, anyway?  The section feels more coherent to me as "OpenCL versions of other filters" rather than "all filters using OpenCL" - program_opencl/openclsrc do something really quite different.  Does anyone else have an opinion on that?

If we do have a separate section, it should include all members, and not 
leave a few scattered elsewhere. Once users see a separate section, 
they'll expect all entries there. Source filters are a possible exception.

Of course, if separated, other H/W filters should be as well. Right now, 
CUDA & OpenGL are mixed in; QSV & VAAPI ones aren't documented yet. 
There may be Vulkan filters.

Regards,
Gyan

Patch hide | download patch | download mbox

diff --git a/doc/filters.texi b/doc/filters.texi
index fb5f3ee..16a350a 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -12917,136 +12917,6 @@  Set value which will be multiplied with filtered result.
 Set value which will be added to filtered result.
 @end table
 
-@anchor{program_opencl}
-@section program_opencl
-
-Filter video using an OpenCL program.
-
-@table @option
-
-@item source
-OpenCL program source file.
-
-@item kernel
-Kernel name in program.
-
-@item inputs
-Number of inputs to the filter.  Defaults to 1.
-
-@item size, s
-Size of output frames.  Defaults to the same as the first input.
-
-@end table
-
-The program source file must contain a kernel function with the given name,
-which will be run once for each plane of the output.  Each run on a plane
-gets enqueued as a separate 2D global NDRange with one work-item for each
-pixel to be generated.  The global ID offset for each work-item is therefore
-the coordinates of a pixel in the destination image.
-
-The kernel function needs to take the following arguments:
-@itemize
-@item
-Destination image, @var{__write_only image2d_t}.
-
-This image will become the output; the kernel should write all of it.
-@item
-Frame index, @var{unsigned int}.
-
-This is a counter starting from zero and increasing by one for each frame.
-@item
-Source images, @var{__read_only image2d_t}.
-
-These are the most recent images on each input.  The kernel may read from
-them to generate the output, but they can't be written to.
-@end itemize
-
-Example programs:
-
-@itemize
-@item
-Copy the input to the output (output must be the same size as the input).
-@verbatim
-__kernel void copy(__write_only image2d_t destination,
-                   unsigned int index,
-                   __read_only  image2d_t source)
-{
-    const sampler_t sampler = CLK_NORMALIZED_COORDS_FALSE;
-
-    int2 location = (int2)(get_global_id(0), get_global_id(1));
-
-    float4 value = read_imagef(source, sampler, location);
-
-    write_imagef(destination, location, value);
-}
-@end verbatim
-
-@item
-Apply a simple transformation, rotating the input by an amount increasing
-with the index counter.  Pixel values are linearly interpolated by the
-sampler, and the output need not have the same dimensions as the input.
-@verbatim
-__kernel void rotate_image(__write_only image2d_t dst,
-                           unsigned int index,
-                           __read_only  image2d_t src)
-{
-    const sampler_t sampler = (CLK_NORMALIZED_COORDS_FALSE |
-                               CLK_FILTER_LINEAR);
-
-    float angle = (float)index / 100.0f;
-
-    float2 dst_dim = convert_float2(get_image_dim(dst));
-    float2 src_dim = convert_float2(get_image_dim(src));
-
-    float2 dst_cen = dst_dim / 2.0f;
-    float2 src_cen = src_dim / 2.0f;
-
-    int2   dst_loc = (int2)(get_global_id(0), get_global_id(1));
-
-    float2 dst_pos = convert_float2(dst_loc) - dst_cen;
-    float2 src_pos = {
-        cos(angle) * dst_pos.x - sin(angle) * dst_pos.y,
-        sin(angle) * dst_pos.x + cos(angle) * dst_pos.y
-    };
-    src_pos = src_pos * src_dim / dst_dim;
-
-    float2 src_loc = src_pos + src_cen;
-
-    if (src_loc.x < 0.0f      || src_loc.y < 0.0f ||
-        src_loc.x > src_dim.x || src_loc.y > src_dim.y)
-        write_imagef(dst, dst_loc, 0.5f);
-    else
-        write_imagef(dst, dst_loc, read_imagef(src, sampler, src_loc));
-}
-@end verbatim
-
-@item
-Blend two inputs together, with the amount of each input used varying
-with the index counter.
-@verbatim
-__kernel void blend_images(__write_only image2d_t dst,
-                           unsigned int index,
-                           __read_only  image2d_t src1,
-                           __read_only  image2d_t src2)
-{
-    const sampler_t sampler = (CLK_NORMALIZED_COORDS_FALSE |
-                               CLK_FILTER_LINEAR);
-
-    float blend = (cos((float)index / 50.0f) + 1.0f) / 2.0f;
-
-    int2  dst_loc = (int2)(get_global_id(0), get_global_id(1));
-    int2 src1_loc = dst_loc * get_image_dim(src1) / get_image_dim(dst);
-    int2 src2_loc = dst_loc * get_image_dim(src2) / get_image_dim(dst);
-
-    float4 val1 = read_imagef(src1, sampler, src1_loc);
-    float4 val2 = read_imagef(src2, sampler, src2_loc);
-
-    write_imagef(dst, dst_loc, val1 * blend + val2 * (1.0f - blend));
-}
-@end verbatim
-
-@end itemize
-
 @section pseudocolor
 
 Alter frame colors in video with pseudocolors.
@@ -17422,6 +17292,617 @@  pixel format "yuv422p" @var{hsub} is 2 and @var{vsub} is 1.
 
 @c man end VIDEO FILTERS
 
+@chapter OpenCL Video Filters
+@c man begin OPENCL VIDEO FILTERS
+
+Below is a description of the currently available OpenCL video filters.
+
+To enable compilation of these filters you need to configure FFmpeg with
+@code{--enable-opencl}.
+
+@section avgblur_opencl
+
+Apply average blur filter.
+
+The filter accepts the following options:
+
+@table @option
+@item sizeX
+Set horizontal kernel size. By default value is @code{1}.
+
+@item planes
+Set which planes to filter. By default all planes are filtered.
+
+@item sizeY
+Set vertical kernel size, if zero it will be same as @code{sizeX}.
+By default value is @code{0}.
+@end table
+
+@subsection Example
+
+@itemize
+@item
+Apply average blur filter with sizeX and sizeY set to 3
+@example
+-i INPUT -vf "hwupload, avgblur_opencl=3, hwdownload" OUTPUT
+@end example
+@end itemize
+
+@section boxblur_opencl
+
+Apply a boxblur algorithm to the input video.
+
+It accepts the following parameters:
+
+@table @option
+
+@item luma_radius, lr
+@item luma_power, lp
+@item chroma_radius, cr
+@item chroma_power, cp
+@item alpha_radius, ar
+@item alpha_power, ap
+
+@end table
+
+A description of the accepted options follows.
+
+@table @option
+@item luma_radius, lr
+@item chroma_radius, cr
+@item alpha_radius, ar
+Set an expression for the box radius in pixels used for blurring the
+corresponding input plane.
+
+The radius value must be a non-negative number, and must not be
+greater than the value of the expression @code{min(w,h)/2} for the
+luma and alpha planes, and of @code{min(cw,ch)/2} for the chroma
+planes.
+
+Default value for @option{luma_radius} is "2". If not specified,
+@option{chroma_radius} and @option{alpha_radius} default to the
+corresponding value set for @option{luma_radius}.
+
+The expressions can contain the following constants:
+@table @option
+@item w
+@item h
+The input width and height in pixels.
+
+@item cw
+@item ch
+The input chroma image width and height in pixels.
+
+@item hsub
+@item vsub
+The horizontal and vertical chroma subsample values. For example, for the
+pixel format "yuv422p", @var{hsub} is 2 and @var{vsub} is 1.
+@end table
+
+@item luma_power, lp
+@item chroma_power, cp
+@item alpha_power, ap
+Specify how many times the boxblur filter is applied to the
+corresponding plane.
+
+Default value for @option{luma_power} is 2. If not specified,
+@option{chroma_power} and @option{alpha_power} default to the
+corresponding value set for @option{luma_power}.
+
+A value of 0 will disable the effect.
+@end table
+
+@subsection Examples
+
+@itemize
+@item
+Apply a boxblur filter with the luma, chroma, and alpha radius
+set to 2:
+@example
+-i INPUT -vf "hwupload, boxblur_opencl=luma_radius=2:luma_power=1, hwdownload" OUTPUT
+-i INPUT -vf "hwupload, boxblur_opencl=2:1, hwdownload" OUTPUT
+@end example
+
+@item
+Set the luma radius to 2, and alpha and chroma radius to 0:
+@example
+-i INPUT -vf "hwupload, boxblur_opencl=2:1:cr=0:ar=0, hwdownload" OUTPUT
+@end example
+
+@item
+Set the luma and chroma radius to a fraction of the video dimension:
+@example
+-i INPUT -vf "hwupload, boxblur_opencl=luma_radius=min(h\,w)/10:luma_power=1:chroma_radius=min(cw\,ch)/10:chroma_power=1, hwdownload" OUTPUT
+@end example
+@end itemize
+
+@section convolution_opencl
+
+Apply convolution of 3x3, 5x5, 7x7 or horizontal/vertical up to 49 elements.
+
+The filter accepts the following options:
+
+@table @option
+@item 0m
+@item 1m
+@item 2m
+@item 3m
+Set matrix for each plane.
+Matrix is sequence of 9, 25 or 49 signed integers in @var{square} mode,
+and from 1 to 49 odd number of signed integers in @var{row} mode.
+
+@item 0rdiv
+@item 1rdiv
+@item 2rdiv
+@item 3rdiv
+Set multiplier for calculated value for each plane.
+If unset or 0, it will be sum of all matrix elements.
+
+@item 0bias
+@item 1bias
+@item 2bias
+@item 3bias
+Set bias for each plane. This value is added to the result of the multiplication.
+Useful for making the overall image brighter or darker. Default is 0.0.
+
+@item 0mode
+@item 1mode
+@item 2mode
+@item 3mode
+Set matrix mode for each plane. Can be @var{square}, @var{row} or @var{column}.
+Default is @var{square}.
+@end table
+
+@subsection Examples
+
+@itemize
+@item
+Apply sharpen:
+@example
+-i INPUT -vf "hwupload, convolution_opencl=0 -1 0 -1 5 -1 0 -1 0:0 -1 0 -1 5 -1 0 -1 0:0 -1 0 -1 5 -1 0 -1 0:0 -1 0 -1 5 -1 0 -1 0, hwdownload" OUTPUT
+@end example
+
+@item
+Apply blur:
+@example
+-i INPUT -vf "hwupload, convolution_opencl=1 1 1 1 1 1 1 1 1:1 1 1 1 1 1 1 1 1:1 1 1 1 1 1 1 1 1:1 1 1 1 1 1 1 1 1:1/9:1/9:1/9:1/9, hwdownload" OUTPUT
+@end example
+
+@item
+Apply edge enhance:
+@example
+-i INPUT -vf "hwupload, convolution_opencl=0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:5:1:1:1:0:128:128:128, hwdownload" OUTPUT
+@end example
+
+@item
+Apply edge detect:
+@example
+-i INPUT -vf "hwupload, convolution_opencl=0 1 0 1 -4 1 0 1 0:0 1 0 1 -4 1 0 1 0:0 1 0 1 -4 1 0 1 0:0 1 0 1 -4 1 0 1 0:5:5:5:1:0:128:128:128, hwdownload" OUTPUT
+@end example
+
+@item
+Apply laplacian edge detector which includes diagonals:
+@example
+-i INPUT -vf "hwupload, convolution_opencl=1 1 1 1 -8 1 1 1 1:1 1 1 1 -8 1 1 1 1:1 1 1 1 -8 1 1 1 1:1 1 1 1 -8 1 1 1 1:5:5:5:1:0:128:128:0, hwdownload" OUTPUT
+@end example
+
+@item
+Apply emboss:
+@example
+-i INPUT -vf "hwupload, convolution_opencl=-2 -1 0 -1 1 1 0 1 2:-2 -1 0 -1 1 1 0 1 2:-2 -1 0 -1 1 1 0 1 2:-2 -1 0 -1 1 1 0 1 2, hwdownload" OUTPUT
+@end example
+@end itemize
+
+@section overlay_opencl
+
+Overlay one video on top of another.
+
+It takes two inputs and has one output. The first input is the "main"
+video on which the second input is overlaid.
+
+The filter accepts the following options:
+
+@table @option
+
+@item x
+Set the x coordinate of the overlaid video on the main video
+By default value is @code{0}.
+
+@item y
+Set the x coordinate of the overlaid video on the main video
+By default value is @code{0}.
+
+@end table
+
+@subsection Example
+
+@itemize
+@item
+Insert a transparent PNG logo in the bottom left corner of the input
+@example
+-i INPUT -i LOGO -filter_complex "[0:v]hwupload[a],[1:v]hwupload[b],[a][b]overlay_opencl[out],[out]hwdownload" OUTPUT
+@end example
+@end itemize
+
+@section prewitt_opencl
+
+Apply prewitt operator to input video stream.
+
+The filter accepts the following option:
+
+@table @option
+@item planes
+Set which planes will be processed, unprocessed planes will be copied.
+By default value @code{0xf}, all planes will be processed.
+
+@item scale
+Set value which will be multiplied with filtered result.
+By default value is @code{1}.
+
+@item delta
+Set value which will be added to filtered result.
+By default value is @code{0}.
+@end table
+
+@subsection Example
+
+@itemize
+@item
+Apply prewitt operator with scale set to 2 and delta set to 10
+@example
+-i INPUT -vf "hwupload, prewitt_opencl=scale=2:delta=10, hwdownload" OUTPUT
+@end example
+@end itemize
+
+@anchor{program_opencl}
+@section program_opencl
+
+Filter video using an OpenCL program.
+
+@table @option
+
+@item source
+OpenCL program source file.
+
+@item kernel
+Kernel name in program.
+
+@item inputs
+Number of inputs to the filter.  Defaults to 1.
+
+@item size, s
+Size of output frames.  Defaults to the same as the first input.
+
+@end table
+
+The program source file must contain a kernel function with the given name,
+which will be run once for each plane of the output.  Each run on a plane
+gets enqueued as a separate 2D global NDRange with one work-item for each
+pixel to be generated.  The global ID offset for each work-item is therefore
+the coordinates of a pixel in the destination image.
+
+The kernel function needs to take the following arguments:
+@itemize
+@item
+Destination image, @var{__write_only image2d_t}.
+
+This image will become the output; the kernel should write all of it.
+@item
+Frame index, @var{unsigned int}.
+
+This is a counter starting from zero and increasing by one for each frame.
+@item
+Source images, @var{__read_only image2d_t}.
+
+These are the most recent images on each input.  The kernel may read from
+them to generate the output, but they can't be written to.
+@end itemize
+
+@subsection Example programs:
+
+@itemize
+@item
+Copy the input to the output (output must be the same size as the input).
+@verbatim
+__kernel void copy(__write_only image2d_t destination,
+                   unsigned int index,
+                   __read_only  image2d_t source)
+{
+    const sampler_t sampler = CLK_NORMALIZED_COORDS_FALSE;
+
+    int2 location = (int2)(get_global_id(0), get_global_id(1));
+
+    float4 value = read_imagef(source, sampler, location);
+
+    write_imagef(destination, location, value);
+}
+@end verbatim
+
+@item
+Apply a simple transformation, rotating the input by an amount increasing
+with the index counter.  Pixel values are linearly interpolated by the
+sampler, and the output need not have the same dimensions as the input.
+@verbatim
+__kernel void rotate_image(__write_only image2d_t dst,
+                           unsigned int index,
+                           __read_only  image2d_t src)
+{
+    const sampler_t sampler = (CLK_NORMALIZED_COORDS_FALSE |
+                               CLK_FILTER_LINEAR);
+
+    float angle = (float)index / 100.0f;
+
+    float2 dst_dim = convert_float2(get_image_dim(dst));
+    float2 src_dim = convert_float2(get_image_dim(src));
+
+    float2 dst_cen = dst_dim / 2.0f;
+    float2 src_cen = src_dim / 2.0f;
+
+    int2   dst_loc = (int2)(get_global_id(0), get_global_id(1));
+
+    float2 dst_pos = convert_float2(dst_loc) - dst_cen;
+    float2 src_pos = {
+        cos(angle) * dst_pos.x - sin(angle) * dst_pos.y,
+        sin(angle) * dst_pos.x + cos(angle) * dst_pos.y
+    };
+    src_pos = src_pos * src_dim / dst_dim;
+
+    float2 src_loc = src_pos + src_cen;
+
+    if (src_loc.x < 0.0f      || src_loc.y < 0.0f ||
+        src_loc.x > src_dim.x || src_loc.y > src_dim.y)
+        write_imagef(dst, dst_loc, 0.5f);
+    else
+        write_imagef(dst, dst_loc, read_imagef(src, sampler, src_loc));
+}
+@end verbatim
+
+@item
+Blend two inputs together, with the amount of each input used varying
+with the index counter.
+@verbatim
+__kernel void blend_images(__write_only image2d_t dst,
+                           unsigned int index,
+                           __read_only  image2d_t src1,
+                           __read_only  image2d_t src2)
+{
+    const sampler_t sampler = (CLK_NORMALIZED_COORDS_FALSE |
+                               CLK_FILTER_LINEAR);
+
+    float blend = (cos((float)index / 50.0f) + 1.0f) / 2.0f;
+
+    int2  dst_loc = (int2)(get_global_id(0), get_global_id(1));
+    int2 src1_loc = dst_loc * get_image_dim(src1) / get_image_dim(dst);
+    int2 src2_loc = dst_loc * get_image_dim(src2) / get_image_dim(dst);
+
+    float4 val1 = read_imagef(src1, sampler, src1_loc);
+    float4 val2 = read_imagef(src2, sampler, src2_loc);
+
+    write_imagef(dst, dst_loc, val1 * blend + val2 * (1.0f - blend));
+}
+@end verbatim
+
+@end itemize
+
+@section roberts_opencl
+Apply roberts cross operator to input video stream.
+
+The filter accepts the following option:
+
+@table @option
+@item planes
+Set which planes will be processed, unprocessed planes will be copied.
+By default value @code{0xf}, all planes will be processed.
+
+@item scale
+Set value which will be multiplied with filtered result.
+By default value is @code{1}.
+
+@item delta
+Set value which will be added to filtered result.
+By default value is @code{0}.
+@end table
+
+@subsection Example
+
+@itemize
+@item
+Apply roberts cross operator with scale set to 2 and delta set to 10
+@example
+-i INPUT -vf "hwupload, roberts_opencl=scale=2:delta=10, hwdownload" OUTPUT
+@end example
+@end itemize
+
+@section sobel_opencl
+
+Apply sobel operator to input video stream.
+
+The filter accepts the following option:
+
+@table @option
+@item planes
+Set which planes will be processed, unprocessed planes will be copied.
+By default value @code{0xf}, all planes will be processed.
+
+@item scale
+Set value which will be multiplied with filtered result.
+By default value is @code{1}.
+
+@item delta
+Set value which will be added to filtered result.
+By default value is @code{0}.
+@end table
+
+@subsection Example
+
+@itemize
+@item
+Apply sobel operator with scale set to 2 and delta set to 10
+@example
+-i INPUT -vf "hwupload, sobel_opencl=scale=2:delta=10, hwdownload" OUTPUT
+@end example
+@end itemize
+
+@section tonemap_opencl
+Tone map colors from different dynamic ranges.
+
+@subsection Options
+The filter accepts the following options.
+
+@table @option
+@item tonemap
+Set the tone map algorithm to use.
+
+Possible values are:
+@table @var
+@item none
+Do not apply any tone map, only desaturate overbright pixels.
+
+@item clip
+Hard-clip any out-of-range values. Use it for perfect color accuracy for
+in-range values, while distorting out-of-range values.
+
+@item linear
+Stretch the entire reference gamut to a linear multiple of the display.
+
+@item gamma
+Fit a logarithmic transfer between the tone curves.
+
+@item reinhard
+Preserve overall image brightness with a simple curve, using nonlinear
+contrast, which results in flattening details and degrading color accuracy.
+
+@item hable
+Preserve both dark and bright details better than @var{reinhard}, at the cost
+of slightly darkening everything. Use it when detail preservation is more
+important than color and brightness accuracy.
+
+@item mobius
+Smoothly map out-of-range values, while retaining contrast and colors for
+in-range material as much as possible. Use it when color accuracy is more
+important than detail preservation.
+@end table
+
+Default is none.
+
+@item param
+Tune the tone mapping algorithm.
+
+This affects the following algorithms:
+@table @var
+@item none
+Ignored.
+
+@item linear
+Specifies the scale factor to use while stretching.
+Default to 1.0.
+
+@item gamma
+Specifies the exponent of the function.
+Default to 1.8.
+
+@item clip
+Specify an extra linear coefficient to multiply into the signal before clipping.
+Default to 1.0.
+
+@item reinhard
+Specify the local contrast coefficient at the display peak.
+Default to 0.5, which means that in-gamut values will be about half as bright
+as when clipping.
+
+@item hable
+Ignored.
+
+@item mobius
+Specify the transition point from linear to mobius transform. Every value
+below this point is guaranteed to be mapped 1:1. The higher the value, the
+more accurate the result will be, at the cost of losing bright details.
+Default to 0.3, which due to the steep initial slope still preserves in-range
+colors fairly accurately.
+@end table
+
+@item desat
+Apply desaturation for highlights that exceed this level of brightness. The
+higher the parameter, the more color information will be preserved. This
+setting helps prevent unnaturally blown-out colors for super-highlights, by
+(smoothly) turning into white instead. This makes images feel more natural,
+at the cost of reducing information about out-of-range colors.
+
+The default of 2.0 is somewhat conservative and will mostly just apply to
+skies or directly sunlit surfaces. A setting of 0.0 disables this option.
+
+This option works only if the input frame has a supported color tag.
+
+@item peak
+Override signal/nominal/reference peak with this value. Useful when the
+embedded peak information in display metadata is not reliable or when tone
+mapping from a lower range to a higher range.
+@end table
+
+@section unsharp_opencl
+
+Sharpen or blur the input video.
+
+It accepts the following parameters:
+
+@table @option
+@item luma_msize_x, lx
+Set the luma matrix horizontal size. It must be an odd integer between
+@code{3} and @code{23}. The default value is @code{5}.
+
+@item luma_msize_y, ly
+Set the luma matrix vertical size. It must be an odd integer between @code{3}
+and @code{23}. The default value is @code{5}.
+
+@item luma_amount, la
+Set the luma effect strength. It must be a floating point number, reasonable
+values lay between @code{-1.5} and @code{1.5}.
+
+Negative values will blur the input video, while positive values will
+sharpen it, a value of zero will disable the effect.
+
+Default value is @code{1.0}.
+
+@item chroma_msize_x, cx
+Set the chroma matrix horizontal size. It must be an odd integer
+between @code{3} and @code{23}. The default value is @code{5}.
+
+@item chroma_msize_y, cy
+Set the chroma matrix vertical size. It must be an odd integer
+between @code{3} and @code{23}. The default value is @code{5}.
+
+@item chroma_amount, ca
+Set the chroma effect strength. It must be a floating point number, reasonable
+values lay between @code{-1.5} and @code{1.5}.
+
+Negative values will blur the input video, while positive values will
+sharpen it, a value of zero will disable the effect.
+
+Default value is @code{0.0}.
+
+@end table
+
+All parameters are optional and default to the equivalent of the
+string '5:5:1.0:5:5:0.0'.
+
+@subsection Examples
+
+@itemize
+@item
+Apply strong luma sharpen effect:
+@example
+-i INPUT -vf "hwupload, unsharp_opencl=luma_msize_x=7:luma_msize_y=7:luma_amount=2.5, hwdownload" OUTPUT
+@end example
+
+@item
+Apply a strong blur of both luma and chroma parameters:
+@example
+-i INPUT -vf "hwupload, unsharp_opencl=7:7:-2:7:7:-2, hwdownload" OUTPUT
+@end example
+@end itemize
+
+@c man end OPENCL VIDEO FILTERS
+
 @chapter Video Sources
 @c man begin VIDEO SOURCES