diff mbox series

[FFmpeg-devel,v8,06/13] avfilter/overlay_graphicsubs: Add overlay_graphicsubs and graphicsub2video filters

Message ID MN2PR04MB59811D9EE2AF8E1188FC8838BAA19@MN2PR04MB5981.namprd04.prod.outlook.com
State Superseded, archived
Headers show
Series [FFmpeg-devel,v8,01/13] global: Prepare AVFrame for subtitle handling
Related show

Checks

Context Check Description
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished
andriy/make_ppc success Make finished
andriy/make_fate_ppc success Make fate finished

Commit Message

Soft Works Sept. 21, 2021, 11:54 p.m. UTC
- overlay_graphicsubs (VS -> V)
  Overlay graphic subtitles onto a video stream

- graphicsub2video {S -> V)
  Converts graphic subtitles to video frames (with alpha)
  Gets auto-inserted for retaining compatibility with
  sub2video command lines

Signed-off-by: softworkz <softworkz@hotmail.com>
---
 doc/filters.texi                     | 104 ++++
 libavfilter/Makefile                 |   2 +
 libavfilter/allfilters.c             |   2 +
 libavfilter/vf_overlay_graphicsubs.c | 730 +++++++++++++++++++++++++++
 4 files changed, 838 insertions(+)
 create mode 100644 libavfilter/vf_overlay_graphicsubs.c

Comments

Andreas Rheinhardt Sept. 22, 2021, 4:24 a.m. UTC | #1
Soft Works:
> - overlay_graphicsubs (VS -> V)
>   Overlay graphic subtitles onto a video stream
> 
> - graphicsub2video {S -> V)
>   Converts graphic subtitles to video frames (with alpha)
>   Gets auto-inserted for retaining compatibility with
>   sub2video command lines
> 
> Signed-off-by: softworkz <softworkz@hotmail.com>
> ---
>  doc/filters.texi                     | 104 ++++
>  libavfilter/Makefile                 |   2 +
>  libavfilter/allfilters.c             |   2 +
>  libavfilter/vf_overlay_graphicsubs.c | 730 +++++++++++++++++++++++++++
>  4 files changed, 838 insertions(+)
>  create mode 100644 libavfilter/vf_overlay_graphicsubs.c
> 
> diff --git a/doc/filters.texi b/doc/filters.texi
> index 94161003c3..9ce956e507 100644
> --- a/doc/filters.texi
> +++ b/doc/filters.texi
> @@ -25079,6 +25079,110 @@ tools.
>  
>  @c man end VIDEO SINKS
>  
> +@chapter Subtitle Filters
> +@c man begin SUBTITLE FILTERS
> +
> +When you configure your FFmpeg build, you can disable any of the
> +existing filters using @code{--disable-filters}.
> +
> +Below is a description of the currently available subtitle filters.
> +
> +@section graphicsub2video
> +
> +Renders graphic subtitles as video frames. 
> +
> +This filter replaces the previous "sub2video" hack which did the conversion implicitly and up-front as subtitle filtering wasn't possible at that time.
> +To retain compatibility with earlier sub2video command lines, this filter is being auto-inserted in those cases.
> +
> +For overlaying graphicsal subtitles it is recommended to use the 'overlay_graphicsubs' filter which is more efficient and takes less processing resources.
> +
> +This filter is still useful in cases where the overlay is done with hardware acceleration (e.g. overlay_qsv, overlay_vaapi, overlay_cuda) for preparing the overlay frames.
> +
> +It accepts the following parameters:
> +
> +@table @option
> +@item size, s
> +Set the size of the output video frame.
> +
> +@end table
> +
> +@subsection Examples
> +
> +@itemize
> +@item
> +Overlay PGS subtitles
> +(not recommended - better use overlay_graphicsubs)
> +@example
> +ffmpeg -i "https://streams.videolan.org/samples/sub/PGS/Girl_With_The_Dragon_Tattoo_2%3A23%3A56.mkv" -filter_complex "[0:1]graphicsub2video[subs];[0:0][subs]overlay" output.mp4
> +@end example
> +
> +@item
> +Overlay PGS subtitles implicitly 
> +The graphicsub2video is inserted automatically for compatibility with legacy command lines. 
> +@example
> +ffmpeg -i "https://streams.videolan.org/samples/sub/PGS/Girl_With_The_Dragon_Tattoo_2%3A23%3A56.mkv" -filter_complex "[0:0][0:1]overlay" output.mp4
> +@end example
> +@end itemize
> +
> +@section overlay_graphicsubs
> +
> +Overlay graphic subtitles onto a video stream.
> +
> +This filter can blend graphical subtitles on a video stream directly, i.e. without creating full-size alpha images first.
> +The blending operation is limited to the area of the subtitle rectangles, which also means that no processing is done at times where no subtitles are to be displayed.
> +
> +
> +It accepts the following parameters:
> +
> +@table @option
> +@item x
> +@item y
> +Set the expression for the x and y coordinates of the overlaid video
> +on the main video. Default value is "0" for both expressions. In case
> +the expression is invalid, it is set to a huge value (meaning that the
> +overlay will not be displayed within the output visible area).
> +
> +@item eof_action
> +See @ref{framesync}.
> +
> +@item eval
> +Set when the expressions for @option{x}, and @option{y} are evaluated.
> +
> +It accepts the following values:
> +@table @samp
> +@item init
> +only evaluate expressions once during the filter initialization or
> +when a command is processed
> +
> +@item frame
> +evaluate expressions for each incoming frame
> +@end table
> +
> +Default value is @samp{frame}.
> +
> +@item shortest
> +See @ref{framesync}.
> +
> +@end table
> +
> +@subsection Examples
> +
> +@itemize
> +@item
> +Overlay PGS subtitles
> +@example
> +ffmpeg -i "https://streams.videolan.org/samples/sub/PGS/Girl_With_The_Dragon_Tattoo_2%3A23%3A56.mkv" -filter_complex "[0:1]graphicsub2video[subs];[0:0][subs]overlay" output.mp4
> +@end example
> +
> +@item
> +Overlay PGS subtitles implicitly 
> +The graphicsub2video is inserted automatically for compatibility with legacy command lines. 
> +@example
> +ffmpeg -i "https://streams.videolan.org/samples/sub/PGS/Girl_With_The_Dragon_Tattoo_2%3A23%3A56.mkv" -filter_complex "[0:0][0:1]overlay" output.mp4
> +@end example
> +@end itemize
> +@c man end SUBTITLE FILTERS
> +
>  @chapter Multimedia Filters
>  @c man begin MULTIMEDIA FILTERS
>  
> diff --git a/libavfilter/Makefile b/libavfilter/Makefile
> index 041d3c5382..8fcc25989e 100644
> --- a/libavfilter/Makefile
> +++ b/libavfilter/Makefile
> @@ -290,6 +290,7 @@ OBJS-$(CONFIG_FSPP_FILTER)                   += vf_fspp.o qp_table.o
>  OBJS-$(CONFIG_GBLUR_FILTER)                  += vf_gblur.o
>  OBJS-$(CONFIG_GEQ_FILTER)                    += vf_geq.o
>  OBJS-$(CONFIG_GRADFUN_FILTER)                += vf_gradfun.o
> +OBJS-$(CONFIG_GRAPHICSUB2VIDEO_FILTER)       += vf_overlay_graphicsubs.o framesync.o
>  OBJS-$(CONFIG_GRAPHMONITOR_FILTER)           += f_graphmonitor.o
>  OBJS-$(CONFIG_GRAYWORLD_FILTER)              += vf_grayworld.o
>  OBJS-$(CONFIG_GREYEDGE_FILTER)               += vf_colorconstancy.o
> @@ -363,6 +364,7 @@ OBJS-$(CONFIG_OVERLAY_CUDA_FILTER)           += vf_overlay_cuda.o framesync.o vf
>  OBJS-$(CONFIG_OVERLAY_OPENCL_FILTER)         += vf_overlay_opencl.o opencl.o \
>                                                  opencl/overlay.o framesync.o
>  OBJS-$(CONFIG_OVERLAY_QSV_FILTER)            += vf_overlay_qsv.o framesync.o
> +OBJS-$(CONFIG_OVERLAY_GRAPHICSUBS_FILTER)    += vf_overlay_graphicsubs.o framesync.o
>  OBJS-$(CONFIG_OVERLAY_VULKAN_FILTER)         += vf_overlay_vulkan.o vulkan.o
>  OBJS-$(CONFIG_OWDENOISE_FILTER)              += vf_owdenoise.o
>  OBJS-$(CONFIG_PAD_FILTER)                    += vf_pad.o
> diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
> index 154eba5bb2..10a310d20d 100644
> --- a/libavfilter/allfilters.c
> +++ b/libavfilter/allfilters.c
> @@ -345,6 +345,7 @@ extern const AVFilter ff_vf_oscilloscope;
>  extern const AVFilter ff_vf_overlay;
>  extern const AVFilter ff_vf_overlay_opencl;
>  extern const AVFilter ff_vf_overlay_qsv;
> +extern const AVFilter ff_vf_overlay_graphicsubs;
>  extern const AVFilter ff_vf_overlay_vulkan;
>  extern const AVFilter ff_vf_overlay_cuda;
>  extern const AVFilter ff_vf_owdenoise;
> @@ -524,6 +525,7 @@ extern const AVFilter ff_avf_showvolume;
>  extern const AVFilter ff_avf_showwaves;
>  extern const AVFilter ff_avf_showwavespic;
>  extern const AVFilter ff_vaf_spectrumsynth;
> +extern const AVFilter ff_svf_graphicsub2video;
>  
>  /* multimedia sources */
>  extern const AVFilter ff_avsrc_amovie;
> diff --git a/libavfilter/vf_overlay_graphicsubs.c b/libavfilter/vf_overlay_graphicsubs.c
> new file mode 100644
> index 0000000000..b71b34abc4
> --- /dev/null
> +++ b/libavfilter/vf_overlay_graphicsubs.c
> @@ -0,0 +1,730 @@
> +/*
> + * Copyright (c) 2021 softworkz (derived from vf_overlay)
> + * Copyright (c) 2010 Stefano Sabatini
> + * Copyright (c) 2010 Baptiste Coudurier
> + * Copyright (c) 2007 Bobby Bingham
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +/**
> + * @file
> + * overlay graphical subtitles on top of a video frame
> + */
> +
> +#include "avfilter.h"
> +#include "formats.h"
> +#include "libavutil/common.h"
> +#include "libavutil/eval.h"
> +#include "libavutil/avstring.h"
> +#include "libavutil/pixdesc.h"
> +#include "libavutil/imgutils.h"
> +#include "libavutil/opt.h"
> +#include "internal.h"
> +#include "drawutils.h"
> +#include "framesync.h"
> +
> +#include "libavcodec/avcodec.h"

What do you actually need all those avcodec.h inclusions for now that
you moved the AV_SUBTITLE_FMT_* enum values to libavutil? I guess it is
no longer necessary now, isn't it?

> +
> +enum var_name {
> +    VAR_MAIN_W,    VAR_MW,
> +    VAR_MAIN_H,    VAR_MH,
> +    VAR_OVERLAY_W, VAR_OW,
> +    VAR_OVERLAY_H, VAR_OH,
> +    VAR_HSUB,
> +    VAR_VSUB,
> +    VAR_X,
> +    VAR_Y,
> +    VAR_N,
> +    VAR_POS,
> +    VAR_T,
> +    VAR_VARS_NB
> +};
> +
> +typedef struct OverlaySubsContext {
> +    const AVClass *class;
> +    int x, y;                   ///< position of overlaid picture
> +    int w, h;
> +    AVFrame *outpicref;
> +
> +    int main_is_packed_rgb;
> +    uint8_t main_rgba_map[4];
> +    int main_has_alpha;
> +    uint8_t overlay_rgba_map[4];
> +    int eval_mode;              ///< EvalMode
> +
> +    FFFrameSync fs;
> +
> +    int main_pix_step[4];       ///< steps per pixel for each plane of the main output
> +    int hsub, vsub;             ///< chroma subsampling values
> +    const AVPixFmtDescriptor *main_desc; ///< format descriptor for main input
> +
> +    double var_values[VAR_VARS_NB];
> +    char *x_expr, *y_expr;
> +
> +    AVExpr *x_pexpr, *y_pexpr;
> +} OverlaySubsContext;
> +
> +static const char *const var_names[] = {
> +    "main_w",    "W", ///< width  of the main    video
> +    "main_h",    "H", ///< height of the main    video
> +    "overlay_w", "w", ///< width  of the overlay video
> +    "overlay_h", "h", ///< height of the overlay video
> +    "hsub",
> +    "vsub",
> +    "x",
> +    "y",
> +    "n",            ///< number of frame
> +    "pos",          ///< position in the file
> +    "t",            ///< timestamp expressed in seconds
> +    NULL
> +};
> +
> +#define MAIN    0
> +#define OVERLAY 1
> +
> +#define R 0
> +#define G 1
> +#define B 2
> +#define A 3
> +
> +#define Y 0
> +#define U 1
> +#define V 2
> +
> +enum EvalMode {
> +    EVAL_MODE_INIT,
> +    EVAL_MODE_FRAME,
> +    EVAL_MODE_NB
> +};
> +
> +static av_cold void overlay_graphicsubs_uninit(AVFilterContext *ctx)
> +{
> +    OverlaySubsContext *s = ctx->priv;
> +
> +    ff_framesync_uninit(&s->fs);
> +    av_expr_free(s->x_pexpr); s->x_pexpr = NULL;
> +    av_expr_free(s->y_pexpr); s->y_pexpr = NULL;
> +}
> +
> +static inline int normalize_xy(double d, int chroma_sub)
> +{
> +    if (isnan(d))
> +        return INT_MAX;
> +    return (int)d & ~((1 << chroma_sub) - 1);
> +}
> +
> +static void eval_expr(AVFilterContext *ctx)
> +{
> +    OverlaySubsContext *s = ctx->priv;
> +
> +    s->var_values[VAR_X] = av_expr_eval(s->x_pexpr, s->var_values, NULL);
> +    s->var_values[VAR_Y] = av_expr_eval(s->y_pexpr, s->var_values, NULL);
> +    /* It is necessary if x is expressed from y  */
> +    s->var_values[VAR_X] = av_expr_eval(s->x_pexpr, s->var_values, NULL);
> +    s->x = normalize_xy(s->var_values[VAR_X], s->hsub);
> +    s->y = normalize_xy(s->var_values[VAR_Y], s->vsub);
> +}
> +
> +static int set_expr(AVExpr **pexpr, const char *expr, const char *option, void *log_ctx)
> +{
> +    int ret;
> +    AVExpr *old = NULL;
> +
> +    if (*pexpr)
> +        old = *pexpr;
> +    ret = av_expr_parse(pexpr, expr, var_names,
> +                        NULL, NULL, NULL, NULL, 0, log_ctx);
> +    if (ret < 0) {
> +        av_log(log_ctx, AV_LOG_ERROR,
> +               "Error when evaluating the expression '%s' for %s\n",
> +               expr, option);
> +        *pexpr = old;
> +        return ret;
> +    }
> +
> +    av_expr_free(old);
> +    return 0;
> +}
> +
> +static int overlay_graphicsubs_query_formats(AVFilterContext *ctx)
> +{
> +    AVFilterFormats *formats;
> +    AVFilterLink *inlink0 = ctx->inputs[0];
> +    AVFilterLink *inlink1 = ctx->inputs[1];
> +    AVFilterLink *outlink = ctx->outputs[0];
> +    int ret;
> +    static const enum AVSubtitleType subtitle_fmts[] = { AV_SUBTITLE_FMT_BITMAP, AV_SUBTITLE_FMT_NONE };
> +    static const enum AVPixelFormat supported_pix_fmts[] = {
> +        AV_PIX_FMT_YUV420P, AV_PIX_FMT_YUV422P, AV_PIX_FMT_YUV444P,
> +        AV_PIX_FMT_ARGB,  AV_PIX_FMT_RGBA,
> +        AV_PIX_FMT_ABGR,  AV_PIX_FMT_BGRA,
> +        AV_PIX_FMT_RGB24, AV_PIX_FMT_BGR24,
> +        AV_PIX_FMT_NONE
> +    };
> +
> +    /* set input0 video formats */
> +    formats = ff_make_format_list(supported_pix_fmts);
> +    if ((ret = ff_formats_ref(formats, &inlink0->outcfg.formats)) < 0)
> +        return ret;
> +
> +    /* set input1 subtitle formats */
> +    formats = ff_make_format_list(subtitle_fmts);
> +    if ((ret = ff_formats_ref(formats, &inlink1->outcfg.formats)) < 0)
> +        return ret;
> +
> +    /* set output0 video formats */
> +    formats = ff_make_format_list(supported_pix_fmts);

You are using different formats list for the video input and output.
This means that your filter advertises to be able to accept e.g.
AV_PIX_FMT_YUV420P and output AV_PIX_FMT_YUV444P (or even
AV_PIX_FMT_ARGB). I just don't see the code for that.

> +    if ((ret = ff_formats_ref(formats, &outlink->incfg.formats)) < 0)
> +        return ret;
> +
> +    return 0;
> +}
> +
> +static int config_output(AVFilterLink *outlink)
> +{
> +    AVFilterContext *ctx = outlink->src;
> +    OverlaySubsContext *s = ctx->priv;
> +    int ret;
> +
> +    if ((ret = ff_framesync_init_dualinput(&s->fs, ctx)) < 0)
> +        return ret;
> +
> +    outlink->w = ctx->inputs[MAIN]->w;
> +    outlink->h = ctx->inputs[MAIN]->h;
> +    outlink->time_base = ctx->inputs[MAIN]->time_base;
> +
> +    return ff_framesync_configure(&s->fs);
> +}
> +
> +// divide by 255 and round to nearest
> +// apply a fast variant: (X+127)/255 = ((X+127)*257+257)>>16 = ((X+128)*257)>>16
> +#define FAST_DIV255(x) ((((x) + 128) * 257) >> 16)
> +
> +// calculate the non-pre-multiplied alpha, applying the general equation:
> +// alpha = alpha_overlay / ( (alpha_main + alpha_overlay) - (alpha_main * alpha_overlay) )
> +// (((x) << 16) - ((x) << 9) + (x)) is a faster version of: 255 * 255 * x
> +// ((((x) + (y)) << 8) - ((x) + (y)) - (y) * (x)) is a faster version of: 255 * (x + y)
> +#define UNPREMULTIPLY_ALPHA(x, y) ((((x) << 16) - ((x) << 9) + (x)) / ((((x) + (y)) << 8) - ((x) + (y)) - (y) * (x)))
> +
> +/**
> + * Blend image in src to destination buffer dst at position (x, y).
> + */

This whole code looks quite duplicated from the ordinary overlay.

> +static av_always_inline void blend_packed_rgb(const AVFilterContext *ctx,
> +    const AVFrame *dst, const AVSubtitleArea *src,
> +    int x, int y,
> +    int is_straight)
> +{
> +    OverlaySubsContext *s = ctx->priv;
> +    int i, imax, j, jmax;
> +    const int src_w = src->w;
> +    const int src_h = src->h;
> +    const int dst_w = dst->width;
> +    const int dst_h = dst->height;
> +    uint8_t alpha;          ///< the amount of overlay to blend on to main
> +    const int dr = s->main_rgba_map[R];
> +    const int dg = s->main_rgba_map[G];
> +    const int db = s->main_rgba_map[B];
> +    const int da = s->main_rgba_map[A];
> +    const int dstep = s->main_pix_step[0];
> +    const int sr = s->overlay_rgba_map[R];
> +    const int sg = s->overlay_rgba_map[G];
> +    const int sb = s->overlay_rgba_map[B];
> +    const int sa = s->overlay_rgba_map[A];
> +    int slice_start, slice_end;
> +    uint8_t *S, *sp, *d, *dp;
> +
> +    i = FFMAX(-y, 0);
> +    imax = FFMIN3(-y + dst_h, FFMIN(src_h, dst_h), y + src_h);
> +
> +    slice_start = i;
> +    slice_end = i + imax;
> +
> +    sp = src->buf[0]->data + slice_start       * src->linesize[0];
> +    dp = dst->data[0] + (slice_start + y) * dst->linesize[0];
> +
> +    for (i = slice_start; i < slice_end; i++) {
> +        j = FFMAX(-x, 0);
> +        S = sp + j;
> +        d = dp + ((x + j) * dstep);
> +
> +        for (jmax = FFMIN(-x + dst_w, src_w); j < jmax; j++) {
> +            uint32_t val = src->pal[*S];
> +            const uint8_t *sval = (uint8_t *)&val;
> +            alpha = sval[sa];
> +
> +            // if the main channel has an alpha channel, alpha has to be calculated
> +            // to create an un-premultiplied (straight) alpha value
> +            if (s->main_has_alpha && alpha != 0 && alpha != 255) {
> +                const uint8_t alpha_d = d[da];
> +                alpha = UNPREMULTIPLY_ALPHA(alpha, alpha_d);
> +            }
> +
> +            switch (alpha) {
> +            case 0:
> +                break;
> +            case 255:
> +                d[dr] = sval[sr];
> +                d[dg] = sval[sg];
> +                d[db] = sval[sb];
> +                break;
> +            default:
> +                // main_value = main_value * (1 - alpha) + overlay_value * alpha
> +                // since alpha is in the range 0-255, the result must divided by 255
> +                d[dr] = is_straight ? FAST_DIV255(d[dr] * (255 - alpha) + sval[sr] * alpha) :
> +                        FFMIN(FAST_DIV255(d[dr] * (255 - alpha)) + sval[sr], 255);
> +                d[dg] = is_straight ? FAST_DIV255(d[dg] * (255 - alpha) + sval[sg] * alpha) :
> +                        FFMIN(FAST_DIV255(d[dg] * (255 - alpha)) + sval[sg], 255);
> +                d[db] = is_straight ? FAST_DIV255(d[db] * (255 - alpha) + sval[sb] * alpha) :
> +                        FFMIN(FAST_DIV255(d[db] * (255 - alpha)) + sval[sb], 255);
> +            }
> +
> +            if (s->main_has_alpha) {
> +                switch (alpha) {
> +                case 0:
> +                    break;
> +                case 255:
> +                    d[da] = sval[sa];
> +                    break;
> +                default:
> +                    // apply alpha compositing: main_alpha += (1-main_alpha) * overlay_alpha
> +                    d[da] += FAST_DIV255((255 - d[da]) * S[sa]);
> +                }
> +            }
> +            d += dstep;
> +            S += 1;
> +        }
> +        dp += dst->linesize[0];
> +        sp += src->linesize[0];
> +    }
> +}
> +
> +static av_always_inline void blend_plane_8_8bits(const AVFilterContext *ctx, const AVFrame *dst, const AVSubtitleArea *area,
> +    const uint32_t *yuv_pal, int src_w, int src_h, int dst_w, int dst_h, int plane, int hsub, int vsub,
> +    int x, int y, int dst_plane, int dst_offset, int dst_step)
> +{
> +    const int src_wp = AV_CEIL_RSHIFT(src_w, hsub);
> +    const int src_hp = AV_CEIL_RSHIFT(src_h, vsub);
> +    const int dst_wp = AV_CEIL_RSHIFT(dst_w, hsub);
> +    const int dst_hp = AV_CEIL_RSHIFT(dst_h, vsub);
> +    const int yp = y >> vsub;
> +    const int xp = x >> hsub;
> +    uint8_t *s, *sp, *d, *dp, *dap;
> +    int imax, i, j, jmax;
> +    int slice_start, slice_end;
> +
> +    i = FFMAX(-yp, 0);                                                                                     \
> +    imax = FFMIN3(-yp + dst_hp, FFMIN(src_hp, dst_hp), yp + src_hp);                                       \
> +
> +    slice_start = i;
> +    slice_end = i + imax;
> +
> +    sp = area->buf[0]->data + (slice_start << vsub) * area->linesize[0];
> +    dp = dst->data[dst_plane] + (yp + slice_start) * dst->linesize[dst_plane] + dst_offset;
> +
> +    dap = dst->data[3] + ((yp + slice_start) << vsub) * dst->linesize[3];
> +
> +    for (i = slice_start; i < slice_end; i++) {
> +        j = FFMAX(-xp, 0);
> +        d = dp + (xp + j) * dst_step;
> +        s = sp + (j << hsub);
> +        jmax = FFMIN(-xp + dst_wp, src_wp);    
> +
> +        for (; j < jmax; j++) {
> +            uint32_t val = yuv_pal[*s];
> +            const uint8_t *sval = (uint8_t *)&val;
> +            const int alpha = sval[3];
> +            const int max = 255, mid = 128;
> +            const int d_int = *d;
> +            const int sval_int = sval[plane];
> +
> +            switch (alpha) {
> +            case 0:
> +                break;
> +            case 255:
> +                *d = sval[plane];
> +                break;
> +            default:
> +                if (plane > 0)
> +                    *d = av_clip(FAST_DIV255((d_int - mid) * (max - alpha) + (sval_int - mid) * alpha) , -mid, mid) + mid;
> +                else
> +                    *d = FAST_DIV255(d_int * (max - alpha) + sval_int * alpha);
> +                break;
> +            }
> +
> +            d += dst_step;
> +            s += 1 << hsub;
> +        }
> +        dp += dst->linesize[dst_plane];
> +        sp +=  (1 << vsub) * area->linesize[0];
> +        dap += (1 << vsub) * dst->linesize[3];
> +    }
> +}
> +
> +#define RGB2Y(r, g, b) (uint8_t)(((66 * (r) + 129 * (g) +  25 * (b) + 128) >> 8) +  16)
> +#define RGB2U(r, g, b) (uint8_t)(((-38 * (r) - 74 * (g) + 112 * (b) + 128) >> 8) + 128)
> +#define RGB2V(r, g, b) (uint8_t)(((112 * (r) - 94 * (g) -  18 * (b) + 128) >> 8) + 128)
> +/* Converts R8 G8 B8 color to YUV. */
> +static av_always_inline void rgb_2_yuv(uint8_t r, uint8_t g, uint8_t b, uint8_t* y, uint8_t* u, uint8_t* v)
> +{
> +    *y = RGB2Y((int)r, (int)g, (int)b);
> +    *u = RGB2U((int)r, (int)g, (int)b);
> +    *v = RGB2V((int)r, (int)g, (int)b);
> +}
> +
> +
> +static av_always_inline void blend_yuv_8_8bits(AVFilterContext *ctx, AVFrame *dst, const AVSubtitleArea *area, int hsub, int vsub, int x, int y)
> +{
> +    OverlaySubsContext *s = ctx->priv;
> +    const int src_w = area->w;
> +    const int src_h = area->h;
> +    const int dst_w = dst->width;
> +    const int dst_h = dst->height;
> +    const int sr = s->overlay_rgba_map[R];
> +    const int sg = s->overlay_rgba_map[G];
> +    const int sb = s->overlay_rgba_map[B];
> +    const int sa = s->overlay_rgba_map[A];
> +    uint32_t yuvpal[256];
> +
> +    for (int i = 0; i < 256; ++i) {
> +        const uint8_t *rgba = (const uint8_t *)&area->pal[i];
> +        uint8_t *yuva = (uint8_t *)&yuvpal[i];
> +        rgb_2_yuv(rgba[sr], rgba[sg], rgba[sb], &yuva[Y], &yuva[U], &yuva[V]);
> +        yuva[3] = rgba[sa];
> +    }
> +
> +    blend_plane_8_8bits(ctx, dst, area, yuvpal, src_w, src_h, dst_w, dst_h, Y, 0,    0,    x, y, s->main_desc->comp[Y].plane, s->main_desc->comp[Y].offset, s->main_desc->comp[Y].step);
> +    blend_plane_8_8bits(ctx, dst, area, yuvpal, src_w, src_h, dst_w, dst_h, U, hsub, vsub, x, y, s->main_desc->comp[U].plane, s->main_desc->comp[U].offset, s->main_desc->comp[U].step);
> +    blend_plane_8_8bits(ctx, dst, area, yuvpal, src_w, src_h, dst_w, dst_h, V, hsub, vsub, x, y, s->main_desc->comp[V].plane, s->main_desc->comp[V].offset, s->main_desc->comp[V].step);
> +}
> +
> +static int config_input_main(AVFilterLink *inlink)
> +{
> +    int ret;
> +    AVFilterContext *ctx  = inlink->dst;
> +    OverlaySubsContext *s = inlink->dst->priv;
> +    const AVPixFmtDescriptor *pix_desc = av_pix_fmt_desc_get(inlink->format);
> +
> +    av_image_fill_max_pixsteps(s->main_pix_step,    NULL, pix_desc);
> +    ff_fill_rgba_map(s->overlay_rgba_map, AV_PIX_FMT_RGB32); // it's actually AV_PIX_FMT_PAL8);
> +
> +    s->hsub = pix_desc->log2_chroma_w;
> +    s->vsub = pix_desc->log2_chroma_h;
> +
> +    s->main_desc = pix_desc;
> +
> +    s->main_is_packed_rgb = ff_fill_rgba_map(s->main_rgba_map, inlink->format) >= 0;
> +    s->main_has_alpha = !!(pix_desc->flags & AV_PIX_FMT_FLAG_ALPHA);
> +
> +    /* Finish the configuration by evaluating the expressions
> +       now when both inputs are configured. */
> +    s->var_values[VAR_MAIN_W   ] = s->var_values[VAR_MW] = ctx->inputs[MAIN   ]->w;
> +    s->var_values[VAR_MAIN_H   ] = s->var_values[VAR_MH] = ctx->inputs[MAIN   ]->h;
> +    s->var_values[VAR_OVERLAY_W] = s->var_values[VAR_OW] = ctx->inputs[OVERLAY]->w;
> +    s->var_values[VAR_OVERLAY_H] = s->var_values[VAR_OH] = ctx->inputs[OVERLAY]->h;
> +    s->var_values[VAR_HSUB]  = 1<<pix_desc->log2_chroma_w;
> +    s->var_values[VAR_VSUB]  = 1<<pix_desc->log2_chroma_h;
> +    s->var_values[VAR_X]     = NAN;
> +    s->var_values[VAR_Y]     = NAN;
> +    s->var_values[VAR_N]     = 0;
> +    s->var_values[VAR_T]     = NAN;
> +    s->var_values[VAR_POS]   = NAN;
> +
> +    if ((ret = set_expr(&s->x_pexpr,      s->x_expr,      "x",      ctx)) < 0 ||
> +        (ret = set_expr(&s->y_pexpr,      s->y_expr,      "y",      ctx)) < 0)
> +        return ret;
> +
> +    if (s->eval_mode == EVAL_MODE_INIT) {
> +        eval_expr(ctx);
> +        av_log(ctx, AV_LOG_VERBOSE, "x:%f xi:%d y:%f yi:%d\n",
> +               s->var_values[VAR_X], s->x,
> +               s->var_values[VAR_Y], s->y);
> +    }
> +
> +    av_log(ctx, AV_LOG_VERBOSE,
> +           "main w:%d h:%d fmt:%s overlay w:%d h:%d fmt:%s\n",
> +           ctx->inputs[MAIN]->w, ctx->inputs[MAIN]->h,
> +           av_get_pix_fmt_name(ctx->inputs[MAIN]->format),
> +           ctx->inputs[OVERLAY]->w, ctx->inputs[OVERLAY]->h,
> +           av_get_pix_fmt_name(ctx->inputs[OVERLAY]->format));
> +    return 0;
> +}
> +
> +static int do_blend(FFFrameSync *fs)
> +{
> +    AVFilterContext *ctx = fs->parent;
> +    AVFrame *mainpic, *second;
> +    OverlaySubsContext *s = ctx->priv;
> +    AVFilterLink *inlink = ctx->inputs[0];
> +    unsigned i;
> +    int ret;
> +
> +    ret = ff_framesync_dualinput_get_writable(fs, &mainpic, &second);
> +    if (ret < 0)
> +        return ret;
> +    if (!second)
> +        return ff_filter_frame(ctx->outputs[0], mainpic);
> +
> +    if (s->eval_mode == EVAL_MODE_FRAME) {
> +        int64_t pos = mainpic->pkt_pos;
> +
> +        s->var_values[VAR_N] = (double)inlink->frame_count_out;
> +        s->var_values[VAR_T] = mainpic->pts == AV_NOPTS_VALUE ?
> +            NAN :(double)mainpic->pts * av_q2d(inlink->time_base);
> +        s->var_values[VAR_POS] = pos == -1 ? NAN : (double)pos;
> +
> +        s->var_values[VAR_OVERLAY_W] = s->var_values[VAR_OW] = second->width;
> +        s->var_values[VAR_OVERLAY_H] = s->var_values[VAR_OH] = second->height;
> +        s->var_values[VAR_MAIN_W   ] = s->var_values[VAR_MW] = mainpic->width;
> +        s->var_values[VAR_MAIN_H   ] = s->var_values[VAR_MH] = mainpic->height;
> +
> +        eval_expr(ctx);
> +        av_log(ctx, AV_LOG_DEBUG, "n:%f t:%f pos:%f x:%f xi:%d y:%f yi:%d\n",
> +               s->var_values[VAR_N], s->var_values[VAR_T], s->var_values[VAR_POS],
> +               s->var_values[VAR_X], s->x,
> +               s->var_values[VAR_Y], s->y);
> +    }
> +
> +    for (i = 0; i < second->num_subtitle_areas; i++) {
> +        const AVSubtitleArea *sub_area = second->subtitle_areas[i];
> +
> +        if (sub_area->type != AV_SUBTITLE_FMT_BITMAP) {
> +            av_log(NULL, AV_LOG_WARNING, "overlay_graphicsub: non-bitmap subtitle\n");
> +            return AVERROR_INVALIDDATA;
> +        }
> +
> +        switch (inlink->format) {
> +        case AV_PIX_FMT_YUV420P:
> +            blend_yuv_8_8bits(ctx, mainpic, sub_area, 1, 1, sub_area->x + s->x, sub_area->y + s->y);
> +            break;
> +        case AV_PIX_FMT_YUV422P:
> +            blend_yuv_8_8bits(ctx, mainpic, sub_area, 1, 0, sub_area->x + s->x, sub_area->y + s->y);
> +            break;
> +        case AV_PIX_FMT_YUV444P:
> +            blend_yuv_8_8bits(ctx, mainpic, sub_area, 0, 0, sub_area->x + s->x, sub_area->y + s->y);
> +            break;
> +        case AV_PIX_FMT_RGB24:
> +        case AV_PIX_FMT_BGR24:
> +        case AV_PIX_FMT_ARGB:
> +        case AV_PIX_FMT_RGBA:
> +        case AV_PIX_FMT_BGRA:
> +        case AV_PIX_FMT_ABGR:
> +            blend_packed_rgb(ctx, mainpic, sub_area, sub_area->x + s->x, sub_area->y + s->y, 1);
> +            break;
> +        default:
> +            av_log(NULL, AV_LOG_ERROR, "Unsupported input pix fmt: %d\n", inlink->format);
> +            return AVERROR(EINVAL);
> +        }
> +    }
> +
> +    return ff_filter_frame(ctx->outputs[0], mainpic);
> +}
> +
> +static av_cold int overlay_graphicsubs_init(AVFilterContext *ctx)
> +{
> +    OverlaySubsContext *s = ctx->priv;
> +
> +    s->fs.on_event = do_blend;
> +    return 0;
> +}
> +
> +static int overlay_graphicsubs_activate(AVFilterContext *ctx)
> +{
> +    OverlaySubsContext *s = ctx->priv;
> +    return ff_framesync_activate(&s->fs);
> +}
> +
> +static int graphicsub2video_query_formats(AVFilterContext *ctx)
> +{
> +    AVFilterFormats *formats;
> +    AVFilterLink *inlink = ctx->inputs[0];
> +    AVFilterLink *outlink = ctx->outputs[0];
> +    static const enum AVSubtitleType subtitle_fmts[] = { AV_SUBTITLE_FMT_BITMAP, AV_SUBTITLE_FMT_NONE };
> +    static const enum AVPixelFormat pix_fmts[] = { AV_PIX_FMT_RGB32, AV_PIX_FMT_NONE };
> +    int ret;
> +
> +    /* set input subtitle formats */
> +    formats = ff_make_format_list(subtitle_fmts);
> +    if ((ret = ff_formats_ref(formats, &inlink->outcfg.formats)) < 0)
> +        return ret;
> +
> +    /* set output video formats */
> +    formats = ff_make_format_list(pix_fmts);
> +    if ((ret = ff_formats_ref(formats, &outlink->incfg.formats)) < 0)
> +        return ret;
> +
> +    return 0;
> +}
> +
> +static int graphicsub2video_config_input(AVFilterLink *inlink)
> +{
> +    AVFilterContext *ctx = inlink->dst;
> +    OverlaySubsContext *s = ctx->priv;
> +
> +    if (s->w <= 0 || s->h <= 0) {
> +        s->w = inlink->w;
> +        s->h = inlink->h;
> +    }
> +    return 0;
> +}
> +
> +static int graphicsub2video_config_output(AVFilterLink *outlink)
> +{
> +    const AVFilterContext *ctx  = outlink->src;
> +    OverlaySubsContext *s = ctx->priv;
> +    const AVPixFmtDescriptor *pix_desc = av_pix_fmt_desc_get(outlink->format);
> +
> +    outlink->w = s->w;
> +    outlink->h = s->h;
> +    outlink->sample_aspect_ratio = (AVRational){1,1};
> +
> +    av_image_fill_max_pixsteps(s->main_pix_step, NULL, pix_desc);
> +    ff_fill_rgba_map(s->overlay_rgba_map, AV_PIX_FMT_RGB32);
> +
> +    s->hsub = pix_desc->log2_chroma_w;
> +    s->vsub = pix_desc->log2_chroma_h;
> +
> +    s->main_desc = pix_desc;
> +
> +    s->main_is_packed_rgb = ff_fill_rgba_map(s->main_rgba_map, outlink->format) >= 0;
> +    s->main_has_alpha = !!(pix_desc->flags & AV_PIX_FMT_FLAG_ALPHA);
> +
> +    return 0;
> +}
> +
> +static int graphicsub2video_filter_frame(AVFilterLink *inlink, AVFrame *src_frame)
> +{
> +    AVFilterLink *outlink = inlink->dst->outputs[0];
> +    AVFrame *out;
> +    const unsigned num_rects = src_frame->num_subtitle_areas;
> +    unsigned int i;
> +
> +    out = ff_get_video_buffer(outlink, outlink->w, outlink->h);
> +    if (!out)
> +        return AVERROR(ENOMEM);

Leak on error.

> +
> +    memset(out->data[0], 0, out->linesize[0] * out->height);
> +
> +    out->pts = src_frame->pts;
> +
> +
> +    for (i = 0; i < num_rects; i++) {
> +        const AVSubtitleArea  *sub_rect = src_frame->subtitle_areas[i];
> +
> +        if (sub_rect->type != AV_SUBTITLE_FMT_BITMAP) {
> +            av_log(NULL, AV_LOG_WARNING, "sub2video: non-bitmap subtitle\n");
> +            return AVERROR_INVALIDDATA;

Leak.

> +        }
> +
> +        blend_packed_rgb(inlink->dst, out, sub_rect, sub_rect->x, sub_rect->y, 1);
> +    }
> +
> +    av_frame_free(&src_frame);
> +    return ff_filter_frame(outlink, out);
> +}
> +
> +#define OFFSET(x) offsetof(OverlaySubsContext, x)
> +#define FLAGS (AV_OPT_FLAG_VIDEO_PARAM | AV_OPT_FLAG_FILTERING_PARAM)
> +
> +static const AVOption overlay_graphicsubs_options[] = {
> +    { "x", "set the x expression", OFFSET(x_expr), AV_OPT_TYPE_STRING, {.str = "0"}, 0, 0, FLAGS },
> +    { "y", "set the y expression", OFFSET(y_expr), AV_OPT_TYPE_STRING, {.str = "0"}, 0, 0, FLAGS },
> +    { "eof_action", "Action to take when encountering EOF from secondary input ",
> +        OFFSET(fs.opt_eof_action), AV_OPT_TYPE_INT, { .i64 = EOF_ACTION_REPEAT },
> +        EOF_ACTION_REPEAT, EOF_ACTION_PASS, .flags = FLAGS, "eof_action" },
> +        { "repeat", "Repeat the previous frame.",   0, AV_OPT_TYPE_CONST, { .i64 = EOF_ACTION_REPEAT }, .flags = FLAGS, "eof_action" },
> +        { "endall", "End both streams.",            0, AV_OPT_TYPE_CONST, { .i64 = EOF_ACTION_ENDALL }, .flags = FLAGS, "eof_action" },
> +        { "pass",   "Pass through the main input.", 0, AV_OPT_TYPE_CONST, { .i64 = EOF_ACTION_PASS },   .flags = FLAGS, "eof_action" },
> +    { "eval", "specify when to evaluate expressions", OFFSET(eval_mode), AV_OPT_TYPE_INT, {.i64 = EVAL_MODE_FRAME}, 0, EVAL_MODE_NB-1, FLAGS, "eval" },
> +         { "init",  "eval expressions once during initialization", 0, AV_OPT_TYPE_CONST, {.i64=EVAL_MODE_INIT},  .flags = FLAGS, .unit = "eval" },
> +         { "frame", "eval expressions per-frame",                  0, AV_OPT_TYPE_CONST, {.i64=EVAL_MODE_FRAME}, .flags = FLAGS, .unit = "eval" },
> +    { "shortest", "force termination when the shortest input terminates", OFFSET(fs.opt_shortest), AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1, FLAGS },
> +    { "repeatlast", "repeat overlay of the last overlay frame", OFFSET(fs.opt_repeatlast), AV_OPT_TYPE_BOOL, {.i64=1}, 0, 1, FLAGS },
> +    { NULL }
> +};
> +
> +static const AVOption graphicsub2video_options[] = {
> +    { "size", "set video size", OFFSET(w), AV_OPT_TYPE_IMAGE_SIZE, {.str = NULL}, 0, 0, FLAGS },
> +    { "s",    "set video size", OFFSET(w), AV_OPT_TYPE_IMAGE_SIZE, {.str = NULL}, 0, 0, FLAGS },
> +    { NULL }
> +};
> +
> +FRAMESYNC_DEFINE_CLASS(overlay_graphicsubs, OverlaySubsContext, fs);
> +
> +static const AVFilterPad overlay_graphicsubs_inputs[] = {
> +    {
> +        .name         = "main",
> +        .type         = AVMEDIA_TYPE_VIDEO,
> +        .config_props = config_input_main,
> +        .flags        = AVFILTERPAD_FLAG_NEEDS_WRITABLE,
> +    },
> +    {
> +        .name         = "overlay",
> +        .type         = AVMEDIA_TYPE_SUBTITLE,
> +    },
> +};
> +
> +static const AVFilterPad overlay_graphicsubs_outputs[] = {
> +    {
> +        .name          = "default",
> +        .type          = AVMEDIA_TYPE_VIDEO,
> +        .config_props  = config_output,
> +    },
> +};
> +
> +const AVFilter ff_vf_overlay_graphicsubs = {
> +    .name          = "overlay_graphicsubs",
> +    .description   = NULL_IF_CONFIG_SMALL("Overlay graphical subtitles on top of the input."),
> +    .preinit       = overlay_graphicsubs_framesync_preinit,
> +    .init          = overlay_graphicsubs_init,
> +    .uninit        = overlay_graphicsubs_uninit,
> +    .priv_size     = sizeof(OverlaySubsContext),
> +    .priv_class    = &overlay_graphicsubs_class,
> +    .query_formats = overlay_graphicsubs_query_formats,
> +    .activate      = overlay_graphicsubs_activate,
> +    FILTER_INPUTS(overlay_graphicsubs_inputs),
> +    FILTER_OUTPUTS(overlay_graphicsubs_outputs),
> +};
> +
> +AVFILTER_DEFINE_CLASS(graphicsub2video);
> +
> +static const AVFilterPad graphicsub2video_inputs[] = {
> +    {
> +        .name         = "default",
> +        .type         = AVMEDIA_TYPE_SUBTITLE,
> +        .filter_frame = graphicsub2video_filter_frame,
> +        .config_props = graphicsub2video_config_input,
> +    },
> +};
> +
> +static const AVFilterPad graphicsub2video_outputs[] = {
> +    {
> +        .name          = "default",
> +        .type          = AVMEDIA_TYPE_VIDEO,
> +        .config_props  = graphicsub2video_config_output,
> +    },
> +};
> +
> +AVFilter ff_svf_graphicsub2video = {

Missing const.

> +    .name          = "graphicsub2video",
> +    .description   = NULL_IF_CONFIG_SMALL("Convert graphical subtitles to video"),
> +    .query_formats = graphicsub2video_query_formats,
> +    .priv_size     = sizeof(OverlaySubsContext),
> +    .priv_class    = &graphicsub2video_class,
> +    FILTER_INPUTS(graphicsub2video_inputs),
> +    FILTER_OUTPUTS(graphicsub2video_outputs),
> +};
>
Soft Works Sept. 24, 2021, 1:18 a.m. UTC | #2
> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of Andreas
> Rheinhardt
> Sent: Wednesday, 22 September 2021 06:24
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH v8 06/13] avfilter/overlay_graphicsubs:
> Add overlay_graphicsubs and graphicsub2video filters
> 
> Soft Works:
> > - overlay_graphicsubs (VS -> V)
> >   Overlay graphic subtitles onto a video stream
> >
> > - graphicsub2video {S -> V)
> >   Converts graphic subtitles to video frames (with alpha)
> >   Gets auto-inserted for retaining compatibility with
> >   sub2video command lines
> >
> > Signed-off-by: softworkz <softworkz@hotmail.com>
> > ---
> >  doc/filters.texi                     | 104 ++++
> >  libavfilter/Makefile                 |   2 +
> >  libavfilter/allfilters.c             |   2 +
> >  libavfilter/vf_overlay_graphicsubs.c | 730 +++++++++++++++++++++++++++
> >  4 files changed, 838 insertions(+)
> >  create mode 100644 libavfilter/vf_overlay_graphicsubs.c
> >
> > +
> > +/**
> > + * Blend image in src to destination buffer dst at position (x, y).
> > + */
> 
> This whole code looks quite duplicated from the ordinary overlay.

Yes - it looks like and it is derived from vf_overlay, but it's
not really a duplication.

The code in vf_overlay is for blending images of similar formats which
may only differ by an alpha component, while the code here is 
about blending a PAL8 images over multiple main formats.

As an example, when blending the PAL8 image over yuv420, I'm only 
converting the palette to yuv for better efficiency, and in case
of subsampling, there's a different stepping for main and overlay
data.

Mangling those specifics into the existing vf_overlay code, would
neither improve execution performance nor code readability and
ease maintenance.

Paul had initially criticised this initially as well, but eventually
agreed that it's better to keep it separate from vf_overlay.

This is the only comment of yours on which I didn’t take action,
and I wanted to explain why.

Kind regards,
softworkz
diff mbox series

Patch

diff --git a/doc/filters.texi b/doc/filters.texi
index 94161003c3..9ce956e507 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -25079,6 +25079,110 @@  tools.
 
 @c man end VIDEO SINKS
 
+@chapter Subtitle Filters
+@c man begin SUBTITLE FILTERS
+
+When you configure your FFmpeg build, you can disable any of the
+existing filters using @code{--disable-filters}.
+
+Below is a description of the currently available subtitle filters.
+
+@section graphicsub2video
+
+Renders graphic subtitles as video frames. 
+
+This filter replaces the previous "sub2video" hack which did the conversion implicitly and up-front as subtitle filtering wasn't possible at that time.
+To retain compatibility with earlier sub2video command lines, this filter is being auto-inserted in those cases.
+
+For overlaying graphicsal subtitles it is recommended to use the 'overlay_graphicsubs' filter which is more efficient and takes less processing resources.
+
+This filter is still useful in cases where the overlay is done with hardware acceleration (e.g. overlay_qsv, overlay_vaapi, overlay_cuda) for preparing the overlay frames.
+
+It accepts the following parameters:
+
+@table @option
+@item size, s
+Set the size of the output video frame.
+
+@end table
+
+@subsection Examples
+
+@itemize
+@item
+Overlay PGS subtitles
+(not recommended - better use overlay_graphicsubs)
+@example
+ffmpeg -i "https://streams.videolan.org/samples/sub/PGS/Girl_With_The_Dragon_Tattoo_2%3A23%3A56.mkv" -filter_complex "[0:1]graphicsub2video[subs];[0:0][subs]overlay" output.mp4
+@end example
+
+@item
+Overlay PGS subtitles implicitly 
+The graphicsub2video is inserted automatically for compatibility with legacy command lines. 
+@example
+ffmpeg -i "https://streams.videolan.org/samples/sub/PGS/Girl_With_The_Dragon_Tattoo_2%3A23%3A56.mkv" -filter_complex "[0:0][0:1]overlay" output.mp4
+@end example
+@end itemize
+
+@section overlay_graphicsubs
+
+Overlay graphic subtitles onto a video stream.
+
+This filter can blend graphical subtitles on a video stream directly, i.e. without creating full-size alpha images first.
+The blending operation is limited to the area of the subtitle rectangles, which also means that no processing is done at times where no subtitles are to be displayed.
+
+
+It accepts the following parameters:
+
+@table @option
+@item x
+@item y
+Set the expression for the x and y coordinates of the overlaid video
+on the main video. Default value is "0" for both expressions. In case
+the expression is invalid, it is set to a huge value (meaning that the
+overlay will not be displayed within the output visible area).
+
+@item eof_action
+See @ref{framesync}.
+
+@item eval
+Set when the expressions for @option{x}, and @option{y} are evaluated.
+
+It accepts the following values:
+@table @samp
+@item init
+only evaluate expressions once during the filter initialization or
+when a command is processed
+
+@item frame
+evaluate expressions for each incoming frame
+@end table
+
+Default value is @samp{frame}.
+
+@item shortest
+See @ref{framesync}.
+
+@end table
+
+@subsection Examples
+
+@itemize
+@item
+Overlay PGS subtitles
+@example
+ffmpeg -i "https://streams.videolan.org/samples/sub/PGS/Girl_With_The_Dragon_Tattoo_2%3A23%3A56.mkv" -filter_complex "[0:1]graphicsub2video[subs];[0:0][subs]overlay" output.mp4
+@end example
+
+@item
+Overlay PGS subtitles implicitly 
+The graphicsub2video is inserted automatically for compatibility with legacy command lines. 
+@example
+ffmpeg -i "https://streams.videolan.org/samples/sub/PGS/Girl_With_The_Dragon_Tattoo_2%3A23%3A56.mkv" -filter_complex "[0:0][0:1]overlay" output.mp4
+@end example
+@end itemize
+@c man end SUBTITLE FILTERS
+
 @chapter Multimedia Filters
 @c man begin MULTIMEDIA FILTERS
 
diff --git a/libavfilter/Makefile b/libavfilter/Makefile
index 041d3c5382..8fcc25989e 100644
--- a/libavfilter/Makefile
+++ b/libavfilter/Makefile
@@ -290,6 +290,7 @@  OBJS-$(CONFIG_FSPP_FILTER)                   += vf_fspp.o qp_table.o
 OBJS-$(CONFIG_GBLUR_FILTER)                  += vf_gblur.o
 OBJS-$(CONFIG_GEQ_FILTER)                    += vf_geq.o
 OBJS-$(CONFIG_GRADFUN_FILTER)                += vf_gradfun.o
+OBJS-$(CONFIG_GRAPHICSUB2VIDEO_FILTER)       += vf_overlay_graphicsubs.o framesync.o
 OBJS-$(CONFIG_GRAPHMONITOR_FILTER)           += f_graphmonitor.o
 OBJS-$(CONFIG_GRAYWORLD_FILTER)              += vf_grayworld.o
 OBJS-$(CONFIG_GREYEDGE_FILTER)               += vf_colorconstancy.o
@@ -363,6 +364,7 @@  OBJS-$(CONFIG_OVERLAY_CUDA_FILTER)           += vf_overlay_cuda.o framesync.o vf
 OBJS-$(CONFIG_OVERLAY_OPENCL_FILTER)         += vf_overlay_opencl.o opencl.o \
                                                 opencl/overlay.o framesync.o
 OBJS-$(CONFIG_OVERLAY_QSV_FILTER)            += vf_overlay_qsv.o framesync.o
+OBJS-$(CONFIG_OVERLAY_GRAPHICSUBS_FILTER)    += vf_overlay_graphicsubs.o framesync.o
 OBJS-$(CONFIG_OVERLAY_VULKAN_FILTER)         += vf_overlay_vulkan.o vulkan.o
 OBJS-$(CONFIG_OWDENOISE_FILTER)              += vf_owdenoise.o
 OBJS-$(CONFIG_PAD_FILTER)                    += vf_pad.o
diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
index 154eba5bb2..10a310d20d 100644
--- a/libavfilter/allfilters.c
+++ b/libavfilter/allfilters.c
@@ -345,6 +345,7 @@  extern const AVFilter ff_vf_oscilloscope;
 extern const AVFilter ff_vf_overlay;
 extern const AVFilter ff_vf_overlay_opencl;
 extern const AVFilter ff_vf_overlay_qsv;
+extern const AVFilter ff_vf_overlay_graphicsubs;
 extern const AVFilter ff_vf_overlay_vulkan;
 extern const AVFilter ff_vf_overlay_cuda;
 extern const AVFilter ff_vf_owdenoise;
@@ -524,6 +525,7 @@  extern const AVFilter ff_avf_showvolume;
 extern const AVFilter ff_avf_showwaves;
 extern const AVFilter ff_avf_showwavespic;
 extern const AVFilter ff_vaf_spectrumsynth;
+extern const AVFilter ff_svf_graphicsub2video;
 
 /* multimedia sources */
 extern const AVFilter ff_avsrc_amovie;
diff --git a/libavfilter/vf_overlay_graphicsubs.c b/libavfilter/vf_overlay_graphicsubs.c
new file mode 100644
index 0000000000..b71b34abc4
--- /dev/null
+++ b/libavfilter/vf_overlay_graphicsubs.c
@@ -0,0 +1,730 @@ 
+/*
+ * Copyright (c) 2021 softworkz (derived from vf_overlay)
+ * Copyright (c) 2010 Stefano Sabatini
+ * Copyright (c) 2010 Baptiste Coudurier
+ * Copyright (c) 2007 Bobby Bingham
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+/**
+ * @file
+ * overlay graphical subtitles on top of a video frame
+ */
+
+#include "avfilter.h"
+#include "formats.h"
+#include "libavutil/common.h"
+#include "libavutil/eval.h"
+#include "libavutil/avstring.h"
+#include "libavutil/pixdesc.h"
+#include "libavutil/imgutils.h"
+#include "libavutil/opt.h"
+#include "internal.h"
+#include "drawutils.h"
+#include "framesync.h"
+
+#include "libavcodec/avcodec.h"
+
+enum var_name {
+    VAR_MAIN_W,    VAR_MW,
+    VAR_MAIN_H,    VAR_MH,
+    VAR_OVERLAY_W, VAR_OW,
+    VAR_OVERLAY_H, VAR_OH,
+    VAR_HSUB,
+    VAR_VSUB,
+    VAR_X,
+    VAR_Y,
+    VAR_N,
+    VAR_POS,
+    VAR_T,
+    VAR_VARS_NB
+};
+
+typedef struct OverlaySubsContext {
+    const AVClass *class;
+    int x, y;                   ///< position of overlaid picture
+    int w, h;
+    AVFrame *outpicref;
+
+    int main_is_packed_rgb;
+    uint8_t main_rgba_map[4];
+    int main_has_alpha;
+    uint8_t overlay_rgba_map[4];
+    int eval_mode;              ///< EvalMode
+
+    FFFrameSync fs;
+
+    int main_pix_step[4];       ///< steps per pixel for each plane of the main output
+    int hsub, vsub;             ///< chroma subsampling values
+    const AVPixFmtDescriptor *main_desc; ///< format descriptor for main input
+
+    double var_values[VAR_VARS_NB];
+    char *x_expr, *y_expr;
+
+    AVExpr *x_pexpr, *y_pexpr;
+} OverlaySubsContext;
+
+static const char *const var_names[] = {
+    "main_w",    "W", ///< width  of the main    video
+    "main_h",    "H", ///< height of the main    video
+    "overlay_w", "w", ///< width  of the overlay video
+    "overlay_h", "h", ///< height of the overlay video
+    "hsub",
+    "vsub",
+    "x",
+    "y",
+    "n",            ///< number of frame
+    "pos",          ///< position in the file
+    "t",            ///< timestamp expressed in seconds
+    NULL
+};
+
+#define MAIN    0
+#define OVERLAY 1
+
+#define R 0
+#define G 1
+#define B 2
+#define A 3
+
+#define Y 0
+#define U 1
+#define V 2
+
+enum EvalMode {
+    EVAL_MODE_INIT,
+    EVAL_MODE_FRAME,
+    EVAL_MODE_NB
+};
+
+static av_cold void overlay_graphicsubs_uninit(AVFilterContext *ctx)
+{
+    OverlaySubsContext *s = ctx->priv;
+
+    ff_framesync_uninit(&s->fs);
+    av_expr_free(s->x_pexpr); s->x_pexpr = NULL;
+    av_expr_free(s->y_pexpr); s->y_pexpr = NULL;
+}
+
+static inline int normalize_xy(double d, int chroma_sub)
+{
+    if (isnan(d))
+        return INT_MAX;
+    return (int)d & ~((1 << chroma_sub) - 1);
+}
+
+static void eval_expr(AVFilterContext *ctx)
+{
+    OverlaySubsContext *s = ctx->priv;
+
+    s->var_values[VAR_X] = av_expr_eval(s->x_pexpr, s->var_values, NULL);
+    s->var_values[VAR_Y] = av_expr_eval(s->y_pexpr, s->var_values, NULL);
+    /* It is necessary if x is expressed from y  */
+    s->var_values[VAR_X] = av_expr_eval(s->x_pexpr, s->var_values, NULL);
+    s->x = normalize_xy(s->var_values[VAR_X], s->hsub);
+    s->y = normalize_xy(s->var_values[VAR_Y], s->vsub);
+}
+
+static int set_expr(AVExpr **pexpr, const char *expr, const char *option, void *log_ctx)
+{
+    int ret;
+    AVExpr *old = NULL;
+
+    if (*pexpr)
+        old = *pexpr;
+    ret = av_expr_parse(pexpr, expr, var_names,
+                        NULL, NULL, NULL, NULL, 0, log_ctx);
+    if (ret < 0) {
+        av_log(log_ctx, AV_LOG_ERROR,
+               "Error when evaluating the expression '%s' for %s\n",
+               expr, option);
+        *pexpr = old;
+        return ret;
+    }
+
+    av_expr_free(old);
+    return 0;
+}
+
+static int overlay_graphicsubs_query_formats(AVFilterContext *ctx)
+{
+    AVFilterFormats *formats;
+    AVFilterLink *inlink0 = ctx->inputs[0];
+    AVFilterLink *inlink1 = ctx->inputs[1];
+    AVFilterLink *outlink = ctx->outputs[0];
+    int ret;
+    static const enum AVSubtitleType subtitle_fmts[] = { AV_SUBTITLE_FMT_BITMAP, AV_SUBTITLE_FMT_NONE };
+    static const enum AVPixelFormat supported_pix_fmts[] = {
+        AV_PIX_FMT_YUV420P, AV_PIX_FMT_YUV422P, AV_PIX_FMT_YUV444P,
+        AV_PIX_FMT_ARGB,  AV_PIX_FMT_RGBA,
+        AV_PIX_FMT_ABGR,  AV_PIX_FMT_BGRA,
+        AV_PIX_FMT_RGB24, AV_PIX_FMT_BGR24,
+        AV_PIX_FMT_NONE
+    };
+
+    /* set input0 video formats */
+    formats = ff_make_format_list(supported_pix_fmts);
+    if ((ret = ff_formats_ref(formats, &inlink0->outcfg.formats)) < 0)
+        return ret;
+
+    /* set input1 subtitle formats */
+    formats = ff_make_format_list(subtitle_fmts);
+    if ((ret = ff_formats_ref(formats, &inlink1->outcfg.formats)) < 0)
+        return ret;
+
+    /* set output0 video formats */
+    formats = ff_make_format_list(supported_pix_fmts);
+    if ((ret = ff_formats_ref(formats, &outlink->incfg.formats)) < 0)
+        return ret;
+
+    return 0;
+}
+
+static int config_output(AVFilterLink *outlink)
+{
+    AVFilterContext *ctx = outlink->src;
+    OverlaySubsContext *s = ctx->priv;
+    int ret;
+
+    if ((ret = ff_framesync_init_dualinput(&s->fs, ctx)) < 0)
+        return ret;
+
+    outlink->w = ctx->inputs[MAIN]->w;
+    outlink->h = ctx->inputs[MAIN]->h;
+    outlink->time_base = ctx->inputs[MAIN]->time_base;
+
+    return ff_framesync_configure(&s->fs);
+}
+
+// divide by 255 and round to nearest
+// apply a fast variant: (X+127)/255 = ((X+127)*257+257)>>16 = ((X+128)*257)>>16
+#define FAST_DIV255(x) ((((x) + 128) * 257) >> 16)
+
+// calculate the non-pre-multiplied alpha, applying the general equation:
+// alpha = alpha_overlay / ( (alpha_main + alpha_overlay) - (alpha_main * alpha_overlay) )
+// (((x) << 16) - ((x) << 9) + (x)) is a faster version of: 255 * 255 * x
+// ((((x) + (y)) << 8) - ((x) + (y)) - (y) * (x)) is a faster version of: 255 * (x + y)
+#define UNPREMULTIPLY_ALPHA(x, y) ((((x) << 16) - ((x) << 9) + (x)) / ((((x) + (y)) << 8) - ((x) + (y)) - (y) * (x)))
+
+/**
+ * Blend image in src to destination buffer dst at position (x, y).
+ */
+static av_always_inline void blend_packed_rgb(const AVFilterContext *ctx,
+    const AVFrame *dst, const AVSubtitleArea *src,
+    int x, int y,
+    int is_straight)
+{
+    OverlaySubsContext *s = ctx->priv;
+    int i, imax, j, jmax;
+    const int src_w = src->w;
+    const int src_h = src->h;
+    const int dst_w = dst->width;
+    const int dst_h = dst->height;
+    uint8_t alpha;          ///< the amount of overlay to blend on to main
+    const int dr = s->main_rgba_map[R];
+    const int dg = s->main_rgba_map[G];
+    const int db = s->main_rgba_map[B];
+    const int da = s->main_rgba_map[A];
+    const int dstep = s->main_pix_step[0];
+    const int sr = s->overlay_rgba_map[R];
+    const int sg = s->overlay_rgba_map[G];
+    const int sb = s->overlay_rgba_map[B];
+    const int sa = s->overlay_rgba_map[A];
+    int slice_start, slice_end;
+    uint8_t *S, *sp, *d, *dp;
+
+    i = FFMAX(-y, 0);
+    imax = FFMIN3(-y + dst_h, FFMIN(src_h, dst_h), y + src_h);
+
+    slice_start = i;
+    slice_end = i + imax;
+
+    sp = src->buf[0]->data + slice_start       * src->linesize[0];
+    dp = dst->data[0] + (slice_start + y) * dst->linesize[0];
+
+    for (i = slice_start; i < slice_end; i++) {
+        j = FFMAX(-x, 0);
+        S = sp + j;
+        d = dp + ((x + j) * dstep);
+
+        for (jmax = FFMIN(-x + dst_w, src_w); j < jmax; j++) {
+            uint32_t val = src->pal[*S];
+            const uint8_t *sval = (uint8_t *)&val;
+            alpha = sval[sa];
+
+            // if the main channel has an alpha channel, alpha has to be calculated
+            // to create an un-premultiplied (straight) alpha value
+            if (s->main_has_alpha && alpha != 0 && alpha != 255) {
+                const uint8_t alpha_d = d[da];
+                alpha = UNPREMULTIPLY_ALPHA(alpha, alpha_d);
+            }
+
+            switch (alpha) {
+            case 0:
+                break;
+            case 255:
+                d[dr] = sval[sr];
+                d[dg] = sval[sg];
+                d[db] = sval[sb];
+                break;
+            default:
+                // main_value = main_value * (1 - alpha) + overlay_value * alpha
+                // since alpha is in the range 0-255, the result must divided by 255
+                d[dr] = is_straight ? FAST_DIV255(d[dr] * (255 - alpha) + sval[sr] * alpha) :
+                        FFMIN(FAST_DIV255(d[dr] * (255 - alpha)) + sval[sr], 255);
+                d[dg] = is_straight ? FAST_DIV255(d[dg] * (255 - alpha) + sval[sg] * alpha) :
+                        FFMIN(FAST_DIV255(d[dg] * (255 - alpha)) + sval[sg], 255);
+                d[db] = is_straight ? FAST_DIV255(d[db] * (255 - alpha) + sval[sb] * alpha) :
+                        FFMIN(FAST_DIV255(d[db] * (255 - alpha)) + sval[sb], 255);
+            }
+
+            if (s->main_has_alpha) {
+                switch (alpha) {
+                case 0:
+                    break;
+                case 255:
+                    d[da] = sval[sa];
+                    break;
+                default:
+                    // apply alpha compositing: main_alpha += (1-main_alpha) * overlay_alpha
+                    d[da] += FAST_DIV255((255 - d[da]) * S[sa]);
+                }
+            }
+            d += dstep;
+            S += 1;
+        }
+        dp += dst->linesize[0];
+        sp += src->linesize[0];
+    }
+}
+
+static av_always_inline void blend_plane_8_8bits(const AVFilterContext *ctx, const AVFrame *dst, const AVSubtitleArea *area,
+    const uint32_t *yuv_pal, int src_w, int src_h, int dst_w, int dst_h, int plane, int hsub, int vsub,
+    int x, int y, int dst_plane, int dst_offset, int dst_step)
+{
+    const int src_wp = AV_CEIL_RSHIFT(src_w, hsub);
+    const int src_hp = AV_CEIL_RSHIFT(src_h, vsub);
+    const int dst_wp = AV_CEIL_RSHIFT(dst_w, hsub);
+    const int dst_hp = AV_CEIL_RSHIFT(dst_h, vsub);
+    const int yp = y >> vsub;
+    const int xp = x >> hsub;
+    uint8_t *s, *sp, *d, *dp, *dap;
+    int imax, i, j, jmax;
+    int slice_start, slice_end;
+
+    i = FFMAX(-yp, 0);                                                                                     \
+    imax = FFMIN3(-yp + dst_hp, FFMIN(src_hp, dst_hp), yp + src_hp);                                       \
+
+    slice_start = i;
+    slice_end = i + imax;
+
+    sp = area->buf[0]->data + (slice_start << vsub) * area->linesize[0];
+    dp = dst->data[dst_plane] + (yp + slice_start) * dst->linesize[dst_plane] + dst_offset;
+
+    dap = dst->data[3] + ((yp + slice_start) << vsub) * dst->linesize[3];
+
+    for (i = slice_start; i < slice_end; i++) {
+        j = FFMAX(-xp, 0);
+        d = dp + (xp + j) * dst_step;
+        s = sp + (j << hsub);
+        jmax = FFMIN(-xp + dst_wp, src_wp);    
+
+        for (; j < jmax; j++) {
+            uint32_t val = yuv_pal[*s];
+            const uint8_t *sval = (uint8_t *)&val;
+            const int alpha = sval[3];
+            const int max = 255, mid = 128;
+            const int d_int = *d;
+            const int sval_int = sval[plane];
+
+            switch (alpha) {
+            case 0:
+                break;
+            case 255:
+                *d = sval[plane];
+                break;
+            default:
+                if (plane > 0)
+                    *d = av_clip(FAST_DIV255((d_int - mid) * (max - alpha) + (sval_int - mid) * alpha) , -mid, mid) + mid;
+                else
+                    *d = FAST_DIV255(d_int * (max - alpha) + sval_int * alpha);
+                break;
+            }
+
+            d += dst_step;
+            s += 1 << hsub;
+        }
+        dp += dst->linesize[dst_plane];
+        sp +=  (1 << vsub) * area->linesize[0];
+        dap += (1 << vsub) * dst->linesize[3];
+    }
+}
+
+#define RGB2Y(r, g, b) (uint8_t)(((66 * (r) + 129 * (g) +  25 * (b) + 128) >> 8) +  16)
+#define RGB2U(r, g, b) (uint8_t)(((-38 * (r) - 74 * (g) + 112 * (b) + 128) >> 8) + 128)
+#define RGB2V(r, g, b) (uint8_t)(((112 * (r) - 94 * (g) -  18 * (b) + 128) >> 8) + 128)
+/* Converts R8 G8 B8 color to YUV. */
+static av_always_inline void rgb_2_yuv(uint8_t r, uint8_t g, uint8_t b, uint8_t* y, uint8_t* u, uint8_t* v)
+{
+    *y = RGB2Y((int)r, (int)g, (int)b);
+    *u = RGB2U((int)r, (int)g, (int)b);
+    *v = RGB2V((int)r, (int)g, (int)b);
+}
+
+
+static av_always_inline void blend_yuv_8_8bits(AVFilterContext *ctx, AVFrame *dst, const AVSubtitleArea *area, int hsub, int vsub, int x, int y)
+{
+    OverlaySubsContext *s = ctx->priv;
+    const int src_w = area->w;
+    const int src_h = area->h;
+    const int dst_w = dst->width;
+    const int dst_h = dst->height;
+    const int sr = s->overlay_rgba_map[R];
+    const int sg = s->overlay_rgba_map[G];
+    const int sb = s->overlay_rgba_map[B];
+    const int sa = s->overlay_rgba_map[A];
+    uint32_t yuvpal[256];
+
+    for (int i = 0; i < 256; ++i) {
+        const uint8_t *rgba = (const uint8_t *)&area->pal[i];
+        uint8_t *yuva = (uint8_t *)&yuvpal[i];
+        rgb_2_yuv(rgba[sr], rgba[sg], rgba[sb], &yuva[Y], &yuva[U], &yuva[V]);
+        yuva[3] = rgba[sa];
+    }
+
+    blend_plane_8_8bits(ctx, dst, area, yuvpal, src_w, src_h, dst_w, dst_h, Y, 0,    0,    x, y, s->main_desc->comp[Y].plane, s->main_desc->comp[Y].offset, s->main_desc->comp[Y].step);
+    blend_plane_8_8bits(ctx, dst, area, yuvpal, src_w, src_h, dst_w, dst_h, U, hsub, vsub, x, y, s->main_desc->comp[U].plane, s->main_desc->comp[U].offset, s->main_desc->comp[U].step);
+    blend_plane_8_8bits(ctx, dst, area, yuvpal, src_w, src_h, dst_w, dst_h, V, hsub, vsub, x, y, s->main_desc->comp[V].plane, s->main_desc->comp[V].offset, s->main_desc->comp[V].step);
+}
+
+static int config_input_main(AVFilterLink *inlink)
+{
+    int ret;
+    AVFilterContext *ctx  = inlink->dst;
+    OverlaySubsContext *s = inlink->dst->priv;
+    const AVPixFmtDescriptor *pix_desc = av_pix_fmt_desc_get(inlink->format);
+
+    av_image_fill_max_pixsteps(s->main_pix_step,    NULL, pix_desc);
+    ff_fill_rgba_map(s->overlay_rgba_map, AV_PIX_FMT_RGB32); // it's actually AV_PIX_FMT_PAL8);
+
+    s->hsub = pix_desc->log2_chroma_w;
+    s->vsub = pix_desc->log2_chroma_h;
+
+    s->main_desc = pix_desc;
+
+    s->main_is_packed_rgb = ff_fill_rgba_map(s->main_rgba_map, inlink->format) >= 0;
+    s->main_has_alpha = !!(pix_desc->flags & AV_PIX_FMT_FLAG_ALPHA);
+
+    /* Finish the configuration by evaluating the expressions
+       now when both inputs are configured. */
+    s->var_values[VAR_MAIN_W   ] = s->var_values[VAR_MW] = ctx->inputs[MAIN   ]->w;
+    s->var_values[VAR_MAIN_H   ] = s->var_values[VAR_MH] = ctx->inputs[MAIN   ]->h;
+    s->var_values[VAR_OVERLAY_W] = s->var_values[VAR_OW] = ctx->inputs[OVERLAY]->w;
+    s->var_values[VAR_OVERLAY_H] = s->var_values[VAR_OH] = ctx->inputs[OVERLAY]->h;
+    s->var_values[VAR_HSUB]  = 1<<pix_desc->log2_chroma_w;
+    s->var_values[VAR_VSUB]  = 1<<pix_desc->log2_chroma_h;
+    s->var_values[VAR_X]     = NAN;
+    s->var_values[VAR_Y]     = NAN;
+    s->var_values[VAR_N]     = 0;
+    s->var_values[VAR_T]     = NAN;
+    s->var_values[VAR_POS]   = NAN;
+
+    if ((ret = set_expr(&s->x_pexpr,      s->x_expr,      "x",      ctx)) < 0 ||
+        (ret = set_expr(&s->y_pexpr,      s->y_expr,      "y",      ctx)) < 0)
+        return ret;
+
+    if (s->eval_mode == EVAL_MODE_INIT) {
+        eval_expr(ctx);
+        av_log(ctx, AV_LOG_VERBOSE, "x:%f xi:%d y:%f yi:%d\n",
+               s->var_values[VAR_X], s->x,
+               s->var_values[VAR_Y], s->y);
+    }
+
+    av_log(ctx, AV_LOG_VERBOSE,
+           "main w:%d h:%d fmt:%s overlay w:%d h:%d fmt:%s\n",
+           ctx->inputs[MAIN]->w, ctx->inputs[MAIN]->h,
+           av_get_pix_fmt_name(ctx->inputs[MAIN]->format),
+           ctx->inputs[OVERLAY]->w, ctx->inputs[OVERLAY]->h,
+           av_get_pix_fmt_name(ctx->inputs[OVERLAY]->format));
+    return 0;
+}
+
+static int do_blend(FFFrameSync *fs)
+{
+    AVFilterContext *ctx = fs->parent;
+    AVFrame *mainpic, *second;
+    OverlaySubsContext *s = ctx->priv;
+    AVFilterLink *inlink = ctx->inputs[0];
+    unsigned i;
+    int ret;
+
+    ret = ff_framesync_dualinput_get_writable(fs, &mainpic, &second);
+    if (ret < 0)
+        return ret;
+    if (!second)
+        return ff_filter_frame(ctx->outputs[0], mainpic);
+
+    if (s->eval_mode == EVAL_MODE_FRAME) {
+        int64_t pos = mainpic->pkt_pos;
+
+        s->var_values[VAR_N] = (double)inlink->frame_count_out;
+        s->var_values[VAR_T] = mainpic->pts == AV_NOPTS_VALUE ?
+            NAN :(double)mainpic->pts * av_q2d(inlink->time_base);
+        s->var_values[VAR_POS] = pos == -1 ? NAN : (double)pos;
+
+        s->var_values[VAR_OVERLAY_W] = s->var_values[VAR_OW] = second->width;
+        s->var_values[VAR_OVERLAY_H] = s->var_values[VAR_OH] = second->height;
+        s->var_values[VAR_MAIN_W   ] = s->var_values[VAR_MW] = mainpic->width;
+        s->var_values[VAR_MAIN_H   ] = s->var_values[VAR_MH] = mainpic->height;
+
+        eval_expr(ctx);
+        av_log(ctx, AV_LOG_DEBUG, "n:%f t:%f pos:%f x:%f xi:%d y:%f yi:%d\n",
+               s->var_values[VAR_N], s->var_values[VAR_T], s->var_values[VAR_POS],
+               s->var_values[VAR_X], s->x,
+               s->var_values[VAR_Y], s->y);
+    }
+
+    for (i = 0; i < second->num_subtitle_areas; i++) {
+        const AVSubtitleArea *sub_area = second->subtitle_areas[i];
+
+        if (sub_area->type != AV_SUBTITLE_FMT_BITMAP) {
+            av_log(NULL, AV_LOG_WARNING, "overlay_graphicsub: non-bitmap subtitle\n");
+            return AVERROR_INVALIDDATA;
+        }
+
+        switch (inlink->format) {
+        case AV_PIX_FMT_YUV420P:
+            blend_yuv_8_8bits(ctx, mainpic, sub_area, 1, 1, sub_area->x + s->x, sub_area->y + s->y);
+            break;
+        case AV_PIX_FMT_YUV422P:
+            blend_yuv_8_8bits(ctx, mainpic, sub_area, 1, 0, sub_area->x + s->x, sub_area->y + s->y);
+            break;
+        case AV_PIX_FMT_YUV444P:
+            blend_yuv_8_8bits(ctx, mainpic, sub_area, 0, 0, sub_area->x + s->x, sub_area->y + s->y);
+            break;
+        case AV_PIX_FMT_RGB24:
+        case AV_PIX_FMT_BGR24:
+        case AV_PIX_FMT_ARGB:
+        case AV_PIX_FMT_RGBA:
+        case AV_PIX_FMT_BGRA:
+        case AV_PIX_FMT_ABGR:
+            blend_packed_rgb(ctx, mainpic, sub_area, sub_area->x + s->x, sub_area->y + s->y, 1);
+            break;
+        default:
+            av_log(NULL, AV_LOG_ERROR, "Unsupported input pix fmt: %d\n", inlink->format);
+            return AVERROR(EINVAL);
+        }
+    }
+
+    return ff_filter_frame(ctx->outputs[0], mainpic);
+}
+
+static av_cold int overlay_graphicsubs_init(AVFilterContext *ctx)
+{
+    OverlaySubsContext *s = ctx->priv;
+
+    s->fs.on_event = do_blend;
+    return 0;
+}
+
+static int overlay_graphicsubs_activate(AVFilterContext *ctx)
+{
+    OverlaySubsContext *s = ctx->priv;
+    return ff_framesync_activate(&s->fs);
+}
+
+static int graphicsub2video_query_formats(AVFilterContext *ctx)
+{
+    AVFilterFormats *formats;
+    AVFilterLink *inlink = ctx->inputs[0];
+    AVFilterLink *outlink = ctx->outputs[0];
+    static const enum AVSubtitleType subtitle_fmts[] = { AV_SUBTITLE_FMT_BITMAP, AV_SUBTITLE_FMT_NONE };
+    static const enum AVPixelFormat pix_fmts[] = { AV_PIX_FMT_RGB32, AV_PIX_FMT_NONE };
+    int ret;
+
+    /* set input subtitle formats */
+    formats = ff_make_format_list(subtitle_fmts);
+    if ((ret = ff_formats_ref(formats, &inlink->outcfg.formats)) < 0)
+        return ret;
+
+    /* set output video formats */
+    formats = ff_make_format_list(pix_fmts);
+    if ((ret = ff_formats_ref(formats, &outlink->incfg.formats)) < 0)
+        return ret;
+
+    return 0;
+}
+
+static int graphicsub2video_config_input(AVFilterLink *inlink)
+{
+    AVFilterContext *ctx = inlink->dst;
+    OverlaySubsContext *s = ctx->priv;
+
+    if (s->w <= 0 || s->h <= 0) {
+        s->w = inlink->w;
+        s->h = inlink->h;
+    }
+    return 0;
+}
+
+static int graphicsub2video_config_output(AVFilterLink *outlink)
+{
+    const AVFilterContext *ctx  = outlink->src;
+    OverlaySubsContext *s = ctx->priv;
+    const AVPixFmtDescriptor *pix_desc = av_pix_fmt_desc_get(outlink->format);
+
+    outlink->w = s->w;
+    outlink->h = s->h;
+    outlink->sample_aspect_ratio = (AVRational){1,1};
+
+    av_image_fill_max_pixsteps(s->main_pix_step, NULL, pix_desc);
+    ff_fill_rgba_map(s->overlay_rgba_map, AV_PIX_FMT_RGB32);
+
+    s->hsub = pix_desc->log2_chroma_w;
+    s->vsub = pix_desc->log2_chroma_h;
+
+    s->main_desc = pix_desc;
+
+    s->main_is_packed_rgb = ff_fill_rgba_map(s->main_rgba_map, outlink->format) >= 0;
+    s->main_has_alpha = !!(pix_desc->flags & AV_PIX_FMT_FLAG_ALPHA);
+
+    return 0;
+}
+
+static int graphicsub2video_filter_frame(AVFilterLink *inlink, AVFrame *src_frame)
+{
+    AVFilterLink *outlink = inlink->dst->outputs[0];
+    AVFrame *out;
+    const unsigned num_rects = src_frame->num_subtitle_areas;
+    unsigned int i;
+
+    out = ff_get_video_buffer(outlink, outlink->w, outlink->h);
+    if (!out)
+        return AVERROR(ENOMEM);
+
+    memset(out->data[0], 0, out->linesize[0] * out->height);
+
+    out->pts = src_frame->pts;
+
+
+    for (i = 0; i < num_rects; i++) {
+        const AVSubtitleArea  *sub_rect = src_frame->subtitle_areas[i];
+
+        if (sub_rect->type != AV_SUBTITLE_FMT_BITMAP) {
+            av_log(NULL, AV_LOG_WARNING, "sub2video: non-bitmap subtitle\n");
+            return AVERROR_INVALIDDATA;
+        }
+
+        blend_packed_rgb(inlink->dst, out, sub_rect, sub_rect->x, sub_rect->y, 1);
+    }
+
+    av_frame_free(&src_frame);
+    return ff_filter_frame(outlink, out);
+}
+
+#define OFFSET(x) offsetof(OverlaySubsContext, x)
+#define FLAGS (AV_OPT_FLAG_VIDEO_PARAM | AV_OPT_FLAG_FILTERING_PARAM)
+
+static const AVOption overlay_graphicsubs_options[] = {
+    { "x", "set the x expression", OFFSET(x_expr), AV_OPT_TYPE_STRING, {.str = "0"}, 0, 0, FLAGS },
+    { "y", "set the y expression", OFFSET(y_expr), AV_OPT_TYPE_STRING, {.str = "0"}, 0, 0, FLAGS },
+    { "eof_action", "Action to take when encountering EOF from secondary input ",
+        OFFSET(fs.opt_eof_action), AV_OPT_TYPE_INT, { .i64 = EOF_ACTION_REPEAT },
+        EOF_ACTION_REPEAT, EOF_ACTION_PASS, .flags = FLAGS, "eof_action" },
+        { "repeat", "Repeat the previous frame.",   0, AV_OPT_TYPE_CONST, { .i64 = EOF_ACTION_REPEAT }, .flags = FLAGS, "eof_action" },
+        { "endall", "End both streams.",            0, AV_OPT_TYPE_CONST, { .i64 = EOF_ACTION_ENDALL }, .flags = FLAGS, "eof_action" },
+        { "pass",   "Pass through the main input.", 0, AV_OPT_TYPE_CONST, { .i64 = EOF_ACTION_PASS },   .flags = FLAGS, "eof_action" },
+    { "eval", "specify when to evaluate expressions", OFFSET(eval_mode), AV_OPT_TYPE_INT, {.i64 = EVAL_MODE_FRAME}, 0, EVAL_MODE_NB-1, FLAGS, "eval" },
+         { "init",  "eval expressions once during initialization", 0, AV_OPT_TYPE_CONST, {.i64=EVAL_MODE_INIT},  .flags = FLAGS, .unit = "eval" },
+         { "frame", "eval expressions per-frame",                  0, AV_OPT_TYPE_CONST, {.i64=EVAL_MODE_FRAME}, .flags = FLAGS, .unit = "eval" },
+    { "shortest", "force termination when the shortest input terminates", OFFSET(fs.opt_shortest), AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1, FLAGS },
+    { "repeatlast", "repeat overlay of the last overlay frame", OFFSET(fs.opt_repeatlast), AV_OPT_TYPE_BOOL, {.i64=1}, 0, 1, FLAGS },
+    { NULL }
+};
+
+static const AVOption graphicsub2video_options[] = {
+    { "size", "set video size", OFFSET(w), AV_OPT_TYPE_IMAGE_SIZE, {.str = NULL}, 0, 0, FLAGS },
+    { "s",    "set video size", OFFSET(w), AV_OPT_TYPE_IMAGE_SIZE, {.str = NULL}, 0, 0, FLAGS },
+    { NULL }
+};
+
+FRAMESYNC_DEFINE_CLASS(overlay_graphicsubs, OverlaySubsContext, fs);
+
+static const AVFilterPad overlay_graphicsubs_inputs[] = {
+    {
+        .name         = "main",
+        .type         = AVMEDIA_TYPE_VIDEO,
+        .config_props = config_input_main,
+        .flags        = AVFILTERPAD_FLAG_NEEDS_WRITABLE,
+    },
+    {
+        .name         = "overlay",
+        .type         = AVMEDIA_TYPE_SUBTITLE,
+    },
+};
+
+static const AVFilterPad overlay_graphicsubs_outputs[] = {
+    {
+        .name          = "default",
+        .type          = AVMEDIA_TYPE_VIDEO,
+        .config_props  = config_output,
+    },
+};
+
+const AVFilter ff_vf_overlay_graphicsubs = {
+    .name          = "overlay_graphicsubs",
+    .description   = NULL_IF_CONFIG_SMALL("Overlay graphical subtitles on top of the input."),
+    .preinit       = overlay_graphicsubs_framesync_preinit,
+    .init          = overlay_graphicsubs_init,
+    .uninit        = overlay_graphicsubs_uninit,
+    .priv_size     = sizeof(OverlaySubsContext),
+    .priv_class    = &overlay_graphicsubs_class,
+    .query_formats = overlay_graphicsubs_query_formats,
+    .activate      = overlay_graphicsubs_activate,
+    FILTER_INPUTS(overlay_graphicsubs_inputs),
+    FILTER_OUTPUTS(overlay_graphicsubs_outputs),
+};
+
+AVFILTER_DEFINE_CLASS(graphicsub2video);
+
+static const AVFilterPad graphicsub2video_inputs[] = {
+    {
+        .name         = "default",
+        .type         = AVMEDIA_TYPE_SUBTITLE,
+        .filter_frame = graphicsub2video_filter_frame,
+        .config_props = graphicsub2video_config_input,
+    },
+};
+
+static const AVFilterPad graphicsub2video_outputs[] = {
+    {
+        .name          = "default",
+        .type          = AVMEDIA_TYPE_VIDEO,
+        .config_props  = graphicsub2video_config_output,
+    },
+};
+
+AVFilter ff_svf_graphicsub2video = {
+    .name          = "graphicsub2video",
+    .description   = NULL_IF_CONFIG_SMALL("Convert graphical subtitles to video"),
+    .query_formats = graphicsub2video_query_formats,
+    .priv_size     = sizeof(OverlaySubsContext),
+    .priv_class    = &graphicsub2video_class,
+    FILTER_INPUTS(graphicsub2video_inputs),
+    FILTER_OUTPUTS(graphicsub2video_outputs),
+};