From patchwork Sun Oct 31 13:35:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Timo Rothenpieler X-Patchwork-Id: 31265 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a5e:a610:0:0:0:0:0 with SMTP id q16csp1644317ioi; Sun, 31 Oct 2021 06:36:29 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyyI9W/uKv+KOXAz8ZP6Poz/ZplXqXlfo89HjkF4+Usp0yUwriseo/paFBAeoqG7chadU+O X-Received: by 2002:aa7:cd88:: with SMTP id x8mr31848647edv.203.1635687389571; Sun, 31 Oct 2021 06:36:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635687389; cv=none; d=google.com; s=arc-20160816; b=AGSRS8YvY52vWBZgyIp8Voq0FPyY63iXtU12NHRUA7rAbNxuYiGDKrT4GfrWjAHIJF uNLg9aB18OVWRxOKcx4vyq/KXtMeQvMwusq6Rf3r8jSbaghe8KJ9j5yzDWH00mH3Pg0x CtwQdITwXIDLyP0afrbPpNa7XfMnuqMBUffO1XEISU26oanvlEtRgrppX82/ym01spul BarTm2zRKDonUZrGOEKNqm0SDWYUPG9wTdipCbkqlgyZSw/w8dPVTsLEhLQt8psBAc17 0EVlLmlwBYe0yecsc/0gE1uqOLBMjHmEcU87x9ghXqGoqjE27M3IY75uTqmkwdKIJgS+ JvPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=WhT+jRpHKkAqcm3Bi4ZU0er0jYQP8kRpuJo6bsJmOMQ=; b=v7liFPB5evdCDgj6YXiEXb8NLURTcTKXkD9XEXa2VCLxIvHm4Iv9vXF+fZLkMXsQGc XTyYsnAwKTio5U9F092qTpqTlsynkJa7wsxjoEtZM3rNkSUK5R5mgChewiExCWD5AS2i eu2AXmgxgBSMjuY/4xrJA/winzIlx9PYsLBvYZDznoOWZr0hZ/m7xNCwHAuuJzjcBLMZ l4mCiQjAMr0CmbTgTuZBX9ODu2KipsCbNspExp/uK0CivgcDIRTTb3A0rgxRvYD1XXk/ yywmeahY+wwDOFkhb54Ss2cGUY0ntgSv+LxNfgkcIkC9Q0D7GHo6CtEAAgcc46kZfjYF Zr8g== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b=o15U5Tod; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id w2si15861044ede.421.2021.10.31.06.36.29; Sun, 31 Oct 2021 06:36:29 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b=o15U5Tod; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id EFAB968A82A; Sun, 31 Oct 2021 15:36:25 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from btbn.de (btbn.de [136.243.74.85]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 6B31C6808EE for ; Sun, 31 Oct 2021 15:36:19 +0200 (EET) Received: from [authenticated] by btbn.de (Postfix) with ESMTPSA id D6D8525F814; Sun, 31 Oct 2021 14:36:17 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rothenpieler.org; s=mail; t=1635687378; bh=gwb1YmPs/aJlXTInRBo62wrnWIb9byCez0FM0sBKGLY=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=o15U5Todi3hFaj2MrWpwBw0Nx8gv3t9LFzMSrUayMm9wYJt81MWMN0OOdmmfaYsWE 70dq1/T7RYW/bTsO8M2F4Hs22qoSsKSiMhNhRLW5R2NNMHV/UiL8Fkl49wxhF3wp9s 5kpO2a5awFVs9Qc335ZX5WON/vLOIeafR+R9rLlqMcrvoiVdEpPwMz0adBWsiFM6fN zk+/I0rWciITSkjYOaUJx0y/SqDG9eAELjmcp+0U2gPIzVG0xMDQhQRn74ngaOqzPl JZrSO0+y8pyeM2/B8/PMiBzu8D3oGP/LZNpMYB3Zrk2OwTrfuY8Hafqz4N69WkH2Hv mzOjAwlk7PkDA== From: Timo Rothenpieler To: ffmpeg-devel@ffmpeg.org Date: Sun, 31 Oct 2021 14:35:52 +0100 Message-Id: <20211031133552.25570-1-timo@rothenpieler.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] avfilter/scale_npp: add scale2ref_npp filter X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: ygupta@nvidia.com, rmonteiro@nvidia.com, rarzumanyan@nvidia.com, Timo Rothenpieler Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 4rxmUVCQGht2 From: Roman Arzumanyan Signed-off-by: Timo Rothenpieler --- Here's my revised version of the patch. It brings its order of operation much more in line with normal scale/scale2ref. Also gets rid of differences in option parsing between the 2ref and non-2ref version of the filters. I also added some more options, like the eval option or size, size it was used anyway, just not exposed. If no further issues or comments arise, I will apply this patch in a few days. I also plan to come up with a similar patch for scale_cuda. doc/filters.texi | 111 ++++++++ libavfilter/allfilters.c | 1 + libavfilter/version.h | 2 +- libavfilter/vf_scale_npp.c | 544 ++++++++++++++++++++++++++++++++++--- 4 files changed, 618 insertions(+), 40 deletions(-) diff --git a/doc/filters.texi b/doc/filters.texi index 177f0774fc..8eae567f01 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -18494,6 +18494,7 @@ scale_cuda=passthrough=0 @end example @end itemize +@anchor{scale_npp} @section scale_npp Use the NVIDIA Performance Primitives (libnpp) to perform scaling and/or pixel @@ -18570,6 +18571,61 @@ This option can be handy if you need to have a video fit within or exceed a defined resolution using @option{force_original_aspect_ratio} but also have encoder restrictions on width or height divisibility. +@item eval +Specify when to evaluate @var{width} and @var{height} expression. It accepts the following values: + +@table @samp +@item init +Only evaluate expressions once during the filter initialization or when a command is processed. + +@item frame +Evaluate expressions for each incoming frame. + +@end table + +@end table + +The values of the @option{w} and @option{h} options are expressions +containing the following constants: + +@table @var +@item in_w +@item in_h +The input width and height + +@item iw +@item ih +These are the same as @var{in_w} and @var{in_h}. + +@item out_w +@item out_h +The output (scaled) width and height + +@item ow +@item oh +These are the same as @var{out_w} and @var{out_h} + +@item a +The same as @var{iw} / @var{ih} + +@item sar +input sample aspect ratio + +@item dar +The input display aspect ratio. Calculated from @code{(iw / ih) * sar}. + +@item n +The (sequential) number of the input frame, starting from 0. +Only available with @code{eval=frame}. + +@item t +The presentation timestamp of the input frame, expressed as a number of +seconds. Only available with @code{eval=frame}. + +@item pos +The position (byte offset) of the frame in the input stream, or NaN if +this information is unavailable and/or meaningless (for example in case of synthetic video). +Only available with @code{eval=frame}. @end table @section scale2ref @@ -18645,6 +18701,61 @@ If the specified expression is not valid, it is kept at its current value. @end table +@section scale2ref_npp + +Use the NVIDIA Performance Primitives (libnpp) to scale (resize) the input +video, based on a reference video. + +See the @ref{scale_npp} filter for available options, scale2ref_npp supports the same +but uses the reference video instead of the main input as basis. scale2ref_npp +also supports the following additional constants for the @option{w} and +@option{h} options: + +@table @var +@item main_w +@item main_h +The main input video's width and height + +@item main_a +The same as @var{main_w} / @var{main_h} + +@item main_sar +The main input video's sample aspect ratio + +@item main_dar, mdar +The main input video's display aspect ratio. Calculated from +@code{(main_w / main_h) * main_sar}. + +@item main_n +The (sequential) number of the main input frame, starting from 0. +Only available with @code{eval=frame}. + +@item main_t +The presentation timestamp of the main input frame, expressed as a number of +seconds. Only available with @code{eval=frame}. + +@item main_pos +The position (byte offset) of the frame in the main input stream, or NaN if +this information is unavailable and/or meaningless (for example in case of synthetic video). +Only available with @code{eval=frame}. +@end table + +@subsection Examples + +@itemize +@item +Scale a subtitle stream (b) to match the main video (a) in size before overlaying +@example +'scale2ref_npp[b][a];[a][b]overlay_cuda' +@end example + +@item +Scale a logo to 1/10th the height of a video, while preserving its display aspect ratio. +@example +[logo-in][video-in]scale2ref_npp=w=oh*mdar:h=ih/10[logo-out][video-out] +@end example +@end itemize + @section scharr Apply scharr operator to input video stream. diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c index 409ab5d3c4..667b6fc246 100644 --- a/libavfilter/allfilters.c +++ b/libavfilter/allfilters.c @@ -396,6 +396,7 @@ extern const AVFilter ff_vf_scale_qsv; extern const AVFilter ff_vf_scale_vaapi; extern const AVFilter ff_vf_scale_vulkan; extern const AVFilter ff_vf_scale2ref; +extern const AVFilter ff_vf_scale2ref_npp; extern const AVFilter ff_vf_scdet; extern const AVFilter ff_vf_scharr; extern const AVFilter ff_vf_scroll; diff --git a/libavfilter/version.h b/libavfilter/version.h index f6bb4fbe69..3bd3816698 100644 --- a/libavfilter/version.h +++ b/libavfilter/version.h @@ -31,7 +31,7 @@ #define LIBAVFILTER_VERSION_MAJOR 8 #define LIBAVFILTER_VERSION_MINOR 16 -#define LIBAVFILTER_VERSION_MICRO 100 +#define LIBAVFILTER_VERSION_MICRO 101 #define LIBAVFILTER_VERSION_INT AV_VERSION_INT(LIBAVFILTER_VERSION_MAJOR, \ diff --git a/libavfilter/vf_scale_npp.c b/libavfilter/vf_scale_npp.c index 8da335154c..6ce82e5302 100644 --- a/libavfilter/vf_scale_npp.c +++ b/libavfilter/vf_scale_npp.c @@ -25,13 +25,13 @@ #include #include -#include "libavutil/avstring.h" -#include "libavutil/common.h" #include "libavutil/hwcontext.h" #include "libavutil/hwcontext_cuda_internal.h" #include "libavutil/cuda_check.h" #include "libavutil/internal.h" #include "libavutil/opt.h" +#include "libavutil/parseutils.h" +#include "libavutil/eval.h" #include "libavutil/pixdesc.h" #include "avfilter.h" @@ -44,6 +44,7 @@ static const enum AVPixelFormat supported_formats[] = { AV_PIX_FMT_YUV420P, + AV_PIX_FMT_YUVA420P, AV_PIX_FMT_NV12, AV_PIX_FMT_YUV444P, }; @@ -67,12 +68,62 @@ typedef struct NPPScaleStageContext { struct { int width; int height; - } planes_in[3], planes_out[3]; + } planes_in[4], planes_out[4]; AVBufferRef *frames_ctx; AVFrame *frame; } NPPScaleStageContext; +static const char *const var_names[] = { + "in_w", "iw", + "in_h", "ih", + "out_w", "ow", + "out_h", "oh", + "a", + "sar", + "dar", + "n", + "t", + "pos", + "main_w", + "main_h", + "main_a", + "main_sar", + "main_dar", "mdar", + "main_n", + "main_t", + "main_pos", + NULL +}; + +enum var_name { + VAR_IN_W, VAR_IW, + VAR_IN_H, VAR_IH, + VAR_OUT_W, VAR_OW, + VAR_OUT_H, VAR_OH, + VAR_A, + VAR_SAR, + VAR_DAR, + VAR_N, + VAR_T, + VAR_POS, + VAR_S2R_MAIN_W, + VAR_S2R_MAIN_H, + VAR_S2R_MAIN_A, + VAR_S2R_MAIN_SAR, + VAR_S2R_MAIN_DAR, VAR_S2R_MDAR, + VAR_S2R_MAIN_N, + VAR_S2R_MAIN_T, + VAR_S2R_MAIN_POS, + VARS_NB +}; + +enum EvalMode { + EVAL_MODE_INIT, + EVAL_MODE_FRAME, + EVAL_MODE_NB +}; + typedef struct NPPScaleContext { const AVClass *class; @@ -102,35 +153,269 @@ typedef struct NPPScaleContext { int force_divisible_by; int interp_algo; + + char* size_str; + + AVExpr* w_pexpr; + AVExpr* h_pexpr; + + double var_values[VARS_NB]; + + int eval_mode; } NPPScaleContext; -static int nppscale_init(AVFilterContext *ctx) +const AVFilter ff_vf_scale2ref_npp; + +static int config_props(AVFilterLink *outlink); + +static int check_exprs(AVFilterContext* ctx) { - NPPScaleContext *s = ctx->priv; - int i; + NPPScaleContext* scale = ctx->priv; + unsigned vars_w[VARS_NB] = {0}, vars_h[VARS_NB] = {0}; + + if (!scale->w_pexpr && !scale->h_pexpr) + return AVERROR(EINVAL); + + if (scale->w_pexpr) + av_expr_count_vars(scale->w_pexpr, vars_w, VARS_NB); + if (scale->h_pexpr) + av_expr_count_vars(scale->h_pexpr, vars_h, VARS_NB); + + if (vars_w[VAR_OUT_W] || vars_w[VAR_OW]) { + av_log(ctx, AV_LOG_ERROR, "Width expression cannot be self-referencing: '%s'.\n", scale->w_expr); + return AVERROR(EINVAL); + } + + if (vars_h[VAR_OUT_H] || vars_h[VAR_OH]) { + av_log(ctx, AV_LOG_ERROR, "Height expression cannot be self-referencing: '%s'.\n", scale->h_expr); + return AVERROR(EINVAL); + } + + if ((vars_w[VAR_OUT_H] || vars_w[VAR_OH]) && + (vars_h[VAR_OUT_W] || vars_h[VAR_OW])) { + av_log(ctx, AV_LOG_WARNING, "Circular references detected for width '%s' and height '%s' - possibly invalid.\n", scale->w_expr, scale->h_expr); + } + + if (ctx->filter != &ff_vf_scale2ref_npp && + (vars_w[VAR_S2R_MAIN_W] || vars_h[VAR_S2R_MAIN_W] || + vars_w[VAR_S2R_MAIN_H] || vars_h[VAR_S2R_MAIN_H] || + vars_w[VAR_S2R_MAIN_A] || vars_h[VAR_S2R_MAIN_A] || + vars_w[VAR_S2R_MAIN_SAR] || vars_h[VAR_S2R_MAIN_SAR] || + vars_w[VAR_S2R_MAIN_DAR] || vars_h[VAR_S2R_MAIN_DAR] || + vars_w[VAR_S2R_MDAR] || vars_h[VAR_S2R_MDAR] || + vars_w[VAR_S2R_MAIN_N] || vars_h[VAR_S2R_MAIN_N] || + vars_w[VAR_S2R_MAIN_T] || vars_h[VAR_S2R_MAIN_T] || + vars_w[VAR_S2R_MAIN_POS] || vars_h[VAR_S2R_MAIN_POS])) { + av_log(ctx, AV_LOG_ERROR, "Expressions with scale2ref_npp variables are not valid in scale_npp filter.\n"); + return AVERROR(EINVAL); + } + + if (scale->eval_mode == EVAL_MODE_INIT && + (vars_w[VAR_N] || vars_h[VAR_N] || + vars_w[VAR_T] || vars_h[VAR_T] || + vars_w[VAR_POS] || vars_h[VAR_POS] || + vars_w[VAR_S2R_MAIN_N] || vars_h[VAR_S2R_MAIN_N] || + vars_w[VAR_S2R_MAIN_T] || vars_h[VAR_S2R_MAIN_T] || + vars_w[VAR_S2R_MAIN_POS] || vars_h[VAR_S2R_MAIN_POS]) ) { + av_log(ctx, AV_LOG_ERROR, "Expressions with frame variables 'n', 't', 'pos' are not valid in init eval_mode.\n"); + return AVERROR(EINVAL); + } + + return 0; +} + +static int nppscale_parse_expr(AVFilterContext* ctx, char* str_expr, + AVExpr** pexpr_ptr, const char* var, + const char* args) +{ + NPPScaleContext* scale = ctx->priv; + int ret, is_inited = 0; + char* old_str_expr = NULL; + AVExpr* old_pexpr = NULL; + + if (str_expr) { + old_str_expr = av_strdup(str_expr); + if (!old_str_expr) + return AVERROR(ENOMEM); + av_opt_set(scale, var, args, 0); + } + + if (*pexpr_ptr) { + old_pexpr = *pexpr_ptr; + *pexpr_ptr = NULL; + is_inited = 1; + } + + ret = av_expr_parse(pexpr_ptr, args, var_names, NULL, NULL, NULL, NULL, 0, + ctx); + if (ret < 0) { + av_log(ctx, AV_LOG_ERROR, "Cannot parse expression for %s: '%s'\n", var, + args); + goto revert; + } + + ret = check_exprs(ctx); + if (ret < 0) + goto revert; + + if (is_inited && (ret = config_props(ctx->outputs[0])) < 0) + goto revert; + + av_expr_free(old_pexpr); + old_pexpr = NULL; + av_freep(&old_str_expr); + + return 0; + +revert: + av_expr_free(*pexpr_ptr); + *pexpr_ptr = NULL; + if (old_str_expr) { + av_opt_set(scale, var, old_str_expr, 0); + av_free(old_str_expr); + } + if (old_pexpr) + *pexpr_ptr = old_pexpr; + + return ret; +} + +static av_cold int nppscale_init(AVFilterContext* ctx) +{ + NPPScaleContext* scale = ctx->priv; + int i, ret; - if (!strcmp(s->format_str, "same")) { - s->format = AV_PIX_FMT_NONE; + if (!strcmp(scale->format_str, "same")) { + scale->format = AV_PIX_FMT_NONE; } else { - s->format = av_get_pix_fmt(s->format_str); - if (s->format == AV_PIX_FMT_NONE) { - av_log(ctx, AV_LOG_ERROR, "Unrecognized pixel format: %s\n", s->format_str); + scale->format = av_get_pix_fmt(scale->format_str); + if (scale->format == AV_PIX_FMT_NONE) { + av_log(ctx, AV_LOG_ERROR, "Unrecognized pixel format: %s\n", scale->format_str); return AVERROR(EINVAL); } } - for (i = 0; i < FF_ARRAY_ELEMS(s->stages); i++) { - s->stages[i].frame = av_frame_alloc(); - if (!s->stages[i].frame) + if (scale->size_str && (scale->w_expr || scale->h_expr)) { + av_log(ctx, AV_LOG_ERROR, + "Size and width/height exprs cannot be set at the same time.\n"); + return AVERROR(EINVAL); + } + + if (scale->w_expr && !scale->h_expr) + FFSWAP(char*, scale->w_expr, scale->size_str); + + if (scale->size_str) { + char buf[32]; + ret = av_parse_video_size(&scale->w, &scale->h, scale->size_str); + if (0 > ret) { + av_log(ctx, AV_LOG_ERROR, "Invalid size '%s'\n", scale->size_str); + return ret; + } + + snprintf(buf, sizeof(buf) - 1, "%d", scale->w); + ret = av_opt_set(scale, "w", buf, 0); + if (ret < 0) + return ret; + + snprintf(buf, sizeof(buf) - 1, "%d", scale->h); + ret = av_opt_set(scale, "h", buf, 0); + if (ret < 0) + return ret; + } + + if (!scale->w_expr) { + ret = av_opt_set(scale, "w", "iw", 0); + if (ret < 0) + return ret; + } + + if (!scale->h_expr) { + ret = av_opt_set(scale, "h", "ih", 0); + if (ret < 0) + return ret; + } + + ret = nppscale_parse_expr(ctx, NULL, &scale->w_pexpr, "width", scale->w_expr); + if (ret < 0) + return ret; + + ret = nppscale_parse_expr(ctx, NULL, &scale->h_pexpr, "height", scale->h_expr); + if (ret < 0) + return ret; + + for (i = 0; i < FF_ARRAY_ELEMS(scale->stages); i++) { + scale->stages[i].frame = av_frame_alloc(); + if (!scale->stages[i].frame) return AVERROR(ENOMEM); } - s->tmp_frame = av_frame_alloc(); - if (!s->tmp_frame) + scale->tmp_frame = av_frame_alloc(); + if (!scale->tmp_frame) return AVERROR(ENOMEM); return 0; } +static int nppscale_eval_dimensions(AVFilterContext* ctx) +{ + NPPScaleContext* scale = ctx->priv; + const char scale2ref = ctx->filter == &ff_vf_scale2ref_npp; + const AVFilterLink* inlink = ctx->inputs[scale2ref ? 1 : 0]; + char* expr; + int eval_w, eval_h; + int ret; + double res; + + scale->var_values[VAR_IN_W] = scale->var_values[VAR_IW] = inlink->w; + scale->var_values[VAR_IN_H] = scale->var_values[VAR_IH] = inlink->h; + scale->var_values[VAR_OUT_W] = scale->var_values[VAR_OW] = NAN; + scale->var_values[VAR_OUT_H] = scale->var_values[VAR_OH] = NAN; + scale->var_values[VAR_A] = (double)inlink->w / inlink->h; + scale->var_values[VAR_SAR] = inlink->sample_aspect_ratio.num ? + (double)inlink->sample_aspect_ratio.num / inlink->sample_aspect_ratio.den : 1; + scale->var_values[VAR_DAR] = scale->var_values[VAR_A] * scale->var_values[VAR_SAR]; + + if (scale2ref) { + const AVFilterLink* main_link = ctx->inputs[0]; + + scale->var_values[VAR_S2R_MAIN_W] = main_link->w; + scale->var_values[VAR_S2R_MAIN_H] = main_link->h; + scale->var_values[VAR_S2R_MAIN_A] = (double)main_link->w / main_link->h; + scale->var_values[VAR_S2R_MAIN_SAR] = main_link->sample_aspect_ratio.num ? + (double)main_link->sample_aspect_ratio.num / main_link->sample_aspect_ratio.den : 1; + scale->var_values[VAR_S2R_MAIN_DAR] = scale->var_values[VAR_S2R_MDAR] = + scale->var_values[VAR_S2R_MAIN_A] * scale->var_values[VAR_S2R_MAIN_SAR]; + } + + res = av_expr_eval(scale->w_pexpr, scale->var_values, NULL); + eval_w = scale->var_values[VAR_OUT_W] = scale->var_values[VAR_OW] = (int)res == 0 ? inlink->w : (int)res; + + res = av_expr_eval(scale->h_pexpr, scale->var_values, NULL); + if (isnan(res)) { + expr = scale->h_expr; + ret = AVERROR(EINVAL); + goto fail; + } + eval_h = scale->var_values[VAR_OUT_H] = scale->var_values[VAR_OH] = (int)res == 0 ? inlink->h : (int)res; + + res = av_expr_eval(scale->w_pexpr, scale->var_values, NULL); + if (isnan(res)) { + expr = scale->w_expr; + ret = AVERROR(EINVAL); + goto fail; + } + eval_w = scale->var_values[VAR_OUT_W] = scale->var_values[VAR_OW] = (int)res == 0 ? inlink->w : (int)res; + + scale->w = eval_w; + scale->h = eval_h; + + return 0; + +fail: + av_log(ctx, AV_LOG_ERROR, "Error when evaluating the expression '%s'.\n", + expr); + return ret; +} + static void nppscale_uninit(AVFilterContext *ctx) { NPPScaleContext *s = ctx->priv; @@ -141,6 +426,10 @@ static void nppscale_uninit(AVFilterContext *ctx) av_buffer_unref(&s->stages[i].frames_ctx); } av_frame_free(&s->tmp_frame); + + av_expr_free(s->w_pexpr); + av_expr_free(s->h_pexpr); + s->w_pexpr = s->h_pexpr = NULL; } static int init_stage(NPPScaleStageContext *stage, AVBufferRef *device_ctx) @@ -164,6 +453,13 @@ static int init_stage(NPPScaleStageContext *stage, AVBufferRef *device_ctx) stage->planes_out[i].height = stage->planes_out[0].height >> out_sh; } + if (AV_PIX_FMT_YUVA420P == stage->in_fmt) { + stage->planes_in[3].width = stage->planes_in[0].width; + stage->planes_in[3].height = stage->planes_in[0].height; + stage->planes_out[3].width = stage->planes_out[0].width; + stage->planes_out[3].height = stage->planes_out[0].height; + } + out_ref = av_hwframe_ctx_alloc(device_ctx); if (!out_ref) return AVERROR(ENOMEM); @@ -326,31 +622,32 @@ static int init_processing_chain(AVFilterContext *ctx, int in_width, int in_heig return 0; } -static int nppscale_config_props(AVFilterLink *outlink) +static int config_props(AVFilterLink *outlink) { AVFilterContext *ctx = outlink->src; - AVFilterLink *inlink = outlink->src->inputs[0]; + AVFilterLink *inlink0 = outlink->src->inputs[0]; + AVFilterLink *inlink = ctx->filter == &ff_vf_scale2ref_npp ? + outlink->src->inputs[1] : + outlink->src->inputs[0]; NPPScaleContext *s = ctx->priv; - int w, h; int ret; - if ((ret = ff_scale_eval_dimensions(s, - s->w_expr, s->h_expr, - inlink, outlink, - &w, &h)) < 0) + if ((ret = nppscale_eval_dimensions(ctx)) < 0) goto fail; - ff_scale_adjust_dimensions(inlink, &w, &h, - s->force_original_aspect_ratio, s->force_divisible_by); + ff_scale_adjust_dimensions(inlink, &s->w, &s->h, + s->force_original_aspect_ratio, + s->force_divisible_by); - if (((int64_t)h * inlink->w) > INT_MAX || - ((int64_t)w * inlink->h) > INT_MAX) + if (s->w > INT_MAX || s->h > INT_MAX || + (s->h * inlink->w) > INT_MAX || + (s->w * inlink->h) > INT_MAX) av_log(ctx, AV_LOG_ERROR, "Rescaled value for width or height is too big.\n"); - outlink->w = w; - outlink->h = h; + outlink->w = s->w; + outlink->h = s->h; - ret = init_processing_chain(ctx, inlink->w, inlink->h, w, h); + ret = init_processing_chain(ctx, inlink0->w, inlink0->h, outlink->w, outlink->h); if (ret < 0) return ret; @@ -370,6 +667,22 @@ fail: return ret; } +static int config_props_ref(AVFilterLink *outlink) +{ + AVFilterLink *inlink = outlink->src->inputs[1]; + AVFilterContext *ctx = outlink->src; + + outlink->w = inlink->w; + outlink->h = inlink->h; + outlink->sample_aspect_ratio = inlink->sample_aspect_ratio; + outlink->time_base = inlink->time_base; + outlink->frame_rate = inlink->frame_rate; + + ctx->outputs[1]->hw_frames_ctx = av_buffer_ref(ctx->inputs[1]->hw_frames_ctx); + + return 0; +} + static int nppscale_deinterleave(AVFilterContext *ctx, NPPScaleStageContext *stage, AVFrame *out, AVFrame *in) { @@ -454,12 +767,71 @@ static int (*const nppscale_process[])(AVFilterContext *ctx, NPPScaleStageContex [STAGE_INTERLEAVE] = nppscale_interleave, }; -static int nppscale_scale(AVFilterContext *ctx, AVFrame *out, AVFrame *in) +static int nppscale_scale(AVFilterLink *link, AVFrame *out, AVFrame *in) { + AVFilterContext *ctx = link->dst; NPPScaleContext *s = ctx->priv; + AVFilterLink *outlink = ctx->outputs[0]; AVFrame *src = in; + char buf[32]; int i, ret, last_stage = -1; + int frame_changed; + + frame_changed = in->width != link->w || + in->height != link->h || + in->format != link->format || + in->sample_aspect_ratio.den != link->sample_aspect_ratio.den || + in->sample_aspect_ratio.num != link->sample_aspect_ratio.num; + + if (s->eval_mode == EVAL_MODE_FRAME || frame_changed) { + unsigned vars_w[VARS_NB] = { 0 }, vars_h[VARS_NB] = { 0 }; + + av_expr_count_vars(s->w_pexpr, vars_w, VARS_NB); + av_expr_count_vars(s->h_pexpr, vars_h, VARS_NB); + + if (s->eval_mode == EVAL_MODE_FRAME && !frame_changed && ctx->filter != &ff_vf_scale2ref_npp && + !(vars_w[VAR_N] || vars_w[VAR_T] || vars_w[VAR_POS]) && + !(vars_h[VAR_N] || vars_h[VAR_T] || vars_h[VAR_POS]) && + s->w && s->h) + goto scale; + + if (s->eval_mode == EVAL_MODE_INIT) { + snprintf(buf, sizeof(buf)-1, "%d", outlink->w); + av_opt_set(s, "w", buf, 0); + snprintf(buf, sizeof(buf)-1, "%d", outlink->h); + av_opt_set(s, "h", buf, 0); + + ret = nppscale_parse_expr(ctx, NULL, &s->w_pexpr, "width", s->w_expr); + if (ret < 0) + return ret; + + ret = nppscale_parse_expr(ctx, NULL, &s->h_pexpr, "height", s->h_expr); + if (ret < 0) + return ret; + } + + if (ctx->filter == &ff_vf_scale2ref_npp) { + s->var_values[VAR_S2R_MAIN_N] = link->frame_count_out; + s->var_values[VAR_S2R_MAIN_T] = TS2T(in->pts, link->time_base); + s->var_values[VAR_S2R_MAIN_POS] = in->pkt_pos == -1 ? NAN : in->pkt_pos; + } else { + s->var_values[VAR_N] = link->frame_count_out; + s->var_values[VAR_T] = TS2T(in->pts, link->time_base); + s->var_values[VAR_POS] = in->pkt_pos == -1 ? NAN : in->pkt_pos; + } + + link->format = in->format; + link->w = in->width; + link->h = in->height; + + link->sample_aspect_ratio.den = in->sample_aspect_ratio.den; + link->sample_aspect_ratio.num = in->sample_aspect_ratio.num; + if ((ret = config_props(outlink)) < 0) + return ret; + } + +scale: for (i = 0; i < FF_ARRAY_ELEMS(s->stages); i++) { if (!s->stages[i].stage_needed) continue; @@ -516,7 +888,7 @@ static int nppscale_filter_frame(AVFilterLink *link, AVFrame *in) if (ret < 0) goto fail; - ret = nppscale_scale(ctx, out, in); + ret = nppscale_scale(link, out, in); CHECK_CU(device_hwctx->internal->cuda_dl->cuCtxPopCurrent(&dummy)); if (ret < 0) @@ -535,12 +907,54 @@ fail: return ret; } +static int nppscale_filter_frame_ref(AVFilterLink *link, AVFrame *in) +{ + NPPScaleContext *scale = link->dst->priv; + AVFilterLink *outlink = link->dst->outputs[1]; + int frame_changed; + + frame_changed = in->width != link->w || + in->height != link->h || + in->format != link->format || + in->sample_aspect_ratio.den != link->sample_aspect_ratio.den || + in->sample_aspect_ratio.num != link->sample_aspect_ratio.num; + + if (frame_changed) { + link->format = in->format; + link->w = in->width; + link->h = in->height; + link->sample_aspect_ratio.num = in->sample_aspect_ratio.num; + link->sample_aspect_ratio.den = in->sample_aspect_ratio.den; + + config_props_ref(outlink); + } + + if (scale->eval_mode == EVAL_MODE_FRAME) { + scale->var_values[VAR_N] = link->frame_count_out; + scale->var_values[VAR_T] = TS2T(in->pts, link->time_base); + scale->var_values[VAR_POS] = in->pkt_pos == -1 ? NAN : in->pkt_pos; + } + + return ff_filter_frame(outlink, in); +} + +static int request_frame(AVFilterLink *outlink) +{ + return ff_request_frame(outlink->src->inputs[0]); +} + +static int request_frame_ref(AVFilterLink *outlink) +{ + return ff_request_frame(outlink->src->inputs[1]); +} + #define OFFSET(x) offsetof(NPPScaleContext, x) #define FLAGS (AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_VIDEO_PARAM) static const AVOption options[] = { - { "w", "Output video width", OFFSET(w_expr), AV_OPT_TYPE_STRING, { .str = "iw" }, .flags = FLAGS }, - { "h", "Output video height", OFFSET(h_expr), AV_OPT_TYPE_STRING, { .str = "ih" }, .flags = FLAGS }, + { "w", "Output video width", OFFSET(w_expr), AV_OPT_TYPE_STRING, .flags = FLAGS }, + { "h", "Output video height", OFFSET(h_expr), AV_OPT_TYPE_STRING, .flags = FLAGS }, { "format", "Output pixel format", OFFSET(format_str), AV_OPT_TYPE_STRING, { .str = "same" }, .flags = FLAGS }, + { "s", "Output video size", OFFSET(size_str), AV_OPT_TYPE_STRING, { .str = NULL }, .flags = FLAGS }, { "interp_algo", "Interpolation algorithm used for resizing", OFFSET(interp_algo), AV_OPT_TYPE_INT, { .i64 = NPPI_INTER_CUBIC }, 0, INT_MAX, FLAGS, "interp_algo" }, { "nn", "nearest neighbour", 0, AV_OPT_TYPE_CONST, { .i64 = NPPI_INTER_NN }, 0, 0, FLAGS, "interp_algo" }, @@ -551,11 +965,14 @@ static const AVOption options[] = { { "cubic2p_b05c03", "2-parameter cubic (B=1/2, C=3/10)", 0, AV_OPT_TYPE_CONST, { .i64 = NPPI_INTER_CUBIC2P_B05C03 }, 0, 0, FLAGS, "interp_algo" }, { "super", "supersampling", 0, AV_OPT_TYPE_CONST, { .i64 = NPPI_INTER_SUPER }, 0, 0, FLAGS, "interp_algo" }, { "lanczos", "Lanczos", 0, AV_OPT_TYPE_CONST, { .i64 = NPPI_INTER_LANCZOS }, 0, 0, FLAGS, "interp_algo" }, - { "force_original_aspect_ratio", "decrease or increase w/h if necessary to keep the original AR", OFFSET(force_original_aspect_ratio), AV_OPT_TYPE_INT, { .i64 = 0}, 0, 2, FLAGS, "force_oar" }, + { "force_original_aspect_ratio", "decrease or increase w/h if necessary to keep the original AR", OFFSET(force_original_aspect_ratio), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, 2, FLAGS, "force_oar" }, { "disable", NULL, 0, AV_OPT_TYPE_CONST, {.i64 = 0 }, 0, 0, FLAGS, "force_oar" }, { "decrease", NULL, 0, AV_OPT_TYPE_CONST, {.i64 = 1 }, 0, 0, FLAGS, "force_oar" }, { "increase", NULL, 0, AV_OPT_TYPE_CONST, {.i64 = 2 }, 0, 0, FLAGS, "force_oar" }, - { "force_divisible_by", "enforce that the output resolution is divisible by a defined integer when force_original_aspect_ratio is used", OFFSET(force_divisible_by), AV_OPT_TYPE_INT, { .i64 = 1}, 1, 256, FLAGS }, + { "force_divisible_by", "enforce that the output resolution is divisible by a defined integer when force_original_aspect_ratio is used", OFFSET(force_divisible_by), AV_OPT_TYPE_INT, { .i64 = 1 }, 1, 256, FLAGS }, + { "eval", "specify when to evaluate expressions", OFFSET(eval_mode), AV_OPT_TYPE_INT, { .i64 = EVAL_MODE_INIT }, 0, EVAL_MODE_NB-1, FLAGS, "eval" }, + { "init", "eval expressions once during initialization", 0, AV_OPT_TYPE_CONST, { .i64 = EVAL_MODE_INIT }, 0, 0, FLAGS, "eval" }, + { "frame", "eval expressions during initialization and per-frame", 0, AV_OPT_TYPE_CONST, { .i64 = EVAL_MODE_FRAME }, 0, 0, FLAGS, "eval" }, { NULL }, }; @@ -564,6 +981,7 @@ static const AVClass nppscale_class = { .item_name = av_default_item_name, .option = options, .version = LIBAVUTIL_VERSION_INT, + .category = AV_CLASS_CATEGORY_FILTER, }; static const AVFilterPad nppscale_inputs[] = { @@ -571,15 +989,15 @@ static const AVFilterPad nppscale_inputs[] = { .name = "default", .type = AVMEDIA_TYPE_VIDEO, .filter_frame = nppscale_filter_frame, - }, + } }; static const AVFilterPad nppscale_outputs[] = { { .name = "default", .type = AVMEDIA_TYPE_VIDEO, - .config_props = nppscale_config_props, - }, + .config_props = config_props, + } }; const AVFilter ff_vf_scale_npp = { @@ -600,3 +1018,51 @@ const AVFilter ff_vf_scale_npp = { .flags_internal = FF_FILTER_FLAG_HWFRAME_AWARE, }; + +static const AVFilterPad nppscale2ref_inputs[] = { + { + .name = "default", + .type = AVMEDIA_TYPE_VIDEO, + .filter_frame = nppscale_filter_frame, + }, + { + .name = "ref", + .type = AVMEDIA_TYPE_VIDEO, + .filter_frame = nppscale_filter_frame_ref, + } +}; + +static const AVFilterPad nppscale2ref_outputs[] = { + { + .name = "default", + .type = AVMEDIA_TYPE_VIDEO, + .config_props = config_props, + .request_frame= request_frame, + }, + { + .name = "ref", + .type = AVMEDIA_TYPE_VIDEO, + .config_props = config_props_ref, + .request_frame= request_frame_ref, + } +}; + +const AVFilter ff_vf_scale2ref_npp = { + .name = "scale2ref_npp", + .description = NULL_IF_CONFIG_SMALL("NVIDIA Performance Primitives video " + "scaling and format conversion to the " + "given reference."), + + .init = nppscale_init, + .uninit = nppscale_uninit, + + .priv_size = sizeof(NPPScaleContext), + .priv_class = &nppscale_class, + + FILTER_INPUTS(nppscale2ref_inputs), + FILTER_OUTPUTS(nppscale2ref_outputs), + + FILTER_SINGLE_PIXFMT(AV_PIX_FMT_CUDA), + + .flags_internal = FF_FILTER_FLAG_HWFRAME_AWARE, +};