Message ID | 58A0BCA5.7030009@email.cz |
---|---|
State | New |
Headers | show |
On Sun, Feb 12, 2017 at 8:51 PM, Miroslav Slugeň <thunder.m@email.cz> wrote: > This patch is for discussion only, not ready to commit yet. > > 1. Cuvid decoder actualy support scaling input to requested resolution > without any performance penalty (like libnpp does), so this patch is proof > of concept that it is working like expected. > I don't think scaling is something a decoder should be doing, we don't really want all sorts of video processing jumbled up into one monolithic cuvid thing, but rather keep tasks separated. - Hendrik
Dne 12.2.2017 v 20:59 Hendrik Leppkes napsal(a): > On Sun, Feb 12, 2017 at 8:51 PM, Miroslav Slugeň <thunder.m@email.cz> wrote: >> This patch is for discussion only, not ready to commit yet. >> >> 1. Cuvid decoder actualy support scaling input to requested resolution >> without any performance penalty (like libnpp does), so this patch is proof >> of concept that it is working like expected. >> > I don't think scaling is something a decoder should be doing, we don't > really want all sorts of video processing jumbled up into one > monolithic cuvid thing, but rather keep tasks separated. > > - Hendrik > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel Yes, but when you transcoding from FHD or 4K to SD quality it could save alotof GPU resources. We have one example where "ONE" Quadro P5000 (2xNVENC) is downscaling about 74 FHD streams to SD at realtime. I know it is not something that is acceptable in current ffmpeg, maybe libav could adopt this patch. M.
On Sun, 12 Feb 2017 21:07:40 +0100 Miroslav Slugeň <thunder.m@email.cz> wrote: > Dne 12.2.2017 v 20:59 Hendrik Leppkes napsal(a): > > On Sun, Feb 12, 2017 at 8:51 PM, Miroslav Slugeň <thunder.m@email.cz> wrote: > >> This patch is for discussion only, not ready to commit yet. > >> > >> 1. Cuvid decoder actualy support scaling input to requested resolution > >> without any performance penalty (like libnpp does), so this patch is proof > >> of concept that it is working like expected. > >> > > I don't think scaling is something a decoder should be doing, we don't > > really want all sorts of video processing jumbled up into one > > monolithic cuvid thing, but rather keep tasks separated. > > > > - Hendrik > > _______________________________________________ > > ffmpeg-devel mailing list > > ffmpeg-devel@ffmpeg.org > > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > Yes, but when you transcoding from FHD or 4K to SD quality it could save > alotof GPU resources. > > We have one example where "ONE" Quadro P5000 (2xNVENC) is downscaling > about 74 FHD streams to SD at realtime. > > I know it is not something that is acceptable in current ffmpeg, maybe > libav could adopt this patch. You mean the Libav project? They'd be even less likely to accept such a patch. Anyway, I don't think this would be slower than doing it in some sort of separate cuda video filter.
Dne 13.2.2017 v 05:03 wm4 napsal(a): > On Sun, 12 Feb 2017 21:07:40 +0100 > Miroslav Slugeň <thunder.m@email.cz> wrote: > >> Dne 12.2.2017 v 20:59 Hendrik Leppkes napsal(a): >>> On Sun, Feb 12, 2017 at 8:51 PM, Miroslav Slugeň <thunder.m@email.cz> wrote: >>>> This patch is for discussion only, not ready to commit yet. >>>> >>>> 1. Cuvid decoder actualy support scaling input to requested resolution >>>> without any performance penalty (like libnpp does), so this patch is proof >>>> of concept that it is working like expected. >>>> >>> I don't think scaling is something a decoder should be doing, we don't >>> really want all sorts of video processing jumbled up into one >>> monolithic cuvid thing, but rather keep tasks separated. >>> >>> - Hendrik >>> _______________________________________________ >>> ffmpeg-devel mailing list >>> ffmpeg-devel@ffmpeg.org >>> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel >> Yes, but when you transcoding from FHD or 4K to SD quality it could save >> alotof GPU resources. >> >> We have one example where "ONE" Quadro P5000 (2xNVENC) is downscaling >> about 74 FHD streams to SD at realtime. >> >> I know it is not something that is acceptable in current ffmpeg, maybe >> libav could adopt this patch. > You mean the Libav project? They'd be even less likely to accept such a > patch. > > Anyway, I don't think this would be slower than doing it in some sort > of separate cuda video filter. > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel This is not true, NVDEC (cuvid) is separate chipset and has its own NVDEC load in nvidia-smi monitoring tool, while resizing with libnpp is completly done on CUDA cores. In NVDEC only deinterlacing ADAPTIVE is using CUDA cores more intensively, cropping and resizing in NVDEC is for free :) M.
On Mon, 13 Feb 2017 09:03:09 +0100 Miroslav Slugeň <thunder.m@email.cz> wrote: > Dne 13.2.2017 v 05:03 wm4 napsal(a): > > On Sun, 12 Feb 2017 21:07:40 +0100 > > Miroslav Slugeň <thunder.m@email.cz> wrote: > > > >> Dne 12.2.2017 v 20:59 Hendrik Leppkes napsal(a): > >>> On Sun, Feb 12, 2017 at 8:51 PM, Miroslav Slugeň <thunder.m@email.cz> wrote: > >>>> This patch is for discussion only, not ready to commit yet. > >>>> > >>>> 1. Cuvid decoder actualy support scaling input to requested resolution > >>>> without any performance penalty (like libnpp does), so this patch is proof > >>>> of concept that it is working like expected. > >>>> > >>> I don't think scaling is something a decoder should be doing, we don't > >>> really want all sorts of video processing jumbled up into one > >>> monolithic cuvid thing, but rather keep tasks separated. > >>> > >>> - Hendrik > >>> _______________________________________________ > >>> ffmpeg-devel mailing list > >>> ffmpeg-devel@ffmpeg.org > >>> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > >> Yes, but when you transcoding from FHD or 4K to SD quality it could save > >> alotof GPU resources. > >> > >> We have one example where "ONE" Quadro P5000 (2xNVENC) is downscaling > >> about 74 FHD streams to SD at realtime. > >> > >> I know it is not something that is acceptable in current ffmpeg, maybe > >> libav could adopt this patch. > > You mean the Libav project? They'd be even less likely to accept such a > > patch. > > > > Anyway, I don't think this would be slower than doing it in some sort > > of separate cuda video filter. > > _______________________________________________ > > ffmpeg-devel mailing list > > ffmpeg-devel@ffmpeg.org > > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > This is not true, NVDEC (cuvid) is separate chipset and has its own > NVDEC load in nvidia-smi monitoring tool, while resizing with libnpp is > completly done on CUDA cores. In NVDEC only deinterlacing ADAPTIVE is > using CUDA cores more intensively, cropping and resizing in NVDEC is for > free :) I wasn't talking about libnpp. I'm assuming they provide their processing stuff as separate APIs somewhere.
Am 12.02.2017 um 20:59 schrieb Hendrik Leppkes: > On Sun, Feb 12, 2017 at 8:51 PM, Miroslav Slugeň <thunder.m@email.cz> wrote: >> This patch is for discussion only, not ready to commit yet. >> >> 1. Cuvid decoder actualy support scaling input to requested resolution >> without any performance penalty (like libnpp does), so this patch is proof >> of concept that it is working like expected. >> > > I don't think scaling is something a decoder should be doing, we don't > really want all sorts of video processing jumbled up into one > monolithic cuvid thing, but rather keep tasks separated. I'm generally in favor of adding this, but I don't see why ffmpeg.c needs changes for this. The decoder should already be free to return any video size it likes. CUVID is kind of a huge special case with its deinterlacing already, cropping/resizing the output is quite trivial compared to that.
On Mon, Feb 13, 2017 at 11:36 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote: > Am 12.02.2017 um 20:59 schrieb Hendrik Leppkes: >> On Sun, Feb 12, 2017 at 8:51 PM, Miroslav Slugeň <thunder.m@email.cz> wrote: >>> This patch is for discussion only, not ready to commit yet. >>> >>> 1. Cuvid decoder actualy support scaling input to requested resolution >>> without any performance penalty (like libnpp does), so this patch is proof >>> of concept that it is working like expected. >>> >> >> I don't think scaling is something a decoder should be doing, we don't >> really want all sorts of video processing jumbled up into one >> monolithic cuvid thing, but rather keep tasks separated. > > I'm generally in favor of adding this, but I don't see why ffmpeg.c > needs changes for this. > The decoder should already be free to return any video size it likes. > > CUVID is kind of a huge special case with its deinterlacing already, > cropping/resizing the output is quite trivial compared to that. > We recently just had all sorts of discussions what decoders should and should not do, I don't think scaling in a decoder is a good thing to start doing here. - Hendrik
On Mon, Feb 13, 2017 at 12:43:51PM +0100, Hendrik Leppkes wrote: > On Mon, Feb 13, 2017 at 11:36 AM, Timo Rothenpieler > <timo@rothenpieler.org> wrote: > > Am 12.02.2017 um 20:59 schrieb Hendrik Leppkes: > >> On Sun, Feb 12, 2017 at 8:51 PM, Miroslav Slugeň <thunder.m@email.cz> wrote: > >>> This patch is for discussion only, not ready to commit yet. > >>> > >>> 1. Cuvid decoder actualy support scaling input to requested resolution > >>> without any performance penalty (like libnpp does), so this patch is proof > >>> of concept that it is working like expected. > >>> > >> > >> I don't think scaling is something a decoder should be doing, we don't > >> really want all sorts of video processing jumbled up into one > >> monolithic cuvid thing, but rather keep tasks separated. > > > > I'm generally in favor of adding this, but I don't see why ffmpeg.c > > needs changes for this. > > The decoder should already be free to return any video size it likes. > > > > CUVID is kind of a huge special case with its deinterlacing already, > > cropping/resizing the output is quite trivial compared to that. > > > > We recently just had all sorts of discussions what decoders should and > should not do, I don't think scaling in a decoder is a good thing to > start doing here. scaling in some decoders is mandated by some specs some standards support reduced resolution which can switch from frame to frame without the decoder output changing There is also the possiblity of scalability where the reference stream has lower resolution IIRC. This is kind of different of course but, scaling code in decoders is part of some specifications. [...]
>> We recently just had all sorts of discussions what decoders should and >> should not do, I don't think scaling in a decoder is a good thing to >> start doing here. > > scaling in some decoders is mandated by some specs > some standards support reduced resolution which can switch from frame > to frame without the decoder output changing > There is also the possiblity of scalability where the reference stream > has lower resolution IIRC. > > This is kind of different of course but, scaling code in decoders is > part of some specifications. Would like to bring this back up. I'd like to merge this, as specially the scaling is freely done by the video asic, offering a possibility to scale without requiring non-free libnpp. And cropping so far is not possible at all. Yes, scaling and cropping is not something a decoder usually does, but it exposes a hardware feature that has no other way of accessing it, which offers valuable functionality to users.
On Wed, Mar 01, 2017 at 11:58:39AM +0100, Timo Rothenpieler wrote: > >> We recently just had all sorts of discussions what decoders should and > >> should not do, I don't think scaling in a decoder is a good thing to > >> start doing here. > > > > scaling in some decoders is mandated by some specs > > some standards support reduced resolution which can switch from frame > > to frame without the decoder output changing > > There is also the possiblity of scalability where the reference stream > > has lower resolution IIRC. > > > > This is kind of different of course but, scaling code in decoders is > > part of some specifications. > > Would like to bring this back up. > I'd like to merge this, as specially the scaling is freely done by the > video asic, offering a possibility to scale without requiring non-free > libnpp. And cropping so far is not possible at all. > > Yes, scaling and cropping is not something a decoder usually does, but > it exposes a hardware feature that has no other way of accessing it, > which offers valuable functionality to users. iam fine with this but i am not sure others are [...]
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wed, 1 Mar 2017 11:58:39 +0100 Timo Rothenpieler <timo@rothenpieler.org> wrote: > >> We recently just had all sorts of discussions what decoders should > >> and should not do, I don't think scaling in a decoder is a good > >> thing to start doing here. > > > > scaling in some decoders is mandated by some specs > > some standards support reduced resolution which can switch from > > frame to frame without the decoder output changing > > There is also the possiblity of scalability where the reference > > stream has lower resolution IIRC. > > > > This is kind of different of course but, scaling code in decoders is > > part of some specifications. > > Would like to bring this back up. > I'd like to merge this, as specially the scaling is freely done by the > video asic, offering a possibility to scale without requiring non-free > libnpp. And cropping so far is not possible at all. > > Yes, scaling and cropping is not something a decoder usually does, but > it exposes a hardware feature that has no other way of accessing it, > which offers valuable functionality to users. > I'm ok with it. I agree it's ugly, but if this is the only way, so be it. For what it's worth. there's precedence in crystalhd. I exposed the hardware's ability to do downscaling, which was valuable because it allowed you to downscale before memcpy, which made the difference between playable and unplayable for some low end machines. - --phil -----BEGIN PGP SIGNATURE----- iEYEARECAAYFAli69m4ACgkQ+NaxlGp1aC5xwgCgkfrCp7/dMt4zl+APxZRCQ17c tQQAoIZW+8Z57NbOsOTxFTqTpp4PVLWP =43NP -----END PGP SIGNATURE-----
On Sat, Mar 04, 2017 at 09:16:30AM -0800, Philip Langdale wrote: > On Wed, 1 Mar 2017 11:58:39 +0100 > Timo Rothenpieler <timo@rothenpieler.org> wrote: > > Would like to bring this back up. > > I'd like to merge this, as specially the scaling is freely done by the > > video asic, offering a possibility to scale without requiring non-free > > libnpp. And cropping so far is not possible at all. > > > > Yes, scaling and cropping is not something a decoder usually does, but > > it exposes a hardware feature that has no other way of accessing it, > > which offers valuable functionality to users. > > I'm ok with it. I agree it's ugly, but if this is the only way, so be > it. I find it kind of intriguing that doing an operation at a place where it is most efficient (also where it seems to belong by the codec or hardware design) is being called "ugly". > For what it's worth. there's precedence in crystalhd. I exposed the > hardware's ability to do downscaling, which was valuable because it > allowed you to downscale before memcpy, which made the difference > between playable and unplayable for some low end machines. The cinepak decoder is another precedent of the same kind, even if not regarding scaling but pixel formats. "Doing the operation where it costs least" looks like a reasonable criteria, doesn't it? Which criteria would make a decoder (or any tool) a wrong place for something it does much better than anyone else? Regards, Rune
> "Doing the operation where it costs least" looks like a reasonable > criteria, doesn't it? > > Which criteria would make a decoder (or any tool) a wrong place > for something it does much better than anyone else? It's about having scaling-functionality in libavcodec, while it belongs into libavfilter, but the cuvid API does not offer that possibility.
On Sat, Mar 04, 2017 at 08:33:03PM +0100, Timo Rothenpieler wrote: > >Which criteria would make a decoder (or any tool) a wrong place > >for something it does much better than anyone else? > > It's about having scaling-functionality in libavcodec, while it belongs into > libavfilter, but the cuvid API does not offer that possibility. You take for granted "it belongs 'there'" but my question was not about "where" but "why". In these particular cases (cuvid, cinepak) a libxxxx can perform at best only a small fraction as good as the decoder itself. So, again, what is our criteria to choose the most suitable place? libxxxxxxx exist for a good reason, in many cases they are best as providers of a certain functionality, compared to multiple spread ad-hoc implementations. OTOH when they are _not_ good at providing a functionality, and for fundamental reasons can not be made as good as an alternative, then why insist on using them? Regards, Rune
On 01/03/17 10:58, Timo Rothenpieler wrote: >>> We recently just had all sorts of discussions what decoders should and >>> should not do, I don't think scaling in a decoder is a good thing to >>> start doing here. >> >> scaling in some decoders is mandated by some specs >> some standards support reduced resolution which can switch from frame >> to frame without the decoder output changing >> There is also the possiblity of scalability where the reference stream >> has lower resolution IIRC. >> >> This is kind of different of course but, scaling code in decoders is >> part of some specifications. > > Would like to bring this back up. > I'd like to merge this, as specially the scaling is freely done by the > video asic, offering a possibility to scale without requiring non-free > libnpp. And cropping so far is not possible at all. > > Yes, scaling and cropping is not something a decoder usually does, but > it exposes a hardware feature that has no other way of accessing it, > which offers valuable functionality to users. To offer an alternative approach to this: * Make a new CUVID hwcontext implementation - each frame in it consists of some decode parameters (including input bitstream) and a reference to a decoder instance. * The CUVID decoder in lavc would create a decoder instance, but when asked to decode a packet it would a new CUVID frame with the appropriate decoding parameters attached to it and returns that. * CUVID scale/crop/deinterlace filters could then be written which just tag the frame with the appropriate transformation to happen later. * The decoder then actually runs when you try to get the frame data - either by mapping to CUDA (av_hwframe_map() / vf_hwmap) or actually downloading the frame to system memory (av_hwframe_transfer_data() / vf_hwdownload). Now, while this has rather nice outward behaviour in having the API work like all other hwcontext implementations, it also has a number of difficulties: * It's even less clear how to get asynchronicity for performance than it is now - decodes are only issued when you try to use the output, so pretty much all overlap possibilities are lost. Maybe that could be avoided by adding some sort of "crystallise frame" call to hwcontext, but it's still somewhat clumsy. * The decoder has to be able to determine the intrinsic delay of the stream in advance, because it can't output a frame until it will definitely be decodable without more packets on the input (av_hwframe_transfer_data() can't return AVERROR(EAGAIN) to indicate that you should supply more data with avcodec_send_packet()). * The non-native output formats of the decoder in lavc (i.e. all current ones - system memory and CUDA) become unwanted, but compatibility would force them to continue to exist as some sort of auto-download setup. (ffmpeg.c wouldn't use it - the download would happen there (or not) like it does with the true hwaccels, since like them the decoder doesn't actually support system memory or even CUDA frame output without copying at all.) * This multiple-library approach putting the decoder in lavu might be regarded as madness. Not really advocating this solution exactly (I rather agree with the final point above), but I think something like this should be considered so that CUVID doesn't end up behaving entirely differently to all other decoders in this respect. - Mark
Am 01.03.2017 um 11:58 schrieb Timo Rothenpieler: >>> We recently just had all sorts of discussions what decoders should and >>> should not do, I don't think scaling in a decoder is a good thing to >>> start doing here. >> >> scaling in some decoders is mandated by some specs >> some standards support reduced resolution which can switch from frame >> to frame without the decoder output changing >> There is also the possiblity of scalability where the reference stream >> has lower resolution IIRC. >> >> This is kind of different of course but, scaling code in decoders is >> part of some specifications. > > Would like to bring this back up. > I'd like to merge this, as specially the scaling is freely done by the > video asic, offering a possibility to scale without requiring non-free > libnpp. And cropping so far is not possible at all. > > Yes, scaling and cropping is not something a decoder usually does, but > it exposes a hardware feature that has no other way of accessing it, > which offers valuable functionality to users. > With the lazy filter init now merged, this patch can be simplified. Rewrote most of it, currently on github: https://github.com/BtbN/FFmpeg/commit/f856fa509278392a88c754b8c7755a575e5aeb41 I'm still doing some testing with it, but intend to push it if no issues are found.
From 9f5dfd6e9cabd3d419a3a58f7bfa3b3c1e179638 Mon Sep 17 00:00:00 2001 From: Miroslav Slugen <thunder.m@email.cz> Date: Sun, 12 Feb 2017 20:29:34 +0100 Subject: [PATCH 1/1] cuvid: add resize and crop futures --- ffmpeg.h | 2 ++ ffmpeg_opt.c | 12 +++++++ libavcodec/cuvid.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++-------- 3 files changed, 96 insertions(+), 13 deletions(-) diff --git a/ffmpeg.h b/ffmpeg.h index 85a8f18..0374f11 100644 --- a/ffmpeg.h +++ b/ffmpeg.h @@ -132,6 +132,8 @@ typedef struct OptionsContext { int nb_hwaccel_output_formats; SpecifierOpt *autorotate; int nb_autorotate; + SpecifierOpt *resize; + int nb_resize; /* output options */ StreamMap *stream_maps; diff --git a/ffmpeg_opt.c b/ffmpeg_opt.c index 6a47d32..fcf4792 100644 --- a/ffmpeg_opt.c +++ b/ffmpeg_opt.c @@ -659,6 +659,7 @@ static void add_input_streams(OptionsContext *o, AVFormatContext *ic) char *codec_tag = NULL; char *next; char *discard_str = NULL; + char *resize_str = NULL; const AVClass *cc = avcodec_get_class(); const AVOption *discard_opt = av_opt_find(&cc, "skip_frame", NULL, 0, 0); @@ -722,6 +723,14 @@ static void add_input_streams(OptionsContext *o, AVFormatContext *ic) case AVMEDIA_TYPE_VIDEO: if(!ist->dec) ist->dec = avcodec_find_decoder(par->codec_id); + + MATCH_PER_STREAM_OPT(resize, str, resize_str, ic, st); + if (resize_str) { + av_parse_video_size(&ist->dec_ctx->width, &ist->dec_ctx->height, resize_str); + ist->dec_ctx->coded_width = ist->dec_ctx->width; + ist->dec_ctx->coded_height = ist->dec_ctx->height; + } + #if FF_API_EMU_EDGE if (av_codec_get_lowres(st->codec)) { av_codec_set_lowres(ist->dec_ctx, av_codec_get_lowres(st->codec)); @@ -3591,6 +3600,9 @@ const OptionDef options[] = { { "hwaccel_output_format", OPT_VIDEO | OPT_STRING | HAS_ARG | OPT_EXPERT | OPT_SPEC | OPT_INPUT, { .off = OFFSET(hwaccel_output_formats) }, "select output format used with HW accelerated decoding", "format" }, + { "resize", OPT_VIDEO | OPT_STRING | HAS_ARG | OPT_EXPERT | + OPT_SPEC | OPT_INPUT | OPT_OUTPUT, { .off = OFFSET(resize) }, + "resizer builtin input or output" }, #if CONFIG_VDA || CONFIG_VIDEOTOOLBOX { "videotoolbox_pixfmt", HAS_ARG | OPT_STRING | OPT_EXPERT, { &videotoolbox_pixfmt}, "" }, #endif diff --git a/libavcodec/cuvid.c b/libavcodec/cuvid.c index a2e125d..7370ed1 100644 --- a/libavcodec/cuvid.c +++ b/libavcodec/cuvid.c @@ -21,6 +21,7 @@ #include "compat/cuda/dynlink_loader.h" +#include "libavutil/avstring.h" #include "libavutil/buffer.h" #include "libavutil/mathematics.h" #include "libavutil/hwcontext.h" @@ -43,6 +44,15 @@ typedef struct CuvidContext char *cu_gpu; int nb_surfaces; int drop_second_field; + char *crop; + char *resize; + + struct { + short left; + short top; + short right; + short bottom; + } offset; AVBufferRef *hwdevice; AVBufferRef *hwframe; @@ -57,6 +67,10 @@ typedef struct CuvidContext int internal_error; int decoder_flushing; + int width; + int height; + int coded_width; + int coded_height; cudaVideoCodec codec_type; cudaVideoChromaFormat chroma_format; @@ -105,6 +119,7 @@ static int CUDAAPI cuvid_handle_video_sequence(void *opaque, CUVIDEOFORMAT* form AVHWFramesContext *hwframe_ctx = (AVHWFramesContext*)ctx->hwframe->data; CUVIDDECODECREATEINFO cuinfo; int surface_fmt; + int width, height; enum AVPixelFormat pix_fmts[3] = { AV_PIX_FMT_CUDA, AV_PIX_FMT_NONE, // Will be updated below @@ -144,8 +159,8 @@ static int CUDAAPI cuvid_handle_video_sequence(void *opaque, CUVIDEOFORMAT* form avctx->pix_fmt = surface_fmt; - avctx->width = format->display_area.right; - avctx->height = format->display_area.bottom; + width = format->display_area.right - format->display_area.left; + height = format->display_area.bottom - format->display_area.top; ff_set_sar(avctx, av_div_q( (AVRational){ format->display_aspect_ratio.x, format->display_aspect_ratio.y }, @@ -174,8 +189,10 @@ static int CUDAAPI cuvid_handle_video_sequence(void *opaque, CUVIDEOFORMAT* form } if (ctx->cudecoder - && avctx->coded_width == format->coded_width - && avctx->coded_height == format->coded_height + && ctx->width == width + && ctx->height == height + && ctx->coded_width == format->coded_width + && ctx->coded_height == format->coded_height && ctx->chroma_format == format->chroma_format && ctx->codec_type == format->codec) return 1; @@ -204,11 +221,15 @@ static int CUDAAPI cuvid_handle_video_sequence(void *opaque, CUVIDEOFORMAT* form return 0; } - avctx->coded_width = format->coded_width; - avctx->coded_height = format->coded_height; - + ctx->width = width; + ctx->height = height; + ctx->coded_width = format->coded_width; + ctx->coded_height = format->coded_height; ctx->chroma_format = format->chroma_format; + avctx->coded_width = avctx->width; + avctx->coded_height = avctx->height; + memset(&cuinfo, 0, sizeof(cuinfo)); cuinfo.CodecType = ctx->codec_type = format->codec; @@ -228,15 +249,24 @@ static int CUDAAPI cuvid_handle_video_sequence(void *opaque, CUVIDEOFORMAT* form return 0; } - cuinfo.ulWidth = avctx->coded_width; - cuinfo.ulHeight = avctx->coded_height; - cuinfo.ulTargetWidth = cuinfo.ulWidth; - cuinfo.ulTargetHeight = cuinfo.ulHeight; + cuinfo.ulWidth = ctx->coded_width; + cuinfo.ulHeight = ctx->coded_height; + + /* cropping depends on original resolution */ + cuinfo.display_area.left = ctx->offset.left; + cuinfo.display_area.top = ctx->offset.top; + cuinfo.display_area.right = cuinfo.ulWidth - ctx->offset.right; + cuinfo.display_area.bottom = cuinfo.ulHeight - ctx->offset.bottom; + /* scaling to requested resolution */ + cuinfo.ulTargetWidth = avctx->width; + cuinfo.ulTargetHeight = avctx->height; + + /* aspect ratio conversion, 1:1, depends on scaled resolution */ cuinfo.target_rect.left = 0; cuinfo.target_rect.top = 0; - cuinfo.target_rect.right = cuinfo.ulWidth; - cuinfo.target_rect.bottom = cuinfo.ulHeight; + cuinfo.target_rect.right = cuinfo.ulTargetWidth; + cuinfo.target_rect.bottom = cuinfo.ulTargetHeight; cuinfo.ulNumDecodeSurfaces = ctx->nb_surfaces; cuinfo.ulNumOutputSurfaces = 1; @@ -636,6 +666,11 @@ static int cuvid_test_dummy_decoder(AVCodecContext *avctx, cuinfo.ulTargetWidth = cuinfo.ulWidth; cuinfo.ulTargetHeight = cuinfo.ulHeight; + cuinfo.display_area.left = 0; + cuinfo.display_area.top = 0; + cuinfo.display_area.right = cuinfo.ulWidth; + cuinfo.display_area.bottom = cuinfo.ulHeight; + cuinfo.target_rect.left = 0; cuinfo.target_rect.top = 0; cuinfo.target_rect.right = cuinfo.ulWidth; @@ -822,6 +857,38 @@ static av_cold int cuvid_decode_init(AVCodecContext *avctx) FFMIN(sizeof(ctx->cuparse_ext.raw_seqhdr_data), avctx->extradata_size)); } + ctx->offset.top = 0; + ctx->offset.bottom = 0; + ctx->offset.left = 0; + ctx->offset.right = 0; + if (ctx->crop) { + char *crop_str, *saveptr; + int crop_idx = 0; + crop_str = av_strdup(ctx->crop); + crop_str = av_strtok(crop_str, "x", &saveptr); + while (crop_str) { + switch (crop_idx++) { + case 0: + ctx->offset.top = atoi(crop_str); + break; + case 1: + ctx->offset.bottom = atoi(crop_str); + break; + case 2: + ctx->offset.left = atoi(crop_str); + break; + case 3: + ctx->offset.right = atoi(crop_str); + break; + default: + break; + } + crop_str = av_strtok(NULL, "x", &saveptr); + } + free(crop_str); + } + + ctx->cuparseinfo.ulMaxNumDecodeSurfaces = ctx->nb_surfaces; ctx->cuparseinfo.ulMaxDisplayDelay = 4; ctx->cuparseinfo.pUserData = avctx; @@ -934,6 +1001,8 @@ static const AVOption options[] = { { "gpu", "GPU to be used for decoding", OFFSET(cu_gpu), AV_OPT_TYPE_STRING, { .str = NULL }, 0, 0, VD }, { "surfaces", "Maximum surfaces to be used for decoding", OFFSET(nb_surfaces), AV_OPT_TYPE_INT, { .i64 = 25 }, 0, INT_MAX, VD }, { "drop_second_field", "Drop second field when deinterlacing", OFFSET(drop_second_field), AV_OPT_TYPE_BOOL, { .i64 = 1 }, 0, 1, VD }, + { "crop", "Crop (top)x(bottom)x(left)x(right)", OFFSET(crop), AV_OPT_TYPE_STRING, { .str = NULL }, 0, 0, VD }, + { "resize", "Resize (width)x(height)", OFFSET(resize), AV_OPT_TYPE_STRING, { .str = NULL }, 0, 0, VD }, { NULL } }; -- 2.1.4