Message ID | CAOSt7Dm_2vmvPRpj3EeX1RRpu6+tH7Z56Bg--Z+vZU5QivccmA@mail.gmail.com |
---|---|
State | New |
Headers | show |
Series | [FFmpeg-devel] avcodec/nvdec: support resizing while decoding | expand |
Context | Check | Description |
---|---|---|
yinshiyou/configure_loongarch64 | warning | Failed to apply patch |
andriy/configure_x86 | warning | Failed to apply patch |
On Fri, Sep 20, 2024 at 1:24 AM Carlos Ruiz <carlos.r.domin@gmail.com> wrote: > > Hi! > > This is my first contribution to the project so please excuse any bad > etiquette, I tried to read all the FAQs before posting. Would love to start > by thanking everyone for such an amazing framework you've built! > > Anyway, here's my proposed patch to support video resizing when using NVDEC > hwaccel to decode hevc video (I could look into a similar patch for h264, > av1, etc if this looks useful). There's a bit more context/explanation in > the commit description in the patch, but please let me know if the use case > isn't clear. > We don't really leverage these extra functions of NVDEC because it breaks many assumptions about hwaccels, which are meant to be exact decoders. If anything, just fudging the width/height is certainly an API violation and will likely not be possible as it breaks many assumptions in the code otherwise, see below. > --- > libavcodec/hevc/hevcdec.c | 8 ++++++-- > libavcodec/nvdec.c | 21 +++++++++++++++++---- > 2 files changed, 23 insertions(+), 6 deletions(-) > > diff --git a/libavcodec/hevc/hevcdec.c b/libavcodec/hevc/hevcdec.c > index d915d74d22..d63fc5875f 100644 > --- a/libavcodec/hevc/hevcdec.c > +++ b/libavcodec/hevc/hevcdec.c > @@ -351,8 +351,12 @@ static void export_stream_params(HEVCContext *s, const > HEVCSPS *sps) > avctx->pix_fmt = sps->pix_fmt; > avctx->coded_width = sps->width; > avctx->coded_height = sps->height; > - avctx->width = sps->width - ow->left_offset - > ow->right_offset; > - avctx->height = sps->height - ow->top_offset - > ow->bottom_offset; > + if (avctx->width <= 0 || avctx->height <= 0) { > + avctx->width = sps->width; > + avctx->height = sps->height; > + } > + avctx->width = avctx->width - ow->left_offset - ow->right_offset; > + avctx->height = avctx->height - ow->top_offset - ow->bottom_offset; You cannot do that here. The frame size can change mid-stream, and this would suppress any such change. Additionally, if this code runs more then once, then the offset is applied repeatedly without actually resetting width/height. - Hendrik
> We don't really leverage these extra functions of NVDEC because it > breaks many assumptions about hwaccels, which are meant to be exact > decoders. Yeah I understand that, and expected this kind of feedback. Do you envision a way for hwaccels (perhaps in the future) to support some additional level of *hardware acceleration*, such as resizing and cropping? Let me try to motivate the rationale. Think about a common use case in the AI world we live in: you receive a bunch of simultaneous 4k (or 1080p) incoming rtsp streams and you want to decode the video and pass it through some ML model. The native hevc codec doesn't support resizing, so you decode video at full 4k on the gpu by leveraging the nvdec hwaccel, which allocates something like 5-10 surfaces at 3840x2160, totalling around 250MB of VRAM (GPU memory), and then you have to immediately take all of those frames, pass them through a filterchain, scale them down to e.g. 640x360. Leaving aside the waste of additional CUDA cores and added latency of having to synchronize the cuda stream twice (once to receive the frame out of the decoder, the second to pull from the filterchain), do this for 50 camera streams and you'll quickly run out of GPU memory with a GPU utilization under 10%... Instead, my patch allows decoding at 640x360 (or whatever you choose) directly, allocating only 6MB of GPU VRAM (instead of 250MB at 4k), so now you can fit 40x more video decoding streams in the same GPU card. I understand the challenges of allowing hwaccels to resize/crop as part of the decoding process and how it currently breaks the API in several ways I'm of course unfamiliar with. However, I'd love to help support this feature in the future, since it would add tremendous value and enable more efficient real-time Computer Vision pipelines to leverage the amazing framework FFmpeg is. > You cannot do that here. The frame size can change mid-stream, and > this would suppress any such change. > Additionally, if this code runs more then once, then the offset is > applied repeatedly without actually resetting width/height. Very good points! Indeed I'm not yet super familiar with the structure of the hevc codec implementation, nor with the available functions to reconfigure a hwaccel (I know the Video Codec SDK supports this, but have never implemented such functionality), but I see how this would break. Would you see value in me trying to figure out a way to support reconfiguring the codec if the width/height/cropping changes mid-stream or would you reject the patch/feature regardless? Thanks! On Fri, Sep 20, 2024 at 7:18 AM Hendrik Leppkes <h.leppkes@gmail.com> wrote: > On Fri, Sep 20, 2024 at 1:24 AM Carlos Ruiz <carlos.r.domin@gmail.com> > wrote: > > > > Hi! > > > > This is my first contribution to the project so please excuse any bad > > etiquette, I tried to read all the FAQs before posting. Would love to > start > > by thanking everyone for such an amazing framework you've built! > > > > Anyway, here's my proposed patch to support video resizing when using > NVDEC > > hwaccel to decode hevc video (I could look into a similar patch for h264, > > av1, etc if this looks useful). There's a bit more context/explanation in > > the commit description in the patch, but please let me know if the use > case > > isn't clear. > > > > We don't really leverage these extra functions of NVDEC because it > breaks many assumptions about hwaccels, which are meant to be exact > decoders. > If anything, just fudging the width/height is certainly an API > violation and will likely not be possible as it breaks many > assumptions in the code otherwise, see below. > > > --- > > libavcodec/hevc/hevcdec.c | 8 ++++++-- > > libavcodec/nvdec.c | 21 +++++++++++++++++---- > > 2 files changed, 23 insertions(+), 6 deletions(-) > > > > diff --git a/libavcodec/hevc/hevcdec.c b/libavcodec/hevc/hevcdec.c > > index d915d74d22..d63fc5875f 100644 > > --- a/libavcodec/hevc/hevcdec.c > > +++ b/libavcodec/hevc/hevcdec.c > > @@ -351,8 +351,12 @@ static void export_stream_params(HEVCContext *s, > const > > HEVCSPS *sps) > > avctx->pix_fmt = sps->pix_fmt; > > avctx->coded_width = sps->width; > > avctx->coded_height = sps->height; > > - avctx->width = sps->width - ow->left_offset - > > ow->right_offset; > > - avctx->height = sps->height - ow->top_offset - > > ow->bottom_offset; > > + if (avctx->width <= 0 || avctx->height <= 0) { > > + avctx->width = sps->width; > > + avctx->height = sps->height; > > + } > > + avctx->width = avctx->width - ow->left_offset - > ow->right_offset; > > + avctx->height = avctx->height - ow->top_offset - > ow->bottom_offset; > > You cannot do that here. The frame size can change mid-stream, and > this would suppress any such change. > Additionally, if this code runs more then once, then the offset is > applied repeatedly without actually resetting width/height. > > - Hendrik > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". >
diff --git a/libavcodec/hevc/hevcdec.c b/libavcodec/hevc/hevcdec.c index d915d74d22..d63fc5875f 100644 --- a/libavcodec/hevc/hevcdec.c +++ b/libavcodec/hevc/hevcdec.c @@ -351,8 +351,12 @@ static void export_stream_params(HEVCContext *s, const HEVCSPS *sps) avctx->pix_fmt = sps->pix_fmt; avctx->coded_width = sps->width; avctx->coded_height = sps->height; - avctx->width = sps->width - ow->left_offset - ow->right_offset; - avctx->height = sps->height - ow->top_offset - ow->bottom_offset; + if (avctx->width <= 0 || avctx->height <= 0) { + avctx->width = sps->width; + avctx->height = sps->height; + } + avctx->width = avctx->width - ow->left_offset - ow->right_offset; + avctx->height = avctx->height - ow->top_offset - ow->bottom_offset; avctx->has_b_frames = sps->temporal_layer[sps->max_sub_layers - 1].num_reorder_pics; avctx->profile = sps->ptl.general_ptl.profile_idc; avctx->level = sps->ptl.general_ptl.level_idc; diff --git a/libavcodec/nvdec.c b/libavcodec/nvdec.c index 932544564a..86143de74c 100644 --- a/libavcodec/nvdec.c +++ b/libavcodec/nvdec.c @@ -324,6 +324,18 @@ static int nvdec_init_hwframes(AVCodecContext *avctx, AVBufferRef **out_frames_r return 0; } +static int get_buffer2(AVCodecContext *avctx, AVFrame *frame, int flags) { + /* + * HEVC codec includes FF_CODEC_CAP_EXPORTS_CROPPING in its caps_internal, so by default frames will be set + * to width=avctx->coded_width and height=avctx->coded_height. Now that we support resizing as part of decoding, + * overwrite the frame dimensions with display values rather than coded. + */ + int ret = avcodec_default_get_buffer2(avctx, frame, flags); + frame->width = avctx->width; + frame->height = avctx->height; + return ret; +}
Hi! This is my first contribution to the project so please excuse any bad etiquette, I tried to read all the FAQs before posting. Would love to start by thanking everyone for such an amazing framework you've built! Anyway, here's my proposed patch to support video resizing when using NVDEC hwaccel to decode hevc video (I could look into a similar patch for h264, av1, etc if this looks useful). There's a bit more context/explanation in the commit description in the patch, but please let me know if the use case isn't clear. I tested locally and all of these 4 scenarios work as expected: * Using hevc codec with nvdec hwaccel, leaving avctx->width and avctx->height unset. On a 1920x1080 input video, I get 1920x1080 cuda frames out. * Using hevc codec with nvdec hwaccel, setting avctx->width and avctx->height to some arbitrary value (e.g. 640x360). On the same input video, I get 640x360 cuda frames out. * Using hevc codec without hwaccel, leaving avctx->width and avctx->height unset. I get 1920x1080 yuvj420p frames (in cpu) out. * Using hevc codec without hwaccel, setting avctx->width and avctx->height to some arbitrary value (e.g. 640x360). The values get ignored (as in FFMpeg master) and I again get 1920x1080 yuvj420p frames out. I'm not extremely familiar with hevcdec.c so I'm not sure if this would accidentally break something else. Looking forward to hearing your thoughts! From 850afda5f6479064c75a4b905f12e48f97b6d551 Mon Sep 17 00:00:00 2001 From: Carlos Ruiz <carlos.r.domin@gmail.com> Date: Thu, 19 Sep 2024 14:00:05 +0200 Subject: [PATCH] avcodec/nvdec: support resizing while decoding Nvidia chips support accelerated resizing while decoding video. The *_cuvid codecs (cuviddec.c) already support resizing and cropping, but have two big downsides: 1) they have a minimum latency of two packets (even with the LOW_DELAY flag enabled) 2) AV_CODEC_FLAG_COPY_OPAQUE is not respected (opaque and opaque_ref aren't transferred from packets to frames) Instead, parsing the video using a non-accelerated codec (hevcdec.c) solves both downsides above. This commit brings resizing capabilities to the *_nvdec hwaccel, similar to what *_cuvid does, to combine the best of both worlds (proper parsing + accelerated decoding and resizing). --- libavcodec/hevc/hevcdec.c | 8 ++++++-- libavcodec/nvdec.c | 21 +++++++++++++++++---- 2 files changed, 23 insertions(+), 6 deletions(-) + int ff_nvdec_decode_init(AVCodecContext *avctx) { NVDECContext *ctx = avctx->internal->hwaccel_priv_data; @@ -393,8 +405,9 @@ int ff_nvdec_decode_init(AVCodecContext *avctx) params.ulWidth = avctx->coded_width; params.ulHeight = avctx->coded_height; - params.ulTargetWidth = avctx->coded_width; - params.ulTargetHeight = avctx->coded_height; + avctx->get_buffer2 = get_buffer2; + params.ulTargetWidth = avctx->width; + params.ulTargetHeight = avctx->height; params.bitDepthMinus8 = sw_desc->comp[0].depth - 8; params.OutputFormat = output_format; params.CodecType = cuvid_codec_type; @@ -719,8 +732,8 @@ int ff_nvdec_frame_params(AVCodecContext *avctx, chroma_444 = supports_444 && cuvid_chroma_format == cudaVideoChromaFormat_444; frames_ctx->format = AV_PIX_FMT_CUDA; - frames_ctx->width = (avctx->coded_width + 1) & ~1; - frames_ctx->height = (avctx->coded_height + 1) & ~1; + frames_ctx->width = (avctx->width + 1) & ~1; + frames_ctx->height = (avctx->height + 1) & ~1; /* * We add two extra frames to the pool to account for deinterlacing filters * holding onto their frames. -- 2.43.0