Message ID | 007701db0a4e$7adc8200$70958600$@samsung.com |
---|---|
State | New |
Headers | show |
Series | [FFmpeg-devel] avcodec/cuviddec: Add handling HDR10+ sidedata on cuviddec. | expand |
Context | Check | Description |
---|---|---|
yinshiyou/make_loongarch64 | success | Make finished |
yinshiyou/make_fate_loongarch64 | success | Make fate finished |
andriy/make_x86 | success | Make finished |
andriy/make_fate_x86 | success | Make fate finished |
On 19.09.2024 06:43, yoonjoo wrote: > Implemented decoding of NAL units and handling HDR10+ sidedata > by referring to hevcdec. Why? Can't you just use the native decoder with nvdec hwaccel?
What? On Fri, Sep 20, 2024, 12:47 AM Timo Rothenpieler <timo@rothenpieler.org> wrote: > On 19.09.2024 06:43, yoonjoo wrote: > > Implemented decoding of NAL units and handling HDR10+ sidedata > > by referring to hevcdec. > > Why? Can't you just use the native decoder with nvdec hwaccel? > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". >
Does native decoder refer to hevc (hevcdec.c)? I tried using hevc and in environments with low CPU performance, hevc_cuvid was much faster. So, I used hevc_cuvid for decoding but encountered an issue where HDR10+ sidedata did not exist. That's why I wrote this patch. I thought that the original content's sidedata should be preserved even after decoding regardless of which decoder is used. And I thought hevc_cuvid was the only way to get Nvidia hwaccel support. If I'm mistaken about anything, please let me know. Also, is it correct to respond to your comments like this? It seems quite different from the format you sent. Apologies, as I'm still relatively new to the FFmpeg community and have a lot to learn. Any additional guidance would be greatly appreciated. -----Original Message----- From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of Timo Rothenpieler Sent: Friday, September 20, 2024 4:48 AM To: ffmpeg-devel@ffmpeg.org Subject: Re: [FFmpeg-devel] [PATCH] avcodec/cuviddec: Add handling HDR10+ sidedata on cuviddec. On 19.09.2024 06:43, yoonjoo wrote: > Implemented decoding of NAL units and handling HDR10+ sidedata by > referring to hevcdec. Why? Can't you just use the native decoder with nvdec hwaccel?
On 20/09/2024 04:08, 김윤주 wrote: > Does native decoder refer to hevc (hevcdec.c)? > I tried using hevc and in environments with low CPU performance, hevc_cuvid > was much faster. > So, I used hevc_cuvid for decoding but encountered an issue where HDR10+ > sidedata did not exist. > That's why I wrote this patch. You did turn on hwaccel, right? I don't see why the native decoder would be much slower at parsing then nvidias parsers. > I thought that the original content's sidedata should be preserved even > after decoding regardless of which decoder is used. > And I thought hevc_cuvid was the only way to get Nvidia hwaccel support. > If I'm mistaken about anything, please let me know. It's just that the cuviddec decoder is more of a relic, from before the native hwaccel existed. The only reason it's not straight up deprecated is that it's sometimes nice to have a "second opinion" when issues crop up, and there are a few specific features like hardware-deinterlacing that can't be exposed via the native hwaccel. But I'd normally not like to expand it even further and add complex and large features to it. Whenever possible, the native nvdec hwaccel should be used. > Also, is it correct to respond to your comments like this? > It seems quite different from the format you sent. Top-posting isn't exactly liked here, though I don't really have a strong opinion on it. > Apologies, as I'm still relatively new to the FFmpeg community and have a > lot to learn. > Any additional guidance would be greatly appreciated. > > -----Original Message----- > From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of Timo > Rothenpieler > Sent: Friday, September 20, 2024 4:48 AM > To: ffmpeg-devel@ffmpeg.org > Subject: Re: [FFmpeg-devel] [PATCH] avcodec/cuviddec: Add handling HDR10+ > sidedata on cuviddec. > > On 19.09.2024 06:43, yoonjoo wrote: >> Implemented decoding of NAL units and handling HDR10+ sidedata by >> referring to hevcdec. > > Why? Can't you just use the native decoder with nvdec hwaccel? > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://protect2.fireeye.com/v1/url?k=c11e9bb6-9e85a2a9-c11f10f9- > 000babdfecba-c4d76825032a0b52&q=1&e=342e88a0-295e-4dc1-a57f- > 8f5aab974009&u=https%3A%2F%2Fffmpeg.org%2Fmailman%2Flistinfo%2Fffmpeg-devel > > To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org > with subject "unsubscribe". > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
> It's just that the cuviddec decoder is more of a relic, from before the > native hwaccel existed. > The only reason it's not straight up deprecated is that it's sometimes > nice to have a "second opinion" when issues crop up, and there are a few > specific features like hardware-deinterlacing that can't be exposed via > the native hwaccel. > > But I'd normally not like to expand it even further and add complex and > large features to it. Whenever possible, the native nvdec hwaccel should > be used. As someone who recently switched from *_cuvid to * + hwaccel=cuda I have to agree that the performance (timing-wise at least) has been very comparable for decoding. The only comment I'd like to make though, and this might be a bit of unpopular opinion based on some other threads I read, but there are huge advantages of hwaccel accelerating not just decoding but also resizing (and I guess optionally cropping). I know that ideally a decoder should only decode, but think about a common usecase in the AI world we live in: you get a bunch of simultaneous 4k (or 1080p) incoming rtsp streams and you want to decode the video and pass it through some ML model, e.g. in TensorRT (to stick with the Nvidia example). The native hevc codec doesn't support resizing, so you decode video at full 4k on the gpu, which means allocating something like 5-10 surfaces at 3840x2160 which becomes 250MB of GPU memory, and then you have immediately take all of those frames, pass them through a filterchain, scale them down to e.g. 640x360, and waste CUDA cores instead of leveraging the dedicated video downsizing inside the NVDEC chip. Now do that for 50 camera streams and you'll quickly run out of GPU memory with a GPU utilization under 10% haha. This is exactly why I submitted a patch yesterday that would allow using the hevc codec with nvdec hwaccel, while resizing on the gpu like hevc_cuviddec does, and the memory (and GPU) consuption goes waaay down (e.g. 6MB of GPU VRAM instead of 250MB per camera). I know this is a different discussion but thought it was appropriate to share because deprecating cuviddec or rejecting my patch would leave part of the community out. On Fri, Sep 20, 2024 at 12:48 PM Timo Rothenpieler <timo@rothenpieler.org> wrote: > On 20/09/2024 04:08, 김윤주 wrote: > > Does native decoder refer to hevc (hevcdec.c)? > > I tried using hevc and in environments with low CPU performance, > hevc_cuvid > > was much faster. > > So, I used hevc_cuvid for decoding but encountered an issue where HDR10+ > > sidedata did not exist. > > That's why I wrote this patch. > > You did turn on hwaccel, right? > I don't see why the native decoder would be much slower at parsing then > nvidias parsers. > > > I thought that the original content's sidedata should be preserved even > > after decoding regardless of which decoder is used. > > And I thought hevc_cuvid was the only way to get Nvidia hwaccel support. > > If I'm mistaken about anything, please let me know. > > It's just that the cuviddec decoder is more of a relic, from before the > native hwaccel existed. > The only reason it's not straight up deprecated is that it's sometimes > nice to have a "second opinion" when issues crop up, and there are a few > specific features like hardware-deinterlacing that can't be exposed via > the native hwaccel. > > But I'd normally not like to expand it even further and add complex and > large features to it. Whenever possible, the native nvdec hwaccel should > be used. > > > Also, is it correct to respond to your comments like this? > > It seems quite different from the format you sent. > > Top-posting isn't exactly liked here, though I don't really have a > strong opinion on it. > > > Apologies, as I'm still relatively new to the FFmpeg community and have a > > lot to learn. > > Any additional guidance would be greatly appreciated. > > > > -----Original Message----- > > From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of Timo > > Rothenpieler > > Sent: Friday, September 20, 2024 4:48 AM > > To: ffmpeg-devel@ffmpeg.org > > Subject: Re: [FFmpeg-devel] [PATCH] avcodec/cuviddec: Add handling HDR10+ > > sidedata on cuviddec. > > > > On 19.09.2024 06:43, yoonjoo wrote: > >> Implemented decoding of NAL units and handling HDR10+ sidedata by > >> referring to hevcdec. > > > > Why? Can't you just use the native decoder with nvdec hwaccel? > > _______________________________________________ > > ffmpeg-devel mailing list > > ffmpeg-devel@ffmpeg.org > > https://protect2.fireeye.com/v1/url?k=c11e9bb6-9e85a2a9-c11f10f9- > > 000babdfecba-c4d76825032a0b52&q=1&e=342e88a0-295e-4dc1-a57f- > > > 8f5aab974009&u=https%3A%2F%2Fffmpeg.org%2Fmailman%2Flistinfo%2Fffmpeg-devel > > > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org > > with subject "unsubscribe". > > > > _______________________________________________ > > ffmpeg-devel mailing list > > ffmpeg-devel@ffmpeg.org > > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > > > To unsubscribe, visit link above, or email > > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". >
Le 20/09/2024 à 20:41, Carlos Ruiz a écrit :> The native hevc codec doesn't > support resizing, so you decode video at full 4k on the gpu, which means > allocating > something like 5-10 surfaces at 3840x2160 which becomes 250MB of GPU memory, > and then you have immediately take all of those frames, pass them through a > filterchain, > scale them down to e.g. 640x360, and waste CUDA cores instead of leveraging > the > dedicated video downsizing inside the NVDEC chip. Now do that for 50 camera > streams > and you'll quickly run out of GPU memory with a GPU utilization under 10% > haha. NVDEC does not implement fixed-function downscaling, in fact none of the desktop cards have any hardware dedicated to that. As far as I know, scaling, deinterlacing, and generally all post-processing is done on the compute engine via cuda. This is still pretty efficient since the data can be shared between the decode/compute engines without copy. Tegra chips are the only ones that come with a VIC engine (Video & Image Compositor) which can do scaling, deinterlacing, spatial/temporal filtering, and basic compositing.
On 20/09/2024 20:41, Carlos Ruiz wrote: >> It's just that the cuviddec decoder is more of a relic, from before the >> native hwaccel existed. >> The only reason it's not straight up deprecated is that it's sometimes >> nice to have a "second opinion" when issues crop up, and there are a few >> specific features like hardware-deinterlacing that can't be exposed via >> the native hwaccel. >> >> But I'd normally not like to expand it even further and add complex and >> large features to it. Whenever possible, the native nvdec hwaccel should >> be used. > > As someone who recently switched from *_cuvid to * + hwaccel=cuda I have to > agree > that the performance (timing-wise at least) has been very comparable for > decoding. > > The only comment I'd like to make though, and this might be a bit of > unpopular opinion > based on some other threads I read, but there are huge advantages of hwaccel > accelerating not just decoding but also resizing (and I guess optionally > cropping). > I know that ideally a decoder should only decode, but think about a common > usecase > in the AI world we live in: you get a bunch of simultaneous 4k (or 1080p) > incoming > rtsp streams and you want to decode the video and pass it through some ML > model, > e.g. in TensorRT (to stick with the Nvidia example). The native hevc codec > doesn't > support resizing, so you decode video at full 4k on the gpu, which means > allocating > something like 5-10 surfaces at 3840x2160 which becomes 250MB of GPU memory, > and then you have immediately take all of those frames, pass them through a > filterchain, > scale them down to e.g. 640x360, and waste CUDA cores instead of leveraging > the > dedicated video downsizing inside the NVDEC chip. Now do that for 50 camera > streams > and you'll quickly run out of GPU memory with a GPU utilization under 10% > haha. > > This is exactly why I submitted a patch yesterday that would allow using > the hevc > codec with nvdec hwaccel, while resizing on the gpu like hevc_cuviddec > does, and > the memory (and GPU) consuption goes waaay down (e.g. 6MB of GPU VRAM > instead of 250MB per camera). I know this is a different discussion but > thought > it was appropriate to share because deprecating cuviddec or rejecting my > patch > would leave part of the community out. > > > On Fri, Sep 20, 2024 at 12:48 PM Timo Rothenpieler <timo@rothenpieler.org> > wrote: > >> On 20/09/2024 04:08, 김윤주 wrote: >>> Does native decoder refer to hevc (hevcdec.c)? >>> I tried using hevc and in environments with low CPU performance, >> hevc_cuvid >>> was much faster. >>> So, I used hevc_cuvid for decoding but encountered an issue where HDR10+ >>> sidedata did not exist. >>> That's why I wrote this patch. >> >> You did turn on hwaccel, right? >> I don't see why the native decoder would be much slower at parsing then >> nvidias parsers. >> >>> I thought that the original content's sidedata should be preserved even >>> after decoding regardless of which decoder is used. >>> And I thought hevc_cuvid was the only way to get Nvidia hwaccel support. >>> If I'm mistaken about anything, please let me know. >> >> It's just that the cuviddec decoder is more of a relic, from before the >> native hwaccel existed. >> The only reason it's not straight up deprecated is that it's sometimes >> nice to have a "second opinion" when issues crop up, and there are a few >> specific features like hardware-deinterlacing that can't be exposed via >> the native hwaccel. >> >> But I'd normally not like to expand it even further and add complex and >> large features to it. Whenever possible, the native nvdec hwaccel should >> be used. >> >>> Also, is it correct to respond to your comments like this? >>> It seems quite different from the format you sent. >> >> Top-posting isn't exactly liked here, though I don't really have a >> strong opinion on it. >> >>> Apologies, as I'm still relatively new to the FFmpeg community and have a >>> lot to learn. >>> Any additional guidance would be greatly appreciated. >>> >>> -----Original Message----- >>> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of Timo >>> Rothenpieler >>> Sent: Friday, September 20, 2024 4:48 AM >>> To: ffmpeg-devel@ffmpeg.org >>> Subject: Re: [FFmpeg-devel] [PATCH] avcodec/cuviddec: Add handling HDR10+ >>> sidedata on cuviddec. >>> >>> On 19.09.2024 06:43, yoonjoo wrote: >>>> Implemented decoding of NAL units and handling HDR10+ sidedata by >>>> referring to hevcdec. >>> >>> Why? Can't you just use the native decoder with nvdec hwaccel? >>> _______________________________________________ >>> ffmpeg-devel mailing list >>> ffmpeg-devel@ffmpeg.org >>> https://protect2.fireeye.com/v1/url?k=c11e9bb6-9e85a2a9-c11f10f9- >>> 000babdfecba-c4d76825032a0b52&q=1&e=342e88a0-295e-4dc1-a57f- >>> >> 8f5aab974009&u=https%3A%2F%2Fffmpeg.org%2Fmailman%2Flistinfo%2Fffmpeg-devel >>> >>> To unsubscribe, visit link above, or email >> ffmpeg-devel-request@ffmpeg.org >>> with subject "unsubscribe". >>> >>> _______________________________________________ >>> ffmpeg-devel mailing list >>> ffmpeg-devel@ffmpeg.org >>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel >>> >>> To unsubscribe, visit link above, or email >>> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". >> >> _______________________________________________ >> ffmpeg-devel mailing list >> ffmpeg-devel@ffmpeg.org >> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel >> >> To unsubscribe, visit link above, or email >> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". >> > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". Its not possible for a decoder to save memory by scaling references. Decoding would simply break. Only way of saving memory while decoding is to simply get rid of output frames as soon as possible.
> Its not possible for a decoder to save memory by scaling references. > Decoding would simply break. > Only way of saving memory while decoding is to simply get rid of output > frames as soon as possible. Maybe I didn't explain myself well, but if I set a breakpoint and step over the code while looking at nvidia-smi, I can show you that decoding the video with cuviddec leaving the "resize" option untouched (so decoding outputs frames at 1080p) uses over 100MB more GPU Memory than setting the "resize" flag to 640x360, and the decoding doesn't break at all (just tested it on my NVIDIA GeForce RTX 4070 Ti with ffmpeg 7.0.2, and we used to have similar results back when we were using ffmpeg 4.3.6). > NVDEC does not implement fixed-function downscaling, in fact none of the > desktop cards have any hardware dedicated to that. > As far as I know, scaling, deinterlacing, and generally all post-processing > is done on the compute engine via cuda. This is still pretty efficient > since the data can be shared between the decode/compute engines without > copy. This is a good point, and something I haven't verified but would be curious to check. Still, the biggest downside we see is GPU VRAM consumption, as explained above :( On Fri, Sep 20, 2024 at 9:34 PM Lynne via ffmpeg-devel < ffmpeg-devel@ffmpeg.org> wrote: > On 20/09/2024 20:41, Carlos Ruiz wrote: > >> It's just that the cuviddec decoder is more of a relic, from before the > >> native hwaccel existed. > >> The only reason it's not straight up deprecated is that it's sometimes > >> nice to have a "second opinion" when issues crop up, and there are a few > >> specific features like hardware-deinterlacing that can't be exposed via > >> the native hwaccel. > >> > >> But I'd normally not like to expand it even further and add complex and > >> large features to it. Whenever possible, the native nvdec hwaccel should > >> be used. > > > > As someone who recently switched from *_cuvid to * + hwaccel=cuda I have > to > > agree > > that the performance (timing-wise at least) has been very comparable for > > decoding. > > > > The only comment I'd like to make though, and this might be a bit of > > unpopular opinion > > based on some other threads I read, but there are huge advantages of > hwaccel > > accelerating not just decoding but also resizing (and I guess optionally > > cropping). > > I know that ideally a decoder should only decode, but think about a > common > > usecase > > in the AI world we live in: you get a bunch of simultaneous 4k (or 1080p) > > incoming > > rtsp streams and you want to decode the video and pass it through some ML > > model, > > e.g. in TensorRT (to stick with the Nvidia example). The native hevc > codec > > doesn't > > support resizing, so you decode video at full 4k on the gpu, which means > > allocating > > something like 5-10 surfaces at 3840x2160 which becomes 250MB of GPU > memory, > > and then you have immediately take all of those frames, pass them > through a > > filterchain, > > scale them down to e.g. 640x360, and waste CUDA cores instead of > leveraging > > the > > dedicated video downsizing inside the NVDEC chip. Now do that for 50 > camera > > streams > > and you'll quickly run out of GPU memory with a GPU utilization under 10% > > haha. > > > > This is exactly why I submitted a patch yesterday that would allow using > > the hevc > > codec with nvdec hwaccel, while resizing on the gpu like hevc_cuviddec > > does, and > > the memory (and GPU) consuption goes waaay down (e.g. 6MB of GPU VRAM > > instead of 250MB per camera). I know this is a different discussion but > > thought > > it was appropriate to share because deprecating cuviddec or rejecting my > > patch > > would leave part of the community out. > > > > > > On Fri, Sep 20, 2024 at 12:48 PM Timo Rothenpieler < > timo@rothenpieler.org> > > wrote: > > > >> On 20/09/2024 04:08, 김윤주 wrote: > >>> Does native decoder refer to hevc (hevcdec.c)? > >>> I tried using hevc and in environments with low CPU performance, > >> hevc_cuvid > >>> was much faster. > >>> So, I used hevc_cuvid for decoding but encountered an issue where > HDR10+ > >>> sidedata did not exist. > >>> That's why I wrote this patch. > >> > >> You did turn on hwaccel, right? > >> I don't see why the native decoder would be much slower at parsing then > >> nvidias parsers. > >> > >>> I thought that the original content's sidedata should be preserved even > >>> after decoding regardless of which decoder is used. > >>> And I thought hevc_cuvid was the only way to get Nvidia hwaccel > support. > >>> If I'm mistaken about anything, please let me know. > >> > >> It's just that the cuviddec decoder is more of a relic, from before the > >> native hwaccel existed. > >> The only reason it's not straight up deprecated is that it's sometimes > >> nice to have a "second opinion" when issues crop up, and there are a few > >> specific features like hardware-deinterlacing that can't be exposed via > >> the native hwaccel. > >> > >> But I'd normally not like to expand it even further and add complex and > >> large features to it. Whenever possible, the native nvdec hwaccel should > >> be used. > >> > >>> Also, is it correct to respond to your comments like this? > >>> It seems quite different from the format you sent. > >> > >> Top-posting isn't exactly liked here, though I don't really have a > >> strong opinion on it. > >> > >>> Apologies, as I'm still relatively new to the FFmpeg community and > have a > >>> lot to learn. > >>> Any additional guidance would be greatly appreciated. > >>> > >>> -----Original Message----- > >>> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of Timo > >>> Rothenpieler > >>> Sent: Friday, September 20, 2024 4:48 AM > >>> To: ffmpeg-devel@ffmpeg.org > >>> Subject: Re: [FFmpeg-devel] [PATCH] avcodec/cuviddec: Add handling > HDR10+ > >>> sidedata on cuviddec. > >>> > >>> On 19.09.2024 06:43, yoonjoo wrote: > >>>> Implemented decoding of NAL units and handling HDR10+ sidedata by > >>>> referring to hevcdec. > >>> > >>> Why? Can't you just use the native decoder with nvdec hwaccel? > >>> _______________________________________________ > >>> ffmpeg-devel mailing list > >>> ffmpeg-devel@ffmpeg.org > >>> https://protect2.fireeye.com/v1/url?k=c11e9bb6-9e85a2a9-c11f10f9- > >>> 000babdfecba-c4d76825032a0b52&q=1&e=342e88a0-295e-4dc1-a57f- > >>> > >> > 8f5aab974009&u=https%3A%2F%2Fffmpeg.org%2Fmailman%2Flistinfo%2Fffmpeg-devel > >>> > >>> To unsubscribe, visit link above, or email > >> ffmpeg-devel-request@ffmpeg.org > >>> with subject "unsubscribe". > >>> > >>> _______________________________________________ > >>> ffmpeg-devel mailing list > >>> ffmpeg-devel@ffmpeg.org > >>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > >>> > >>> To unsubscribe, visit link above, or email > >>> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". > >> > >> _______________________________________________ > >> ffmpeg-devel mailing list > >> ffmpeg-devel@ffmpeg.org > >> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > >> > >> To unsubscribe, visit link above, or email > >> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". > >> > > _______________________________________________ > > ffmpeg-devel mailing list > > ffmpeg-devel@ffmpeg.org > > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > > > To unsubscribe, visit link above, or email > > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". > > Its not possible for a decoder to save memory by scaling references. > Decoding would simply break. > Only way of saving memory while decoding is to simply get rid of output > frames as soon as possible. > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". >
diff --git a/libavcodec/cuviddec.c b/libavcodec/cuviddec.c index 3fae9c1..1b80b81 100644 --- a/libavcodec/cuviddec.c +++ b/libavcodec/cuviddec.c @@ -33,6 +33,7 @@ #include "libavutil/mem.h" #include "libavutil/opt.h" #include "libavutil/pixdesc.h" +#include "libavutil/hdr_dynamic_metadata.h" #include "avcodec.h" #include "bsf.h" @@ -41,6 +42,11 @@ #include "hwconfig.h" #include "nvdec.h" #include "internal.h" +#include "h2645_parse.h" +#include "bytestream.h" +#include "hevc/hevc.h" +#include "hevc/sei.h" +#include "hevc/ps.h" #if !NVDECAPI_CHECK_VERSION(9, 0) #define cudaVideoSurfaceFormat_YUV444 2 @@ -488,6 +494,64 @@ error: return 0; } +static void decode_nal_units(AVCodecContext* avctx, const uint8_t *buf, int length, AVFrame* frame) +{ + H2645Packet hpkt = { 0 }; + int is_nalff = 1; + int nal_length_size = 4; + HEVCSEI sei = { 0 }; + HEVCParamSets ps = { 0 }; + + av_log(avctx, AV_LOG_TRACE, "decode_nal_units\n"); + ff_h2645_packet_split(&hpkt, buf, length, avctx, is_nalff, nal_length_size, avctx->codec_id, 1, 0); + + for (int i = 0; i < hpkt.nb_nals; i++) { + H2645NAL* nal = &hpkt.nals[i]; + GetBitContext gb = nal->gb; + + av_log(avctx, AV_LOG_TRACE, "[%d/%d] NAL type = %d\n", i + 1, hpkt. nb_nals, nal->type); + + switch (nal->type) { + case HEVC_NAL_SEI_PREFIX: + { + int ret = ff_hevc_decode_nal_sei(&gb, avctx, &sei, &ps, nal- >type); + if (ret < 0) { + av_log(avctx, AV_LOG_WARNING, "Skipping invalid undecodable NALU: %d\n", nal->type); + return; + } + + if (sei.common.dynamic_hdr_plus.info) + av_frame_new_side_data_from_buf(frame, AV_FRAME_DATA_DYNAMIC_HDR_PLUS, sei.common.dynamic_hdr_plus.info); + break; + } + case HEVC_NAL_VPS: + case HEVC_NAL_SPS: + case HEVC_NAL_PPS: + case HEVC_NAL_TRAIL_R: + case HEVC_NAL_TRAIL_N: + case HEVC_NAL_TSA_N: + case HEVC_NAL_TSA_R: + case HEVC_NAL_STSA_N: + case HEVC_NAL_STSA_R: + case HEVC_NAL_BLA_W_LP: + case HEVC_NAL_BLA_W_RADL: + case HEVC_NAL_BLA_N_LP: + case HEVC_NAL_IDR_W_RADL: + case HEVC_NAL_IDR_N_LP: + case HEVC_NAL_CRA_NUT: + case HEVC_NAL_RADL_N: + case HEVC_NAL_RADL_R: + case HEVC_NAL_RASL_N: + case HEVC_NAL_RASL_R: + // these Nal types will be handled in 'cuvid_decode_packet()' + break; + default: + av_log(avctx, AV_LOG_INFO, "Skipping NAL unit %d\n", nal- >type); + break;
Implemented decoding of NAL units and handling HDR10+ sidedata by referring to hevcdec. Signed-off-by: yoonjoo <yoonjoo.kim@samsung.com> --- libavcodec/cuviddec.c | 69 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+) + } + } +} + static int cuvid_output_frame(AVCodecContext *avctx, AVFrame *frame) { CuvidContext *ctx = avctx->priv_data; @@ -497,6 +561,8 @@ static int cuvid_output_frame(AVCodecContext *avctx, AVFrame *frame) CuvidParsedFrame parsed_frame; CUdeviceptr mapped_frame = 0; int ret = 0, eret = 0; + uint8_t *pkt_data = avctx->internal->buffer_pkt->data; + int pkt_size = avctx->internal->buffer_pkt->size; av_log(avctx, AV_LOG_TRACE, "cuvid_output_frame\n"); @@ -512,6 +578,9 @@ static int cuvid_output_frame(AVCodecContext *avctx, AVFrame *frame) if (ret < 0 && ret != AVERROR_EOF) return ret; ret = cuvid_decode_packet(avctx, pkt); + + decode_nal_units(avctx, pkt_data, pkt_size, frame); + av_packet_unref(pkt); // cuvid_is_buffer_full() should avoid this. if (ret == AVERROR(EAGAIN))