diff mbox series

[FFmpeg-devel] avcodec/cuviddec: Add handling HDR10+ sidedata on cuviddec.

Message ID 007701db0a4e$7adc8200$70958600$@samsung.com
State New
Headers show
Series [FFmpeg-devel] avcodec/cuviddec: Add handling HDR10+ sidedata on cuviddec. | expand

Checks

Context Check Description
yinshiyou/make_loongarch64 success Make finished
yinshiyou/make_fate_loongarch64 success Make fate finished
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

김윤주 Sept. 19, 2024, 4:43 a.m. UTC
Implemented decoding of NAL units and handling HDR10+ sidedata
by referring to hevcdec.

Signed-off-by: yoonjoo <yoonjoo.kim@samsung.com>
---
 libavcodec/cuviddec.c | 69 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 69 insertions(+)

+        }
+    }
+}
+
 static int cuvid_output_frame(AVCodecContext *avctx, AVFrame *frame)
 {
     CuvidContext *ctx = avctx->priv_data;
@@ -497,6 +561,8 @@ static int cuvid_output_frame(AVCodecContext *avctx,
AVFrame *frame)
     CuvidParsedFrame parsed_frame;
     CUdeviceptr mapped_frame = 0;
     int ret = 0, eret = 0;
+    uint8_t *pkt_data = avctx->internal->buffer_pkt->data;
+    int pkt_size = avctx->internal->buffer_pkt->size;
 
     av_log(avctx, AV_LOG_TRACE, "cuvid_output_frame\n");
 
@@ -512,6 +578,9 @@ static int cuvid_output_frame(AVCodecContext *avctx,
AVFrame *frame)
         if (ret < 0 && ret != AVERROR_EOF)
             return ret;
         ret = cuvid_decode_packet(avctx, pkt);
+
+        decode_nal_units(avctx, pkt_data, pkt_size, frame);
+
         av_packet_unref(pkt);
         // cuvid_is_buffer_full() should avoid this.
         if (ret == AVERROR(EAGAIN))

Comments

Timo Rothenpieler Sept. 19, 2024, 7:47 p.m. UTC | #1
On 19.09.2024 06:43, yoonjoo wrote:
> Implemented decoding of NAL units and handling HDR10+ sidedata
> by referring to hevcdec.

Why? Can't you just use the native decoder with nvdec hwaccel?
Fahad Mustafa Sept. 19, 2024, 7:53 p.m. UTC | #2
What?

On Fri, Sep 20, 2024, 12:47 AM Timo Rothenpieler <timo@rothenpieler.org>
wrote:

> On 19.09.2024 06:43, yoonjoo wrote:
> > Implemented decoding of NAL units and handling HDR10+ sidedata
> > by referring to hevcdec.
>
> Why? Can't you just use the native decoder with nvdec hwaccel?
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
김윤주 Sept. 20, 2024, 2:08 a.m. UTC | #3
Does native decoder refer to hevc (hevcdec.c)? 
I tried using hevc and in environments with low CPU performance, hevc_cuvid
was much faster. 
So, I used hevc_cuvid for decoding but encountered an issue where HDR10+
sidedata did not exist. 
That's why I wrote this patch. 

I thought that the original content's sidedata should be preserved even
after decoding regardless of which decoder is used. 
And I thought hevc_cuvid was the only way to get Nvidia hwaccel support. 
If I'm mistaken about anything, please let me know. 

Also, is it correct to respond to your comments like this? 
It seems quite different from the format you sent. 

Apologies, as I'm still relatively new to the FFmpeg community and have a
lot to learn. 
Any additional guidance would be greatly appreciated.

-----Original Message-----
From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of Timo
Rothenpieler
Sent: Friday, September 20, 2024 4:48 AM
To: ffmpeg-devel@ffmpeg.org
Subject: Re: [FFmpeg-devel] [PATCH] avcodec/cuviddec: Add handling HDR10+
sidedata on cuviddec.

On 19.09.2024 06:43, yoonjoo wrote:
> Implemented decoding of NAL units and handling HDR10+ sidedata by 
> referring to hevcdec.

Why? Can't you just use the native decoder with nvdec hwaccel?
Timo Rothenpieler Sept. 20, 2024, 10:48 a.m. UTC | #4
On 20/09/2024 04:08, 김윤주 wrote:
> Does native decoder refer to hevc (hevcdec.c)?
> I tried using hevc and in environments with low CPU performance, hevc_cuvid
> was much faster.
> So, I used hevc_cuvid for decoding but encountered an issue where HDR10+
> sidedata did not exist.
> That's why I wrote this patch.

You did turn on hwaccel, right?
I don't see why the native decoder would be much slower at parsing then 
nvidias parsers.

> I thought that the original content's sidedata should be preserved even
> after decoding regardless of which decoder is used.
> And I thought hevc_cuvid was the only way to get Nvidia hwaccel support.
> If I'm mistaken about anything, please let me know.

It's just that the cuviddec decoder is more of a relic, from before the 
native hwaccel existed.
The only reason it's not straight up deprecated is that it's sometimes 
nice to have a "second opinion" when issues crop up, and there are a few 
specific features like hardware-deinterlacing that can't be exposed via 
the native hwaccel.

But I'd normally not like to expand it even further and add complex and 
large features to it. Whenever possible, the native nvdec hwaccel should 
be used.

> Also, is it correct to respond to your comments like this?
> It seems quite different from the format you sent.

Top-posting isn't exactly liked here, though I don't really have a 
strong opinion on it.

> Apologies, as I'm still relatively new to the FFmpeg community and have a
> lot to learn.
> Any additional guidance would be greatly appreciated.
> 
> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of Timo
> Rothenpieler
> Sent: Friday, September 20, 2024 4:48 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH] avcodec/cuviddec: Add handling HDR10+
> sidedata on cuviddec.
> 
> On 19.09.2024 06:43, yoonjoo wrote:
>> Implemented decoding of NAL units and handling HDR10+ sidedata by
>> referring to hevcdec.
> 
> Why? Can't you just use the native decoder with nvdec hwaccel?
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://protect2.fireeye.com/v1/url?k=c11e9bb6-9e85a2a9-c11f10f9-
> 000babdfecba-c4d76825032a0b52&q=1&e=342e88a0-295e-4dc1-a57f-
> 8f5aab974009&u=https%3A%2F%2Fffmpeg.org%2Fmailman%2Flistinfo%2Fffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org
> with subject "unsubscribe".
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Carlos Ruiz Sept. 20, 2024, 6:41 p.m. UTC | #5
> It's just that the cuviddec decoder is more of a relic, from before the
> native hwaccel existed.
> The only reason it's not straight up deprecated is that it's sometimes
> nice to have a "second opinion" when issues crop up, and there are a few
> specific features like hardware-deinterlacing that can't be exposed via
> the native hwaccel.
>
> But I'd normally not like to expand it even further and add complex and
> large features to it. Whenever possible, the native nvdec hwaccel should
> be used.

As someone who recently switched from *_cuvid to * + hwaccel=cuda I have to
agree
that the performance (timing-wise at least) has been very comparable for
decoding.

The only comment I'd like to make though, and this might be a bit of
unpopular opinion
based on some other threads I read, but there are huge advantages of hwaccel
accelerating not just decoding but also resizing (and I guess optionally
cropping).
I know that ideally a decoder should only decode, but think about a common
usecase
in the AI world we live in: you get a bunch of simultaneous 4k (or 1080p)
incoming
rtsp streams and you want to decode the video and pass it through some ML
model,
e.g. in TensorRT (to stick with the Nvidia example). The native hevc codec
doesn't
support resizing, so you decode video at full 4k on the gpu, which means
allocating
something like 5-10 surfaces at 3840x2160 which becomes 250MB of GPU memory,
and then you have immediately take all of those frames, pass them through a
filterchain,
scale them down to e.g. 640x360, and waste CUDA cores instead of leveraging
the
dedicated video downsizing inside the NVDEC chip. Now do that for 50 camera
streams
and you'll quickly run out of GPU memory with a GPU utilization under 10%
haha.

This is exactly why I submitted a patch yesterday that would allow using
the hevc
codec with nvdec hwaccel, while resizing on the gpu like hevc_cuviddec
does, and
the memory (and GPU) consuption goes waaay down (e.g. 6MB of GPU VRAM
instead of 250MB per camera). I know this is a different discussion but
thought
it was appropriate to share because deprecating cuviddec or rejecting my
patch
would leave part of the community out.


On Fri, Sep 20, 2024 at 12:48 PM Timo Rothenpieler <timo@rothenpieler.org>
wrote:

> On 20/09/2024 04:08, 김윤주 wrote:
> > Does native decoder refer to hevc (hevcdec.c)?
> > I tried using hevc and in environments with low CPU performance,
> hevc_cuvid
> > was much faster.
> > So, I used hevc_cuvid for decoding but encountered an issue where HDR10+
> > sidedata did not exist.
> > That's why I wrote this patch.
>
> You did turn on hwaccel, right?
> I don't see why the native decoder would be much slower at parsing then
> nvidias parsers.
>
> > I thought that the original content's sidedata should be preserved even
> > after decoding regardless of which decoder is used.
> > And I thought hevc_cuvid was the only way to get Nvidia hwaccel support.
> > If I'm mistaken about anything, please let me know.
>
> It's just that the cuviddec decoder is more of a relic, from before the
> native hwaccel existed.
> The only reason it's not straight up deprecated is that it's sometimes
> nice to have a "second opinion" when issues crop up, and there are a few
> specific features like hardware-deinterlacing that can't be exposed via
> the native hwaccel.
>
> But I'd normally not like to expand it even further and add complex and
> large features to it. Whenever possible, the native nvdec hwaccel should
> be used.
>
> > Also, is it correct to respond to your comments like this?
> > It seems quite different from the format you sent.
>
> Top-posting isn't exactly liked here, though I don't really have a
> strong opinion on it.
>
> > Apologies, as I'm still relatively new to the FFmpeg community and have a
> > lot to learn.
> > Any additional guidance would be greatly appreciated.
> >
> > -----Original Message-----
> > From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of Timo
> > Rothenpieler
> > Sent: Friday, September 20, 2024 4:48 AM
> > To: ffmpeg-devel@ffmpeg.org
> > Subject: Re: [FFmpeg-devel] [PATCH] avcodec/cuviddec: Add handling HDR10+
> > sidedata on cuviddec.
> >
> > On 19.09.2024 06:43, yoonjoo wrote:
> >> Implemented decoding of NAL units and handling HDR10+ sidedata by
> >> referring to hevcdec.
> >
> > Why? Can't you just use the native decoder with nvdec hwaccel?
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://protect2.fireeye.com/v1/url?k=c11e9bb6-9e85a2a9-c11f10f9-
> > 000babdfecba-c4d76825032a0b52&q=1&e=342e88a0-295e-4dc1-a57f-
> >
> 8f5aab974009&u=https%3A%2F%2Fffmpeg.org%2Fmailman%2Flistinfo%2Fffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org
> > with subject "unsubscribe".
> >
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
averne Sept. 20, 2024, 7:06 p.m. UTC | #6
Le 20/09/2024 à 20:41, Carlos Ruiz a écrit :> The native hevc codec doesn't
> support resizing, so you decode video at full 4k on the gpu, which means
> allocating
> something like 5-10 surfaces at 3840x2160 which becomes 250MB of GPU memory,
> and then you have immediately take all of those frames, pass them through a
> filterchain,
> scale them down to e.g. 640x360, and waste CUDA cores instead of leveraging
> the
> dedicated video downsizing inside the NVDEC chip. Now do that for 50 camera
> streams
> and you'll quickly run out of GPU memory with a GPU utilization under 10%
> haha.

NVDEC does not implement fixed-function downscaling, in fact none of the 
desktop cards have any hardware dedicated to that.  
As far as I know, scaling, deinterlacing, and generally all post-processing
is done on the compute engine via cuda. This is still pretty efficient 
since the data can be shared between the decode/compute engines without
copy.
Tegra chips are the only ones that come with a VIC engine (Video & Image
Compositor) which can do scaling, deinterlacing, spatial/temporal filtering, 
and basic compositing.
Lynne Sept. 20, 2024, 7:14 p.m. UTC | #7
On 20/09/2024 20:41, Carlos Ruiz wrote:
>> It's just that the cuviddec decoder is more of a relic, from before the
>> native hwaccel existed.
>> The only reason it's not straight up deprecated is that it's sometimes
>> nice to have a "second opinion" when issues crop up, and there are a few
>> specific features like hardware-deinterlacing that can't be exposed via
>> the native hwaccel.
>>
>> But I'd normally not like to expand it even further and add complex and
>> large features to it. Whenever possible, the native nvdec hwaccel should
>> be used.
> 
> As someone who recently switched from *_cuvid to * + hwaccel=cuda I have to
> agree
> that the performance (timing-wise at least) has been very comparable for
> decoding.
> 
> The only comment I'd like to make though, and this might be a bit of
> unpopular opinion
> based on some other threads I read, but there are huge advantages of hwaccel
> accelerating not just decoding but also resizing (and I guess optionally
> cropping).
> I know that ideally a decoder should only decode, but think about a common
> usecase
> in the AI world we live in: you get a bunch of simultaneous 4k (or 1080p)
> incoming
> rtsp streams and you want to decode the video and pass it through some ML
> model,
> e.g. in TensorRT (to stick with the Nvidia example). The native hevc codec
> doesn't
> support resizing, so you decode video at full 4k on the gpu, which means
> allocating
> something like 5-10 surfaces at 3840x2160 which becomes 250MB of GPU memory,
> and then you have immediately take all of those frames, pass them through a
> filterchain,
> scale them down to e.g. 640x360, and waste CUDA cores instead of leveraging
> the
> dedicated video downsizing inside the NVDEC chip. Now do that for 50 camera
> streams
> and you'll quickly run out of GPU memory with a GPU utilization under 10%
> haha.
> 
> This is exactly why I submitted a patch yesterday that would allow using
> the hevc
> codec with nvdec hwaccel, while resizing on the gpu like hevc_cuviddec
> does, and
> the memory (and GPU) consuption goes waaay down (e.g. 6MB of GPU VRAM
> instead of 250MB per camera). I know this is a different discussion but
> thought
> it was appropriate to share because deprecating cuviddec or rejecting my
> patch
> would leave part of the community out.
> 
> 
> On Fri, Sep 20, 2024 at 12:48 PM Timo Rothenpieler <timo@rothenpieler.org>
> wrote:
> 
>> On 20/09/2024 04:08, 김윤주 wrote:
>>> Does native decoder refer to hevc (hevcdec.c)?
>>> I tried using hevc and in environments with low CPU performance,
>> hevc_cuvid
>>> was much faster.
>>> So, I used hevc_cuvid for decoding but encountered an issue where HDR10+
>>> sidedata did not exist.
>>> That's why I wrote this patch.
>>
>> You did turn on hwaccel, right?
>> I don't see why the native decoder would be much slower at parsing then
>> nvidias parsers.
>>
>>> I thought that the original content's sidedata should be preserved even
>>> after decoding regardless of which decoder is used.
>>> And I thought hevc_cuvid was the only way to get Nvidia hwaccel support.
>>> If I'm mistaken about anything, please let me know.
>>
>> It's just that the cuviddec decoder is more of a relic, from before the
>> native hwaccel existed.
>> The only reason it's not straight up deprecated is that it's sometimes
>> nice to have a "second opinion" when issues crop up, and there are a few
>> specific features like hardware-deinterlacing that can't be exposed via
>> the native hwaccel.
>>
>> But I'd normally not like to expand it even further and add complex and
>> large features to it. Whenever possible, the native nvdec hwaccel should
>> be used.
>>
>>> Also, is it correct to respond to your comments like this?
>>> It seems quite different from the format you sent.
>>
>> Top-posting isn't exactly liked here, though I don't really have a
>> strong opinion on it.
>>
>>> Apologies, as I'm still relatively new to the FFmpeg community and have a
>>> lot to learn.
>>> Any additional guidance would be greatly appreciated.
>>>
>>> -----Original Message-----
>>> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of Timo
>>> Rothenpieler
>>> Sent: Friday, September 20, 2024 4:48 AM
>>> To: ffmpeg-devel@ffmpeg.org
>>> Subject: Re: [FFmpeg-devel] [PATCH] avcodec/cuviddec: Add handling HDR10+
>>> sidedata on cuviddec.
>>>
>>> On 19.09.2024 06:43, yoonjoo wrote:
>>>> Implemented decoding of NAL units and handling HDR10+ sidedata by
>>>> referring to hevcdec.
>>>
>>> Why? Can't you just use the native decoder with nvdec hwaccel?
>>> _______________________________________________
>>> ffmpeg-devel mailing list
>>> ffmpeg-devel@ffmpeg.org
>>> https://protect2.fireeye.com/v1/url?k=c11e9bb6-9e85a2a9-c11f10f9-
>>> 000babdfecba-c4d76825032a0b52&q=1&e=342e88a0-295e-4dc1-a57f-
>>>
>> 8f5aab974009&u=https%3A%2F%2Fffmpeg.org%2Fmailman%2Flistinfo%2Fffmpeg-devel
>>>
>>> To unsubscribe, visit link above, or email
>> ffmpeg-devel-request@ffmpeg.org
>>> with subject "unsubscribe".
>>>
>>> _______________________________________________
>>> ffmpeg-devel mailing list
>>> ffmpeg-devel@ffmpeg.org
>>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>>
>>> To unsubscribe, visit link above, or email
>>> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>>
>> _______________________________________________
>> ffmpeg-devel mailing list
>> ffmpeg-devel@ffmpeg.org
>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>
>> To unsubscribe, visit link above, or email
>> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

Its not possible for a decoder to save memory by scaling references. 
Decoding would simply break.
Only way of saving memory while decoding is to simply get rid of output 
frames as soon as possible.
Carlos Ruiz Sept. 23, 2024, 11:37 a.m. UTC | #8
> Its not possible for a decoder to save memory by scaling references.
> Decoding would simply break.
> Only way of saving memory while decoding is to simply get rid of output
> frames as soon as possible.

Maybe I didn't explain myself well, but if I set a breakpoint and step over
the code while looking at nvidia-smi, I can show you that decoding the
video with cuviddec leaving the "resize" option untouched (so decoding
outputs frames at 1080p) uses over 100MB more GPU Memory than
setting the "resize" flag to 640x360, and the decoding doesn't break at all
(just tested it on my NVIDIA GeForce RTX 4070 Ti with ffmpeg 7.0.2, and
we used to have similar results back when we were using ffmpeg 4.3.6).

> NVDEC does not implement fixed-function downscaling, in fact none of the
> desktop cards have any hardware dedicated to that.
> As far as I know, scaling, deinterlacing, and generally all
post-processing
> is done on the compute engine via cuda. This is still pretty efficient
> since the data can be shared between the decode/compute engines without
> copy.

This is a good point, and something I haven't verified but would be curious
to check. Still, the biggest downside we see is GPU VRAM consumption, as
explained above :(

On Fri, Sep 20, 2024 at 9:34 PM Lynne via ffmpeg-devel <
ffmpeg-devel@ffmpeg.org> wrote:

> On 20/09/2024 20:41, Carlos Ruiz wrote:
> >> It's just that the cuviddec decoder is more of a relic, from before the
> >> native hwaccel existed.
> >> The only reason it's not straight up deprecated is that it's sometimes
> >> nice to have a "second opinion" when issues crop up, and there are a few
> >> specific features like hardware-deinterlacing that can't be exposed via
> >> the native hwaccel.
> >>
> >> But I'd normally not like to expand it even further and add complex and
> >> large features to it. Whenever possible, the native nvdec hwaccel should
> >> be used.
> >
> > As someone who recently switched from *_cuvid to * + hwaccel=cuda I have
> to
> > agree
> > that the performance (timing-wise at least) has been very comparable for
> > decoding.
> >
> > The only comment I'd like to make though, and this might be a bit of
> > unpopular opinion
> > based on some other threads I read, but there are huge advantages of
> hwaccel
> > accelerating not just decoding but also resizing (and I guess optionally
> > cropping).
> > I know that ideally a decoder should only decode, but think about a
> common
> > usecase
> > in the AI world we live in: you get a bunch of simultaneous 4k (or 1080p)
> > incoming
> > rtsp streams and you want to decode the video and pass it through some ML
> > model,
> > e.g. in TensorRT (to stick with the Nvidia example). The native hevc
> codec
> > doesn't
> > support resizing, so you decode video at full 4k on the gpu, which means
> > allocating
> > something like 5-10 surfaces at 3840x2160 which becomes 250MB of GPU
> memory,
> > and then you have immediately take all of those frames, pass them
> through a
> > filterchain,
> > scale them down to e.g. 640x360, and waste CUDA cores instead of
> leveraging
> > the
> > dedicated video downsizing inside the NVDEC chip. Now do that for 50
> camera
> > streams
> > and you'll quickly run out of GPU memory with a GPU utilization under 10%
> > haha.
> >
> > This is exactly why I submitted a patch yesterday that would allow using
> > the hevc
> > codec with nvdec hwaccel, while resizing on the gpu like hevc_cuviddec
> > does, and
> > the memory (and GPU) consuption goes waaay down (e.g. 6MB of GPU VRAM
> > instead of 250MB per camera). I know this is a different discussion but
> > thought
> > it was appropriate to share because deprecating cuviddec or rejecting my
> > patch
> > would leave part of the community out.
> >
> >
> > On Fri, Sep 20, 2024 at 12:48 PM Timo Rothenpieler <
> timo@rothenpieler.org>
> > wrote:
> >
> >> On 20/09/2024 04:08, 김윤주 wrote:
> >>> Does native decoder refer to hevc (hevcdec.c)?
> >>> I tried using hevc and in environments with low CPU performance,
> >> hevc_cuvid
> >>> was much faster.
> >>> So, I used hevc_cuvid for decoding but encountered an issue where
> HDR10+
> >>> sidedata did not exist.
> >>> That's why I wrote this patch.
> >>
> >> You did turn on hwaccel, right?
> >> I don't see why the native decoder would be much slower at parsing then
> >> nvidias parsers.
> >>
> >>> I thought that the original content's sidedata should be preserved even
> >>> after decoding regardless of which decoder is used.
> >>> And I thought hevc_cuvid was the only way to get Nvidia hwaccel
> support.
> >>> If I'm mistaken about anything, please let me know.
> >>
> >> It's just that the cuviddec decoder is more of a relic, from before the
> >> native hwaccel existed.
> >> The only reason it's not straight up deprecated is that it's sometimes
> >> nice to have a "second opinion" when issues crop up, and there are a few
> >> specific features like hardware-deinterlacing that can't be exposed via
> >> the native hwaccel.
> >>
> >> But I'd normally not like to expand it even further and add complex and
> >> large features to it. Whenever possible, the native nvdec hwaccel should
> >> be used.
> >>
> >>> Also, is it correct to respond to your comments like this?
> >>> It seems quite different from the format you sent.
> >>
> >> Top-posting isn't exactly liked here, though I don't really have a
> >> strong opinion on it.
> >>
> >>> Apologies, as I'm still relatively new to the FFmpeg community and
> have a
> >>> lot to learn.
> >>> Any additional guidance would be greatly appreciated.
> >>>
> >>> -----Original Message-----
> >>> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of Timo
> >>> Rothenpieler
> >>> Sent: Friday, September 20, 2024 4:48 AM
> >>> To: ffmpeg-devel@ffmpeg.org
> >>> Subject: Re: [FFmpeg-devel] [PATCH] avcodec/cuviddec: Add handling
> HDR10+
> >>> sidedata on cuviddec.
> >>>
> >>> On 19.09.2024 06:43, yoonjoo wrote:
> >>>> Implemented decoding of NAL units and handling HDR10+ sidedata by
> >>>> referring to hevcdec.
> >>>
> >>> Why? Can't you just use the native decoder with nvdec hwaccel?
> >>> _______________________________________________
> >>> ffmpeg-devel mailing list
> >>> ffmpeg-devel@ffmpeg.org
> >>> https://protect2.fireeye.com/v1/url?k=c11e9bb6-9e85a2a9-c11f10f9-
> >>> 000babdfecba-c4d76825032a0b52&q=1&e=342e88a0-295e-4dc1-a57f-
> >>>
> >>
> 8f5aab974009&u=https%3A%2F%2Fffmpeg.org%2Fmailman%2Flistinfo%2Fffmpeg-devel
> >>>
> >>> To unsubscribe, visit link above, or email
> >> ffmpeg-devel-request@ffmpeg.org
> >>> with subject "unsubscribe".
> >>>
> >>> _______________________________________________
> >>> ffmpeg-devel mailing list
> >>> ffmpeg-devel@ffmpeg.org
> >>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >>>
> >>> To unsubscribe, visit link above, or email
> >>> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
> >>
> >> _______________________________________________
> >> ffmpeg-devel mailing list
> >> ffmpeg-devel@ffmpeg.org
> >> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >>
> >> To unsubscribe, visit link above, or email
> >> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
> >>
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
> Its not possible for a decoder to save memory by scaling references.
> Decoding would simply break.
> Only way of saving memory while decoding is to simply get rid of output
> frames as soon as possible.
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
diff mbox series

Patch

diff --git a/libavcodec/cuviddec.c b/libavcodec/cuviddec.c
index 3fae9c1..1b80b81 100644
--- a/libavcodec/cuviddec.c
+++ b/libavcodec/cuviddec.c
@@ -33,6 +33,7 @@ 
 #include "libavutil/mem.h"
 #include "libavutil/opt.h"
 #include "libavutil/pixdesc.h"
+#include "libavutil/hdr_dynamic_metadata.h"
 
 #include "avcodec.h"
 #include "bsf.h"
@@ -41,6 +42,11 @@ 
 #include "hwconfig.h"
 #include "nvdec.h"
 #include "internal.h"
+#include "h2645_parse.h"
+#include "bytestream.h"
+#include "hevc/hevc.h"
+#include "hevc/sei.h"
+#include "hevc/ps.h"
 
 #if !NVDECAPI_CHECK_VERSION(9, 0)
 #define cudaVideoSurfaceFormat_YUV444 2
@@ -488,6 +494,64 @@  error:
         return 0;
 }
 
+static void decode_nal_units(AVCodecContext* avctx, const uint8_t *buf,
int length, AVFrame* frame)
+{
+    H2645Packet hpkt = { 0 };
+    int is_nalff = 1;
+    int nal_length_size = 4;
+    HEVCSEI sei = { 0 };
+    HEVCParamSets ps = { 0 };
+
+    av_log(avctx, AV_LOG_TRACE, "decode_nal_units\n");
+    ff_h2645_packet_split(&hpkt, buf, length, avctx, is_nalff,
nal_length_size, avctx->codec_id, 1, 0);
+
+    for (int i = 0; i < hpkt.nb_nals; i++) {
+        H2645NAL* nal = &hpkt.nals[i];
+        GetBitContext gb = nal->gb;
+
+        av_log(avctx, AV_LOG_TRACE, "[%d/%d] NAL type = %d\n", i + 1, hpkt.
nb_nals, nal->type);
+
+        switch (nal->type) {
+        case HEVC_NAL_SEI_PREFIX:
+        {
+            int ret = ff_hevc_decode_nal_sei(&gb, avctx, &sei, &ps, nal-
>type);
+            if (ret < 0) {
+                av_log(avctx, AV_LOG_WARNING, "Skipping invalid
undecodable NALU: %d\n", nal->type);
+                return;
+            }
+
+            if (sei.common.dynamic_hdr_plus.info)
+                av_frame_new_side_data_from_buf(frame,
AV_FRAME_DATA_DYNAMIC_HDR_PLUS, sei.common.dynamic_hdr_plus.info);
+            break;
+        }
+        case HEVC_NAL_VPS:
+        case HEVC_NAL_SPS:
+        case HEVC_NAL_PPS:
+        case HEVC_NAL_TRAIL_R:
+        case HEVC_NAL_TRAIL_N:
+        case HEVC_NAL_TSA_N:
+        case HEVC_NAL_TSA_R:
+        case HEVC_NAL_STSA_N:
+        case HEVC_NAL_STSA_R:
+        case HEVC_NAL_BLA_W_LP:
+        case HEVC_NAL_BLA_W_RADL:
+        case HEVC_NAL_BLA_N_LP:
+        case HEVC_NAL_IDR_W_RADL:
+        case HEVC_NAL_IDR_N_LP:
+        case HEVC_NAL_CRA_NUT:
+        case HEVC_NAL_RADL_N:
+        case HEVC_NAL_RADL_R:
+        case HEVC_NAL_RASL_N:
+        case HEVC_NAL_RASL_R:
+            // these Nal types will be handled in 'cuvid_decode_packet()'
+            break;
+        default:
+            av_log(avctx, AV_LOG_INFO, "Skipping NAL unit %d\n", nal-
>type);
+            break;