diff mbox series

[FFmpeg-devel] avcodec/cuvidddec: Guess pixel format based on probed bit depth

Message ID CAJZRZVnDwe2-DS_TxYEEvLqXEBdsK=zhoyva2ZHZQRVXKj97PA@mail.gmail.com
State New
Headers show
Series [FFmpeg-devel] avcodec/cuvidddec: Guess pixel format based on probed bit depth | expand

Checks

Context Check Description
andriy/commit_msg_x86 warning The first line of the commit message must start with a context terminated by a colon and a space, for example "lavu/opt: " or "doc: ".
yinshiyou/commit_msg_loongarch64 warning The first line of the commit message must start with a context terminated by a colon and a space, for example "lavu/opt: " or "doc: ".
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Roman Arzumanyan Aug. 1, 2024, 1:54 p.m. UTC
Hello world,

This patch adds a pixel format guess based on probed bit depth.
With current FFMpeg ToT, when the cuvid codec is opened, input sw_pix_fmt
is AV_PIX_FMT_NV12 until the first frame is decoded. Even if input has 10
or 12 bit depth, the format will be NV12 for some time.

What's the need for this patch ?
Applications that rely on libavcodec will have a chance to calculate the
proper amount of vRAM required to store a reconstructed video frame before
decoding begins.

Comments

Timo Rothenpieler Aug. 1, 2024, 6:14 p.m. UTC | #1
On 01.08.2024 15:54, Roman Arzumanyan wrote:
> Hello world,
> 
> This patch adds a pixel format guess based on probed bit depth.
> With current FFMpeg ToT, when the cuvid codec is opened, input sw_pix_fmt
> is AV_PIX_FMT_NV12 until the first frame is decoded. Even if input has 10
> or 12 bit depth, the format will be NV12 for some time.
> 
> What's the need for this patch ?
> Applications that rely on libavcodec will have a chance to calculate the
> proper amount of vRAM required to store a reconstructed video frame before
> decoding begins.

The 12 bit format should be AV_PIX_FMT_P016.
Also, might as well take probe_desc->log2_chroma_w/log2_chroma_h into 
account.
If they're 0, it's 444, and the formats change to AV_PIX_FMT_YUV444P(16).
Akin to the switch() on format->bit_depth_luma_minus8 in the probe function.

Patch looks fine on first glance, though relying on a rather arbitrary 
second field there does not seem like a good idea to me.
Why can't the application simply also look at the probed format?
Roman Arzumanyan Aug. 2, 2024, 7:46 a.m. UTC | #2
Hi Timo,

> Why can't the application simply also look at the probed format?
It's certainly possible, but in my opinion it makes sense to improve the
codec behavior.
All required information is already there, why not return the correct value
?

> The 12 bit format should be AV_PIX_FMT_P016.
> Also, might as well take probe_desc->log2_chroma_w/log2_chroma_h into
account.
> If they're 0, it's 444, and the formats change to AV_PIX_FMT_YUV444P(16).
> Akin to the switch() on format->bit_depth_luma_minus8 in the probe
function.

Thanks, I'll fix and submit v.2 of the patch.


чт, 1 авг. 2024 г. в 21:14, Timo Rothenpieler <timo@rothenpieler.org>:

> On 01.08.2024 15:54, Roman Arzumanyan wrote:
> > Hello world,
> >
> > This patch adds a pixel format guess based on probed bit depth.
> > With current FFMpeg ToT, when the cuvid codec is opened, input sw_pix_fmt
> > is AV_PIX_FMT_NV12 until the first frame is decoded. Even if input has 10
> > or 12 bit depth, the format will be NV12 for some time.
> >
> > What's the need for this patch ?
> > Applications that rely on libavcodec will have a chance to calculate the
> > proper amount of vRAM required to store a reconstructed video frame
> before
> > decoding begins.
>
> The 12 bit format should be AV_PIX_FMT_P016.
> Also, might as well take probe_desc->log2_chroma_w/log2_chroma_h into
> account.
> If they're 0, it's 444, and the formats change to AV_PIX_FMT_YUV444P(16).
> Akin to the switch() on format->bit_depth_luma_minus8 in the probe
> function.
>
> Patch looks fine on first glance, though relying on a rather arbitrary
> second field there does not seem like a good idea to me.
> Why can't the application simply also look at the probed format?
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
diff mbox series

Patch

From ae80b12d10a4de4aa96a4670b72accbfc5a87631 Mon Sep 17 00:00:00 2001
From: Roman Arzumanyan <r.arzumanyan@visionlabs.ai>
Date: Thu, 1 Aug 2024 16:35:22 +0300
Subject: [PATCH] Guess pixel format based on bit depth

---
 libavcodec/cuviddec.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/libavcodec/cuviddec.c b/libavcodec/cuviddec.c
index f88ad75e88..2205b1536a 100644
--- a/libavcodec/cuviddec.c
+++ b/libavcodec/cuviddec.c
@@ -834,7 +834,7 @@  static av_cold int cuvid_decode_init(AVCodecContext *avctx)
     int ret = 0;
 
     enum AVPixelFormat pix_fmts[3] = { AV_PIX_FMT_CUDA,
-                                       AV_PIX_FMT_NV12,
+                                       AV_PIX_FMT_NONE,
                                        AV_PIX_FMT_NONE };
 
     int probed_width = avctx->coded_width ? avctx->coded_width : 1280;
@@ -845,11 +845,26 @@  static av_cold int cuvid_decode_init(AVCodecContext *avctx)
     if (probe_desc && probe_desc->nb_components)
         probed_bit_depth = probe_desc->comp[0].depth;
 
+    // Arbitrarily pick pixel format based on bit depth.
+    switch (probed_bit_depth) {
+    case 8:
+        pix_fmts[1] = AV_PIX_FMT_NV12;
+        break;        
+    case 10:
+        pix_fmts[1] = AV_PIX_FMT_P010;
+        break;
+    case 12:
+        pix_fmts[1] = AV_PIX_FMT_P012;
+        break;
+    default:
+        break;
+    }
+
     ctx->pkt = avctx->internal->in_pkt;
     // Accelerated transcoding scenarios with 'ffmpeg' require that the
     // pix_fmt be set to AV_PIX_FMT_CUDA early. The sw_pix_fmt, and the
     // pix_fmt for non-accelerated transcoding, do not need to be correct
-    // but need to be set to something. We arbitrarily pick NV12.
+    // but need to be set to something.
     ret = ff_get_format(avctx, pix_fmts);
     if (ret < 0) {
         av_log(avctx, AV_LOG_ERROR, "ff_get_format failed: %d\n", ret);
-- 
2.34.1