diff mbox series

[FFmpeg-devel,v2,1/4] lavu/frame: Add Dolby Vision metadata side data type

Message ID 20211208101237.18407-1-ffmpeg@haasn.xyz
State New
Headers show
Series [FFmpeg-devel,v2,1/4] lavu/frame: Add Dolby Vision metadata side data type | expand

Checks

Context Check Description
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished
andriy/make_ppc success Make finished
andriy/make_fate_ppc success Make fate finished

Commit Message

Niklas Haas Dec. 8, 2021, 10:12 a.m. UTC
From: Niklas Haas <git@haasn.dev>

Signed-off-by: Niklas Haas <git@haasn.dev>
---
 doc/APIchanges        |   3 ++
 libavutil/dovi_meta.c |  23 ++++++++
 libavutil/dovi_meta.h | 122 ++++++++++++++++++++++++++++++++++++++++++
 libavutil/frame.c     |   1 +
 libavutil/frame.h     |   9 +++-
 libavutil/version.h   |   2 +-
 6 files changed, 158 insertions(+), 2 deletions(-)

Comments

Derek Buitenhuis Dec. 9, 2021, 5:21 p.m. UTC | #1
On 12/8/2021 10:12 AM, Niklas Haas wrote:
> +/* based on guesswork, see mkvtoolnix and dovi_tool */
> +int av_dovi_profile(const AVDOVIRpuDataHeader *hdr)
> +{

The correct way to find the profile is from the stream level
DOVI configuration record side data, if available.

- Derek
Niklas Haas Dec. 9, 2021, 9:46 p.m. UTC | #2
On Thu, 09 Dec 2021 17:21:58 +0000 Derek Buitenhuis <derek.buitenhuis@gmail.com> wrote:
> On 12/8/2021 10:12 AM, Niklas Haas wrote:
> > +/* based on guesswork, see mkvtoolnix and dovi_tool */
> > +int av_dovi_profile(const AVDOVIRpuDataHeader *hdr)
> > +{
> 
> The correct way to find the profile is from the stream level
> DOVI configuration record side data, if available.

So, I changed the API to accept the DOVI configuration record, but
actually getting that record into hevcdec isimpossible given the current
design of FFmpeg where stream-level properties are not mode available to
the decoder. Possibilities:

1. Add patch to propagate stream-level properties to
   AVCodecContext.coded_side_data automatically
2. Add patch to automatically propagate stream-level properties to each
   AVPacket somehow?
3. Have the code read the profile from the AVPacket even though the side
   data doesn't exist for them, and let this just be an open bug.
4. Something else?

Thoughts?
Derek Buitenhuis Dec. 10, 2021, 4:49 p.m. UTC | #3
On 12/9/2021 9:46 PM, Niklas Haas wrote:
> So, I changed the API to accept the DOVI configuration record, but
> actually getting that record into hevcdec isimpossible given the current
> design of FFmpeg where stream-level properties are not mode available to
> the decoder. Possibilities:

Ugh, yes, now I remember - I an into this same issue when adding RPU buffer export.

> 1. Add patch to propagate stream-level properties to
>    AVCodecContext.coded_side_data automatically

I don't know enough about what coded_side_data is to comment, I think.

> 2. Add patch to automatically propagate stream-level properties to each
>    AVPacket somehow?

This seems excessive if it means attaching stream side data to every single packet...

> 3. Have the code read the profile from the AVPacket even though the side
>    data doesn't exist for them, and let this just be an open bug.

If you mean just leaving the level guessing in, I guess it is somehow the least
bad idea... somehow.

> 4. Something else?

I hope someone else does have an idea ;)

- Derek
Marton Balint Dec. 10, 2021, 6:06 p.m. UTC | #4
On Thu, 9 Dec 2021, Niklas Haas wrote:

> On Thu, 09 Dec 2021 17:21:58 +0000 Derek Buitenhuis <derek.buitenhuis@gmail.com> wrote:
>> On 12/8/2021 10:12 AM, Niklas Haas wrote:
>>> +/* based on guesswork, see mkvtoolnix and dovi_tool */
>>> +int av_dovi_profile(const AVDOVIRpuDataHeader *hdr)
>>> +{
>>
>> The correct way to find the profile is from the stream level
>> DOVI configuration record side data, if available.
>
> So, I changed the API to accept the DOVI configuration record, but
> actually getting that record into hevcdec isimpossible given the current
> design of FFmpeg where stream-level properties are not mode available to
> the decoder. Possibilities:
>
> 1. Add patch to propagate stream-level properties to
>   AVCodecContext.coded_side_data automatically
> 2. Add patch to automatically propagate stream-level properties to each
>   AVPacket somehow?

Don't we already have av_format_inject_global_side_data() for something 
like this? It is not enabled by default however, only for ffplay, but I am 
not sure why.

Regards,
Marton

> 3. Have the code read the profile from the AVPacket even though the side
>   data doesn't exist for them, and let this just be an open bug.
> 4. Something else?
>
> Thoughts?
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
Niklas Haas Dec. 11, 2021, 11:37 a.m. UTC | #5
On Fri, 10 Dec 2021 19:06:15 +0100 Marton Balint <cus@passwd.hu> wrote:
> Don't we already have av_format_inject_global_side_data() for something 
> like this? It is not enabled by default however, only for ffplay, but I am 
> not sure why.

Indeed, that seems to do exactly what we need - while testing though I
noticed that it only adds the side data on the first AVPacket. So my
current solution was *almost* right, I just need to persist it across
frames inside the decoder. That solves the problem elegantly, IMO.
diff mbox series

Patch

diff --git a/doc/APIchanges b/doc/APIchanges
index 2914ad6734..422874e3b9 100644
--- a/doc/APIchanges
+++ b/doc/APIchanges
@@ -14,6 +14,9 @@  libavutil:     2021-04-27
 
 API changes, most recent first:
 
+2021-12-06 - xxxxxxxxxx - lavu 57.11.100 - frame.h
+  Add AV_FRAME_DATA_DOVI_RESHAPING.
+
 2021-11-xx - xxxxxxxxxx - lavfi 8.19.100 - avfilter.h
   Add AVFILTER_FLAG_METADATA_ONLY.
 
diff --git a/libavutil/dovi_meta.c b/libavutil/dovi_meta.c
index 7bd08f6c54..e2ef8ac3a4 100644
--- a/libavutil/dovi_meta.c
+++ b/libavutil/dovi_meta.c
@@ -33,3 +33,26 @@  AVDOVIDecoderConfigurationRecord *av_dovi_alloc(size_t *size)
 
     return dovi;
 }
+
+/* based on guesswork, see mkvtoolnix and dovi_tool */
+int av_dovi_profile(const AVDOVIRpuDataHeader *hdr)
+{
+    switch (hdr->vdr_rpu_profile) {
+    case 0:
+        if (hdr->bl_video_full_range_flag)
+            return 5;
+        break;
+    case 1:
+        if (hdr->el_spatial_resampling_filter_flag && !hdr->disable_residual_flag) {
+            if (hdr->vdr_bit_depth == 12) {
+                return 7;
+            } else {
+                return 4;
+            }
+        } else {
+            return 8;
+        }
+    }
+
+    return 0; /* unknown */
+}
diff --git a/libavutil/dovi_meta.h b/libavutil/dovi_meta.h
index 299911d434..a3a97463ac 100644
--- a/libavutil/dovi_meta.h
+++ b/libavutil/dovi_meta.h
@@ -29,6 +29,7 @@ 
 
 #include <stdint.h>
 #include <stddef.h>
+#include "rational.h"
 
 /*
  * DOVI configuration
@@ -67,4 +68,125 @@  typedef struct AVDOVIDecoderConfigurationRecord {
  */
 AVDOVIDecoderConfigurationRecord *av_dovi_alloc(size_t *size);
 
+/**
+ * Dolby Vision RPU data header.
+ */
+typedef struct AVDOVIRpuDataHeader {
+    uint8_t rpu_type;
+    uint16_t rpu_format;
+    uint8_t vdr_rpu_profile;
+    uint8_t vdr_rpu_level;
+    int chroma_resampling_explicit_filter_flag;
+    uint8_t coef_data_type; /* informative, lavc always converts to fixed */
+    uint8_t coef_log2_denom;
+    uint8_t vdr_rpu_normalized_idc;
+    int bl_video_full_range_flag;
+    uint8_t bl_bit_depth; /* [8, 16] */
+    uint8_t el_bit_depth; /* [8, 16] */
+    uint8_t vdr_bit_depth; /* [8, 16] */
+    int spatial_resampling_filter_flag;
+    int el_spatial_resampling_filter_flag;
+    int disable_residual_flag;
+} AVDOVIRpuDataHeader;
+
+/**
+ * Return the Dolby Vision profile number derived from a given RPU data header,
+ * or 0 for unknown/unrecognized profiles.
+ */
+int av_dovi_profile(const AVDOVIRpuDataHeader *hdr);
+
+enum AVDOVIMappingMethod {
+    AV_DOVI_MAPPING_POLYNOMIAL = 0,
+    AV_DOVI_MAPPING_MMR = 1,
+};
+
+/**
+ * Coefficients of a piece-wise function. The pieces of the function span the
+ * value ranges between two adjacent pivot values.
+ */
+#define FF_DOVI_MAX_PIECES 8
+typedef struct AVDOVIReshapingCurve {
+    uint8_t num_pivots;                         /* [2, 9] */
+    uint16_t pivots[FF_DOVI_MAX_PIECES + 1];    /* sorted ascending */
+    enum AVDOVIMappingMethod mapping_idc[FF_DOVI_MAX_PIECES];
+    /* AV_DOVI_MAPPING_POLYNOMIAL */
+    uint8_t poly_order[FF_DOVI_MAX_PIECES];     /* [1, 2] */
+    int64_t poly_coef[FF_DOVI_MAX_PIECES][3];   /* x^0, x^1, x^2 */
+    /* AV_DOVI_MAPPING_MMR */
+    uint8_t mmr_order[FF_DOVI_MAX_PIECES];      /* [1, 3] */
+    int64_t mmr_constant[FF_DOVI_MAX_PIECES];
+    int64_t mmr_coef[FF_DOVI_MAX_PIECES][3/* order - 1 */][7];
+} AVDOVIReshapingCurve;
+
+enum AVDOVINLQMethod {
+    AV_DOVI_NLQ_NONE = -1,
+    AV_DOVI_NLQ_LINEAR_DZ = 0,
+};
+
+/**
+ * Coefficients of the non-linear inverse quantization. For the interpretation
+ * of these, see ETSI GS CCM 001.
+ */
+typedef struct AVDOVINLQParams {
+    uint64_t nlq_offset;
+    uint64_t vdr_in_max;
+    /* AV_DOVI_NLQ_LINEAR_DZ */
+    uint64_t linear_deadzone_slope;
+    uint64_t linear_deadzone_threshold;
+} AVDOVINLQParams;
+
+/**
+ * Dolby Vision RPU data mapping parameters.
+ */
+typedef struct AVDOVIDataMapping {
+    uint8_t vdr_rpu_id;
+    uint8_t mapping_color_space;
+    uint8_t mapping_chroma_format_idc;
+    AVDOVIReshapingCurve curves[3]; /* per component */
+
+    /* Non-linear inverse quantization */
+    enum AVDOVINLQMethod nlq_method_idc;
+    uint32_t num_x_partitions;
+    uint32_t num_y_partitions;
+    AVDOVINLQParams nlq[3]; /* per component */
+} AVDOVIDataMapping;
+
+typedef struct AVDOVIColorMetadata {
+    uint8_t dm_metadata_id;
+    int scene_refresh_flag;
+
+    /**
+     * Coefficients of the custom Dolby Vision IPT-PQ matrices. These are to be
+     * used instead of the matrices indicated by the frame's colorspace tags.
+     * The output of rgb_to_lms_matrix is to be fed into a BT.2020 LMS->RGB
+     * matrix based on a Hunt-Pointer-Estevez transform, but without any
+     * crosstalk. (See the definition of the ICtCp colorspace for more
+     * information.)
+     */
+    AVRational ycc_to_rgb_matrix[9]; /* before PQ linearization */
+    AVRational ycc_to_rgb_offset[3]; /* input offset of neutral value */
+    AVRational rgb_to_lms_matrix[9]; /* after PQ linearization */
+
+    /**
+     * Extra signal metadata (see Dolby patents for more info).
+     */
+    uint16_t signal_eotf;
+    uint16_t signal_eotf_param0;
+    uint16_t signal_eotf_param1;
+    uint32_t signal_eotf_param2;
+    uint8_t signal_bit_depth;
+    uint8_t signal_color_space;
+    uint8_t signal_chroma_format;
+    uint8_t signal_full_range_flag; /* [0, 3] */
+    uint16_t source_min_pq;
+    uint16_t source_max_pq;
+    uint16_t source_diagonal;
+} AVDOVIColorMetadata;
+
+typedef struct AVDOVIMetadata {
+    AVDOVIRpuDataHeader header;
+    AVDOVIDataMapping mapping;
+    AVDOVIColorMetadata color;
+} AVDOVIMetadata;
+
 #endif /* AVUTIL_DOVI_META_H */
diff --git a/libavutil/frame.c b/libavutil/frame.c
index 0912ad9131..8997c85e35 100644
--- a/libavutil/frame.c
+++ b/libavutil/frame.c
@@ -729,6 +729,7 @@  const char *av_frame_side_data_name(enum AVFrameSideDataType type)
     case AV_FRAME_DATA_FILM_GRAIN_PARAMS:           return "Film grain parameters";
     case AV_FRAME_DATA_DETECTION_BBOXES:            return "Bounding boxes for object detection and classification";
     case AV_FRAME_DATA_DOVI_RPU_BUFFER:             return "Dolby Vision RPU Data";
+    case AV_FRAME_DATA_DOVI_METADATA:               return "Dolby Vision Metadata";
     }
     return NULL;
 }
diff --git a/libavutil/frame.h b/libavutil/frame.h
index 3f295f6b9e..18e239f870 100644
--- a/libavutil/frame.h
+++ b/libavutil/frame.h
@@ -189,11 +189,18 @@  enum AVFrameSideDataType {
     AV_FRAME_DATA_DETECTION_BBOXES,
 
     /**
-     * Dolby Vision RPU data, suitable for passing to x265
+     * Dolby Vision RPU raw data, suitable for passing to x265
      * or other libraries. Array of uint8_t, with NAL emulation
      * bytes intact.
      */
     AV_FRAME_DATA_DOVI_RPU_BUFFER,
+
+    /**
+     * Parsed Dolby Vision metadata, suitable for passing to a software
+     * implementation. The payload is the AVDOVIMetadata struct defined in
+     * libavutil/dovi_meta.h.
+     */
+    AV_FRAME_DATA_DOVI_METADATA,
 };
 
 enum AVActiveFormatDescription {
diff --git a/libavutil/version.h b/libavutil/version.h
index 017fc277a6..678401fcf5 100644
--- a/libavutil/version.h
+++ b/libavutil/version.h
@@ -79,7 +79,7 @@ 
  */
 
 #define LIBAVUTIL_VERSION_MAJOR  57
-#define LIBAVUTIL_VERSION_MINOR  10
+#define LIBAVUTIL_VERSION_MINOR  11
 #define LIBAVUTIL_VERSION_MICRO 101
 
 #define LIBAVUTIL_VERSION_INT   AV_VERSION_INT(LIBAVUTIL_VERSION_MAJOR, \