Message ID | 20211021134843.1436241-2-derek.buitenhuis@gmail.com |
---|---|
State | New |
Headers | show |
Series | Dolby Vision RPU Side Data | expand |
Context | Check | Description |
---|---|---|
andriy/make_x86 | success | Make finished |
andriy/make_fate_x86 | success | Make fate finished |
andriy/make_ppc | success | Make finished |
andriy/make_fate_ppc | success | Make fate finished |
On Thu, Oct 21, 2021 at 4:49 PM Derek Buitenhuis <derek.buitenhuis@gmail.com> wrote: > > Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com> > --- > All known encoders, etc. expect RPUs in this format, to my knowledge. > (There is no spec for parsing the RPUs, even privately, and not a single > piece of software that would benefit from that DoVi Profile 5 and 7 rendering requires the data within these RPU entities, AFAIK (as it contains various values to correctly configure the custom color space). So software such as libplacebo where haasn has made efforts towards implementing the non-standard color space utilized would benefit from having the contents. So while there clearly are two separate use cases, I would note that saying "nothing would see any use from the bytes within these structures" is maybe a bit disingenuous. The two use cases being: 1. Just re-encoding the content. The contents of these RPU buffers being unrelated and only passthrough is necessary. 2. Trying to gather the information of these metadata units regarding interpretation of the received image. Both are clearly valid use cases, and we should IMHO support both, at least on the AVFrame level. Not that I mean that you should implement the parsing of these metadata units, since it is clear you want to keep away from that part of the work :) . > - and it is a legal minefield > which I am not going to touch.) While with this I agree with completely. Or at least given the D company's track record :P . That said, just parsing the metadata units is probably much less of a swamp than actually trying to utilize the metadata to properly render Profile 5 or 7 video. For the record, there are at least two known to me OSS implementations of RPU parsing/writing, one being made in Java and the other being made in Rust. Jan
On 10/22/2021 11:02 PM, Jan Ekström wrote: > DoVi Profile 5 and 7 rendering requires the data within these RPU > entities, AFAIK (as it contains various values to correctly configure > the custom color space). So software such as libplacebo where haasn > has made efforts towards implementing the non-standard color space > utilized would benefit from having the contents. > > So while there clearly are two separate use cases, I would note that > saying "nothing would see any use from the bytes within these > structures" is maybe a bit disingenuous. Do any of thse actually work or is it just gusswork based on reverse engineering and patents (0% chance I am readng any patent or IPR). We could rename this or DOVI_PROFILE8 or RPU_BUFFER or something if you prefer. > The two use cases being: > 1. Just re-encoding the content. The contents of these RPU buffers > being unrelated and only passthrough is necessary. Yes, supported today by e.g. x265, which I imagine is the most common usecase. > 2. Trying to gather the information of these metadata units regarding > interpretation of the received image. To what end is a bit iffy - nothing can make much use of 99% of it. > Both are clearly valid use cases, and we should IMHO support both, at > least on the AVFrame level. Not that I mean that you should implement > the parsing of these metadata units, since it is clear you want to > keep away from that part of the work :) . Indeed, I was hoping this patchset would not be bogged down with "yeah implement this undocumented proprietary parser from RE or patent instead" which I obviously cannot do (dayjob, etc.) - and I think the value is ... eh ... at best. I'm not particularily fond of the idea of indefinitely killing this functionality while waiting for someone to take up the parsing. >> - and it is a legal minefield >> which I am not going to touch.) > > While with this I agree with completely. Or at least given the D > company's track record :P . That said, just parsing the metadata units > is probably much less of a swamp than actually trying to utilize the > metadata to properly render Profile 5 or 7 video. Which is why parsing it is iffy value given the risk. - Derek
diff --git a/doc/APIchanges b/doc/APIchanges index 7b267a79ac..8bc9c7eb6e 100644 --- a/doc/APIchanges +++ b/doc/APIchanges @@ -14,6 +14,9 @@ libavutil: 2021-04-27 API changes, most recent first: +2021-XX-XX - xxxxxxxxxx - lavf 57.8.100 - frame.h + Add AV_FRAME_DATA_DOVI_RPU. + 2021-09-21 - xxxxxxxxxx - lavu 57.7.100 - pixfmt.h Add AV_PIX_FMT_X2BGR10. diff --git a/libavutil/frame.c b/libavutil/frame.c index 2617cda2b5..23b52f7b66 100644 --- a/libavutil/frame.c +++ b/libavutil/frame.c @@ -728,6 +728,7 @@ const char *av_frame_side_data_name(enum AVFrameSideDataType type) case AV_FRAME_DATA_SEI_UNREGISTERED: return "H.26[45] User Data Unregistered SEI message"; case AV_FRAME_DATA_FILM_GRAIN_PARAMS: return "Film grain parameters"; case AV_FRAME_DATA_DETECTION_BBOXES: return "Bounding boxes for object detection and classification"; + case AV_FRAME_DATA_DOVI_RPU: return "Dolby Vision RPU Data"; } return NULL; } diff --git a/libavutil/frame.h b/libavutil/frame.h index ff2540a20f..29839211ca 100644 --- a/libavutil/frame.h +++ b/libavutil/frame.h @@ -187,6 +187,13 @@ enum AVFrameSideDataType { * as described by AVDetectionBBoxHeader. */ AV_FRAME_DATA_DETECTION_BBOXES, + + /** + * Dolby Vision RPU data, suitable for passing to x265 + * or other libraries. Array of uint8_t, with NAL emulation + * bytes intact. + */ + AV_FRAME_DATA_DOVI_RPU, }; enum AVActiveFormatDescription { diff --git a/libavutil/version.h b/libavutil/version.h index 896e348d80..eeb33b388c 100644 --- a/libavutil/version.h +++ b/libavutil/version.h @@ -79,7 +79,7 @@ */ #define LIBAVUTIL_VERSION_MAJOR 57 -#define LIBAVUTIL_VERSION_MINOR 7 +#define LIBAVUTIL_VERSION_MINOR 8 #define LIBAVUTIL_VERSION_MICRO 100 #define LIBAVUTIL_VERSION_INT AV_VERSION_INT(LIBAVUTIL_VERSION_MAJOR, \
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com> --- All known encoders, etc. expect RPUs in this format, to my knowledge. (There is no spec for parsing the RPUs, even privately, and not a single piece of software that would benefit from that - and it is a legal minefield which I am not going to touch.) The comment says "with emulation bytes intact" but technically, RPUs do not have emulation bytes, since they are not actually NALs, but just masquerade as them, and as such, encoder APIs expect them to be there. --- doc/APIchanges | 3 +++ libavutil/frame.c | 1 + libavutil/frame.h | 7 +++++++ libavutil/version.h | 2 +- 4 files changed, 12 insertions(+), 1 deletion(-)