diff mbox series

[FFmpeg-devel] avcodec: Pass the HDR10+ metadata to the packet side data in VP9 encoder

Message ID 20210607173919.503279-1-izadi@google.com
State Superseded
Headers show
Series [FFmpeg-devel] avcodec: Pass the HDR10+ metadata to the packet side data in VP9 encoder
Related show

Checks

Context Check Description
andriy/x86_make success Make finished
andriy/x86_make_fate success Make fate finished
andriy/PPC64_make success Make finished
andriy/PPC64_make_fate success Make fate finished

Commit Message

Mohammad Izadi June 7, 2021, 5:39 p.m. UTC
HDR10+ metadata is stored in the bit stream for HEVC. The story is different for VP9 and cannot store the metadata in the bit stream. HDR10+ should be passed to packet side data an stored in the container (mkv) for VP9.

This CL is taking HDR10+ from AVFrame side data in libvpxenc and is passing it to the AVPacket side data.
---
 doc/APIchanges         |  2 +
 libavcodec/avpacket.c  |  1 +
 libavcodec/decode.c    |  1 +
 libavcodec/libvpxenc.c | 98 ++++++++++++++++++++++++++++++++++++++++++
 libavcodec/packet.h    |  8 ++++
 libavcodec/version.h   |  4 +-
 6 files changed, 112 insertions(+), 2 deletions(-)

Comments

Michael Niedermayer June 7, 2021, 7:55 p.m. UTC | #1
On Mon, Jun 07, 2021 at 10:39:19AM -0700, Mohammad Izadi wrote:
> HDR10+ metadata is stored in the bit stream for HEVC. The story is different for VP9 and cannot store the metadata in the bit stream. HDR10+ should be passed to packet side data an stored in the container (mkv) for VP9.
> 
> This CL is taking HDR10+ from AVFrame side data in libvpxenc and is passing it to the AVPacket side data.
> ---
>  doc/APIchanges         |  2 +
>  libavcodec/avpacket.c  |  1 +
>  libavcodec/decode.c    |  1 +
>  libavcodec/libvpxenc.c | 98 ++++++++++++++++++++++++++++++++++++++++++
>  libavcodec/packet.h    |  8 ++++
>  libavcodec/version.h   |  4 +-
>  6 files changed, 112 insertions(+), 2 deletions(-)

still crashing  ...
 
 heres a test with a fate sample:
 
 -i ~/fate/msmpeg4v1/mpg4.avi -vcodec libvpx -y   vpx.webm


frame=   50 fps=0.7 q=0.0 Lsize=      58kB time=00:00:01.63 bitrate= 288.0kbits/s speed=0.0229x    
video:57kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 1.558648%
==26333== Invalid read of size 4
==26333==    at 0x1220B5E: av_fifo_size (fifo.c:79)
==26333==    by 0x9F64C5: free_hdr10_plus_fifo (libvpxenc.c:346)
==26333==    by 0x9F699D: vpx_free (libvpxenc.c:443)
==26333==    by 0x7A20D6: avcodec_close (avcodec.c:472)
==26333==    by 0xAE5002: avcodec_free_context (options.c:163)
==26333==    by 0x24AC0F: ffmpeg_cleanup (ffmpeg.c:609)
==26333==    by 0x24258A: exit_program (cmdutils.c:135)
==26333==    by 0x25C5AF: main (ffmpeg.c:5034)
==26333==  Address 0x24 is not stack'd, malloc'd or (recently) free'd



[...]
Andreas Rheinhardt June 7, 2021, 7:58 p.m. UTC | #2
Michael Niedermayer:
> On Mon, Jun 07, 2021 at 10:39:19AM -0700, Mohammad Izadi wrote:
>> HDR10+ metadata is stored in the bit stream for HEVC. The story is different for VP9 and cannot store the metadata in the bit stream. HDR10+ should be passed to packet side data an stored in the container (mkv) for VP9.
>>
>> This CL is taking HDR10+ from AVFrame side data in libvpxenc and is passing it to the AVPacket side data.
>> ---
>>  doc/APIchanges         |  2 +
>>  libavcodec/avpacket.c  |  1 +
>>  libavcodec/decode.c    |  1 +
>>  libavcodec/libvpxenc.c | 98 ++++++++++++++++++++++++++++++++++++++++++
>>  libavcodec/packet.h    |  8 ++++
>>  libavcodec/version.h   |  4 +-
>>  6 files changed, 112 insertions(+), 2 deletions(-)
> 
> still crashing  ...
>  
>  heres a test with a fate sample:
>  
>  -i ~/fate/msmpeg4v1/mpg4.avi -vcodec libvpx -y   vpx.webm
> 
> 
> frame=   50 fps=0.7 q=0.0 Lsize=      58kB time=00:00:01.63 bitrate= 288.0kbits/s speed=0.0229x    
> video:57kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 1.558648%
> ==26333== Invalid read of size 4
> ==26333==    at 0x1220B5E: av_fifo_size (fifo.c:79)
> ==26333==    by 0x9F64C5: free_hdr10_plus_fifo (libvpxenc.c:346)
> ==26333==    by 0x9F699D: vpx_free (libvpxenc.c:443)
> ==26333==    by 0x7A20D6: avcodec_close (avcodec.c:472)
> ==26333==    by 0xAE5002: avcodec_free_context (options.c:163)
> ==26333==    by 0x24AC0F: ffmpeg_cleanup (ffmpeg.c:609)
> ==26333==    by 0x24258A: exit_program (cmdutils.c:135)
> ==26333==    by 0x25C5AF: main (ffmpeg.c:5034)
> ==26333==  Address 0x24 is not stack'd, malloc'd or (recently) free'd
> 
> 
The reason is that the fifo is not allocated unconditionally, but when
freeing it is checked whether the fifo is empty even when the fifo has
never been allocated at all.

- Andreas
Mohammad Izadi June 8, 2021, 6:25 p.m. UTC | #3
On Mon, Jun 7, 2021 at 12:59 PM Andreas Rheinhardt <
andreas.rheinhardt@outlook.com> wrote:

> Michael Niedermayer:
> > On Mon, Jun 07, 2021 at 10:39:19AM -0700, Mohammad Izadi wrote:
> >> HDR10+ metadata is stored in the bit stream for HEVC. The story is
> different for VP9 and cannot store the metadata in the bit stream. HDR10+
> should be passed to packet side data an stored in the container (mkv) for
> VP9.
> >>
> >> This CL is taking HDR10+ from AVFrame side data in libvpxenc and is
> passing it to the AVPacket side data.
> >> ---
> >>  doc/APIchanges         |  2 +
> >>  libavcodec/avpacket.c  |  1 +
> >>  libavcodec/decode.c    |  1 +
> >>  libavcodec/libvpxenc.c | 98 ++++++++++++++++++++++++++++++++++++++++++
> >>  libavcodec/packet.h    |  8 ++++
> >>  libavcodec/version.h   |  4 +-
> >>  6 files changed, 112 insertions(+), 2 deletions(-)
> >
> > still crashing  ...
> >
> >  heres a test with a fate sample:
> >
> >  -i ~/fate/msmpeg4v1/mpg4.avi -vcodec libvpx -y   vpx.webm
> >
> >
> > frame=   50 fps=0.7 q=0.0 Lsize=      58kB time=00:00:01.63 bitrate=
> 288.0kbits/s speed=0.0229x
> > video:57kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB
> muxing overhead: 1.558648%
> > ==26333== Invalid read of size 4
> > ==26333==    at 0x1220B5E: av_fifo_size (fifo.c:79)
> > ==26333==    by 0x9F64C5: free_hdr10_plus_fifo (libvpxenc.c:346)
> > ==26333==    by 0x9F699D: vpx_free (libvpxenc.c:443)
> > ==26333==    by 0x7A20D6: avcodec_close (avcodec.c:472)
> > ==26333==    by 0xAE5002: avcodec_free_context (options.c:163)
> > ==26333==    by 0x24AC0F: ffmpeg_cleanup (ffmpeg.c:609)
> > ==26333==    by 0x24258A: exit_program (cmdutils.c:135)
> > ==26333==    by 0x25C5AF: main (ffmpeg.c:5034)
> > ==26333==  Address 0x24 is not stack'd, malloc'd or (recently) free'd
> >
> >
> The reason is that the fifo is not allocated unconditionally, but when
> freeing it is checked whether the fifo is empty even when the fifo has
> never been allocated at all.
>
Fixed. Verified and passes now.

>
> - Andreas
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
diff mbox series

Patch

diff --git a/doc/APIchanges b/doc/APIchanges
index c46f4d5304..60995579e5 100644
--- a/doc/APIchanges
+++ b/doc/APIchanges
@@ -13,6 +13,8 @@  libavutil:     2021-04-27
 
 
 API changes, most recent first:
+2021-05-25 - 8c88a66d3c - lavc 59.2.100 - packet.h
+  Add AV_PKT_DATA_DYNAMIC_HDR10_PLUS
 
 2021-04-27 - cb3ac722f4 - lavc 59.0.100 - avcodec.h
   Constified AVCodecParserContext.parser.
diff --git a/libavcodec/avpacket.c b/libavcodec/avpacket.c
index 7383d12d3e..800bee3489 100644
--- a/libavcodec/avpacket.c
+++ b/libavcodec/avpacket.c
@@ -289,6 +289,7 @@  const char *av_packet_side_data_name(enum AVPacketSideDataType type)
     case AV_PKT_DATA_ICC_PROFILE:                return "ICC Profile";
     case AV_PKT_DATA_DOVI_CONF:                  return "DOVI configuration record";
     case AV_PKT_DATA_S12M_TIMECODE:              return "SMPTE ST 12-1:2014 timecode";
+    case AV_PKT_DATA_DYNAMIC_HDR10_PLUS:         return "HDR10+ Dynamic Metadata (SMPTE 2094-40)";
     }
     return NULL;
 }
diff --git a/libavcodec/decode.c b/libavcodec/decode.c
index 75bc7ad98e..40f688e40c 100644
--- a/libavcodec/decode.c
+++ b/libavcodec/decode.c
@@ -1488,6 +1488,7 @@  int ff_decode_frame_props(AVCodecContext *avctx, AVFrame *frame)
         { AV_PKT_DATA_A53_CC,                     AV_FRAME_DATA_A53_CC },
         { AV_PKT_DATA_ICC_PROFILE,                AV_FRAME_DATA_ICC_PROFILE },
         { AV_PKT_DATA_S12M_TIMECODE,              AV_FRAME_DATA_S12M_TIMECODE },
+        { AV_PKT_DATA_DYNAMIC_HDR10_PLUS,         AV_FRAME_DATA_DYNAMIC_HDR_PLUS },
     };
 
     if (IS_EMPTY(pkt) && av_fifo_size(avctx->internal->pkt_props) >= sizeof(*pkt))
diff --git a/libavcodec/libvpxenc.c b/libavcodec/libvpxenc.c
index 66bad444d0..cf7e25a4bc 100644
--- a/libavcodec/libvpxenc.c
+++ b/libavcodec/libvpxenc.c
@@ -64,6 +64,11 @@  struct FrameListData {
     struct FrameListData *next;
 };
 
+typedef struct FrameHDR10Plus {
+    int64_t pts;
+    AVBufferRef *hdr10_plus;
+} FrameHDR10Plus;
+
 typedef struct VPxEncoderContext {
     AVClass *class;
     struct vpx_codec_ctx encoder;
@@ -121,6 +126,8 @@  typedef struct VPxEncoderContext {
     int tune_content;
     int corpus_complexity;
     int tpl_model;
+    int discard_hdr10_plus;
+    AVFifoBuffer *hdr10_plus_fifo;
     /**
      * If the driver does not support ROI then warn the first time we
      * encounter a frame with ROI side data.
@@ -316,6 +323,55 @@  static av_cold void free_frame_list(struct FrameListData *list)
     }
 }
 
+static av_cold int add_hdr10_plus(AVFifoBuffer *fifo, struct FrameHDR10Plus *data)
+{
+    int err = av_fifo_grow(fifo, sizeof(*data));
+    if (err < 0)
+        return err;
+    av_fifo_generic_write(fifo, data, sizeof(*data), NULL);
+    return 0;
+}
+
+static av_cold void free_hdr10_plus(struct FrameHDR10Plus *p)
+{
+    if (!p)
+        return;
+    av_buffer_unref(&p->hdr10_plus);
+    av_free(p);
+}
+
+static av_cold void free_hdr10_plus_fifo(AVFifoBuffer **fifo)
+{
+    FrameHDR10Plus *frame_hdr10_plus = NULL;
+    while (av_fifo_size(*fifo) >= sizeof(FrameHDR10Plus)) {
+        av_fifo_generic_read(*fifo, frame_hdr10_plus, sizeof(FrameHDR10Plus), NULL);
+        free_hdr10_plus(frame_hdr10_plus);
+    }
+    av_fifo_freep(fifo);
+}
+
+static int copy_hdr10_plus_to_pkt(AVFifoBuffer *fifo, AVPacket *pkt)
+{
+    FrameHDR10Plus *frame_hdr10_plus;
+    uint8_t *data;
+    if (av_fifo_size(fifo) < 1)
+        return 0;
+
+    av_fifo_generic_read(fifo, frame_hdr10_plus, sizeof(*frame_hdr10_plus), NULL);
+    if (!frame_hdr10_plus || !pkt || !frame_hdr10_plus->hdr10_plus || frame_hdr10_plus->pts != pkt->pts)
+        return 0;
+
+    data = av_packet_new_side_data(pkt, AV_PKT_DATA_DYNAMIC_HDR10_PLUS, frame_hdr10_plus->hdr10_plus->size);
+    if (!data) {
+        free_hdr10_plus(frame_hdr10_plus);
+        return AVERROR(ENOMEM);
+    }
+    memcpy(data, frame_hdr10_plus->hdr10_plus->data, frame_hdr10_plus->hdr10_plus->size);
+    free_hdr10_plus(frame_hdr10_plus);
+
+    return 0;
+}
+
 static av_cold int codecctl_int(AVCodecContext *avctx,
                                 enum vp8e_enc_control_id id, int val)
 {
@@ -384,6 +440,7 @@  static av_cold int vpx_free(AVCodecContext *avctx)
     av_freep(&ctx->twopass_stats.buf);
     av_freep(&avctx->stats_out);
     free_frame_list(ctx->coded_frame_list);
+    free_hdr10_plus_fifo(&ctx->hdr10_plus_fifo);
     return 0;
 }
 
@@ -835,6 +892,7 @@  static av_cold int vpx_init(AVCodecContext *avctx,
 #endif
     AVDictionaryEntry* en = NULL;
 
+    ctx->discard_hdr10_plus = 1;
     av_log(avctx, AV_LOG_INFO, "%s\n", vpx_codec_version_str());
     av_log(avctx, AV_LOG_VERBOSE, "%s\n", vpx_codec_build_config());
 
@@ -851,6 +909,14 @@  static av_cold int vpx_init(AVCodecContext *avctx,
     if (avctx->codec_id == AV_CODEC_ID_VP9) {
         if (set_pix_fmt(avctx, codec_caps, &enccfg, &flags, &img_fmt))
             return AVERROR(EINVAL);
+        // Keep HDR10+ if it has bit depth higher than 8 and
+        // it has PQ trc (SMPTE2084).
+        if (enccfg.g_bit_depth > 8 && avctx->color_trc == AVCOL_TRC_SMPTE2084) {
+            ctx->discard_hdr10_plus = 0;
+            ctx->hdr10_plus_fifo = av_fifo_alloc(sizeof(FrameHDR10Plus));
+            if (!ctx->hdr10_plus_fifo)
+                return AVERROR(ENOMEM);
+        }
     }
 #endif
 
@@ -1211,6 +1277,15 @@  static int storeframe(AVCodecContext *avctx, struct FrameListData *cx_frame,
         AV_WB64(side_data, 1);
         memcpy(side_data + 8, cx_frame->buf_alpha, cx_frame->sz_alpha);
     }
+    if (cx_frame->frame_number != -1) {
+        VPxContext *ctx = avctx->priv_data;
+        if (!ctx->discard_hdr10_plus) {
+            int err = copy_hdr10_plus_to_pkt(ctx->hdr10_plus_fifo, pkt);
+            if (err < 0)
+                return err;
+        }
+    }
+
     return pkt->size;
 }
 
@@ -1618,6 +1693,29 @@  static int vpx_encode(AVCodecContext *avctx, AVPacket *pkt,
                 vp9_encode_set_roi(avctx, frame->width, frame->height, sd);
             }
         }
+
+        if (!ctx->discard_hdr10_plus) {
+            AVFrameSideData *hdr10_plus_metadata;
+            // Add HDR10+ metadata to queue.
+            hdr10_plus_metadata = av_frame_get_side_data(frame, AV_FRAME_DATA_DYNAMIC_HDR_PLUS);
+            if (hdr10_plus_metadata) {
+                int err;
+                struct FrameHDR10Plus *data =  av_malloc(sizeof(*data));
+                if (!data)
+                    return AVERROR(ENOMEM);
+                data->pts = frame->pts;
+                data->hdr10_plus = av_buffer_ref(hdr10_plus_metadata->buf);
+                if (!data->hdr10_plus) {
+                    av_freep(&data);
+                    return AVERROR(ENOMEM);
+                }
+                err = add_hdr10_plus(ctx->hdr10_plus_fifo, data);
+                if (err < 0) {
+                    av_freep(&data);
+                    return err;
+                }
+            }
+        }
     }
 
     // this is for encoding with preset temporal layering patterns defined in
diff --git a/libavcodec/packet.h b/libavcodec/packet.h
index fad8341c12..a9d3a9b596 100644
--- a/libavcodec/packet.h
+++ b/libavcodec/packet.h
@@ -290,6 +290,14 @@  enum AVPacketSideDataType {
      */
     AV_PKT_DATA_S12M_TIMECODE,
 
+    /**
+     * HDR10+ dynamic metadata associated with a video frame. The metadata is in
+     * the form of the AVDynamicHDRPlus struct and contains
+     * information for color volume transform - application 4 of
+     * SMPTE 2094-40:2016 standard.
+     */
+    AV_PKT_DATA_DYNAMIC_HDR10_PLUS,
+
     /**
      * The number of side data types.
      * This is not part of the public API/ABI in the sense that it may
diff --git a/libavcodec/version.h b/libavcodec/version.h
index 5b1e9e77f3..1288cecebe 100644
--- a/libavcodec/version.h
+++ b/libavcodec/version.h
@@ -28,8 +28,8 @@ 
 #include "libavutil/version.h"
 
 #define LIBAVCODEC_VERSION_MAJOR  59
-#define LIBAVCODEC_VERSION_MINOR   1
-#define LIBAVCODEC_VERSION_MICRO 101
+#define LIBAVCODEC_VERSION_MINOR   2
+#define LIBAVCODEC_VERSION_MICRO 100
 
 #define LIBAVCODEC_VERSION_INT  AV_VERSION_INT(LIBAVCODEC_VERSION_MAJOR, \
                                                LIBAVCODEC_VERSION_MINOR, \