diff mbox series

[FFmpeg-devel] lavu: add AVVideoHint API

Message ID 20230709110529.29490-1-anton@khirnov.net
State New
Headers show
Series [FFmpeg-devel] lavu: add AVVideoHint API | expand

Checks

Context Check Description
yinshiyou/make_loongarch64 success Make finished
yinshiyou/make_fate_loongarch64 success Make fate finished
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Anton Khirnov July 9, 2023, 11:05 a.m. UTC
From: Elias Carotti <eliascrt _at_ amazon _dot_ it>

Add side data type to provide hint to the video encoders about unchanged
portions of each frame.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
---
I've made couple small changes:
* rebased against current master
* consistently refer to rectangles or AVVideoRect, not blocks
* use size_t instead of int for AVVideoHint.nb_rects
* build unconditionally -- this is public API, it must always be
  available regardless of what encoders are or are not available
* tweaked documentation

If you have no objections to the changes, I'll push this in a few days.
---
 doc/APIchanges         |   3 ++
 libavutil/Makefile     |   2 +
 libavutil/frame.h      |  10 ++++
 libavutil/version.h    |   2 +-
 libavutil/video_hint.c |  82 +++++++++++++++++++++++++++++++
 libavutil/video_hint.h | 108 +++++++++++++++++++++++++++++++++++++++++
 6 files changed, 206 insertions(+), 1 deletion(-)
 create mode 100644 libavutil/video_hint.c
 create mode 100644 libavutil/video_hint.h

Comments

Lynne July 9, 2023, 1:10 p.m. UTC | #1
Jul 9, 2023, 13:05 by anton@khirnov.net:

> From: Elias Carotti <eliascrt _at_ amazon _dot_ it>
>
> Add side data type to provide hint to the video encoders about unchanged
> portions of each frame.
>
> Signed-off-by: Anton Khirnov <anton@khirnov.net>
> ---
> I've made couple small changes:
> * rebased against current master
> * consistently refer to rectangles or AVVideoRect, not blocks
> * use size_t instead of int for AVVideoHint.nb_rects
> * build unconditionally -- this is public API, it must always be
>  available regardless of what encoders are or are not available
> * tweaked documentation
>
> If you have no objections to the changes, I'll push this in a few days.
> ---
>  doc/APIchanges         |   3 ++
>  libavutil/Makefile     |   2 +
>  libavutil/frame.h      |  10 ++++
>  libavutil/version.h    |   2 +-
>  libavutil/video_hint.c |  82 +++++++++++++++++++++++++++++++
>  libavutil/video_hint.h | 108 +++++++++++++++++++++++++++++++++++++++++
>  6 files changed, 206 insertions(+), 1 deletion(-)
>  create mode 100644 libavutil/video_hint.c
>  create mode 100644 libavutil/video_hint.h
>

AVVideoHint is a bad name for something like this.
Could you borrow some wording from graphics and call it
AVVideoDamagedHint or maybe AVVideoChangedAreaHint
or a combination of both?
I'd prefer the former, damage is standard language in graphics
circles about what has changed since the last frame.
Carotti, Elias July 10, 2023, 8:09 a.m. UTC | #2
-----Original Message-----
From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of Anton Khirnov
Sent: Sunday, July 9, 2023 1:05 PM
To: ffmpeg-devel@ffmpeg.org
Subject: [EXTERNAL] [FFmpeg-devel] [PATCH] lavu: add AVVideoHint API

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.



From: Elias Carotti <eliascrt _at_ amazon _dot_ it>

Add side data type to provide hint to the video encoders about unchanged portions of each frame.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
---
I've made couple small changes:
* rebased against current master
* consistently refer to rectangles or AVVideoRect, not blocks
* use size_t instead of int for AVVideoHint.nb_rects
* build unconditionally -- this is public API, it must always be
  available regardless of what encoders are or are not available
* tweaked documentation

If you have no objections to the changes, I'll push this in a few days.

Hi Anton,
I'm perfectly fine with all the changes, so feel free to proceed.
Best,
Elias



---
 doc/APIchanges         |   3 ++
 libavutil/Makefile     |   2 +
 libavutil/frame.h      |  10 ++++
 libavutil/version.h    |   2 +-
 libavutil/video_hint.c |  82 +++++++++++++++++++++++++++++++  libavutil/video_hint.h | 108 +++++++++++++++++++++++++++++++++++++++++
 6 files changed, 206 insertions(+), 1 deletion(-)  create mode 100644 libavutil/video_hint.c  create mode 100644 libavutil/video_hint.h

diff --git a/doc/APIchanges b/doc/APIchanges index 27f835cfce..0cda51fdee 100644
--- a/doc/APIchanges
+++ b/doc/APIchanges
@@ -2,6 +2,9 @@ The last version increases of all libraries were on 2023-02-09

 API changes, most recent first:

+2023-07-xx - xxxxxxxxxx - lavu 58.15.100 - video_hint.h
+  Add AVVideoHint API.
+
 2023-07-05 - xxxxxxxxxx - lavu 58.14.100 - random_seed.h
   Add av_random_bytes()

diff --git a/libavutil/Makefile b/libavutil/Makefile index bd9c6f9e32..7828c94dc5 100644
--- a/libavutil/Makefile
+++ b/libavutil/Makefile
@@ -91,6 +91,7 @@ HEADERS = adler32.h                                                     \
           tea.h                                                         \
           tx.h                                                          \
           film_grain_params.h                                           \
+          video_hint.h

 ARCH_HEADERS = bswap.h                                                  \
                intmath.h                                                \
@@ -181,6 +182,7 @@ OBJS = adler32.o                                                        \
        uuid.o                                                           \
        version.o                                                        \
        video_enc_params.o                                               \
+       video_hint.o                                                     \
        film_grain_params.o                                              \


diff --git a/libavutil/frame.h b/libavutil/frame.h index a491315f25..c0c1b23db7 100644
--- a/libavutil/frame.h
+++ b/libavutil/frame.h
@@ -214,6 +214,16 @@ enum AVFrameSideDataType {
      * Ambient viewing environment metadata, as defined by H.274.
      */
     AV_FRAME_DATA_AMBIENT_VIEWING_ENVIRONMENT,
+
+    /**
+     * Provide encoder-specific hinting information about changed/unchanged
+     * portions of a frame.  It can be used to pass information about which
+     * macroblocks can be skipped because they didn't change from the
+     * corresponding ones in the previous frame. This could be useful for
+     * applications which know this information in advance to speed up
+     * encoding.
+     */
+    AV_FRAME_DATA_VIDEO_HINT,
 };

 enum AVActiveFormatDescription {
diff --git a/libavutil/version.h b/libavutil/version.h index 24af520e08..9e798b0e3f 100644
--- a/libavutil/version.h
+++ b/libavutil/version.h
@@ -79,7 +79,7 @@
  */

 #define LIBAVUTIL_VERSION_MAJOR  58
-#define LIBAVUTIL_VERSION_MINOR  14
+#define LIBAVUTIL_VERSION_MINOR  15
 #define LIBAVUTIL_VERSION_MICRO 100

 #define LIBAVUTIL_VERSION_INT   AV_VERSION_INT(LIBAVUTIL_VERSION_MAJOR, \
diff --git a/libavutil/video_hint.c b/libavutil/video_hint.c new file mode 100644 index 0000000000..5730ed6cfb
--- /dev/null
+++ b/libavutil/video_hint.c
@@ -0,0 +1,82 @@
+/*
+ * Copyright 2023 Elias Carotti <eliascrt at amazon dot it>
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 
+02110-1301 USA  */
+
+#include <string.h>
+
+#include "avstring.h"
+#include "frame.h"
+#include "macros.h"
+#include "mem.h"
+#include "video_hint.h"
+
+AVVideoHint *av_video_hint_alloc(size_t nb_rects,
+                                 size_t *out_size) {
+    struct TestStruct {
+        AVVideoHint    hint;
+        AVVideoRect    rect;
+    };
+    const size_t rect_offset = offsetof(struct TestStruct, rect);
+    size_t size = rect_offset;
+    AVVideoHint *hint;
+
+    *out_size = 0;
+    if (nb_rects > (SIZE_MAX - size) / sizeof(AVVideoRect))
+        return NULL;
+    size += sizeof(AVVideoRect) * nb_rects;
+
+    hint = av_mallocz(size);
+    if (!hint)
+        return NULL;
+
+    hint->nb_rects    = nb_rects;
+    hint->rect_offset = rect_offset;
+    hint->rect_size   = sizeof(AVVideoRect);
+
+    *out_size = size;
+
+    return hint;
+}
+
+AVVideoHint *av_video_hint_create_side_data(AVFrame *frame,
+                                            size_t nb_rects) {
+    AVVideoHint *hint;
+    AVBufferRef *buf;
+    size_t size = 0;
+
+    hint = av_video_hint_alloc(nb_rects, &size);
+    if (!hint)
+        return NULL;
+
+    buf = av_buffer_create((uint8_t *)hint, size, NULL, NULL, 0);
+    if (!buf) {
+        av_freep(&hint);
+        return NULL;
+    }
+
+    if (!av_frame_new_side_data_from_buf(frame, AV_FRAME_DATA_VIDEO_HINT, buf)) {
+        av_buffer_unref(&buf);
+        return NULL;
+    }
+
+    return hint;
+}
+
diff --git a/libavutil/video_hint.h b/libavutil/video_hint.h new file mode 100644 index 0000000000..86addf5ceb
--- /dev/null
+++ b/libavutil/video_hint.h
@@ -0,0 +1,108 @@
+/**
+ * Copyright 2023 Elias Carotti <eliascrt at amazon dot it>
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 
+02110-1301 USA  */
+
+#ifndef AVUTIL_VIDEO_HINT_H
+#define AVUTIL_VIDEO_HINT_H
+
+#include <stddef.h>
+#include <stdint.h>
+#include "libavutil/avassert.h"
+#include "libavutil/frame.h"
+
+typedef struct AVVideoRect {
+    uint32_t x, y;
+    uint32_t width, height;
+} AVVideoRect;
+
+typedef enum AVVideoHintType {
+    /* rectangled delimit the constant areas (unchanged), default is changed */
+    AV_VIDEO_HINT_CONSTANT,
+
+    /* rectangled delimit the constant areas (changed), default is not changed */
+    AV_VIDEO_HINT_CHANGED,
+} AVVideoHintType;
+
+typedef struct AVVideoHint {
+    /**
+     * Number of AVVideoRect present.
+     *
+     * May be 0, in which case no per-rectangle information is present. In this
+     * case the values of rect_offset / rect_size are unspecified and should
+     * not be accessed.
+     */
+    size_t nb_rects;
+
+    /**
+     * Offset in bytes from the beginning of this structure at which the array
+     * of AVVideoRect starts.
+     */
+    size_t rect_offset;
+
+    /**
+     * Size in bytes of AVVideoRect.
+     */
+    size_t rect_size;
+
+    AVVideoHintType type;
+} AVVideoHint;
+
+static av_always_inline AVVideoRect*
+av_video_hint_rects(const AVVideoHint *hints) {
+    return (AVVideoRect *)((uint8_t *)hints + hints->rect_offset); }
+
+static av_always_inline AVVideoRect*
+av_video_hint_get_rect(const AVVideoHint *hints, size_t idx) {
+    return (AVVideoRect *)((uint8_t *)hints + hints->rect_offset + idx 
+* hints->rect_size); }
+
+/**
+ * Allocate memory for the AVVideoHint struct along with an 
+nb_rects-sized
+ * arrays of AVVideoRect.
+ *
+ * The side data contains a list of rectangles for the portions of the 
+frame
+ * which changed from the last encoded one (and the remainder are 
+assumed to be
+ * changed), or, alternately (depending on the type parameter) the 
+unchanged
+ * ones (and the remanining ones are those which changed).
+ * Macroblocks will thus be hinted either to be P_SKIP-ped or go 
+through the
+ * regular encoding procedure.
+ *
+ * It's responsibility of the caller to fill the AVRects accordingly, 
+and to set
+ * the proper AVVideoHintType field.
+ *
+ * @param out_size if non-NULL, the size in bytes of the resulting data array is
+ *                 written here
+ *
+ * @return newly allocated AVVideoHint struct (must be freed by the caller using
+ *         av_free()) on success, NULL on memory allocation failure
+ */
+AVVideoHint *av_video_hint_alloc(size_t nb_rects,
+                                 size_t *out_size);
+/**
+ * Same as av_video_hint_alloc(), except newly-allocated AVVideoHint is 
+attached
+ * as side data of type AV_FRAME_DATA_VIDEO_HINT_INFO to frame.
+ */
+AVVideoHint *av_video_hint_create_side_data(AVFrame *frame,
+                                            size_t nb_rects);
+
+
+#endif /* AVUTIL_VIDEO_HINT_H */
--
2.40.1
Carotti, Elias July 10, 2023, 8:13 a.m. UTC | #3
-----Original Message-----
From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of Lynne
Sent: Sunday, July 9, 2023 3:11 PM
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: RE: [EXTERNAL][FFmpeg-devel] [PATCH] lavu: add AVVideoHint API

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.



Jul 9, 2023, 13:05 by anton@khirnov.net:

> From: Elias Carotti <eliascrt _at_ amazon _dot_ it>
>
> Add side data type to provide hint to the video encoders about 
> unchanged portions of each frame.
>
> Signed-off-by: Anton Khirnov <anton@khirnov.net>
> ---
> I've made couple small changes:
> * rebased against current master
> * consistently refer to rectangles or AVVideoRect, not blocks
> * use size_t instead of int for AVVideoHint.nb_rects
> * build unconditionally -- this is public API, it must always be  
> available regardless of what encoders are or are not available
> * tweaked documentation
>
> If you have no objections to the changes, I'll push this in a few days.
> ---
>  doc/APIchanges         |   3 ++
>  libavutil/Makefile     |   2 +
>  libavutil/frame.h      |  10 ++++
>  libavutil/version.h    |   2 +-
>  libavutil/video_hint.c |  82 +++++++++++++++++++++++++++++++  
> libavutil/video_hint.h | 108 +++++++++++++++++++++++++++++++++++++++++
>  6 files changed, 206 insertions(+), 1 deletion(-)  create mode 100644 
> libavutil/video_hint.c  create mode 100644 libavutil/video_hint.h
>

AVVideoHint is a bad name for something like this.
Could you borrow some wording from graphics and call it AVVideoDamagedHint or maybe AVVideoChangedAreaHint or a combination of both?
I'd prefer the former, damage is standard language in graphics circles about what has changed since the last frame.

Hi,
I have no strong objections on this. Personally I also like the AVVideoDamagedHint name best, my only concern is that it is strictly related to the current use/implementation 
(it's true right now that's the only kind of hint) while it may turn out to be a bad naming decision should other forms of hinting for the encoder be added in the future.
That said, I am fine with the change too.
Elias
Carotti, Elias July 10, 2023, 12:51 p.m. UTC | #4
On Mon, 2023-07-10 at 08:13 +0000, Carotti, Elias wrote:
> 

> AVVideoHint is a bad name for something like this.
> Could you borrow some wording from graphics and call it
> AVVideoDamagedHint or maybe AVVideoChangedAreaHint or a combination
> of both?
> I'd prefer the former, damage is standard language in graphics
> circles about what has changed since the last frame.
> 
> Hi,
> I have no strong objections on this. Personally I also like the
> AVVideoDamagedHint name best, my only concern is that it is strictly
> related to the current use/implementation 
> (it's true right now that's the only kind of hint) while it may turn
> out to be a bad naming decision should other forms of hinting for the
> encoder be added in the future.
> That said, I am fine with the change too.
> Elias
> 

I added a type to the AVVideoRect struct. This should solve the naming
issue above while preserving the possibility to extend this to
different hinting types.
These are the only changes to Anton's version.
Best
Elias

 





NICE SRL, viale Monte Grappa 3/5, 20124 Milano, Italia, Registro delle Imprese di Milano Monza Brianza Lodi REA n. 2096882, Capitale Sociale: 10.329,14 EUR i.v., Cod. Fisc. e P.IVA 01133050052, Societa con Socio Unico
Stefano Sabatini July 17, 2023, 10:19 p.m. UTC | #5
On date Monday 2023-07-10 12:51:25 +0000, Carotti, Elias wrote:
> On Mon, 2023-07-10 at 08:13 +0000, Carotti, Elias wrote:
> > 
> 
> > AVVideoHint is a bad name for something like this.
> > Could you borrow some wording from graphics and call it
> > AVVideoDamagedHint or maybe AVVideoChangedAreaHint or a combination
> > of both?
> > I'd prefer the former, damage is standard language in graphics
> > circles about what has changed since the last frame.
> > 
> > Hi,
> > I have no strong objections on this. Personally I also like the
> > AVVideoDamagedHint name best, my only concern is that it is strictly
> > related to the current use/implementation 
> > (it's true right now that's the only kind of hint) while it may turn
> > out to be a bad naming decision should other forms of hinting for the
> > encoder be added in the future.
> > That said, I am fine with the change too.

AVVideoDamagedHint would be too specific (what if we want to add
"hinting" for different things?).

> > Elias
> > 
> 
> I added a type to the AVVideoRect struct. This should solve the naming
> issue above while preserving the possibility to extend this to
> different hinting types.
> These are the only changes to Anton's version.
> Best
> Elias
> 
>  
> 
> 
> 
> 
> 
> NICE SRL, viale Monte Grappa 3/5, 20124 Milano, Italia, Registro delle Imprese di Milano Monza Brianza Lodi REA n. 2096882, Capitale Sociale: 10.329,14 EUR i.v., Cod. Fisc. e P.IVA 01133050052, Societa con Socio Unico
> 
> 

> From 8ef4f97410a6b78df048b71d9921a763da6255b3 Mon Sep 17 00:00:00 2001
> From: Elias Carotti <eliascrt _at_ amazon _dot_ it>
> Date: Mon, 10 Jul 2023 14:34:53 +0200
> Subject: [PATCH] Add side data type to provide hint to the video encoders
>  about unchanged portions of each frame.
> 
> Signed-off-by: Anton Khirnov <anton@khirnov.net>
> ---
>  doc/APIchanges         |   3 ++
>  libavutil/Makefile     |   2 +
>  libavutil/frame.h      |  10 ++++
>  libavutil/version.h    |   2 +-
>  libavutil/video_hint.c |  82 +++++++++++++++++++++++++++++
>  libavutil/video_hint.h | 117 +++++++++++++++++++++++++++++++++++++++++
>  6 files changed, 215 insertions(+), 1 deletion(-)
>  create mode 100644 libavutil/video_hint.c
>  create mode 100644 libavutil/video_hint.h
> 
> diff --git a/doc/APIchanges b/doc/APIchanges
> index 27f835cfce..0cda51fdee 100644
> --- a/doc/APIchanges
> +++ b/doc/APIchanges
> @@ -2,6 +2,9 @@ The last version increases of all libraries were on 2023-02-09
>  
>  API changes, most recent first:
>  
> +2023-07-xx - xxxxxxxxxx - lavu 58.15.100 - video_hint.h
> +  Add AVVideoHint API.
> +
>  2023-07-05 - xxxxxxxxxx - lavu 58.14.100 - random_seed.h
>    Add av_random_bytes()
>  
> diff --git a/libavutil/Makefile b/libavutil/Makefile
> index bd9c6f9e32..7828c94dc5 100644
> --- a/libavutil/Makefile
> +++ b/libavutil/Makefile
> @@ -91,6 +91,7 @@ HEADERS = adler32.h                                                     \
>            tea.h                                                         \
>            tx.h                                                          \
>            film_grain_params.h                                           \
> +          video_hint.h
>  
>  ARCH_HEADERS = bswap.h                                                  \
>                 intmath.h                                                \
> @@ -181,6 +182,7 @@ OBJS = adler32.o                                                        \
>         uuid.o                                                           \
>         version.o                                                        \
>         video_enc_params.o                                               \
> +       video_hint.o                                                     \
>         film_grain_params.o                                              \
>  
>  
> diff --git a/libavutil/frame.h b/libavutil/frame.h
> index a491315f25..c0c1b23db7 100644
> --- a/libavutil/frame.h
> +++ b/libavutil/frame.h
> @@ -214,6 +214,16 @@ enum AVFrameSideDataType {
>       * Ambient viewing environment metadata, as defined by H.274.
>       */
>      AV_FRAME_DATA_AMBIENT_VIEWING_ENVIRONMENT,
> +
> +    /**
> +     * Provide encoder-specific hinting information about changed/unchanged
> +     * portions of a frame.  It can be used to pass information about which
> +     * macroblocks can be skipped because they didn't change from the
> +     * corresponding ones in the previous frame. This could be useful for
> +     * applications which know this information in advance to speed up
> +     * encoding.
> +     */
> +    AV_FRAME_DATA_VIDEO_HINT,
>  };
>  
>  enum AVActiveFormatDescription {
> diff --git a/libavutil/version.h b/libavutil/version.h
> index 24af520e08..9e798b0e3f 100644
> --- a/libavutil/version.h
> +++ b/libavutil/version.h
> @@ -79,7 +79,7 @@
>   */
>  
>  #define LIBAVUTIL_VERSION_MAJOR  58
> -#define LIBAVUTIL_VERSION_MINOR  14
> +#define LIBAVUTIL_VERSION_MINOR  15
>  #define LIBAVUTIL_VERSION_MICRO 100
>  
>  #define LIBAVUTIL_VERSION_INT   AV_VERSION_INT(LIBAVUTIL_VERSION_MAJOR, \
> diff --git a/libavutil/video_hint.c b/libavutil/video_hint.c
> new file mode 100644
> index 0000000000..5730ed6cfb
> --- /dev/null
> +++ b/libavutil/video_hint.c
> @@ -0,0 +1,82 @@
> +/*
> + * Copyright 2023 Elias Carotti <eliascrt at amazon dot it>
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +#include <string.h>
> +
> +#include "avstring.h"
> +#include "frame.h"
> +#include "macros.h"
> +#include "mem.h"
> +#include "video_hint.h"
> +
> +AVVideoHint *av_video_hint_alloc(size_t nb_rects,
> +                                 size_t *out_size)
> +{
> +    struct TestStruct {
> +        AVVideoHint    hint;
> +        AVVideoRect    rect;
> +    };
> +    const size_t rect_offset = offsetof(struct TestStruct, rect);
> +    size_t size = rect_offset;
> +    AVVideoHint *hint;
> +
> +    *out_size = 0;
> +    if (nb_rects > (SIZE_MAX - size) / sizeof(AVVideoRect))
> +        return NULL;
> +    size += sizeof(AVVideoRect) * nb_rects;
> +
> +    hint = av_mallocz(size);
> +    if (!hint)
> +        return NULL;
> +
> +    hint->nb_rects    = nb_rects;
> +    hint->rect_offset = rect_offset;
> +    hint->rect_size   = sizeof(AVVideoRect);
> +
> +    *out_size = size;
> +
> +    return hint;
> +}
> +
> +AVVideoHint *av_video_hint_create_side_data(AVFrame *frame,
> +                                            size_t nb_rects)
> +{
> +    AVVideoHint *hint;
> +    AVBufferRef *buf;
> +    size_t size = 0;
> +
> +    hint = av_video_hint_alloc(nb_rects, &size);
> +    if (!hint)
> +        return NULL;
> +
> +    buf = av_buffer_create((uint8_t *)hint, size, NULL, NULL, 0);
> +    if (!buf) {
> +        av_freep(&hint);
> +        return NULL;
> +    }
> +
> +    if (!av_frame_new_side_data_from_buf(frame, AV_FRAME_DATA_VIDEO_HINT, buf)) {
> +        av_buffer_unref(&buf);
> +        return NULL;
> +    }
> +
> +    return hint;
> +}
> +
> diff --git a/libavutil/video_hint.h b/libavutil/video_hint.h
> new file mode 100644
> index 0000000000..14bfe20aa6
> --- /dev/null
> +++ b/libavutil/video_hint.h
> @@ -0,0 +1,117 @@
> +/**
> + * Copyright 2023 Elias Carotti <eliascrt at amazon dot it>
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +#ifndef AVUTIL_VIDEO_HINT_H
> +#define AVUTIL_VIDEO_HINT_H
> +
> +#include <stddef.h>
> +#include <stdint.h>
> +#include "libavutil/avassert.h"
> +#include "libavutil/frame.h"
> +

> +/**
> +  * The type of the hinting information provided by an AVVideoRect.
> +  * Currently only damage information, i.e., changed/unchanged portions of the frame.
> +  **/

> +typedef enum AVVideoRectType {
> +    AV_VIDEO_RECT_DAMAGE,

AV_VIDEO_RECT_TYPE_DAMAGE for consistency

> +} AVVideoRectType;

> +
> +typedef struct AVVideoRect {
> +    uint32_t x, y;
> +    uint32_t width, height;
> +    AVVideoRectType type;
> +} AVVideoRect;
> +
> +typedef enum AVVideoHintType {
> +    /* rectangled delimit the constant areas (unchanged), default is changed */

rectangles?

> +    AV_VIDEO_HINT_CONSTANT,

AV_VIDEO_HINT_TYPE_...

> +
> +    /* rectangled delimit the constant areas (changed), default is not changed */

rectangles?

> +    AV_VIDEO_HINT_CHANGED,
> +} AVVideoHintType;
>
> +typedef struct AVVideoHint {
> +    /**
> +     * Number of AVVideoRect present.
> +     *
> +     * May be 0, in which case no per-rectangle information is present. In this
> +     * case the values of rect_offset / rect_size are unspecified and should
> +     * not be accessed.
> +     */
> +    size_t nb_rects;
> +
> +    /**
> +     * Offset in bytes from the beginning of this structure at which the array
> +     * of AVVideoRect starts.
> +     */
> +    size_t rect_offset;
> +
> +    /**
> +     * Size in bytes of AVVideoRect.
> +     */
> +    size_t rect_size;
> +

> +    AVVideoHintType type;

sorry to nitpick, but if we have this information in the single rect
should not we use that instead?

Basically I see two options:
1. generic VideoHint, storing single hint type and rects

for example we might have type=changed, and the rects storing only the
changed rects

If we want to extend this then we need to add a new type. The problem
might be if we want to store more than one VideoHint in the side data
(it it possible to store more than one AVVideoHint, each one with a
different type, as side data?). Suppose for example we want to add a
new hint type, e.g. ROI, then we can add a new AVHideoHint with
type=roi - but I don't know if we can have several frame hint side
data (each one with different individual types).

2. generic VideoHint, storing rects each one containing the hint type

for example we might have a list of rects, each one with
type=changed|constant|roi. This would allow one to store different
kind of hints in the same AVVideoHint structure, but at the same time
would enable inconsistent data (for example we might have changed and
unchanged rects in the same list of rects)

In each case, storing the type in the generic AVVideHint and in each
rect might lead to inconsistent states (e.g. you might have generic
type=damage, and have type=roi and type=changed in the rects), in this
case having the type in both the general structure and in each rects
seems redundant and leads to possible data inconsistencies.

...

I would rather go to solution 1, but I'm not sure if having more than
one hint in the side data (with different subtypes) is possible and
wanted. If this is the case, probably the best solution would be to
store more than one AVVideoHint (each one with an individual type and
list of rects) in the single side data object.

> +} AVVideoHint;
> +
> +static av_always_inline AVVideoRect*
> +av_video_hint_rects(const AVVideoHint *hints)
> +{
> +    return (AVVideoRect *)((uint8_t *)hints + hints->rect_offset);
> +}
> +
> +static av_always_inline AVVideoRect*
> +av_video_hint_get_rect(const AVVideoHint *hints, size_t idx)
> +{
> +    return (AVVideoRect *)((uint8_t *)hints + hints->rect_offset + idx * hints->rect_size);
> +}
> +
> +/**
> + * Allocate memory for the AVVideoHint struct along with an nb_rects-sized
> + * arrays of AVVideoRect.
> + *
> + * The side data contains a list of rectangles for the portions of the frame
> + * which changed from the last encoded one (and the remainder are assumed to be
> + * changed), or, alternately (depending on the type parameter) the unchanged
> + * ones (and the remanining ones are those which changed).
> + * Macroblocks will thus be hinted either to be P_SKIP-ped or go through the
> + * regular encoding procedure.
> + *
> + * It's responsibility of the caller to fill the AVRects accordingly, and to set
> + * the proper AVVideoHintType field.
> + *
> + * @param out_size if non-NULL, the size in bytes of the resulting data array is
> + *                 written here
> + *
> + * @return newly allocated AVVideoHint struct (must be freed by the caller using
> + *         av_free()) on success, NULL on memory allocation failure
> + */
> +AVVideoHint *av_video_hint_alloc(size_t nb_rects,
> +                                 size_t *out_size);
> +/**
> + * Same as av_video_hint_alloc(), except newly-allocated AVVideoHint is attached
> + * as side data of type AV_FRAME_DATA_VIDEO_HINT_INFO to frame.
> + */
> +AVVideoHint *av_video_hint_create_side_data(AVFrame *frame,
> +                                            size_t nb_rects);
> +
> +
> +#endif /* AVUTIL_VIDEO_HINT_H */
> -- 
> 2.34.1
> 

> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Stefano Sabatini July 23, 2023, 11:27 p.m. UTC | #6
On date Tuesday 2023-07-18 00:19:40 +0200, Stefano Sabatini wrote:
> On date Monday 2023-07-10 12:51:25 +0000, Carotti, Elias wrote:
> > On Mon, 2023-07-10 at 08:13 +0000, Carotti, Elias wrote:
> > > 
> > 
> > > AVVideoHint is a bad name for something like this.
> > > Could you borrow some wording from graphics and call it
> > > AVVideoDamagedHint or maybe AVVideoChangedAreaHint or a combination
> > > of both?
> > > I'd prefer the former, damage is standard language in graphics
> > > circles about what has changed since the last frame.
> > > 
> > > Hi,
> > > I have no strong objections on this. Personally I also like the
> > > AVVideoDamagedHint name best, my only concern is that it is strictly
> > > related to the current use/implementation 
> > > (it's true right now that's the only kind of hint) while it may turn
> > > out to be a bad naming decision should other forms of hinting for the
> > > encoder be added in the future.
> > > That said, I am fine with the change too.
> 
> AVVideoDamagedHint would be too specific (what if we want to add
> "hinting" for different things?).
> 
> > > Elias
> > > 
> > 
> > I added a type to the AVVideoRect struct. This should solve the naming
> > issue above while preserving the possibility to extend this to
> > different hinting types.
> > These are the only changes to Anton's version.
> > Best
> > Elias
> > 
> >  
> > 
> > 
> > 
> > 
> > 
> > NICE SRL, viale Monte Grappa 3/5, 20124 Milano, Italia, Registro delle Imprese di Milano Monza Brianza Lodi REA n. 2096882, Capitale Sociale: 10.329,14 EUR i.v., Cod. Fisc. e P.IVA 01133050052, Societa con Socio Unico
> > 
> > 
> 
> > From 8ef4f97410a6b78df048b71d9921a763da6255b3 Mon Sep 17 00:00:00 2001
> > From: Elias Carotti <eliascrt _at_ amazon _dot_ it>
> > Date: Mon, 10 Jul 2023 14:34:53 +0200
> > Subject: [PATCH] Add side data type to provide hint to the video encoders
> >  about unchanged portions of each frame.
> > 
> > Signed-off-by: Anton Khirnov <anton@khirnov.net>
> > ---
> >  doc/APIchanges         |   3 ++
> >  libavutil/Makefile     |   2 +
> >  libavutil/frame.h      |  10 ++++
> >  libavutil/version.h    |   2 +-
> >  libavutil/video_hint.c |  82 +++++++++++++++++++++++++++++
> >  libavutil/video_hint.h | 117 +++++++++++++++++++++++++++++++++++++++++
> >  6 files changed, 215 insertions(+), 1 deletion(-)
> >  create mode 100644 libavutil/video_hint.c
> >  create mode 100644 libavutil/video_hint.h
> > 
> > diff --git a/doc/APIchanges b/doc/APIchanges
> > index 27f835cfce..0cda51fdee 100644
> > --- a/doc/APIchanges
> > +++ b/doc/APIchanges
> > @@ -2,6 +2,9 @@ The last version increases of all libraries were on 2023-02-09
> >  
> >  API changes, most recent first:
> >  
> > +2023-07-xx - xxxxxxxxxx - lavu 58.15.100 - video_hint.h
> > +  Add AVVideoHint API.
> > +
> >  2023-07-05 - xxxxxxxxxx - lavu 58.14.100 - random_seed.h
> >    Add av_random_bytes()
> >  
> > diff --git a/libavutil/Makefile b/libavutil/Makefile
> > index bd9c6f9e32..7828c94dc5 100644
> > --- a/libavutil/Makefile
> > +++ b/libavutil/Makefile
> > @@ -91,6 +91,7 @@ HEADERS = adler32.h                                                     \
> >            tea.h                                                         \
> >            tx.h                                                          \
> >            film_grain_params.h                                           \
> > +          video_hint.h
> >  
> >  ARCH_HEADERS = bswap.h                                                  \
> >                 intmath.h                                                \
> > @@ -181,6 +182,7 @@ OBJS = adler32.o                                                        \
> >         uuid.o                                                           \
> >         version.o                                                        \
> >         video_enc_params.o                                               \
> > +       video_hint.o                                                     \
> >         film_grain_params.o                                              \
> >  
> >  
> > diff --git a/libavutil/frame.h b/libavutil/frame.h
> > index a491315f25..c0c1b23db7 100644
> > --- a/libavutil/frame.h
> > +++ b/libavutil/frame.h
> > @@ -214,6 +214,16 @@ enum AVFrameSideDataType {
> >       * Ambient viewing environment metadata, as defined by H.274.
> >       */
> >      AV_FRAME_DATA_AMBIENT_VIEWING_ENVIRONMENT,
> > +
> > +    /**
> > +     * Provide encoder-specific hinting information about changed/unchanged
> > +     * portions of a frame.  It can be used to pass information about which
> > +     * macroblocks can be skipped because they didn't change from the
> > +     * corresponding ones in the previous frame. This could be useful for
> > +     * applications which know this information in advance to speed up
> > +     * encoding.
> > +     */
> > +    AV_FRAME_DATA_VIDEO_HINT,
> >  };
> >  
> >  enum AVActiveFormatDescription {
> > diff --git a/libavutil/version.h b/libavutil/version.h
> > index 24af520e08..9e798b0e3f 100644
> > --- a/libavutil/version.h
> > +++ b/libavutil/version.h
> > @@ -79,7 +79,7 @@
> >   */
> >  
> >  #define LIBAVUTIL_VERSION_MAJOR  58
> > -#define LIBAVUTIL_VERSION_MINOR  14
> > +#define LIBAVUTIL_VERSION_MINOR  15
> >  #define LIBAVUTIL_VERSION_MICRO 100
> >  
> >  #define LIBAVUTIL_VERSION_INT   AV_VERSION_INT(LIBAVUTIL_VERSION_MAJOR, \
> > diff --git a/libavutil/video_hint.c b/libavutil/video_hint.c
> > new file mode 100644
> > index 0000000000..5730ed6cfb
> > --- /dev/null
> > +++ b/libavutil/video_hint.c
> > @@ -0,0 +1,82 @@
> > +/*
> > + * Copyright 2023 Elias Carotti <eliascrt at amazon dot it>
> > + *
> > + * This file is part of FFmpeg.
> > + *
> > + * FFmpeg is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU Lesser General Public
> > + * License as published by the Free Software Foundation; either
> > + * version 2.1 of the License, or (at your option) any later version.
> > + *
> > + * FFmpeg is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > + * Lesser General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU Lesser General Public
> > + * License along with FFmpeg; if not, write to the Free Software
> > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> > + */
> > +
> > +#include <string.h>
> > +
> > +#include "avstring.h"
> > +#include "frame.h"
> > +#include "macros.h"
> > +#include "mem.h"
> > +#include "video_hint.h"
> > +
> > +AVVideoHint *av_video_hint_alloc(size_t nb_rects,
> > +                                 size_t *out_size)
> > +{
> > +    struct TestStruct {
> > +        AVVideoHint    hint;
> > +        AVVideoRect    rect;
> > +    };
> > +    const size_t rect_offset = offsetof(struct TestStruct, rect);
> > +    size_t size = rect_offset;
> > +    AVVideoHint *hint;
> > +
> > +    *out_size = 0;
> > +    if (nb_rects > (SIZE_MAX - size) / sizeof(AVVideoRect))
> > +        return NULL;
> > +    size += sizeof(AVVideoRect) * nb_rects;
> > +
> > +    hint = av_mallocz(size);
> > +    if (!hint)
> > +        return NULL;
> > +
> > +    hint->nb_rects    = nb_rects;
> > +    hint->rect_offset = rect_offset;
> > +    hint->rect_size   = sizeof(AVVideoRect);
> > +
> > +    *out_size = size;
> > +
> > +    return hint;
> > +}
> > +
> > +AVVideoHint *av_video_hint_create_side_data(AVFrame *frame,
> > +                                            size_t nb_rects)
> > +{
> > +    AVVideoHint *hint;
> > +    AVBufferRef *buf;
> > +    size_t size = 0;
> > +
> > +    hint = av_video_hint_alloc(nb_rects, &size);
> > +    if (!hint)
> > +        return NULL;
> > +
> > +    buf = av_buffer_create((uint8_t *)hint, size, NULL, NULL, 0);
> > +    if (!buf) {
> > +        av_freep(&hint);
> > +        return NULL;
> > +    }
> > +
> > +    if (!av_frame_new_side_data_from_buf(frame, AV_FRAME_DATA_VIDEO_HINT, buf)) {
> > +        av_buffer_unref(&buf);
> > +        return NULL;
> > +    }
> > +
> > +    return hint;
> > +}
> > +
> > diff --git a/libavutil/video_hint.h b/libavutil/video_hint.h
> > new file mode 100644
> > index 0000000000..14bfe20aa6
> > --- /dev/null
> > +++ b/libavutil/video_hint.h
> > @@ -0,0 +1,117 @@
> > +/**
> > + * Copyright 2023 Elias Carotti <eliascrt at amazon dot it>
> > + *
> > + * This file is part of FFmpeg.
> > + *
> > + * FFmpeg is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU Lesser General Public
> > + * License as published by the Free Software Foundation; either
> > + * version 2.1 of the License, or (at your option) any later version.
> > + *
> > + * FFmpeg is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > + * Lesser General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU Lesser General Public
> > + * License along with FFmpeg; if not, write to the Free Software
> > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> > + */
> > +
> > +#ifndef AVUTIL_VIDEO_HINT_H
> > +#define AVUTIL_VIDEO_HINT_H
> > +
> > +#include <stddef.h>
> > +#include <stdint.h>
> > +#include "libavutil/avassert.h"
> > +#include "libavutil/frame.h"
> > +
> 
> > +/**
> > +  * The type of the hinting information provided by an AVVideoRect.
> > +  * Currently only damage information, i.e., changed/unchanged portions of the frame.
> > +  **/
> 
> > +typedef enum AVVideoRectType {
> > +    AV_VIDEO_RECT_DAMAGE,
> 
> AV_VIDEO_RECT_TYPE_DAMAGE for consistency
> 
> > +} AVVideoRectType;
> 
> > +
> > +typedef struct AVVideoRect {
> > +    uint32_t x, y;
> > +    uint32_t width, height;
> > +    AVVideoRectType type;
> > +} AVVideoRect;
> > +
> > +typedef enum AVVideoHintType {
> > +    /* rectangled delimit the constant areas (unchanged), default is changed */
> 
> rectangles?
> 
> > +    AV_VIDEO_HINT_CONSTANT,
> 
> AV_VIDEO_HINT_TYPE_...
> 
> > +
> > +    /* rectangled delimit the constant areas (changed), default is not changed */
> 
> rectangles?
> 
> > +    AV_VIDEO_HINT_CHANGED,
> > +} AVVideoHintType;
> >
> > +typedef struct AVVideoHint {
> > +    /**
> > +     * Number of AVVideoRect present.
> > +     *
> > +     * May be 0, in which case no per-rectangle information is present. In this
> > +     * case the values of rect_offset / rect_size are unspecified and should
> > +     * not be accessed.
> > +     */
> > +    size_t nb_rects;
> > +
> > +    /**
> > +     * Offset in bytes from the beginning of this structure at which the array
> > +     * of AVVideoRect starts.
> > +     */
> > +    size_t rect_offset;
> > +
> > +    /**
> > +     * Size in bytes of AVVideoRect.
> > +     */
> > +    size_t rect_size;
> > +
> 
> > +    AVVideoHintType type;
> 
> sorry to nitpick, but if we have this information in the single rect
> should not we use that instead?
> 
> Basically I see two options:
> 1. generic VideoHint, storing single hint type and rects
> 
> for example we might have type=changed, and the rects storing only the
> changed rects
> 
> If we want to extend this then we need to add a new type. The problem
> might be if we want to store more than one VideoHint in the side data
> (it it possible to store more than one AVVideoHint, each one with a
> different type, as side data?). Suppose for example we want to add a
> new hint type, e.g. ROI, then we can add a new AVHideoHint with
> type=roi - but I don't know if we can have several frame hint side
> data (each one with different individual types).
> 
> 2. generic VideoHint, storing rects each one containing the hint type
> 
> for example we might have a list of rects, each one with
> type=changed|constant|roi. This would allow one to store different
> kind of hints in the same AVVideoHint structure, but at the same time
> would enable inconsistent data (for example we might have changed and
> unchanged rects in the same list of rects)
> 
> In each case, storing the type in the generic AVVideHint and in each
> rect might lead to inconsistent states (e.g. you might have generic
> type=damage, and have type=roi and type=changed in the rects), in this
> case having the type in both the general structure and in each rects
> seems redundant and leads to possible data inconsistencies.
> 
> ...
> 

> I would rather go to solution 1, but I'm not sure if having more than
> one hint in the side data (with different subtypes) is possible and
> wanted. If this is the case, probably the best solution would be to
> store more than one AVVideoHint (each one with an individual type and
> list of rects) in the single side data object.

Elaborating on this.

1. Ideally we might want to extend the av_frame_get_side_data API to
be able to store more than one entry per side data, by adding a unique
label to the side data, but this requires probably more effort and
extends the scope even further.

2. Alternatively, we might extend the hint API like this:

AVVideoHint *av_video_hint_create_side_data(AVFrame *frame,
                                            size_t nb_rects);

AVVideoHint *av_video_hint_extend_side_data(AVFrame *frame,
                                            size_t nb_rects);

by marking the continuation on the previous hint (but this would
complicate deserialization as you need then to iterate to get all the
side data).

Or we could make a variadic function like this:
AVVideoHint *av_video_hint_create_side_data(AVFrame *frame, size_t nb_rects, ...);

Alternatively, we might store in AVVideoHint more than one hint, but
this somehow conflict with the general design.

...

As a low effort approach, I would keep the initial design of a single
hint type per side-data (dropping the possibility of having more than
one hint type in the same side data), which should be fixed
generically by implementing 1.
Carotti, Elias July 26, 2023, 10:52 a.m. UTC | #7
On Mon, 2023-07-24 at 01:27 +0200, Stefano Sabatini wrote:
> CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender
> and know the content is safe.
> 
> 
<snip>

> > 
> > sorry to nitpick, but if we have this information in the single
> > rect
> > should not we use that instead?
> > 
> > Basically I see two options:
> > 1. generic VideoHint, storing single hint type and rects
> > 
> > for example we might have type=changed, and the rects storing only
> > the
> > changed rects
> > 
> > If we want to extend this then we need to add a new type. The
> > problem
> > might be if we want to store more than one VideoHint in the side
> > data
> > (it it possible to store more than one AVVideoHint, each one with a
> > different type, as side data?). Suppose for example we want to add
> > a
> > new hint type, e.g. ROI, then we can add a new AVHideoHint with
> > type=roi - but I don't know if we can have several frame hint side
> > data (each one with different individual types).
> > 
> > 2. generic VideoHint, storing rects each one containing the hint
> > type
> > 
> > for example we might have a list of rects, each one with
> > type=changed|constant|roi. This would allow one to store different
> > kind of hints in the same AVVideoHint structure, but at the same
> > time
> > would enable inconsistent data (for example we might have changed
> > and
> > unchanged rects in the same list of rects)
> > 

I don't think it allowed such inconsistency the way it was made.
Basically, the semantics was that the Hint type could be either
CONSTANT or CHANGED and that was the basis to interpret the rects with
type DAMAGE. 

> > In each case, storing the type in the generic AVVideHint and in
> > each
> > rect might lead to inconsistent states (e.g. you might have generic
> > type=damage, and have type=roi and type=changed in the rects), in
> > this
> > case having the type in both the general structure and in each
> > rects
> > seems redundant and leads to possible data inconsistencies.
> > 
> > ...
> > 
> 
> > I would rather go to solution 1, but I'm not sure if having more
> > than
> > one hint in the side data (with different subtypes) is possible and
> > wanted. If this is the case, probably the best solution would be to
> > store more than one AVVideoHint (each one with an individual type
> > and
> > list of rects) in the single side data object.
> 
> Elaborating on this.
> 
> 1. Ideally we might want to extend the av_frame_get_side_data API to
> be able to store more than one entry per side data, by adding a
> unique
> label to the side data, but this requires probably more effort and
> extends the scope even further.
> 
> 2. Alternatively, we might extend the hint API like this:
> 
> AVVideoHint *av_video_hint_create_side_data(AVFrame *frame,
>                                             size_t nb_rects);
> 
> AVVideoHint *av_video_hint_extend_side_data(AVFrame *frame,
>                                             size_t nb_rects);
> 
> by marking the continuation on the previous hint (but this would
> complicate deserialization as you need then to iterate to get all the
> side data).
> 
> Or we could make a variadic function like this:
> AVVideoHint *av_video_hint_create_side_data(AVFrame *frame, size_t
> nb_rects, ...);
> 
> Alternatively, we might store in AVVideoHint more than one hint, but
> this somehow conflict with the general design.
> 
> ...
> 
> As a low effort approach, I would keep the initial design of a single
> hint type per side-data (dropping the possibility of having more than
> one hint type in the same side data), which should be fixed
> generically by implementing 1.

Agreed.
For clarity I am attaching again the patch with Anton's changes just
rebased on the current master branch.
Anton, is this still ok to be merged?


Best, 
Elias







NICE SRL, viale Monte Grappa 3/5, 20124 Milano, Italia, Registro delle Imprese di Milano Monza Brianza Lodi REA n. 2096882, Capitale Sociale: 10.329,14 EUR i.v., Cod. Fisc. e P.IVA 01133050052, Societa con Socio Unico
Stefano Sabatini July 28, 2023, 7:44 a.m. UTC | #8
On date Wednesday 2023-07-26 10:52:57 +0000, Carotti, Elias wrote:
> On Mon, 2023-07-24 at 01:27 +0200, Stefano Sabatini wrote:
> > CAUTION: This email originated from outside of the organization. Do
> > not click links or open attachments unless you can confirm the sender
> > and know the content is safe.
> > 
> > 
> <snip>
> 
> > > 
> > > sorry to nitpick, but if we have this information in the single
> > > rect
> > > should not we use that instead?
> > > 
> > > Basically I see two options:
> > > 1. generic VideoHint, storing single hint type and rects
> > > 
> > > for example we might have type=changed, and the rects storing only
> > > the
> > > changed rects
> > > 
> > > If we want to extend this then we need to add a new type. The
> > > problem
> > > might be if we want to store more than one VideoHint in the side
> > > data
> > > (it it possible to store more than one AVVideoHint, each one with a
> > > different type, as side data?). Suppose for example we want to add
> > > a
> > > new hint type, e.g. ROI, then we can add a new AVHideoHint with
> > > type=roi - but I don't know if we can have several frame hint side
> > > data (each one with different individual types).
> > > 
> > > 2. generic VideoHint, storing rects each one containing the hint
> > > type
> > > 
> > > for example we might have a list of rects, each one with
> > > type=changed|constant|roi. This would allow one to store different
> > > kind of hints in the same AVVideoHint structure, but at the same
> > > time
> > > would enable inconsistent data (for example we might have changed
> > > and
> > > unchanged rects in the same list of rects)
> > > 
> 

> I don't think it allowed such inconsistency the way it was made.
> Basically, the semantics was that the Hint type could be either
> CONSTANT or CHANGED and that was the basis to interpret the rects with
> type DAMAGE. 

Yes, but then there is no guarantee that you'll have all rects with
the same subtype, and you need to iterate and check each one to
check the subtype (so you might have mixed constant/changed and
unclear behavior to apply in this case).

> > > In each case, storing the type in the generic AVVideHint and in
> > > each
> > > rect might lead to inconsistent states (e.g. you might have generic
> > > type=damage, and have type=roi and type=changed in the rects), in
> > > this
> > > case having the type in both the general structure and in each
> > > rects
> > > seems redundant and leads to possible data inconsistencies.
> > > 
> > > ...
> > > 
> > 
> > > I would rather go to solution 1, but I'm not sure if having more
> > > than
> > > one hint in the side data (with different subtypes) is possible and
> > > wanted. If this is the case, probably the best solution would be to
> > > store more than one AVVideoHint (each one with an individual type
> > > and
> > > list of rects) in the single side data object.
> > 
> > Elaborating on this.
> > 
> > 1. Ideally we might want to extend the av_frame_get_side_data API to
> > be able to store more than one entry per side data, by adding a
> > unique
> > label to the side data, but this requires probably more effort and
> > extends the scope even further.
> > 
> > 2. Alternatively, we might extend the hint API like this:
> > 
> > AVVideoHint *av_video_hint_create_side_data(AVFrame *frame,
> >                                             size_t nb_rects);
> > 
> > AVVideoHint *av_video_hint_extend_side_data(AVFrame *frame,
> >                                             size_t nb_rects);
> > 
> > by marking the continuation on the previous hint (but this would
> > complicate deserialization as you need then to iterate to get all the
> > side data).
> > 
> > Or we could make a variadic function like this:
> > AVVideoHint *av_video_hint_create_side_data(AVFrame *frame, size_t
> > nb_rects, ...);
> > 
> > Alternatively, we might store in AVVideoHint more than one hint, but
> > this somehow conflict with the general design.
> > 
> > ...
> > 
> > As a low effort approach, I would keep the initial design of a single
> > hint type per side-data (dropping the possibility of having more than
> > one hint type in the same side data), which should be fixed
> > generically by implementing 1.
> 
> Agreed.
> For clarity I am attaching again the patch with Anton's changes just
> rebased on the current master branch.
> Anton, is this still ok to be merged?

I'm fine with the current patch, the behavior can then be extended
by extending side data with multiple-label for the same type.

Anton, if you are fine with it I would merge this one (and maybe get
included in the next release if we are not too late already).

Thanks.
Stefano Sabatini Aug. 2, 2023, 5:36 a.m. UTC | #9
On date Friday 2023-07-28 09:44:37 +0200, Stefano Sabatini wrote:
> On date Wednesday 2023-07-26 10:52:57 +0000, Carotti, Elias wrote:
> > On Mon, 2023-07-24 at 01:27 +0200, Stefano Sabatini wrote:
> > > CAUTION: This email originated from outside of the organization. Do
> > > not click links or open attachments unless you can confirm the sender
> > > and know the content is safe.
> > > 
> > > 
> > <snip>
> > 
> > > > 
> > > > sorry to nitpick, but if we have this information in the single
> > > > rect
> > > > should not we use that instead?
> > > > 
> > > > Basically I see two options:
> > > > 1. generic VideoHint, storing single hint type and rects
> > > > 
> > > > for example we might have type=changed, and the rects storing only
> > > > the
> > > > changed rects
> > > > 
> > > > If we want to extend this then we need to add a new type. The
> > > > problem
> > > > might be if we want to store more than one VideoHint in the side
> > > > data
> > > > (it it possible to store more than one AVVideoHint, each one with a
> > > > different type, as side data?). Suppose for example we want to add
> > > > a
> > > > new hint type, e.g. ROI, then we can add a new AVHideoHint with
> > > > type=roi - but I don't know if we can have several frame hint side
> > > > data (each one with different individual types).
> > > > 
> > > > 2. generic VideoHint, storing rects each one containing the hint
> > > > type
> > > > 
> > > > for example we might have a list of rects, each one with
> > > > type=changed|constant|roi. This would allow one to store different
> > > > kind of hints in the same AVVideoHint structure, but at the same
> > > > time
> > > > would enable inconsistent data (for example we might have changed
> > > > and
> > > > unchanged rects in the same list of rects)
> > > > 
> > 
> 
> > I don't think it allowed such inconsistency the way it was made.
> > Basically, the semantics was that the Hint type could be either
> > CONSTANT or CHANGED and that was the basis to interpret the rects with
> > type DAMAGE. 
> 
> Yes, but then there is no guarantee that you'll have all rects with
> the same subtype, and you need to iterate and check each one to
> check the subtype (so you might have mixed constant/changed and
> unclear behavior to apply in this case).
> 
> > > > In each case, storing the type in the generic AVVideHint and in
> > > > each
> > > > rect might lead to inconsistent states (e.g. you might have generic
> > > > type=damage, and have type=roi and type=changed in the rects), in
> > > > this
> > > > case having the type in both the general structure and in each
> > > > rects
> > > > seems redundant and leads to possible data inconsistencies.
> > > > 
> > > > ...
> > > > 
> > > 
> > > > I would rather go to solution 1, but I'm not sure if having more
> > > > than
> > > > one hint in the side data (with different subtypes) is possible and
> > > > wanted. If this is the case, probably the best solution would be to
> > > > store more than one AVVideoHint (each one with an individual type
> > > > and
> > > > list of rects) in the single side data object.
> > > 
> > > Elaborating on this.
> > > 
> > > 1. Ideally we might want to extend the av_frame_get_side_data API to
> > > be able to store more than one entry per side data, by adding a
> > > unique
> > > label to the side data, but this requires probably more effort and
> > > extends the scope even further.
> > > 
> > > 2. Alternatively, we might extend the hint API like this:
> > > 
> > > AVVideoHint *av_video_hint_create_side_data(AVFrame *frame,
> > >                                             size_t nb_rects);
> > > 
> > > AVVideoHint *av_video_hint_extend_side_data(AVFrame *frame,
> > >                                             size_t nb_rects);
> > > 
> > > by marking the continuation on the previous hint (but this would
> > > complicate deserialization as you need then to iterate to get all the
> > > side data).
> > > 
> > > Or we could make a variadic function like this:
> > > AVVideoHint *av_video_hint_create_side_data(AVFrame *frame, size_t
> > > nb_rects, ...);
> > > 
> > > Alternatively, we might store in AVVideoHint more than one hint, but
> > > this somehow conflict with the general design.
> > > 
> > > ...
> > > 
> > > As a low effort approach, I would keep the initial design of a single
> > > hint type per side-data (dropping the possibility of having more than
> > > one hint type in the same side data), which should be fixed
> > > generically by implementing 1.
> > 
> > Agreed.
> > For clarity I am attaching again the patch with Anton's changes just
> > rebased on the current master branch.
> > Anton, is this still ok to be merged?
> 
> I'm fine with the current patch, the behavior can then be extended
> by extending side data with multiple-label for the same type.
> 
> Anton, if you are fine with it I would merge this one (and maybe get
> included in the next release if we are not too late already).

I'll push this in a few days unless there are more comments.
Stefano Sabatini Aug. 8, 2023, 8:16 a.m. UTC | #10
On date Wednesday 2023-08-02 07:36:43 +0200, Stefano Sabatini wrote:
> On date Friday 2023-07-28 09:44:37 +0200, Stefano Sabatini wrote:
> > On date Wednesday 2023-07-26 10:52:57 +0000, Carotti, Elias wrote:
[...]
> > > Agreed.
> > > For clarity I am attaching again the patch with Anton's changes just
> > > rebased on the current master branch.
> > > Anton, is this still ok to be merged?
> > 
> > I'm fine with the current patch, the behavior can then be extended
> > by extending side data with multiple-label for the same type.
> > 
> > Anton, if you are fine with it I would merge this one (and maybe get
> > included in the next release if we are not too late already).
> 
> I'll push this in a few days unless there are more comments.

Rebased and applied with a minor change (AV AV_VIDEO_HINT_* =>
AV_VIDEO_HINT_TYPE_*).

Thanks Elias for the contribution!
diff mbox series

Patch

diff --git a/doc/APIchanges b/doc/APIchanges
index 27f835cfce..0cda51fdee 100644
--- a/doc/APIchanges
+++ b/doc/APIchanges
@@ -2,6 +2,9 @@  The last version increases of all libraries were on 2023-02-09
 
 API changes, most recent first:
 
+2023-07-xx - xxxxxxxxxx - lavu 58.15.100 - video_hint.h
+  Add AVVideoHint API.
+
 2023-07-05 - xxxxxxxxxx - lavu 58.14.100 - random_seed.h
   Add av_random_bytes()
 
diff --git a/libavutil/Makefile b/libavutil/Makefile
index bd9c6f9e32..7828c94dc5 100644
--- a/libavutil/Makefile
+++ b/libavutil/Makefile
@@ -91,6 +91,7 @@  HEADERS = adler32.h                                                     \
           tea.h                                                         \
           tx.h                                                          \
           film_grain_params.h                                           \
+          video_hint.h
 
 ARCH_HEADERS = bswap.h                                                  \
                intmath.h                                                \
@@ -181,6 +182,7 @@  OBJS = adler32.o                                                        \
        uuid.o                                                           \
        version.o                                                        \
        video_enc_params.o                                               \
+       video_hint.o                                                     \
        film_grain_params.o                                              \
 
 
diff --git a/libavutil/frame.h b/libavutil/frame.h
index a491315f25..c0c1b23db7 100644
--- a/libavutil/frame.h
+++ b/libavutil/frame.h
@@ -214,6 +214,16 @@  enum AVFrameSideDataType {
      * Ambient viewing environment metadata, as defined by H.274.
      */
     AV_FRAME_DATA_AMBIENT_VIEWING_ENVIRONMENT,
+
+    /**
+     * Provide encoder-specific hinting information about changed/unchanged
+     * portions of a frame.  It can be used to pass information about which
+     * macroblocks can be skipped because they didn't change from the
+     * corresponding ones in the previous frame. This could be useful for
+     * applications which know this information in advance to speed up
+     * encoding.
+     */
+    AV_FRAME_DATA_VIDEO_HINT,
 };
 
 enum AVActiveFormatDescription {
diff --git a/libavutil/version.h b/libavutil/version.h
index 24af520e08..9e798b0e3f 100644
--- a/libavutil/version.h
+++ b/libavutil/version.h
@@ -79,7 +79,7 @@ 
  */
 
 #define LIBAVUTIL_VERSION_MAJOR  58
-#define LIBAVUTIL_VERSION_MINOR  14
+#define LIBAVUTIL_VERSION_MINOR  15
 #define LIBAVUTIL_VERSION_MICRO 100
 
 #define LIBAVUTIL_VERSION_INT   AV_VERSION_INT(LIBAVUTIL_VERSION_MAJOR, \
diff --git a/libavutil/video_hint.c b/libavutil/video_hint.c
new file mode 100644
index 0000000000..5730ed6cfb
--- /dev/null
+++ b/libavutil/video_hint.c
@@ -0,0 +1,82 @@ 
+/*
+ * Copyright 2023 Elias Carotti <eliascrt at amazon dot it>
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <string.h>
+
+#include "avstring.h"
+#include "frame.h"
+#include "macros.h"
+#include "mem.h"
+#include "video_hint.h"
+
+AVVideoHint *av_video_hint_alloc(size_t nb_rects,
+                                 size_t *out_size)
+{
+    struct TestStruct {
+        AVVideoHint    hint;
+        AVVideoRect    rect;
+    };
+    const size_t rect_offset = offsetof(struct TestStruct, rect);
+    size_t size = rect_offset;
+    AVVideoHint *hint;
+
+    *out_size = 0;
+    if (nb_rects > (SIZE_MAX - size) / sizeof(AVVideoRect))
+        return NULL;
+    size += sizeof(AVVideoRect) * nb_rects;
+
+    hint = av_mallocz(size);
+    if (!hint)
+        return NULL;
+
+    hint->nb_rects    = nb_rects;
+    hint->rect_offset = rect_offset;
+    hint->rect_size   = sizeof(AVVideoRect);
+
+    *out_size = size;
+
+    return hint;
+}
+
+AVVideoHint *av_video_hint_create_side_data(AVFrame *frame,
+                                            size_t nb_rects)
+{
+    AVVideoHint *hint;
+    AVBufferRef *buf;
+    size_t size = 0;
+
+    hint = av_video_hint_alloc(nb_rects, &size);
+    if (!hint)
+        return NULL;
+
+    buf = av_buffer_create((uint8_t *)hint, size, NULL, NULL, 0);
+    if (!buf) {
+        av_freep(&hint);
+        return NULL;
+    }
+
+    if (!av_frame_new_side_data_from_buf(frame, AV_FRAME_DATA_VIDEO_HINT, buf)) {
+        av_buffer_unref(&buf);
+        return NULL;
+    }
+
+    return hint;
+}
+
diff --git a/libavutil/video_hint.h b/libavutil/video_hint.h
new file mode 100644
index 0000000000..86addf5ceb
--- /dev/null
+++ b/libavutil/video_hint.h
@@ -0,0 +1,108 @@ 
+/**
+ * Copyright 2023 Elias Carotti <eliascrt at amazon dot it>
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef AVUTIL_VIDEO_HINT_H
+#define AVUTIL_VIDEO_HINT_H
+
+#include <stddef.h>
+#include <stdint.h>
+#include "libavutil/avassert.h"
+#include "libavutil/frame.h"
+
+typedef struct AVVideoRect {
+    uint32_t x, y;
+    uint32_t width, height;
+} AVVideoRect;
+
+typedef enum AVVideoHintType {
+    /* rectangled delimit the constant areas (unchanged), default is changed */
+    AV_VIDEO_HINT_CONSTANT,
+
+    /* rectangled delimit the constant areas (changed), default is not changed */
+    AV_VIDEO_HINT_CHANGED,
+} AVVideoHintType;
+
+typedef struct AVVideoHint {
+    /**
+     * Number of AVVideoRect present.
+     *
+     * May be 0, in which case no per-rectangle information is present. In this
+     * case the values of rect_offset / rect_size are unspecified and should
+     * not be accessed.
+     */
+    size_t nb_rects;
+
+    /**
+     * Offset in bytes from the beginning of this structure at which the array
+     * of AVVideoRect starts.
+     */
+    size_t rect_offset;
+
+    /**
+     * Size in bytes of AVVideoRect.
+     */
+    size_t rect_size;
+
+    AVVideoHintType type;
+} AVVideoHint;
+
+static av_always_inline AVVideoRect*
+av_video_hint_rects(const AVVideoHint *hints)
+{
+    return (AVVideoRect *)((uint8_t *)hints + hints->rect_offset);
+}
+
+static av_always_inline AVVideoRect*
+av_video_hint_get_rect(const AVVideoHint *hints, size_t idx)
+{
+    return (AVVideoRect *)((uint8_t *)hints + hints->rect_offset + idx * hints->rect_size);
+}
+
+/**
+ * Allocate memory for the AVVideoHint struct along with an nb_rects-sized
+ * arrays of AVVideoRect.
+ *
+ * The side data contains a list of rectangles for the portions of the frame
+ * which changed from the last encoded one (and the remainder are assumed to be
+ * changed), or, alternately (depending on the type parameter) the unchanged
+ * ones (and the remanining ones are those which changed).
+ * Macroblocks will thus be hinted either to be P_SKIP-ped or go through the
+ * regular encoding procedure.
+ *
+ * It's responsibility of the caller to fill the AVRects accordingly, and to set
+ * the proper AVVideoHintType field.
+ *
+ * @param out_size if non-NULL, the size in bytes of the resulting data array is
+ *                 written here
+ *
+ * @return newly allocated AVVideoHint struct (must be freed by the caller using
+ *         av_free()) on success, NULL on memory allocation failure
+ */
+AVVideoHint *av_video_hint_alloc(size_t nb_rects,
+                                 size_t *out_size);
+/**
+ * Same as av_video_hint_alloc(), except newly-allocated AVVideoHint is attached
+ * as side data of type AV_FRAME_DATA_VIDEO_HINT_INFO to frame.
+ */
+AVVideoHint *av_video_hint_create_side_data(AVFrame *frame,
+                                            size_t nb_rects);
+
+
+#endif /* AVUTIL_VIDEO_HINT_H */