diff mbox series

[FFmpeg-devel,2/4,v8] avformat/mov: add support for tile HEIF still images

Message ID 20240211185701.9327-2-jamrial@gmail.com
State New
Headers show
Series [FFmpeg-devel,1/4,v10] avformat: add a Tile Grid stream group type | expand

Checks

Context Check Description
yinshiyou/make_loongarch64 success Make finished
yinshiyou/make_fate_loongarch64 success Make fate finished
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

James Almer Feb. 11, 2024, 6:56 p.m. UTC
Export tiles as streams, and the grid information as a Stream Group of type
TILE_GRID.
This also enables exporting other stream items like thumbnails, which may be
present in non tiled HEIF images too.

Based on a patch by Swaraj Hota

Signed-off-by: James Almer <jamrial@gmail.com>
---
 libavformat/avformat.c |   8 +
 libavformat/avformat.h |   6 +
 libavformat/dump.c     |   2 +
 libavformat/internal.h |   5 +
 libavformat/isom.h     |  16 +-
 libavformat/mov.c      | 492 +++++++++++++++++++++++++++++++++++++----
 6 files changed, 484 insertions(+), 45 deletions(-)

Comments

Michael Niedermayer Feb. 11, 2024, 8:41 p.m. UTC | #1
On Sun, Feb 11, 2024 at 03:56:59PM -0300, James Almer wrote:
> Export tiles as streams, and the grid information as a Stream Group of type
> TILE_GRID.
> This also enables exporting other stream items like thumbnails, which may be
> present in non tiled HEIF images too.
> 
> Based on a patch by Swaraj Hota
> 
> Signed-off-by: James Almer <jamrial@gmail.com>
> ---
>  libavformat/avformat.c |   8 +
>  libavformat/avformat.h |   6 +
>  libavformat/dump.c     |   2 +
>  libavformat/internal.h |   5 +
>  libavformat/isom.h     |  16 +-
>  libavformat/mov.c      | 492 +++++++++++++++++++++++++++++++++++++----
>  6 files changed, 484 insertions(+), 45 deletions(-)

git dislikes this:

git am -3

Applying: avformat/mov: add support for tile HEIF still images
error: sha1 information is lacking or useless (libavformat/avformat.c).
error: could not build fake ancestor
Patch failed at 0001 avformat/mov: add support for tile HEIF still images
Use 'git am --show-current-patch' to see the failed patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".


[...]
James Almer Feb. 11, 2024, 9:08 p.m. UTC | #2
On 2/11/2024 5:41 PM, Michael Niedermayer wrote:
> On Sun, Feb 11, 2024 at 03:56:59PM -0300, James Almer wrote:
>> Export tiles as streams, and the grid information as a Stream Group of type
>> TILE_GRID.
>> This also enables exporting other stream items like thumbnails, which may be
>> present in non tiled HEIF images too.
>>
>> Based on a patch by Swaraj Hota
>>
>> Signed-off-by: James Almer <jamrial@gmail.com>
>> ---
>>   libavformat/avformat.c |   8 +
>>   libavformat/avformat.h |   6 +
>>   libavformat/dump.c     |   2 +
>>   libavformat/internal.h |   5 +
>>   libavformat/isom.h     |  16 +-
>>   libavformat/mov.c      | 492 +++++++++++++++++++++++++++++++++++++----
>>   6 files changed, 484 insertions(+), 45 deletions(-)
> 
> git dislikes this:
> 
> git am -3
> 
> Applying: avformat/mov: add support for tile HEIF still images
> error: sha1 information is lacking or useless (libavformat/avformat.c).
> error: could not build fake ancestor
> Patch failed at 0001 avformat/mov: add support for tile HEIF still images
> Use 'git am --show-current-patch' to see the failed patch
> When you have resolved this problem, run "git am --continue".
> If you prefer to skip this patch, run "git am --skip" instead.
> To restore the original branch and stop patching, run "git am --abort".

I can't reproduce this. Tried to apply the patches as sent to the ml and 
they still apply.
James Almer Feb. 11, 2024, 9:10 p.m. UTC | #3
On 2/11/2024 6:08 PM, James Almer wrote:
> On 2/11/2024 5:41 PM, Michael Niedermayer wrote:
>> On Sun, Feb 11, 2024 at 03:56:59PM -0300, James Almer wrote:
>>> Export tiles as streams, and the grid information as a Stream Group 
>>> of type
>>> TILE_GRID.
>>> This also enables exporting other stream items like thumbnails, which 
>>> may be
>>> present in non tiled HEIF images too.
>>>
>>> Based on a patch by Swaraj Hota
>>>
>>> Signed-off-by: James Almer <jamrial@gmail.com>
>>> ---
>>>   libavformat/avformat.c |   8 +
>>>   libavformat/avformat.h |   6 +
>>>   libavformat/dump.c     |   2 +
>>>   libavformat/internal.h |   5 +
>>>   libavformat/isom.h     |  16 +-
>>>   libavformat/mov.c      | 492 +++++++++++++++++++++++++++++++++++++----
>>>   6 files changed, 484 insertions(+), 45 deletions(-)
>>
>> git dislikes this:
>>
>> git am -3
>>
>> Applying: avformat/mov: add support for tile HEIF still images
>> error: sha1 information is lacking or useless (libavformat/avformat.c).
>> error: could not build fake ancestor
>> Patch failed at 0001 avformat/mov: add support for tile HEIF still images
>> Use 'git am --show-current-patch' to see the failed patch
>> When you have resolved this problem, run "git am --continue".
>> If you prefer to skip this patch, run "git am --skip" instead.
>> To restore the original branch and stop patching, run "git am --abort".
> 
> I can't reproduce this. Tried to apply the patches as sent to the ml and 
> they still apply.

https://patchwork.ffmpeg.org/project/ffmpeg/list/?series=10765 shows no 
issues.
I also pushed it to https://github.com/jamrial/FFmpeg/commits/heif_new/ 
so you can test it.
Michael Niedermayer Feb. 11, 2024, 10:14 p.m. UTC | #4
On Sun, Feb 11, 2024 at 06:10:06PM -0300, James Almer wrote:
> On 2/11/2024 6:08 PM, James Almer wrote:
> > On 2/11/2024 5:41 PM, Michael Niedermayer wrote:
> > > On Sun, Feb 11, 2024 at 03:56:59PM -0300, James Almer wrote:
> > > > Export tiles as streams, and the grid information as a Stream
> > > > Group of type
> > > > TILE_GRID.
> > > > This also enables exporting other stream items like thumbnails,
> > > > which may be
> > > > present in non tiled HEIF images too.
> > > > 
> > > > Based on a patch by Swaraj Hota
> > > > 
> > > > Signed-off-by: James Almer <jamrial@gmail.com>
> > > > ---
> > > >   libavformat/avformat.c |   8 +
> > > >   libavformat/avformat.h |   6 +
> > > >   libavformat/dump.c     |   2 +
> > > >   libavformat/internal.h |   5 +
> > > >   libavformat/isom.h     |  16 +-
> > > >   libavformat/mov.c      | 492 +++++++++++++++++++++++++++++++++++++----
> > > >   6 files changed, 484 insertions(+), 45 deletions(-)
> > > 
> > > git dislikes this:
> > > 
> > > git am -3
> > > 
> > > Applying: avformat/mov: add support for tile HEIF still images
> > > error: sha1 information is lacking or useless (libavformat/avformat.c).
> > > error: could not build fake ancestor
> > > Patch failed at 0001 avformat/mov: add support for tile HEIF still images
> > > Use 'git am --show-current-patch' to see the failed patch
> > > When you have resolved this problem, run "git am --continue".
> > > If you prefer to skip this patch, run "git am --skip" instead.
> > > To restore the original branch and stop patching, run "git am --abort".
> > 
> > I can't reproduce this. Tried to apply the patches as sent to the ml and
> > they still apply.
> 
> https://patchwork.ffmpeg.org/project/ffmpeg/list/?series=10765 shows no
> issues.
> I also pushed it to https://github.com/jamrial/FFmpeg/commits/heif_new/ so
> you can test it.

It seems this was caused by prior applied patches

thx

[...]
Anton Khirnov Feb. 16, 2024, 12:50 p.m. UTC | #5
Quoting James Almer (2024-02-11 19:56:59)
> +/**
> + * The video stream is intended to be merged with another stream before
> + * presentation.
> + * Used for example to signal the stream contains a tile from a HEIF grid.
> + */
> +#define AV_DISPOSITION_TILE                 (1 << 21)

The notion of "this stream needs to be combined with another one for
presentation" seems more general than just tiling video, could just as
well describe a set of audio tracks to be mixed together.

And since we're running out of easily usable disposition bits, we
shouldn't waste them. How about AV_DISPOSITION_SUBSTREAM?
James Almer Feb. 16, 2024, 4:35 p.m. UTC | #6
On 2/16/2024 9:50 AM, Anton Khirnov wrote:
> Quoting James Almer (2024-02-11 19:56:59)
>> +/**
>> + * The video stream is intended to be merged with another stream before
>> + * presentation.
>> + * Used for example to signal the stream contains a tile from a HEIF grid.
>> + */
>> +#define AV_DISPOSITION_TILE                 (1 << 21)
> 
> The notion of "this stream needs to be combined with another one for
> presentation" seems more general than just tiling video, could just as
> well describe a set of audio tracks to be mixed together.

That's why i stated it's for video. For audio there's 
AV_DISPOSITION_DEPENDENT.

> 
> And since we're running out of easily usable disposition bits, we
> shouldn't waste them. How about AV_DISPOSITION_SUBSTREAM?

Maybe we could redefine and reuse AV_DISPOSITION_DEPENDENT for this?
Anton Khirnov Feb. 16, 2024, 5:32 p.m. UTC | #7
Quoting James Almer (2024-02-16 17:35:13)
> > 
> > And since we're running out of easily usable disposition bits, we
> > shouldn't waste them. How about AV_DISPOSITION_SUBSTREAM?
> 
> Maybe we could redefine and reuse AV_DISPOSITION_DEPENDENT for this?

Works for me.
diff mbox series

Patch

diff --git a/libavformat/avformat.c b/libavformat/avformat.c
index f53ba4ce58..eb898223d2 100644
--- a/libavformat/avformat.c
+++ b/libavformat/avformat.c
@@ -119,6 +119,14 @@  void ff_remove_stream(AVFormatContext *s, AVStream *st)
     ff_free_stream(&s->streams[ --s->nb_streams ]);
 }
 
+void ff_remove_stream_group(AVFormatContext *s, AVStreamGroup *stg)
+{
+    av_assert0(s->nb_stream_groups > 0);
+    av_assert0(s->stream_groups[ s->nb_stream_groups - 1 ] == stg);
+
+    ff_free_stream_group(&s->stream_groups[ --s->nb_stream_groups ]);
+}
+
 /* XXX: suppress the packet queue */
 void ff_flush_packet_queue(AVFormatContext *s)
 {
diff --git a/libavformat/avformat.h b/libavformat/avformat.h
index 92751e5aee..27b052bad5 100644
--- a/libavformat/avformat.h
+++ b/libavformat/avformat.h
@@ -811,6 +811,12 @@  typedef struct AVIndexEntry {
  * The video stream contains still images.
  */
 #define AV_DISPOSITION_STILL_IMAGE          (1 << 20)
+/**
+ * The video stream is intended to be merged with another stream before
+ * presentation.
+ * Used for example to signal the stream contains a tile from a HEIF grid.
+ */
+#define AV_DISPOSITION_TILE                 (1 << 21)
 
 /**
  * @return The AV_DISPOSITION_* flag corresponding to disp or a negative error
diff --git a/libavformat/dump.c b/libavformat/dump.c
index 756db8c87e..6123ca58a5 100644
--- a/libavformat/dump.c
+++ b/libavformat/dump.c
@@ -657,6 +657,8 @@  static void dump_stream_format(const AVFormatContext *ic, int i,
         av_log(NULL, log_level, " (still image)");
     if (st->disposition & AV_DISPOSITION_NON_DIEGETIC)
         av_log(NULL, log_level, " (non-diegetic)");
+    if (st->disposition & AV_DISPOSITION_TILE)
+        av_log(NULL, log_level, " (tile)");
     av_log(NULL, log_level, "\n");
 
     dump_metadata(NULL, st->metadata, extra_indent, log_level);
diff --git a/libavformat/internal.h b/libavformat/internal.h
index c66f959e9f..5603ca1ab5 100644
--- a/libavformat/internal.h
+++ b/libavformat/internal.h
@@ -637,6 +637,11 @@  void ff_remove_stream(AVFormatContext *s, AVStream *st);
  * is not yet attached to an AVFormatContext.
  */
 void ff_free_stream_group(AVStreamGroup **pstg);
+/**
+ * Remove a stream group from its AVFormatContext and free it.
+ * The stream group must be the last stream group of the AVFormatContext.
+ */
+void ff_remove_stream_group(AVFormatContext *s, AVStreamGroup *stg);
 
 unsigned int ff_codec_get_tag(const AVCodecTag *tags, enum AVCodecID id);
 
diff --git a/libavformat/isom.h b/libavformat/isom.h
index a4cca4c798..d96722fe79 100644
--- a/libavformat/isom.h
+++ b/libavformat/isom.h
@@ -267,15 +267,25 @@  typedef struct MOVStreamContext {
 
 typedef struct HEIFItem {
     AVStream *st;
+    char *name;
     int item_id;
     int64_t extent_length;
     int64_t extent_offset;
-    int64_t size;
+    int tile_rows;
+    int tile_cols;
     int width;
     int height;
     int type;
+    int is_idat_relative;
 } HEIFItem;
 
+typedef struct HEIFGrid {
+    HEIFItem *item;
+    unsigned int *tile_idx_list;
+    int16_t *tile_id_list;
+    int nb_tiles;
+} HEIFGrid;
+
 typedef struct MOVContext {
     const AVClass *class; ///< class for private options
     AVFormatContext *fc;
@@ -339,6 +349,10 @@  typedef struct MOVContext {
     int cur_item_id;
     HEIFItem *heif_item;
     int nb_heif_item;
+    HEIFGrid *heif_grid;
+    int nb_heif_grid;
+    int thmb_item_id;
+    int64_t idat_offset;
     int interleaved_read;
 } MOVContext;
 
diff --git a/libavformat/mov.c b/libavformat/mov.c
index 42b0135987..23343c7ae2 100644
--- a/libavformat/mov.c
+++ b/libavformat/mov.c
@@ -185,6 +185,30 @@  static int mov_read_mac_string(MOVContext *c, AVIOContext *pb, int len,
     return p - dst;
 }
 
+static AVStream *get_curr_st(MOVContext *c)
+{
+    AVStream *st = NULL;
+
+    if (c->fc->nb_streams < 1)
+        return NULL;
+
+    for (int i = 0; i < c->nb_heif_item; i++) {
+        HEIFItem *item = &c->heif_item[i];
+
+        if (!item->st)
+            continue;
+        if (item->st->id != c->cur_item_id)
+            continue;
+
+        st = item->st;
+        break;
+    }
+    if (!st)
+        st = c->fc->streams[c->fc->nb_streams-1];
+
+    return st;
+}
+
 static int mov_read_covr(MOVContext *c, AVIOContext *pb, int type, int len)
 {
     AVStream *st;
@@ -1767,9 +1791,9 @@  static int mov_read_colr(MOVContext *c, AVIOContext *pb, MOVAtom atom)
     uint16_t color_primaries, color_trc, color_matrix;
     int ret;
 
-    if (c->fc->nb_streams < 1)
+    st = get_curr_st(c);
+    if (!st)
         return 0;
-    st = c->fc->streams[c->fc->nb_streams - 1];
 
     ret = ffio_read_size(pb, color_parameter_type, 4);
     if (ret < 0)
@@ -2117,9 +2141,9 @@  static int mov_read_glbl(MOVContext *c, AVIOContext *pb, MOVAtom atom)
     AVStream *st;
     int ret;
 
-    if (c->fc->nb_streams < 1)
+    st = get_curr_st(c);
+    if (!st)
         return 0;
-    st = c->fc->streams[c->fc->nb_streams-1];
 
     if ((uint64_t)atom.size > (1<<30))
         return AVERROR_INVALIDDATA;
@@ -4951,12 +4975,10 @@  static int heif_add_stream(MOVContext *c, HEIFItem *item)
     st->codecpar->codec_type = AVMEDIA_TYPE_VIDEO;
     st->codecpar->codec_id = mov_codec_id(st, item->type);
     sc->ffindex = st->index;
-    c->trak_index = st->index;
     st->avg_frame_rate.num = st->avg_frame_rate.den = 1;
     st->time_base.num = st->time_base.den = 1;
     st->nb_frames = 1;
     sc->time_scale = 1;
-    sc = st->priv_data;
     sc->pb = c->fc->pb;
     sc->pb_is_copied = 1;
 
@@ -7809,11 +7831,18 @@  static int mov_read_pitm(MOVContext *c, AVIOContext *pb, MOVAtom atom)
     return atom.size;
 }
 
+static int mov_read_idat(MOVContext *c, AVIOContext *pb, MOVAtom atom)
+{
+    c->idat_offset = avio_tell(pb);
+    return 0;
+}
+
 static int mov_read_iloc(MOVContext *c, AVIOContext *pb, MOVAtom atom)
 {
+    HEIFItem *heif_item;
     int version, offset_size, length_size, base_offset_size, index_size;
     int item_count, extent_count;
-    uint64_t base_offset, extent_offset, extent_length;
+    int64_t base_offset, extent_offset, extent_length;
     uint8_t value;
 
     if (c->found_moov) {
@@ -7832,17 +7861,20 @@  static int mov_read_iloc(MOVContext *c, AVIOContext *pb, MOVAtom atom)
     base_offset_size = (value >> 4) & 0xF;
     index_size = !version ? 0 : (value & 0xF);
     if (index_size) {
-        av_log(c->fc, AV_LOG_ERROR, "iloc: index_size != 0 not supported.\n");
+        avpriv_report_missing_feature(c->fc, "iloc: index_size != 0");
         return AVERROR_PATCHWELCOME;
     }
     item_count = (version < 2) ? avio_rb16(pb) : avio_rb32(pb);
 
-    if (!c->heif_item) {
-        c->heif_item = av_calloc(item_count, sizeof(*c->heif_item));
-        if (!c->heif_item)
-            return AVERROR(ENOMEM);
-        c->nb_heif_item = item_count;
-    }
+    heif_item = av_realloc_array(c->heif_item, item_count, sizeof(*c->heif_item));
+    if (!heif_item)
+        return AVERROR(ENOMEM);
+    c->heif_item = heif_item;
+    if (item_count > c->nb_heif_item)
+        memset(c->heif_item + c->nb_heif_item, 0,
+               sizeof(*c->heif_item) * (item_count - c->nb_heif_item));
+    c->nb_heif_item = FFMAX(c->nb_heif_item, item_count);
+    c->cur_item_id = 0;
 
     av_log(c->fc, AV_LOG_TRACE, "iloc: item_count %d\n", item_count);
     for (int i = 0; i < item_count; i++) {
@@ -7852,7 +7884,7 @@  static int mov_read_iloc(MOVContext *c, AVIOContext *pb, MOVAtom atom)
         if (avio_feof(pb))
             return AVERROR_INVALIDDATA;
         if (offset_type > 1) {
-            avpriv_request_sample(c->fc, "iloc offset type %d", offset_type);
+            avpriv_report_missing_feature(c->fc, "iloc offset type %d", offset_type);
             return AVERROR_PATCHWELCOME;
         }
         c->heif_item[i].item_id = item_id;
@@ -7863,13 +7895,15 @@  static int mov_read_iloc(MOVContext *c, AVIOContext *pb, MOVAtom atom)
         extent_count = avio_rb16(pb);
         if (extent_count > 1) {
             // For still AVIF images, we only support one extent item.
-            av_log(c->fc, AV_LOG_ERROR, "iloc: extent_count > 1 not supported\n");
+            avpriv_report_missing_feature(c->fc, "iloc: extent_count > 1");
             return AVERROR_PATCHWELCOME;
         }
         for (int j = 0; j < extent_count; j++) {
             if (rb_size(pb, &extent_offset, offset_size) < 0 ||
                 rb_size(pb, &extent_length, length_size) < 0)
                 return AVERROR_INVALIDDATA;
+            if (offset_type == 1)
+                c->heif_item[i].is_idat_relative = 1;
             c->heif_item[i].extent_length = extent_length;
             c->heif_item[i].extent_offset = base_offset + extent_offset;
             av_log(c->fc, AV_LOG_TRACE, "iloc: item_idx %d, offset_type %d, "
@@ -7884,7 +7918,7 @@  static int mov_read_iloc(MOVContext *c, AVIOContext *pb, MOVAtom atom)
 
 static int mov_read_infe(MOVContext *c, AVIOContext *pb, MOVAtom atom)
 {
-    char item_name[128];
+    AVBPrint item_name;
     int64_t size = atom.size;
     uint32_t item_type;
     int item_id;
@@ -7894,27 +7928,32 @@  static int mov_read_infe(MOVContext *c, AVIOContext *pb, MOVAtom atom)
     avio_rb24(pb);  // flags.
     size -= 4;
 
-    if (version != 2) {
-        av_log(c->fc, AV_LOG_ERROR, "infe: version != 2 not supported\n");
+    if (version < 2) {
+        av_log(c->fc, AV_LOG_ERROR, "infe: version < 2 not supported\n");
         return AVERROR_PATCHWELCOME;
     }
 
-    item_id = avio_rb16(pb);
+    item_id = version > 2 ? avio_rb32(pb) : avio_rb16(pb);
     avio_rb16(pb); // item_protection_index
     item_type = avio_rl32(pb);
     size -= 8;
-    size -= avio_get_str(pb, INT_MAX, item_name, sizeof(item_name));
 
-    av_log(c->fc, AV_LOG_TRACE, "infe: item_id %d, item_type %s, item_name %s\n",
-           item_id, av_fourcc2str(item_type), item_name);
+    av_bprint_init(&item_name, 0, AV_BPRINT_SIZE_UNLIMITED);
+    ret = ff_read_string_to_bprint_overwrite(pb, &item_name, size);
+    if (ret < 0) {
+        av_bprint_finalize(&item_name, NULL);
+        return ret;
+    }
 
-    // Skip all but the primary item until support is added
-    if (item_id != c->primary_item_id)
-        return 0;
+    av_log(c->fc, AV_LOG_TRACE, "infe: item_id %d, item_type %s, item_name %s\n",
+           item_id, av_fourcc2str(item_type), item_name.str);
 
+    size -= ret + 1;
     if (size > 0)
         avio_skip(pb, size);
 
+    if (ret)
+        av_bprint_finalize(&item_name, &c->heif_item[c->cur_item_id].name);
     c->heif_item[c->cur_item_id].item_id = item_id;
     c->heif_item[c->cur_item_id].type    = item_type;
 
@@ -7925,9 +7964,6 @@  static int mov_read_infe(MOVContext *c, AVIOContext *pb, MOVAtom atom)
         if (ret < 0)
             return ret;
         break;
-    default:
-        av_log(c->fc, AV_LOG_TRACE, "infe: ignoring item_type %s\n", av_fourcc2str(item_type));
-        break;
     }
 
     c->cur_item_id++;
@@ -7937,6 +7973,7 @@  static int mov_read_infe(MOVContext *c, AVIOContext *pb, MOVAtom atom)
 
 static int mov_read_iinf(MOVContext *c, AVIOContext *pb, MOVAtom atom)
 {
+    HEIFItem *heif_item;
     int entry_count;
     int version, ret;
 
@@ -7954,13 +7991,14 @@  static int mov_read_iinf(MOVContext *c, AVIOContext *pb, MOVAtom atom)
     avio_rb24(pb);  // flags.
     entry_count = version ? avio_rb32(pb) : avio_rb16(pb);
 
-    if (!c->heif_item) {
-        c->heif_item = av_calloc(entry_count, sizeof(*c->heif_item));
-        if (!c->heif_item)
-            return AVERROR(ENOMEM);
-        c->nb_heif_item = entry_count;
-    }
-
+    heif_item = av_realloc_array(c->heif_item, entry_count, sizeof(*c->heif_item));
+    if (!heif_item)
+        return AVERROR(ENOMEM);
+    c->heif_item = heif_item;
+    if (entry_count > c->nb_heif_item)
+        memset(c->heif_item + c->nb_heif_item, 0,
+               sizeof(*c->heif_item) * (entry_count - c->nb_heif_item));
+    c->nb_heif_item = FFMAX(c->nb_heif_item, entry_count);
     c->cur_item_id = 0;
 
     for (int i = 0; i < entry_count; i++) {
@@ -7977,11 +8015,125 @@  static int mov_read_iinf(MOVContext *c, AVIOContext *pb, MOVAtom atom)
     return 0;
 }
 
+static int mov_read_iref_dimg(MOVContext *c, AVIOContext *pb, int version)
+{
+    HEIFItem *item = NULL;
+    HEIFGrid *grid;
+    int entries, i;
+    int from_item_id = version ? avio_rb32(pb) : avio_rb16(pb);
+
+    for (int i = 0; i < c->nb_heif_grid; i++) {
+        if (c->heif_grid[i].item->item_id == from_item_id) {
+            av_log(c->fc, AV_LOG_ERROR, "More than one 'dimg' box "
+                                        "referencing the same 'grid'\n");
+            return AVERROR_INVALIDDATA;
+        }
+    }
+    for (int i = 0; i < c->nb_heif_item; i++) {
+        if (c->heif_item[i].item_id != from_item_id)
+            continue;
+        item = &c->heif_item[i];
+
+        switch (item->type) {
+        case MKTAG('g','r','i','d'):
+        case MKTAG('i','o','v','l'):
+            break;
+        default:
+            avpriv_report_missing_feature(c->fc, "Derived item of type %s",
+                                          av_fourcc2str(item->type));
+            return 0;
+        }
+        break;
+    }
+    if (!item) {
+        av_log(c->fc, AV_LOG_ERROR, "Missing grid information\n");
+        return AVERROR_INVALIDDATA;
+    }
+
+    grid = av_realloc_array(c->heif_grid, c->nb_heif_grid + 1U,
+                            sizeof(*c->heif_grid));
+    if (!grid)
+        return AVERROR(ENOMEM);
+    c->heif_grid = grid;
+    grid = &grid[c->nb_heif_grid++];
+
+    entries = avio_rb16(pb);
+    grid->tile_id_list = av_malloc_array(entries, sizeof(*grid->tile_id_list));
+    grid->tile_idx_list = av_calloc(entries, sizeof(*grid->tile_idx_list));
+    if (!grid->tile_id_list || !grid->tile_idx_list)
+        return AVERROR(ENOMEM);
+    /* 'to' item ids */
+    for (i = 0; i < entries; i++)
+        grid->tile_id_list[i] = version ? avio_rb32(pb) : avio_rb16(pb);
+    grid->nb_tiles = entries;
+    grid->item = item;
+
+    av_log(c->fc, AV_LOG_TRACE, "dimg: from_item_id %d, entries %d\n",
+           from_item_id, entries);
+
+    return 0;
+}
+
+static int mov_read_iref_thmb(MOVContext *c, AVIOContext *pb, int version)
+{
+    int entries;
+    int to_item_id, from_item_id = version ? avio_rb32(pb) : avio_rb16(pb);
+
+    entries = avio_rb16(pb);
+    if (entries > 1) {
+        avpriv_request_sample(c->fc, "thmb in iref referencing several items");
+        return AVERROR_PATCHWELCOME;
+    }
+    /* 'to' item ids */
+    to_item_id = version ? avio_rb32(pb) : avio_rb16(pb);
+
+    if (to_item_id != c->primary_item_id)
+        return 0;
+
+    c->thmb_item_id = from_item_id;
+
+    av_log(c->fc, AV_LOG_TRACE, "thmb: from_item_id %d, entries %d\n",
+           from_item_id, entries);
+
+    return 0;
+}
+
 static int mov_read_iref(MOVContext *c, AVIOContext *pb, MOVAtom atom)
 {
-    avio_rb32(pb);  /* version and flags */
+    int version = avio_r8(pb);
+    avio_rb24(pb); // flags
     atom.size -= 4;
-    return mov_read_default(c, pb, atom);
+
+    if (version > 1) {
+        av_log(c->fc, AV_LOG_WARNING, "Unknown iref box version %d\n", version);
+        return 0;
+    }
+
+    while (atom.size) {
+        uint32_t type, size = avio_rb32(pb);
+        int64_t next = avio_tell(pb);
+
+        if (size < 14 || next < 0 || next > INT64_MAX - size)
+            return AVERROR_INVALIDDATA;
+
+        next += size - 4;
+        type = avio_rl32(pb);
+        switch (type) {
+        case MKTAG('d','i','m','g'):
+            mov_read_iref_dimg(c, pb, version);
+            break;
+        case MKTAG('t','h','m','b'):
+            mov_read_iref_thmb(c, pb, version);
+            break;
+        default:
+            av_log(c->fc, AV_LOG_DEBUG, "Unknown iref type %s size %"PRIu32"\n",
+                   av_fourcc2str(type), size);
+        }
+
+        atom.size -= size;
+        avio_seek(pb, next, SEEK_SET);
+    }
+    return 0;
 }
 
 static int mov_read_ispe(MOVContext *c, AVIOContext *pb, MOVAtom atom)
@@ -8104,10 +8256,6 @@  static int mov_read_iprp(MOVContext *c, AVIOContext *pb, MOVAtom atom)
             av_log(c->fc, AV_LOG_TRACE, "ipma: property_index %d, item_id %d, item_type %s\n",
                    index + 1, item_id, av_fourcc2str(ref->type));
 
-            // Skip properties referencing items other than the primary item until support is added
-            if (item_id != c->primary_item_id)
-                continue;
-
             c->cur_item_id = item_id;
 
             ret = mov_read_default(c, &ref->b.pub,
@@ -8236,6 +8384,7 @@  static const MOVParseTableEntry mov_default_parse_table[] = {
 { MKTAG('p','c','m','C'), mov_read_pcmc }, /* PCM configuration box */
 { MKTAG('p','i','t','m'), mov_read_pitm },
 { MKTAG('e','v','c','C'), mov_read_glbl },
+{ MKTAG('i','d','a','t'), mov_read_idat },
 { MKTAG('i','r','e','f'), mov_read_iref },
 { MKTAG('i','s','p','e'), mov_read_ispe },
 { MKTAG('i','p','r','p'), mov_read_iprp },
@@ -8745,7 +8894,14 @@  static int mov_read_close(AVFormatContext *s)
 
     av_freep(&mov->aes_decrypt);
     av_freep(&mov->chapter_tracks);
+    for (i = 0; i < mov->nb_heif_item; i++)
+        av_freep(&mov->heif_item[i].name);
     av_freep(&mov->heif_item);
+    for (i = 0; i < mov->nb_heif_grid; i++) {
+        av_freep(&mov->heif_grid[i].tile_id_list);
+        av_freep(&mov->heif_grid[i].tile_idx_list);
+    }
+    av_freep(&mov->heif_grid);
 
     return 0;
 }
@@ -8885,6 +9041,229 @@  fail:
     return ret;
 }
 
+static int read_image_grid(AVFormatContext *s, AVStreamGroup *stg,
+                           AVStreamGroupTileGrid *tile_grid, HEIFGrid *grid)
+{
+    MOVContext *c = s->priv_data;
+    HEIFItem *item = grid->item;
+    int64_t offset = 0, pos = avio_tell(s->pb);
+    int x = 0, y = 0, i = 0;
+    int flags, size;
+
+    if (!(s->pb->seekable & AVIO_SEEKABLE_NORMAL)) {
+        av_log(c->fc, AV_LOG_INFO, "grid box with non seekable input\n");
+        return AVERROR_PATCHWELCOME;
+    }
+    if (item->is_idat_relative) {
+        if (!c->idat_offset) {
+            av_log(c->fc, AV_LOG_ERROR, "missing idat box required by the image grid\n");
+            return AVERROR_INVALIDDATA;
+        }
+        offset = c->idat_offset;
+    }
+
+    avio_seek(s->pb, item->extent_offset + offset, SEEK_SET);
+
+    avio_r8(s->pb);    /* version */
+    flags = avio_r8(s->pb);
+
+    item->tile_rows = avio_r8(s->pb) + 1;
+    item->tile_cols = avio_r8(s->pb) + 1;
+    /* actual width and height of output image */
+    tile_grid->width  = (flags & 1) ? avio_rb32(s->pb) : avio_rb16(s->pb);
+    tile_grid->height = (flags & 1) ? avio_rb32(s->pb) : avio_rb16(s->pb);
+
+    av_log(c->fc, AV_LOG_TRACE, "grid: grid_rows %d grid_cols %d output_width %d output_height %d\n",
+           item->tile_rows, item->tile_cols, tile_grid->width, tile_grid->height);
+
+    avio_seek(s->pb, pos, SEEK_SET);
+
+    size = item->tile_rows * item->tile_cols;
+    for (int i = 0; i < item->tile_cols; i++)
+        tile_grid->coded_width  += stg->streams[i]->codecpar->width;
+    for (int i = 0; i < size; i += item->tile_cols)
+        tile_grid->coded_height += stg->streams[i]->codecpar->height;
+
+    tile_grid->offsets = av_calloc(tile_grid->nb_tiles, sizeof(*tile_grid->offsets));
+    if (!tile_grid->offsets)
+        return AVERROR(ENOMEM);
+
+    while (y < tile_grid->coded_height) {
+        int left_col = i;
+
+        while (x < tile_grid->coded_width) {
+            if (i == tile_grid->nb_tiles)
+                return AVERROR(EINVAL);
+
+            tile_grid->offsets[i].horizontal = x;
+            tile_grid->offsets[i].vertical   = y;
+
+            x += stg->streams[i++]->codecpar->width;
+        }
+
+        if (x > tile_grid->coded_width) {
+            avpriv_request_sample(s, "Non uniform HEIF tiles");
+            return AVERROR_PATCHWELCOME;
+        }
+
+        x  = 0;
+        y += stg->streams[left_col]->codecpar->height;
+    }
+
+    if (y > tile_grid->coded_height || i != tile_grid->nb_tiles) {
+        avpriv_request_sample(s, "Non uniform HEIF tiles");
+        return AVERROR_PATCHWELCOME;
+    }
+
+    return 0;
+}
+
+static int read_image_overlay(AVFormatContext *s, AVStreamGroupTileGrid *tile_grid,
+                              HEIFGrid *grid)
+{
+    MOVContext *c = s->priv_data;
+    HEIFItem *item = grid->item;
+    uint16_t canvas_fill_value[4];
+    int64_t offset = 0, pos = avio_tell(s->pb);
+    int ret = 0, flags;
+
+    if (!(s->pb->seekable & AVIO_SEEKABLE_NORMAL)) {
+        av_log(c->fc, AV_LOG_INFO, "iovl box with non seekable input\n");
+        return AVERROR_PATCHWELCOME;
+    }
+    if (item->is_idat_relative) {
+        if (!c->idat_offset) {
+            av_log(c->fc, AV_LOG_ERROR, "missing idat box required by the image overlay\n");
+            return AVERROR_INVALIDDATA;
+        }
+        offset = c->idat_offset;
+    }
+
+    avio_seek(s->pb, item->extent_offset + offset, SEEK_SET);
+
+    avio_r8(s->pb);    /* version */
+    flags = avio_r8(s->pb);
+
+    for (int i = 0; i < 4; i++)
+        canvas_fill_value[i] = avio_rb16(s->pb);
+    av_log(c->fc, AV_LOG_TRACE, "iovl: canvas_fill_value { %u, %u, %u, %u }\n",
+           canvas_fill_value[0], canvas_fill_value[1],
+           canvas_fill_value[2], canvas_fill_value[3]);
+    for (int i = 0; i < 4; i++)
+        tile_grid->background[i] = canvas_fill_value[i];
+
+    /* actual width and height of output image */
+    tile_grid->width        =
+    tile_grid->coded_width  = (flags & 1) ? avio_rb32(s->pb) : avio_rb16(s->pb);
+    tile_grid->height       =
+    tile_grid->coded_height = (flags & 1) ? avio_rb32(s->pb) : avio_rb16(s->pb);
+    av_log(c->fc, AV_LOG_TRACE, "iovl: output_width %d, output_height %d\n",
+           tile_grid->width, tile_grid->height);
+
+    tile_grid->offsets = av_malloc_array(tile_grid->nb_tiles, sizeof(*tile_grid->offsets));
+    if (!tile_grid->offsets) {
+        ret = AVERROR(ENOMEM);
+        goto fail;
+    }
+
+    for (int i = 0; i < tile_grid->nb_tiles; i++) {
+        tile_grid->offsets[i].idx        = grid->tile_idx_list[i];
+        tile_grid->offsets[i].horizontal = (flags & 1) ? avio_rb32(s->pb) : avio_rb16(s->pb);
+        tile_grid->offsets[i].vertical   = (flags & 1) ? avio_rb32(s->pb) : avio_rb16(s->pb);
+        av_log(c->fc, AV_LOG_TRACE, "iovl: stream_idx[%d] %u, horizontal_offset[%d] %d, vertical_offset[%d] %d\n",
+               i, tile_grid->offsets[i].idx,
+               i, tile_grid->offsets[i].horizontal, i, tile_grid->offsets[i].vertical);
+    }
+
+fail:
+    avio_seek(s->pb, pos, SEEK_SET);
+
+    return ret;
+}
+
+static int mov_parse_tiles(AVFormatContext *s)
+{
+    MOVContext *mov = s->priv_data;
+
+    for (int i = 0; i < mov->nb_heif_grid; i++) {
+        AVStreamGroup *stg = avformat_stream_group_create(s, AV_STREAM_GROUP_PARAMS_TILE_GRID, NULL);
+        AVStreamGroupTileGrid *tile_grid;
+        HEIFGrid *grid = &mov->heif_grid[i];
+        int err, loop = 1;
+
+        if (!stg)
+            return AVERROR(ENOMEM);
+
+        stg->id = grid->item->item_id;
+        tile_grid = stg->params.tile_grid;
+
+        for (int j = 0; j < grid->nb_tiles; j++) {
+            int tile_id = grid->tile_id_list[j];
+
+            for (int k = 0; k < mov->nb_heif_item; k++) {
+                const HEIFItem *item = &mov->heif_item[k];
+                AVStream *st = item->st;
+
+                if (item->item_id != tile_id)
+                    continue;
+                if (!st) {
+                    av_log(s, AV_LOG_WARNING, "HEIF item id %d from grid id %d doesn't "
+                                              "reference a stream\n",
+                           tile_id, grid->item->item_id);
+                    ff_remove_stream_group(s, stg);
+                    loop = 0;
+                    break;
+                }
+
+                err = avformat_stream_group_add_stream(stg, st);
+                if (err == AVERROR(EEXIST)) {
+                    unsigned int l;
+                    for (l = 0; l < stg->nb_streams; l++)
+                        if (stg->streams[l]->index == st->index)
+                            break;
+                    av_assert0(l < stg->nb_streams);
+                    grid->tile_idx_list[j] = l;
+                    break;
+                } else if (err < 0)
+                    return err;
+
+                grid->tile_idx_list[j] = stg->nb_streams - 1;
+                st->codecpar->width  = item->width;
+                st->codecpar->height = item->height;
+                st->disposition |= AV_DISPOSITION_TILE;
+                break;
+            }
+
+            if (!loop)
+                break;
+        }
+
+        if (!loop)
+            continue;
+
+        tile_grid->nb_tiles = grid->nb_tiles;
+
+        switch (grid->item->type) {
+        case MKTAG('g','r','i','d'):
+            err = read_image_grid(s, stg, tile_grid, grid);
+            break;
+        case MKTAG('i','o','v','l'):
+            err = read_image_overlay(s, tile_grid, grid);
+            break;
+        default:
+            av_assert0(0);
+        }
+        if (err < 0)
+            return err;
+
+
+        if (grid->item->name)
+            av_dict_set(&stg->metadata, "title", grid->item->name, 0);
+    }
+
+    return 0;
+}
+
 static int mov_read_header(AVFormatContext *s)
 {
     MOVContext *mov = s->priv_data;
@@ -8901,6 +9280,8 @@  static int mov_read_header(AVFormatContext *s)
 
     mov->fc = s;
     mov->trak_index = -1;
+    mov->thmb_item_id = -1;
+    mov->primary_item_id = -1;
     /* .mov and .mp4 aren't streamable anyway (only progressive download if moov is before mdat) */
     if (pb->seekable & AVIO_SEEKABLE_NORMAL)
         atom.size = avio_size(pb);
@@ -8923,20 +9304,43 @@  static int mov_read_header(AVFormatContext *s)
     av_log(mov->fc, AV_LOG_TRACE, "on_parse_exit_offset=%"PRId64"\n", avio_tell(pb));
 
     if (mov->found_iloc) {
+        if (mov->nb_heif_grid) {
+            err = mov_parse_tiles(s);
+            if (err < 0)
+                return err;
+        }
+
         for (i = 0; i < mov->nb_heif_item; i++) {
             HEIFItem *item = &mov->heif_item[i];
             MOVStreamContext *sc;
             AVStream *st;
+            int64_t offset = 0;
 
-            if (!item->st)
+            if (!item->st) {
+                if (item->item_id == mov->thmb_item_id) {
+                    av_log(s, AV_LOG_ERROR, "HEIF thumbnail doesn't reference a stream\n");
+                    return AVERROR_INVALIDDATA;
+                }
                 continue;
+            }
+            if (item->is_idat_relative) {
+                if (!mov->idat_offset) {
+                    av_log(s, AV_LOG_ERROR, "Missing idat box for item %d\n", item->item_id);
+                    return AVERROR_INVALIDDATA;
+                }
+                offset = mov->idat_offset;
+            }
 
             st = item->st;
             sc = st->priv_data;
             st->codecpar->width  = item->width;
             st->codecpar->height = item->height;
+
             sc->sample_sizes[0]  = item->extent_length;
-            sc->chunk_offsets[0] = item->extent_offset;
+            sc->chunk_offsets[0] = item->extent_offset + offset;
+
+            if (item->item_id == mov->primary_item_id)
+                st->disposition |= AV_DISPOSITION_DEFAULT;
 
             mov_build_index(mov, st);
         }