[FFmpeg-devel,v2] avformat/movenc: support writing iTunes cover image

Submitted by Timo Teräs on April 14, 2018, 6:32 p.m.

Details

Message ID 20180414183211.12906-1-timo.teras@iki.fi
State Accepted
Commit 9af71b326fda1c6f32a26d465f7740110113e759
Headers show

Commit Message

Timo Teräs April 14, 2018, 6:32 p.m.
Fixes https://trac.ffmpeg.org/ticket/2798

This makes movenc handle AV_DISPOSITION_ATTACHED_PIC and write
the associated pictures in iTunes cover atom. This corresponds
to how 'mov' demuxer parses and exposes the cover images when
reading.

Most of the existing track handling loops properly ignore
these 'virtual streams' as MOVTrack->entry is never incremented
for them. However, additional tests are added as needed to ignore
them.

Tested to produce valid output with:
  ffmpeg -i movie.mp4 -i thumb.jpg -disposition:v:1 attached_pic \
         -map 0 -map 1 -c copy movie-with-cover.mp4

The cover image is also copied correctly with:
  ffmpeg -i movie-with-cover.mp4 -map 0 -c copy out.mp4

AtomicParseley says that the attached_pic stream is properly
not visible in the main tracks of the file.

Signed-off-by: Timo Teräs <timo.teras@iki.fi>
---
v2:
- Store the image in MOVTrack->cover_image instead of
  AVStream->attached_pic per review request

 libavformat/movenc.c | 88 +++++++++++++++++++++++++++++++++++++++++---
 libavformat/movenc.h |  1 +
 2 files changed, 84 insertions(+), 5 deletions(-)

Comments

James Almer April 16, 2018, 1:53 p.m.
On 4/14/2018 3:32 PM, Timo Teräs wrote:
> Fixes https://trac.ffmpeg.org/ticket/2798
> 
> This makes movenc handle AV_DISPOSITION_ATTACHED_PIC and write
> the associated pictures in iTunes cover atom. This corresponds
> to how 'mov' demuxer parses and exposes the cover images when
> reading.
> 
> Most of the existing track handling loops properly ignore
> these 'virtual streams' as MOVTrack->entry is never incremented
> for them. However, additional tests are added as needed to ignore
> them.
> 
> Tested to produce valid output with:
>   ffmpeg -i movie.mp4 -i thumb.jpg -disposition:v:1 attached_pic \
>          -map 0 -map 1 -c copy movie-with-cover.mp4
> 
> The cover image is also copied correctly with:
>   ffmpeg -i movie-with-cover.mp4 -map 0 -c copy out.mp4
> 
> AtomicParseley says that the attached_pic stream is properly
> not visible in the main tracks of the file.
> 
> Signed-off-by: Timo Teräs <timo.teras@iki.fi>
> ---
> v2:
> - Store the image in MOVTrack->cover_image instead of
>   AVStream->attached_pic per review request

This is failing when i try to mux a jpg as cover art into m4a (Ipod
muxer). It complains about missing codec tag for mjpeg. Is the covr atom
valid for that format?

Adding an entry to codec_ipod_tags[] may fix it, but that alone would
probably then allow non cover art video tracks using that codec, which i
assume is undesirable.
Timo Teräs April 16, 2018, 2:05 p.m.
On Mon, 16 Apr 2018 10:53:43 -0300
James Almer <jamrial@gmail.com> wrote:

> On 4/14/2018 3:32 PM, Timo Teräs wrote:
> > Fixes https://trac.ffmpeg.org/ticket/2798
> > 
> > This makes movenc handle AV_DISPOSITION_ATTACHED_PIC and write
> > the associated pictures in iTunes cover atom. This corresponds
> > to how 'mov' demuxer parses and exposes the cover images when
> > reading.
> > 
> > Most of the existing track handling loops properly ignore
> > these 'virtual streams' as MOVTrack->entry is never incremented
> > for them. However, additional tests are added as needed to ignore
> > them.
> > 
> > Tested to produce valid output with:
> >   ffmpeg -i movie.mp4 -i thumb.jpg -disposition:v:1 attached_pic \
> >          -map 0 -map 1 -c copy movie-with-cover.mp4
> > 
> > The cover image is also copied correctly with:
> >   ffmpeg -i movie-with-cover.mp4 -map 0 -c copy out.mp4
> > 
> > AtomicParseley says that the attached_pic stream is properly
> > not visible in the main tracks of the file.
> > 
> > Signed-off-by: Timo Teräs <timo.teras@iki.fi>
> > ---
> > v2:
> > - Store the image in MOVTrack->cover_image instead of
> >   AVStream->attached_pic per review request  
> 
> This is failing when i try to mux a jpg as cover art into m4a (Ipod
> muxer). It complains about missing codec tag for mjpeg. Is the covr
> atom valid for that format?

Yes, it's valid for m4a.

> Adding an entry to codec_ipod_tags[] may fix it, but that alone would
> probably then allow non cover art video tracks using that codec,
> which i assume is undesirable.

For audio formats, e.g. mp3, query codec returns MKTAG('A', 'P', 'I',
'C') to indicate that the image works only as attached picture.

But even for video formats, it would be nice if query codec can
properly tell which image formats it does based on disposition. The
codec list for cover images is different from generic video streams.

Any suggestions how to do that properly?

Timo
Rostislav Pehlivanov April 16, 2018, 11:01 p.m.
On 16 April 2018 at 15:05, Timo Teras <timo.teras@iki.fi> wrote:

> On Mon, 16 Apr 2018 10:53:43 -0300
> James Almer <jamrial@gmail.com> wrote:
>
> > On 4/14/2018 3:32 PM, Timo Teräs wrote:
> > > Fixes https://trac.ffmpeg.org/ticket/2798
> > >
> > > This makes movenc handle AV_DISPOSITION_ATTACHED_PIC and write
> > > the associated pictures in iTunes cover atom. This corresponds
> > > to how 'mov' demuxer parses and exposes the cover images when
> > > reading.
> > >
> > > Most of the existing track handling loops properly ignore
> > > these 'virtual streams' as MOVTrack->entry is never incremented
> > > for them. However, additional tests are added as needed to ignore
> > > them.
> > >
> > > Tested to produce valid output with:
> > >   ffmpeg -i movie.mp4 -i thumb.jpg -disposition:v:1 attached_pic \
> > >          -map 0 -map 1 -c copy movie-with-cover.mp4
> > >
> > > The cover image is also copied correctly with:
> > >   ffmpeg -i movie-with-cover.mp4 -map 0 -c copy out.mp4
> > >
> > > AtomicParseley says that the attached_pic stream is properly
> > > not visible in the main tracks of the file.
> > >
> > > Signed-off-by: Timo Teräs <timo.teras@iki.fi>
> > > ---
> > > v2:
> > > - Store the image in MOVTrack->cover_image instead of
> > >   AVStream->attached_pic per review request
> >
> > This is failing when i try to mux a jpg as cover art into m4a (Ipod
> > muxer). It complains about missing codec tag for mjpeg. Is the covr
> > atom valid for that format?
>
> Yes, it's valid for m4a.
>
> > Adding an entry to codec_ipod_tags[] may fix it, but that alone would
> > probably then allow non cover art video tracks using that codec,
> > which i assume is undesirable.
>
> For audio formats, e.g. mp3, query codec returns MKTAG('A', 'P', 'I',
> 'C') to indicate that the image works only as attached picture.
>
> But even for video formats, it would be nice if query codec can
> properly tell which image formats it does based on disposition. The
> codec list for cover images is different from generic video streams.
>
> Any suggestions how to do that properly?
>
> Timo
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>

What's the problem with doing what mp3 does?
Timo Teräs April 17, 2018, 5:18 a.m.
On Tue, 17 Apr 2018 00:01:42 +0100
Rostislav Pehlivanov <atomnuker@gmail.com> wrote:

> On 16 April 2018 at 15:05, Timo Teras <timo.teras@iki.fi> wrote:
> 
> > On Mon, 16 Apr 2018 10:53:43 -0300
> > James Almer <jamrial@gmail.com> wrote:
> >  
> > > On 4/14/2018 3:32 PM, Timo Teräs wrote:  
> > > > This makes movenc handle AV_DISPOSITION_ATTACHED_PIC and write
> > > > the associated pictures in iTunes cover atom. This corresponds
> > > > to how 'mov' demuxer parses and exposes the cover images when
> > > > reading.
> > >
> > > This is failing when i try to mux a jpg as cover art into m4a
> > > (Ipod muxer). It complains about missing codec tag for mjpeg. Is
> > > the covr atom valid for that format?  
> >
> > Yes, it's valid for m4a.
> >  
> > > Adding an entry to codec_ipod_tags[] may fix it, but that alone
> > > would probably then allow non cover art video tracks using that
> > > codec, which i assume is undesirable.  
> >
> > For audio formats, e.g. mp3, query codec returns MKTAG('A', 'P',
> > 'I', 'C') to indicate that the image works only as attached picture.
> >
> > But even for video formats, it would be nice if query codec can
> > properly tell which image formats it does based on disposition. The
> > codec list for cover images is different from generic video streams.
> >
> > Any suggestions how to do that properly?
> 
> What's the problem with doing what mp3 does?

Because mp3 files take only audio. It can give universal reply that the
video codecs are valid for cover images only. Mp4/mov in general takes
also video. avformat_query_codec() or AVOutputFormat.query_codec() (if
we'd define it) does not have stream config info or it's disposition.

Ultimately the problem is that currently codec negotiation assumes the
acceptable codecs are same regardless of disposition. When mp3 started
to support this, a hack was added to query_codec() to return special
code to mean "only as cover art". But in mov/mp4 we cannot do that
because the same video codec can be used for a main stream too.

To me it sounds like we need to introduce new version of
avformat_query_codec() that gets the whole AVStream config, or at least
the stream's disposition information so we can return different codec
matches based on disposition. Or modify AVOutputFormat's codec_tag (or
add something new) that can specify the "attached_pic" video codecs
that get's used properly.

Since the above will likely be ABI change, perhaps the existing patch
can be taken as such, and the above be left a follow up fix. The patch
implements stuff correctly for video enabled mp4/mov, so as it is
now it's a big improvement or works correctly on the mp4/mov file
subtypes it happens to support.

Timo

Patch hide | download patch | download mbox

diff --git a/libavformat/movenc.c b/libavformat/movenc.c
index d03d7906a1..be5daa50c2 100644
--- a/libavformat/movenc.c
+++ b/libavformat/movenc.c
@@ -142,7 +142,9 @@  static int co64_required(const MOVTrack *track)
 
 static int rtp_hinting_needed(const AVStream *st)
 {
-    /* Add hint tracks for each audio and video stream */
+    /* Add hint tracks for each real audio and video stream */
+    if (st->disposition & AV_DISPOSITION_ATTACHED_PIC)
+        return 0;
     return st->codecpar->codec_type == AVMEDIA_TYPE_VIDEO ||
            st->codecpar->codec_type == AVMEDIA_TYPE_AUDIO;
 }
@@ -3420,6 +3422,51 @@  static int mov_write_int8_metadata(AVFormatContext *s, AVIOContext *pb,
     return size;
 }
 
+static int mov_write_covr(AVIOContext *pb, AVFormatContext *s)
+{
+    MOVMuxContext *mov = s->priv_data;
+    int64_t pos = 0;
+    int i, type;
+
+    for (i = 0; i < s->nb_streams; i++) {
+        MOVTrack *trk = &mov->tracks[i];
+        AVStream *st = s->streams[i];
+
+        if (!(st->disposition & AV_DISPOSITION_ATTACHED_PIC) ||
+            trk->cover_image.size <= 0)
+            continue;
+
+        switch (st->codecpar->codec_id) {
+        case AV_CODEC_ID_MJPEG:
+            type = 0xD;
+            break;
+        case AV_CODEC_ID_PNG:
+            type = 0xE;
+            break;
+        case AV_CODEC_ID_BMP:
+            type = 0x1B;
+            break;
+        default:
+            av_log(s, AV_LOG_ERROR, "unsupported codec_id (0x%x) for cover",
+                   st->codecpar->codec_id);
+            continue;
+        }
+
+        if (!pos) {
+            pos = avio_tell(pb);
+            avio_wb32(pb, 0);
+            ffio_wfourcc(pb, "covr");
+        }
+        avio_wb32(pb, 16 + trk->cover_image.size);
+        ffio_wfourcc(pb, "data");
+        avio_wb32(pb, type);
+        avio_wb32(pb , 0);
+        avio_write(pb, trk->cover_image.data, trk->cover_image.size);
+    }
+
+    return pos ? update_size(pb, pos) : 0;
+}
+
 /* iTunes meta data list */
 static int mov_write_ilst_tag(AVIOContext *pb, MOVMuxContext *mov,
                               AVFormatContext *s)
@@ -3454,6 +3501,7 @@  static int mov_write_ilst_tag(AVIOContext *pb, MOVMuxContext *mov,
     mov_write_int8_metadata  (s, pb, "hdvd",    "hd_video",  1);
     mov_write_int8_metadata  (s, pb, "pgap",    "gapless_playback",1);
     mov_write_int8_metadata  (s, pb, "cpil",    "compilation", 1);
+    mov_write_covr(pb, s);
     mov_write_trkn_tag(pb, mov, s, 0); // track number
     mov_write_trkn_tag(pb, mov, s, 1); // disc number
     mov_write_tmpo_tag(pb, s);
@@ -3951,6 +3999,8 @@  static int mov_write_isml_manifest(AVIOContext *pb, MOVMuxContext *mov, AVFormat
         } else {
             continue;
         }
+        if (st->disposition & AV_DISPOSITION_ATTACHED_PIC)
+            continue;
 
         props = (AVCPBProperties*)av_stream_get_side_data(track->st, AV_PKT_DATA_CPB_PROPERTIES, NULL);
 
@@ -4564,6 +4614,8 @@  static int mov_write_ftyp_tag(AVIOContext *pb, AVFormatContext *s)
 
     for (i = 0; i < s->nb_streams; i++) {
         AVStream *st = s->streams[i];
+        if (st->disposition & AV_DISPOSITION_ATTACHED_PIC)
+            continue;
         if (st->codecpar->codec_type == AVMEDIA_TYPE_VIDEO)
             has_video = 1;
         if (st->codecpar->codec_id == AV_CODEC_ID_H264)
@@ -4712,6 +4764,8 @@  static int mov_write_identification(AVIOContext *pb, AVFormatContext *s)
         int video_streams_nb = 0, audio_streams_nb = 0, other_streams_nb = 0;
         for (i = 0; i < s->nb_streams; i++) {
             AVStream *st = s->streams[i];
+            if (st->disposition & AV_DISPOSITION_ATTACHED_PIC)
+                continue;
             if (st->codecpar->codec_type == AVMEDIA_TYPE_VIDEO)
                 video_streams_nb++;
             else if (st->codecpar->codec_type == AVMEDIA_TYPE_AUDIO)
@@ -4901,7 +4955,8 @@  static int mov_flush_fragment(AVFormatContext *s, int force)
         int buf_size, moov_size;
 
         for (i = 0; i < mov->nb_streams; i++)
-            if (!mov->tracks[i].entry)
+            if (!mov->tracks[i].entry &&
+                (i >= s->nb_streams || !(s->streams[i]->disposition & AV_DISPOSITION_ATTACHED_PIC)))
                 break;
         /* Don't write the initial moov unless all tracks have data */
         if (i < mov->nb_streams && !force)
@@ -5480,13 +5535,34 @@  static int mov_write_subtitle_end_packet(AVFormatContext *s,
 
 static int mov_write_packet(AVFormatContext *s, AVPacket *pkt)
 {
+    MOVMuxContext *mov = s->priv_data;
+    MOVTrack *trk;
+    AVStream *st;
+
     if (!pkt) {
         mov_flush_fragment(s, 1);
         return 1;
+    }
+
+    st = s->streams[pkt->stream_index];
+    trk = &mov->tracks[pkt->stream_index];
+
+    if (st->disposition & AV_DISPOSITION_ATTACHED_PIC) {
+        int ret;
+
+        if (st->nb_frames >= 1) {
+            if (st->nb_frames == 1)
+                av_log(s, AV_LOG_WARNING, "Got more than one picture in stream %d,"
+                       " ignoring.\n", pkt->stream_index);
+            return 0;
+        }
+
+        if ((ret = av_packet_ref(&trk->cover_image, pkt)) < 0)
+            return ret;
+
+        return 0;
     } else {
         int i;
-        MOVMuxContext *mov = s->priv_data;
-        MOVTrack *trk = &mov->tracks[pkt->stream_index];
 
         if (!pkt->size)
             return mov_write_single_packet(s, pkt); /* Passthrough. */
@@ -5733,7 +5809,8 @@  static void enable_tracks(AVFormatContext *s)
         AVStream *st = s->streams[i];
 
         if (st->codecpar->codec_type <= AVMEDIA_TYPE_UNKNOWN ||
-            st->codecpar->codec_type >= AVMEDIA_TYPE_NB)
+            st->codecpar->codec_type >= AVMEDIA_TYPE_NB ||
+            st->disposition & AV_DISPOSITION_ATTACHED_PIC)
             continue;
 
         if (first[st->codecpar->codec_type] < 0)
@@ -5776,6 +5853,7 @@  static void mov_free(AVFormatContext *s)
             av_freep(&mov->tracks[i].par);
         av_freep(&mov->tracks[i].cluster);
         av_freep(&mov->tracks[i].frag_info);
+        av_packet_unref(&mov->tracks[i].cover_image);
 
         if (mov->tracks[i].vos_len)
             av_freep(&mov->tracks[i].vos_data);
diff --git a/libavformat/movenc.h b/libavformat/movenc.h
index ca2a9c9722..c9b4072fb9 100644
--- a/libavformat/movenc.h
+++ b/libavformat/movenc.h
@@ -132,6 +132,7 @@  typedef struct MOVTrack {
     uint32_t    default_size;
 
     HintSampleQueue sample_queue;
+    AVPacket cover_image;
 
     AVIOContext *mdat_buf;
     int64_t     data_offset;