diff mbox series

[FFmpeg-devel] avformat/mov: parse 3gpp titl from media or track udta

Message ID 20210721172809.22974-1-jeebjp@gmail.com
State New
Headers show
Series [FFmpeg-devel] avformat/mov: parse 3gpp titl from media or track udta
Related show

Checks

Context Check Description
andriy/x86_make success Make finished
andriy/x86_make_fate success Make fate finished
andriy/PPC64_make success Make finished
andriy/PPC64_make_fate success Make fate finished

Commit Message

Jan Ekström July 21, 2021, 5:28 p.m. UTC
Seems to be utilized by Handbrake for track titling and is
actually defined as "title for the media" as per the
specification.

Definition from 3GPP TS 26.244 follows:

Field               Type                Details                             Value
BoxHeader.Size      Unsigned int(32)                                        BOX_SIZE
BoxHeader.Type      Unsigned int(32)                                        'titl'
BoxHeader.Version   Unsigned int(8)                                         0
BoxHeader.Flags     Bit(24)                                                 0
Pad                 Bit(1)                                                  0
Language            Unsigned int(5)[3]  Packed ISO-639-2/T language code
Title               String              Text of title

Semantics:

Language: declares the language code for the following text. See
ISO 639-2/T for the set of three character codes. Each character
is packed as the difference between its ASCII value and 0x60.

The code is confined to being three lower-case letters, so these
values are strictly positive.

Title: null-terminated string in either UTF-8 or UTF-16 characters,
giving a title information. If UTF-16 is used, the string shall
start with the BYTE ORDER MARK (0xFEFF).
---
 libavformat/mov.c | 107 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 107 insertions(+)

Comments

Jan Ekström July 21, 2021, 5:30 p.m. UTC | #1
On Wed, Jul 21, 2021 at 8:28 PM Jan Ekström <jeebjp@gmail.com> wrote:
>
> Seems to be:
> * Utilized by Handbrake for track titling
> * Actually defined as "title for the media"
>
> Definition from 3GPP TS 26.244 follows:
>
> Field               Type                Details                             Value
> BoxHeader.Size      Unsigned int(32)                                        BOX_SIZE
> BoxHeader.Type      Unsigned int(32)                                        'titl'
> BoxHeader.Version   Unsigned int(8)                                         0
> BoxHeader.Flags     Bit(24)                                                 0
> Pad                 Bit(1)                                                  0
> Language            Unsigned int(5)[3]  Packed ISO-639-2/T language code
> Title               String              Text of title
>
> Semantics:
>
> Language: declares the language code for the following text. See
> ISO 639-2/T for the set of three character codes. Each character
> is packed as the difference between its ASCII value and 0x60.
>
> The code is confined to being three lower-case letters, so these
> values are strictly positive.
>
> Title: null-terminated string in either UTF-8 or UTF-16 characters,
> giving a title information. If UTF-16 is used, the string shall
> start with the BYTE ORDER MARK (0xFEFF).
> ---

A sample for this sort of metadata can be seen with
https://0x0.st/-zjq.m4v , which was posted at
https://github.com/mpv-player/mpv/issues/8488 .

The sample contains both "name" and "titl" boxes:
[udta: User Data Box]
    position = 3991500
    size = 71
    [name]
        position = 3991508
        size = 28
    [titl]
        position = 3991536
        size = 35

...out of which if I read QTFF documentation correctly "name" should
not be utilized for user-facing naming, and "titl" is actually a
user-facing metadata box. Thus I implemented the latter.

Jan
Jan Ekström July 24, 2021, 6:48 p.m. UTC | #2
On Wed, Jul 21, 2021 at 8:30 PM Jan Ekström <jeebjp@gmail.com> wrote:
>
> On Wed, Jul 21, 2021 at 8:28 PM Jan Ekström <jeebjp@gmail.com> wrote:
> >
> > Seems to be:
> > * Utilized by Handbrake for track titling
> > * Actually defined as "title for the media"
> >
> > Definition from 3GPP TS 26.244 follows:
> >
> > Field               Type                Details                             Value
> > BoxHeader.Size      Unsigned int(32)                                        BOX_SIZE
> > BoxHeader.Type      Unsigned int(32)                                        'titl'
> > BoxHeader.Version   Unsigned int(8)                                         0
> > BoxHeader.Flags     Bit(24)                                                 0
> > Pad                 Bit(1)                                                  0
> > Language            Unsigned int(5)[3]  Packed ISO-639-2/T language code
> > Title               String              Text of title
> >
> > Semantics:
> >
> > Language: declares the language code for the following text. See
> > ISO 639-2/T for the set of three character codes. Each character
> > is packed as the difference between its ASCII value and 0x60.
> >
> > The code is confined to being three lower-case letters, so these
> > values are strictly positive.
> >
> > Title: null-terminated string in either UTF-8 or UTF-16 characters,
> > giving a title information. If UTF-16 is used, the string shall
> > start with the BYTE ORDER MARK (0xFEFF).
> > ---
>
> A sample for this sort of metadata can be seen with
> https://0x0.st/-zjq.m4v , which was posted at
> https://github.com/mpv-player/mpv/issues/8488 .
>
> The sample contains both "name" and "titl" boxes:
> [udta: User Data Box]
>     position = 3991500
>     size = 71
>     [name]
>         position = 3991508
>         size = 28
>     [titl]
>         position = 3991536
>         size = 35
>
> ...out of which if I read QTFF documentation correctly "name" should
> not be utilized for user-facing naming, and "titl" is actually a
> user-facing metadata box. Thus I implemented the latter.
>
> Jan

Ping.

Includes a link to a testable file, and the specification of the box
is in the commit message :)

Jan
diff mbox series

Patch

diff --git a/libavformat/mov.c b/libavformat/mov.c
index 040babed95..9edb3d6596 100644
--- a/libavformat/mov.c
+++ b/libavformat/mov.c
@@ -291,6 +291,111 @@  static int mov_metadata_hmmt(MOVContext *c, AVIOContext *pb, unsigned len)
     return 0;
 }
 
+// 3GPP TS 26.244, 8.2 3GPP asset meta data
+static int mov_metadata_titl(MOVContext *c, AVIOContext *pb, unsigned len)
+{
+    AVFormatContext *s = c->fc;
+    int version = -1, ret = AVERROR_BUG;
+    unsigned left_bytes = len, langcode = 0, flags = 100, bom = 0, buf_size = 0;
+    char language[4] = { 0 };
+    AVStream *st = NULL;
+    char *title_buf = NULL;
+    const char key[] = "title";
+
+    // 4 byte FullBox header, 2 byte lang. code, at least 1 byte for string
+    if (len < 4 + 2 + 1) {
+        av_log(s, AV_LOG_ERROR, "3GPP titl box too short!\n");
+        return AVERROR_INVALIDDATA;
+    }
+
+    if (s->nb_streams >= 1)
+        st = s->streams[s->nb_streams-1];
+
+    // FullBox header
+    version = avio_r8(pb);
+    flags   = avio_rb24(pb);
+    left_bytes -= 4;
+
+    if (version != 0 || flags != 0) {
+        av_log(s, AV_LOG_ERROR,
+               "Invalid nonzero version (%d) or flags (%x) for 3GPP titl!\n",
+               version, flags);
+        return AVERROR_INVALIDDATA;
+    }
+
+    langcode = avio_rb16(pb) & ~(1 << 15);
+    if ((ret = ff_mov_lang_to_iso639(langcode, language)) < 0) {
+        av_log(s, AV_LOG_ERROR,
+               "Failed to parse 3GPP titl language code %x: %s!\n",
+               langcode, av_err2str(ret));
+        return ret;
+    }
+
+    left_bytes -= 2;
+
+    if (left_bytes <= 1)
+        // no contents (just null)
+        return 0;
+
+    buf_size = left_bytes + 1;
+    if (!(title_buf = av_mallocz(buf_size))) {
+        av_log(s, AV_LOG_ERROR,
+               "Could not allocate buffer of length %u for parsed 3GPP titl "
+               "title string!\n",
+               left_bytes);
+        return AVERROR(ENOMEM);
+    }
+
+    bom = avio_rb16(pb);
+    left_bytes -= 2;
+
+    if (bom == 0xfeff)
+        avio_get_str16be(pb, left_bytes, title_buf, buf_size);
+    else if (bom == 0xfffe)
+        avio_get_str16le(pb, left_bytes, title_buf, buf_size);
+    else {
+        AV_WB16(title_buf, bom);
+        if (!left_bytes)
+            title_buf[2] = 0;
+        else
+            avio_get_str(pb, left_bytes, title_buf + 2, buf_size - 2);
+    }
+
+    av_log(s, AV_LOG_TRACE, "%s TitlBox(lang: %s, title: %s)\n",
+           st ? "track" : "media",
+           language, title_buf);
+
+    s->event_flags |= AVFMT_EVENT_FLAG_METADATA_UPDATED;
+
+    if (*language && strcmp(language, "und")) {
+        char lang_key[sizeof(key) + 1 + sizeof(language)] = { 0 };
+        snprintf(lang_key, sizeof(lang_key), "%s-%s", key, language);
+
+        if ((ret = av_dict_set(st ? &st->metadata : &s->metadata,
+                               lang_key, title_buf, 0)) < 0) {
+            av_log(s, AV_LOG_ERROR,
+                   "Failed to set %s metadata key %s to value %s: %s!\n",
+                   st ? "track" : "media",
+                   lang_key, title_buf,
+                   av_err2str(ret));
+            goto cleanup;
+        }
+    }
+
+    ret = av_dict_set(st ? &st->metadata : &s->metadata, key, title_buf, 0);
+    if (ret < 0)
+        av_log(s, AV_LOG_ERROR,
+               "Failed to set %s metadata key %s to value %s: %s!\n",
+               st ? "track" : "media",
+               key, title_buf,
+               av_err2str(ret));
+
+cleanup:
+    av_freep(&title_buf);
+
+    return ret;
+}
+
 static int mov_read_udta_string(MOVContext *c, AVIOContext *pb, MOVAtom atom)
 {
     char tmp_key[AV_FOURCC_MAX_STRING_SIZE] = {0};
@@ -349,6 +454,8 @@  static int mov_read_udta_string(MOVContext *c, AVIOContext *pb, MOVAtom atom)
     case MKTAG( 's','o','s','n'): key = "sort_show";    break;
     case MKTAG( 's','t','i','k'): key = "media_type";
         parse = mov_metadata_int8_no_padding; break;
+    case MKTAG( 't','i','t','l'):
+        return mov_metadata_titl(c, pb, atom.size);
     case MKTAG( 't','r','k','n'): key = "track";
         parse = mov_metadata_track_or_disc_number; break;
     case MKTAG( 't','v','e','n'): key = "episode_id"; break;