diff mbox series

[FFmpeg-devel,2/2] mov: Don't export unknown/unhandled metadata types as if they were UTF8

Message ID 20210401115107.53832-2-martin@martin.st
State Accepted
Commit 13ec6624b3507052d8a0e14cee39b49e3c99450f
Headers show
Series [FFmpeg-devel,1/2] mov: Pick up "com.apple.quicktime.artwork" as cover art | expand

Checks

Context Check Description
andriy/x86_make success Make finished
andriy/x86_make_fate success Make fate finished
andriy/PPC64_make success Make finished
andriy/PPC64_make_fate success Make fate finished

Commit Message

Martin Storsjö April 1, 2021, 11:51 a.m. UTC
They can be other incompatible text encodings (such as UTF-16),
or even binary data.
---
 libavformat/mov.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Martin Storsjö April 15, 2021, 7:33 a.m. UTC | #1
On Thu, 1 Apr 2021, Martin Storsjö wrote:

> They can be other incompatible text encodings (such as UTF-16),
> or even binary data.
> ---
> libavformat/mov.c | 8 ++++++++
> 1 file changed, 8 insertions(+)

Ping

// Martin
Martin Storsjö June 17, 2021, 10:28 a.m. UTC | #2
On Thu, 15 Apr 2021, Martin Storsjö wrote:

> On Thu, 1 Apr 2021, Martin Storsjö wrote:
>
>> They can be other incompatible text encodings (such as UTF-16),
>> or even binary data.
>> ---
>> libavformat/mov.c | 8 ++++++++
>> 1 file changed, 8 insertions(+)
>
> Ping

Ping (see 
https://patchwork.ffmpeg.org/project/ffmpeg/patch/20210401115107.53832-2-martin@martin.st/ 
for the patch).

// Martin
Derek Buitenhuis July 13, 2021, 1:14 p.m. UTC | #3
On 4/1/2021 12:51 PM, Martin Storsjö wrote:
> +        } else if (data_type > 1 && data_type != 4) {
> +            // data_type can be 0 if not set at all above. data_type 1 means
> +            // UTF8 and 4 means "UTF8 sort". For any other type (UTF16 or e.g.
> +            // a picture), don't return it blindly in a string that is supposed
> +            // to be UTF8 text.
> +            av_log(c->fc, AV_LOG_WARNING, "Skipping unhandled metadata %s of type %d\n", key, data_type);
> +            av_free(str);
> +            return 0;

Should we add any UTF-8 validation on our end too?

- Derek
Martin Storsjö July 14, 2021, 12:32 p.m. UTC | #4
On Tue, 13 Jul 2021, Derek Buitenhuis wrote:

> On 4/1/2021 12:51 PM, Martin Storsjö wrote:
>> +        } else if (data_type > 1 && data_type != 4) {
>> +            // data_type can be 0 if not set at all above. data_type 1 means
>> +            // UTF8 and 4 means "UTF8 sort". For any other type (UTF16 or e.g.
>> +            // a picture), don't return it blindly in a string that is supposed
>> +            // to be UTF8 text.
>> +            av_log(c->fc, AV_LOG_WARNING, "Skipping unhandled metadata %s of type %d\n", key, data_type);
>> +            av_free(str);
>> +            return 0;
>
> Should we add any UTF-8 validation on our end too?

(Ah, here the reply arrived from yesterday)

I guess we could - but I see that as a separate thing to do which could be 
applied everywhere where we export metadata.

Here we have a flag that clearly identifies what kind of data it is 
(although we only recognize some of them), and if a type is set, which 
isn't utf8, we at least should bail out there.

For reference for myself and others, the types are defined here: 
https://developer.apple.com/library/archive/documentation/QuickTime/QTFF/Metadata/Metadata.html#//apple_ref/doc/uid/TP40000939-CH1-SW35

So 0 would mean unset/unknown/whatever, where we keep doing what we did 
before. But for other values, like utf16, various endian integers, bmps, 
whatever, just bail out.

// Martin
Derek Buitenhuis July 14, 2021, 1:22 p.m. UTC | #5
On 7/14/2021 1:32 PM, Martin Storsjö wrote:
> I guess we could - but I see that as a separate thing to do which could be 
> applied everywhere where we export metadata.

Yeah, true... we should probably have it in som upper/middle layer that checks
all string-type exports or something.

Patch should be OK.

- Derek
diff mbox series

Patch

diff --git a/libavformat/mov.c b/libavformat/mov.c
index 162772f499..f539bca1f0 100644
--- a/libavformat/mov.c
+++ b/libavformat/mov.c
@@ -509,6 +509,14 @@  retry:
                 av_free(str);
                 return AVERROR_INVALIDDATA;
             }
+        } else if (data_type > 1 && data_type != 4) {
+            // data_type can be 0 if not set at all above. data_type 1 means
+            // UTF8 and 4 means "UTF8 sort". For any other type (UTF16 or e.g.
+            // a picture), don't return it blindly in a string that is supposed
+            // to be UTF8 text.
+            av_log(c->fc, AV_LOG_WARNING, "Skipping unhandled metadata %s of type %d\n", key, data_type);
+            av_free(str);
+            return 0;
         } else {
             int ret = ffio_read_size(pb, str, str_size);
             if (ret < 0) {