[FFmpeg-devel,05/37] avformat/matroskadec: Get rid of cluster size field assumption

Submitted by Andreas Rheinhardt on May 16, 2019, 10:29 p.m.

Details

Message ID 20190516223018.30827-6-andreas.rheinhardt@gmail.com
State Accepted
Commit 36aceb6174a6a1c40014001ff73c4c30012b569d
Headers show

Commit Message

Andreas Rheinhardt May 16, 2019, 10:29 p.m.
The earlier code relied on the length of clusters always being coded on
eight bytes as was the behaviour of libavformat's Matroska muxer until
recently. But given that our own Matroska muxer now (and mkvmerge from
time immemorial) creates files that don't conform to this assumption,
it is high time to get rid of this assumption.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
---
I originally planed for this patch to get merged before my Matroska muxer
patches.
 libavformat/matroskadec.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

Comments

James Almer June 23, 2019, 4:15 a.m.
On 5/16/2019 7:29 PM, Andreas Rheinhardt wrote:
> The earlier code relied on the length of clusters always being coded on
> eight bytes as was the behaviour of libavformat's Matroska muxer until
> recently. But given that our own Matroska muxer now (and mkvmerge from
> time immemorial) creates files that don't conform to this assumption,
> it is high time to get rid of this assumption.
> 
> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
> ---
> I originally planed for this patch to get merged before my Matroska muxer
> patches.
>  libavformat/matroskadec.c | 15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/libavformat/matroskadec.c b/libavformat/matroskadec.c
> index 4dd933ef74..6fd5537f5a 100644
> --- a/libavformat/matroskadec.c
> +++ b/libavformat/matroskadec.c
> @@ -3711,15 +3711,17 @@ static int webm_clusters_start_with_keyframe(AVFormatContext *s)
>      cluster_pos = s->streams[0]->index_entries[index].pos;
>      before_pos = avio_tell(s->pb);
>      while (1) {
> -        int64_t cluster_id = 0, cluster_length = 0;
> +        uint64_t cluster_id, cluster_length;
> +        int read;
>          AVPacket *pkt;
>          avio_seek(s->pb, cluster_pos, SEEK_SET);
>          // read cluster id and length
> -        ebml_read_num(matroska, matroska->ctx->pb, 4, &cluster_id);
> -        ebml_read_length(matroska, matroska->ctx->pb, &cluster_length);
> -        if (cluster_id != 0xF43B675) { // done with all clusters
> +        read = ebml_read_num(matroska, matroska->ctx->pb, 4, &cluster_id);
> +        if (read < 0 || cluster_id != 0xF43B675) // done with all clusters

This could be changed to use the matroska.h define instead.

> +            break;
> +        read = ebml_read_length(matroska, matroska->ctx->pb, &cluster_length);
> +        if (read < 0)
>              break;
> -        }
>          avio_seek(s->pb, cluster_pos, SEEK_SET);
>          matroska->current_id = 0;
>          matroska_clear_queue(matroska);
> @@ -3728,7 +3730,8 @@ static int webm_clusters_start_with_keyframe(AVFormatContext *s)
>              break;
>          }
>          pkt = &matroska->queue->pkt;
> -        cluster_pos += cluster_length + 12; // 12 is the offset of the cluster id and length.
> +        // 4 + read is the length of the cluster id and the cluster length field.
> +        cluster_pos += 4 + read + cluster_length;
>          if (!(pkt->flags & AV_PKT_FLAG_KEY)) {
>              rv = 0;
>              break;

Applied, thanks.
Andreas Rheinhardt June 23, 2019, 4:28 a.m.
James Almer:
> On 5/16/2019 7:29 PM, Andreas Rheinhardt wrote:
>> The earlier code relied on the length of clusters always being coded on
>> eight bytes as was the behaviour of libavformat's Matroska muxer until
>> recently. But given that our own Matroska muxer now (and mkvmerge from
>> time immemorial) creates files that don't conform to this assumption,
>> it is high time to get rid of this assumption.
>>
>> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
>> ---
>> I originally planed for this patch to get merged before my Matroska muxer
>> patches.
>>  libavformat/matroskadec.c | 15 +++++++++------
>>  1 file changed, 9 insertions(+), 6 deletions(-)
>>
>> diff --git a/libavformat/matroskadec.c b/libavformat/matroskadec.c
>> index 4dd933ef74..6fd5537f5a 100644
>> --- a/libavformat/matroskadec.c
>> +++ b/libavformat/matroskadec.c
>> @@ -3711,15 +3711,17 @@ static int webm_clusters_start_with_keyframe(AVFormatContext *s)
>>      cluster_pos = s->streams[0]->index_entries[index].pos;
>>      before_pos = avio_tell(s->pb);
>>      while (1) {
>> -        int64_t cluster_id = 0, cluster_length = 0;
>> +        uint64_t cluster_id, cluster_length;
>> +        int read;
>>          AVPacket *pkt;
>>          avio_seek(s->pb, cluster_pos, SEEK_SET);
>>          // read cluster id and length
>> -        ebml_read_num(matroska, matroska->ctx->pb, 4, &cluster_id);
>> -        ebml_read_length(matroska, matroska->ctx->pb, &cluster_length);
>> -        if (cluster_id != 0xF43B675) { // done with all clusters
>> +        read = ebml_read_num(matroska, matroska->ctx->pb, 4, &cluster_id);
>> +        if (read < 0 || cluster_id != 0xF43B675) // done with all clusters
> 
> This could be changed to use the matroska.h define instead.
> 
The defines in matroska.h include the whole EBML-numbers, including
the length descriptor bit (or VINT_MARKER in the parlance of the
current EBML specifications), whereas ebml_read_length returns the
number encoded by the EBML number (i.e. without VINT_MARKER). So one
can't simply replace the number by MATROSKA_ID_CLUSTER and so I didn't
do it.
>> +            break;
>> +        read = ebml_read_length(matroska, matroska->ctx->pb, &cluster_length);
>> +        if (read < 0)
>>              break;
>> -        }
>>          avio_seek(s->pb, cluster_pos, SEEK_SET);
>>          matroska->current_id = 0;
>>          matroska_clear_queue(matroska);
>> @@ -3728,7 +3730,8 @@ static int webm_clusters_start_with_keyframe(AVFormatContext *s)
>>              break;
>>          }
>>          pkt = &matroska->queue->pkt;
>> -        cluster_pos += cluster_length + 12; // 12 is the offset of the cluster id and length.
>> +        // 4 + read is the length of the cluster id and the cluster length field.
>> +        cluster_pos += 4 + read + cluster_length;
>>          if (!(pkt->flags & AV_PKT_FLAG_KEY)) {
>>              rv = 0;
>>              break;
> 
> Applied, thanks.
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
James Almer June 23, 2019, 12:50 p.m.
On 6/23/2019 1:28 AM, Andreas Rheinhardt wrote:
> James Almer:
>> On 5/16/2019 7:29 PM, Andreas Rheinhardt wrote:
>>> The earlier code relied on the length of clusters always being coded on
>>> eight bytes as was the behaviour of libavformat's Matroska muxer until
>>> recently. But given that our own Matroska muxer now (and mkvmerge from
>>> time immemorial) creates files that don't conform to this assumption,
>>> it is high time to get rid of this assumption.
>>>
>>> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
>>> ---
>>> I originally planed for this patch to get merged before my Matroska muxer
>>> patches.
>>>  libavformat/matroskadec.c | 15 +++++++++------
>>>  1 file changed, 9 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/libavformat/matroskadec.c b/libavformat/matroskadec.c
>>> index 4dd933ef74..6fd5537f5a 100644
>>> --- a/libavformat/matroskadec.c
>>> +++ b/libavformat/matroskadec.c
>>> @@ -3711,15 +3711,17 @@ static int webm_clusters_start_with_keyframe(AVFormatContext *s)
>>>      cluster_pos = s->streams[0]->index_entries[index].pos;
>>>      before_pos = avio_tell(s->pb);
>>>      while (1) {
>>> -        int64_t cluster_id = 0, cluster_length = 0;
>>> +        uint64_t cluster_id, cluster_length;
>>> +        int read;
>>>          AVPacket *pkt;
>>>          avio_seek(s->pb, cluster_pos, SEEK_SET);
>>>          // read cluster id and length
>>> -        ebml_read_num(matroska, matroska->ctx->pb, 4, &cluster_id);
>>> -        ebml_read_length(matroska, matroska->ctx->pb, &cluster_length);
>>> -        if (cluster_id != 0xF43B675) { // done with all clusters
>>> +        read = ebml_read_num(matroska, matroska->ctx->pb, 4, &cluster_id);
>>> +        if (read < 0 || cluster_id != 0xF43B675) // done with all clusters
>>
>> This could be changed to use the matroska.h define instead.
>>
> The defines in matroska.h include the whole EBML-numbers, including
> the length descriptor bit (or VINT_MARKER in the parlance of the
> current EBML specifications), whereas ebml_read_length returns the
> number encoded by the EBML number (i.e. without VINT_MARKER). So one
> can't simply replace the number by MATROSKA_ID_CLUSTER and so I didn't
> do it.

In another patch from this set you changed a line containing one of
these defines using a 0xFFFFFF mask for this purpose. I'm not sure if
it's better (It admittedly does look a bit ugly), but it makes it clear
to the occasional reader.

It was mostly a nit. It's fine as is.

>>> +            break;
>>> +        read = ebml_read_length(matroska, matroska->ctx->pb, &cluster_length);
>>> +        if (read < 0)
>>>              break;
>>> -        }
>>>          avio_seek(s->pb, cluster_pos, SEEK_SET);
>>>          matroska->current_id = 0;
>>>          matroska_clear_queue(matroska);
>>> @@ -3728,7 +3730,8 @@ static int webm_clusters_start_with_keyframe(AVFormatContext *s)
>>>              break;
>>>          }
>>>          pkt = &matroska->queue->pkt;
>>> -        cluster_pos += cluster_length + 12; // 12 is the offset of the cluster id and length.
>>> +        // 4 + read is the length of the cluster id and the cluster length field.
>>> +        cluster_pos += 4 + read + cluster_length;
>>>          if (!(pkt->flags & AV_PKT_FLAG_KEY)) {
>>>              rv = 0;
>>>              break;
>>
>> Applied, thanks.
>> _______________________________________________
>> ffmpeg-devel mailing list
>> ffmpeg-devel@ffmpeg.org
>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>
>> To unsubscribe, visit link above, or email
>> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>>
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>

Patch hide | download patch | download mbox

diff --git a/libavformat/matroskadec.c b/libavformat/matroskadec.c
index 4dd933ef74..6fd5537f5a 100644
--- a/libavformat/matroskadec.c
+++ b/libavformat/matroskadec.c
@@ -3711,15 +3711,17 @@  static int webm_clusters_start_with_keyframe(AVFormatContext *s)
     cluster_pos = s->streams[0]->index_entries[index].pos;
     before_pos = avio_tell(s->pb);
     while (1) {
-        int64_t cluster_id = 0, cluster_length = 0;
+        uint64_t cluster_id, cluster_length;
+        int read;
         AVPacket *pkt;
         avio_seek(s->pb, cluster_pos, SEEK_SET);
         // read cluster id and length
-        ebml_read_num(matroska, matroska->ctx->pb, 4, &cluster_id);
-        ebml_read_length(matroska, matroska->ctx->pb, &cluster_length);
-        if (cluster_id != 0xF43B675) { // done with all clusters
+        read = ebml_read_num(matroska, matroska->ctx->pb, 4, &cluster_id);
+        if (read < 0 || cluster_id != 0xF43B675) // done with all clusters
+            break;
+        read = ebml_read_length(matroska, matroska->ctx->pb, &cluster_length);
+        if (read < 0)
             break;
-        }
         avio_seek(s->pb, cluster_pos, SEEK_SET);
         matroska->current_id = 0;
         matroska_clear_queue(matroska);
@@ -3728,7 +3730,8 @@  static int webm_clusters_start_with_keyframe(AVFormatContext *s)
             break;
         }
         pkt = &matroska->queue->pkt;
-        cluster_pos += cluster_length + 12; // 12 is the offset of the cluster id and length.
+        // 4 + read is the length of the cluster id and the cluster length field.
+        cluster_pos += 4 + read + cluster_length;
         if (!(pkt->flags & AV_PKT_FLAG_KEY)) {
             rv = 0;
             break;