diff mbox series

[FFmpeg-devel,5/8] avformat/matroskaenc: Use av_fast_realloc_array for index entries

Message ID DB6PR0101MB22148E756BE238C0567532608F819@DB6PR0101MB2214.eurprd01.prod.exchangelabs.com
State New
Headers show
Series [FFmpeg-devel,1/8] avutil/mem: Handle fast allocations near UINT_MAX properly | expand

Checks

Context Check Description
yinshiyou/make_loongarch64 success Make finished
yinshiyou/make_fate_loongarch64 success Make fate finished
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Andreas Rheinhardt July 5, 2022, 8:26 p.m. UTC
Currently, the Matroska muxer reallocates its array of index entries
each time another entry is added. This is bad performance-wise,
especially on Windows where reallocations are slow. This is solved
by switching to av_fast_realloc_array() which ensures that actual
reallocations will happen only seldomly.

For an (admittedly extreme) example which consists of looping a video
consisting of a single keyframe of size 4KB 540000 times this improved
the time for writing a frame from 23524201 decicycles (516466 runs,
7822 skips) to 225240 decicycles (522122 runs, 2166 skips) on Windows.

(Writing CRC-32 elements was disabled for these tests.)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
 libavformat/matroskaenc.c | 23 ++++++++++++++++-------
 1 file changed, 16 insertions(+), 7 deletions(-)

Comments

Tomas Härdin July 6, 2022, 3:03 p.m. UTC | #1
tis 2022-07-05 klockan 22:26 +0200 skrev Andreas Rheinhardt:
> 
> -    entries = av_realloc_array(entries, cues->num_entries + 1,
> sizeof(mkv_cuepoint));
> -    if (!entries)
> -        return AVERROR(ENOMEM);
> -    cues->entries = entries;
> +    ret = av_fast_realloc_array(&cues->entries, &cues-
> >allocated_entries,
> +                                cues->num_entries + 1,
> +                                MAX_SUPPORTED_EBML_LENGTH /
> MIN_CUETRACKPOS_SIZE,

Looks fine since MAX_SUPPORTED_EBML_LENGTH <= INT_MAX. Even SIZE_MAX /
MIN_CUETRACKPOS_SIZE would work. Maybe we can could switch
MAX_SUPPORTED_EBML_LENGTH to

 #define MAX_SUPPORTED_EBML_LENGTH FFMIN(MAX_EBML_LENGTH, SIZE_MAX)

?

/Tomas
Andreas Rheinhardt July 6, 2022, 3:10 p.m. UTC | #2
Tomas Härdin:
> tis 2022-07-05 klockan 22:26 +0200 skrev Andreas Rheinhardt:
>>
>> -    entries = av_realloc_array(entries, cues->num_entries + 1,
>> sizeof(mkv_cuepoint));
>> -    if (!entries)
>> -        return AVERROR(ENOMEM);
>> -    cues->entries = entries;
>> +    ret = av_fast_realloc_array(&cues->entries, &cues-
>>> allocated_entries,
>> +                                cues->num_entries + 1,
>> +                                MAX_SUPPORTED_EBML_LENGTH /
>> MIN_CUETRACKPOS_SIZE,
> 
> Looks fine since MAX_SUPPORTED_EBML_LENGTH <= INT_MAX. Even SIZE_MAX /
> MIN_CUETRACKPOS_SIZE would work. Maybe we can could switch
> MAX_SUPPORTED_EBML_LENGTH to
> 
>  #define MAX_SUPPORTED_EBML_LENGTH FFMIN(MAX_EBML_LENGTH, SIZE_MAX)
> 
> ?
> 

To quote the comment for MAX_SUPPORTED_EBML_LENGTH:
"/* The dynamic buffer API we rely upon has a limit of INT_MAX;
 * and so has avio_write(). */"
And I don't get why MAX_SUPPORTED_EBML_LENGTH <= INT_MAX is even
relevant here. (Do you worry that MAX_SUPPORTED_EBML_LENGTH /
MIN_CUETRACKPOS_SIZE might not be representable in a size_t? Thinking
about this, defining it as FFMIN3(MAX_EBML_LENGTH, INT_MAX, SIZE_MAX) is
better.)

- Andreas
Tomas Härdin July 6, 2022, 3:21 p.m. UTC | #3
ons 2022-07-06 klockan 17:10 +0200 skrev Andreas Rheinhardt:
> Tomas Härdin:
> > tis 2022-07-05 klockan 22:26 +0200 skrev Andreas Rheinhardt:
> > > 
> > > -    entries = av_realloc_array(entries, cues->num_entries + 1,
> > > sizeof(mkv_cuepoint));
> > > -    if (!entries)
> > > -        return AVERROR(ENOMEM);
> > > -    cues->entries = entries;
> > > +    ret = av_fast_realloc_array(&cues->entries, &cues-
> > > > allocated_entries,
> > > +                                cues->num_entries + 1,
> > > +                                MAX_SUPPORTED_EBML_LENGTH /
> > > MIN_CUETRACKPOS_SIZE,
> > 
> > Looks fine since MAX_SUPPORTED_EBML_LENGTH <= INT_MAX. Even
> > SIZE_MAX /
> > MIN_CUETRACKPOS_SIZE would work. Maybe we can could switch
> > MAX_SUPPORTED_EBML_LENGTH to
> > 
> >  #define MAX_SUPPORTED_EBML_LENGTH FFMIN(MAX_EBML_LENGTH, SIZE_MAX)
> > 
> > ?
> > 
> 
> To quote the comment for MAX_SUPPORTED_EBML_LENGTH:
> "/* The dynamic buffer API we rely upon has a limit of INT_MAX;
>  * and so has avio_write(). */"
> And I don't get why MAX_SUPPORTED_EBML_LENGTH <= INT_MAX is even
> relevant here. (Do you worry that MAX_SUPPORTED_EBML_LENGTH /
> MIN_CUETRACKPOS_SIZE might not be representable in a size_t? Thinking
> about this, defining it as FFMIN3(MAX_EBML_LENGTH, INT_MAX, SIZE_MAX)
> is
> better.)

INT_MAX <= SIZE_MAX on all platforms I am aware of. Just thought we
might want to support absolutely gargantuan .mkv files. Leaving it as-
is is fine

/Tomas
diff mbox series

Patch

diff --git a/libavformat/matroskaenc.c b/libavformat/matroskaenc.c
index 1256bdfe36..7c7de612de 100644
--- a/libavformat/matroskaenc.c
+++ b/libavformat/matroskaenc.c
@@ -168,7 +168,8 @@  typedef struct mkv_cuepoint {
 
 typedef struct mkv_cues {
     mkv_cuepoint   *entries;
-    int             num_entries;
+    size_t          num_entries;
+    size_t          allocated_entries;
 } mkv_cues;
 
 struct MatroskaMuxContext;
@@ -257,6 +258,10 @@  typedef struct MatroskaMuxContext {
 /** 4 * (1-byte EBML ID, 1-byte EBML size, 8-byte uint max) */
 #define MAX_CUETRACKPOS_SIZE 40
 
+/** Minimal size of CueTrack, CueClusterPosition and CueRelativePosition,
+ *  and 1 + 1 bytes for the overhead of CueTrackPositions itself. */
+#define MIN_CUETRACKPOS_SIZE (1 + 1 + 3 * (1 + 1 + 1))
+
 /** 2 + 1 Simpletag header, 2 + 1 + 8 Name "DURATION", 23B for TagString */
 #define DURATION_SIMPLETAG_SIZE (2 + 1 + (2 + 1 + 8) + 23)
 
@@ -914,16 +919,20 @@  static int mkv_add_cuepoint(MatroskaMuxContext *mkv, int stream, int64_t ts,
                             int64_t cluster_pos, int64_t relative_pos, int64_t duration)
 {
     mkv_cues *cues = &mkv->cues;
-    mkv_cuepoint *entries = cues->entries;
-    unsigned idx = cues->num_entries;
+    mkv_cuepoint *entries;
+    size_t idx = cues->num_entries;
+    int ret;
 
     if (ts < 0)
         return 0;
 
-    entries = av_realloc_array(entries, cues->num_entries + 1, sizeof(mkv_cuepoint));
-    if (!entries)
-        return AVERROR(ENOMEM);
-    cues->entries = entries;
+    ret = av_fast_realloc_array(&cues->entries, &cues->allocated_entries,
+                                cues->num_entries + 1,
+                                MAX_SUPPORTED_EBML_LENGTH / MIN_CUETRACKPOS_SIZE,
+                                sizeof(*cues->entries));
+    if (ret < 0)
+        return ret;
+    entries = cues->entries;
 
     /* Make sure the cues entries are sorted by pts. */
     while (idx > 0 && entries[idx - 1].pts > ts)