diff mbox series

[FFmpeg-devel,2/2] aviobuf: Avoid clearing the whole buffer in fill_buffer

Message ID 20230321123729.74124-2-martin@martin.st
State New
Headers show
Series [FFmpeg-devel,1/2] libavformat: Improve ff_configure_buffers_for_index for excessive deltas | expand

Checks

Context Check Description
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Martin Storsjö March 21, 2023, 12:37 p.m. UTC
Normally, fill_buffer reads in one max_packet_size/IO_BUFFER_SIZE
worth of data into the buffer, slowly filling the buffer until it
is full.

Previously, when the buffer was full, fill_buffer would start over
from the start, effectively discarding all the previously buffered
data.

For files that are read linearly, the previous behaviour was fine.

For files that exhibit some amount of nonlinear read patterns,
especially mov files (where ff_configure_buffers_for_index
increases the buffer size to accomodate for the nonlinear reading!)
we would mostly be able to seek within the buffer - but whenever
we've hit the maximum buffer size, we'd discard most of the buffer
and start over with a very small buffer, so the next seek backwards
would end up outside of the buffer.

Keep one fourth of the buffered data, moving it to the start of
the buffer, freeing the rest to be refilled with future data.

For mov files with nonlinear read patterns, this almost entirely
avoids doing seeks on the lower IO level, where we previously would
end up doing seeks occasionally.

Signed-off-by: Martin Storsjö <martin@martin.st>
---
I'm open to suggestions on whether 1/4 of the buffer is a reasonable
amount to keep. It does of course incur some amount of overhead
for well behaved linear files, but is a decent improvement for
nonlinear mov files.

Alternatively we could trigger this behaviour only after we've
observed a couple seeks backwards?
---
 libavformat/aviobuf.c | 46 +++++++++++++++++++++++++++++++++++++------
 1 file changed, 40 insertions(+), 6 deletions(-)

Comments

Marton Balint March 21, 2023, 7:29 p.m. UTC | #1
On Tue, 21 Mar 2023, Martin Storsjö wrote:

> Normally, fill_buffer reads in one max_packet_size/IO_BUFFER_SIZE
> worth of data into the buffer, slowly filling the buffer until it
> is full.
>
> Previously, when the buffer was full, fill_buffer would start over
> from the start, effectively discarding all the previously buffered
> data.
>
> For files that are read linearly, the previous behaviour was fine.
>
> For files that exhibit some amount of nonlinear read patterns,
> especially mov files (where ff_configure_buffers_for_index
> increases the buffer size to accomodate for the nonlinear reading!)
> we would mostly be able to seek within the buffer - but whenever
> we've hit the maximum buffer size, we'd discard most of the buffer
> and start over with a very small buffer, so the next seek backwards
> would end up outside of the buffer.
>
> Keep one fourth of the buffered data, moving it to the start of
> the buffer, freeing the rest to be refilled with future data.
>
> For mov files with nonlinear read patterns, this almost entirely
> avoids doing seeks on the lower IO level, where we previously would
> end up doing seeks occasionally.

Maybe the demuxer should use ffio_ensure_seekback() instead if it knows
that a seekback will happen? Unconditional memmove of even fourth of all 
data does not seem like a good idea.

Regards,
Marton

>
> Signed-off-by: Martin Storsjö <martin@martin.st>
> ---
> I'm open to suggestions on whether 1/4 of the buffer is a reasonable
> amount to keep. It does of course incur some amount of overhead
> for well behaved linear files, but is a decent improvement for
> nonlinear mov files.
>
> Alternatively we could trigger this behaviour only after we've
> observed a couple seeks backwards?
> ---
> libavformat/aviobuf.c | 46 +++++++++++++++++++++++++++++++++++++------
> 1 file changed, 40 insertions(+), 6 deletions(-)
>
> diff --git a/libavformat/aviobuf.c b/libavformat/aviobuf.c
> index 4ad734a3c3..dfc3e77016 100644
> --- a/libavformat/aviobuf.c
> +++ b/libavformat/aviobuf.c
> @@ -534,8 +534,7 @@ static void fill_buffer(AVIOContext *s)
>     FFIOContext *const ctx = (FFIOContext *)s;
>     int max_buffer_size = s->max_packet_size ?
>                           s->max_packet_size : IO_BUFFER_SIZE;
> -    uint8_t *dst        = s->buf_end - s->buffer + max_buffer_size <= s->buffer_size ?
> -                          s->buf_end : s->buffer;
> +    uint8_t *dst        = s->buf_end;
>     int len             = s->buffer_size - (dst - s->buffer);
>
>     /* can't fill the buffer without read_packet, just set EOF if appropriate */
> @@ -546,11 +545,46 @@ static void fill_buffer(AVIOContext *s)
>     if (s->eof_reached)
>         return;
> 
> -    if (s->update_checksum && dst == s->buffer) {
> -        if (s->buf_end > s->checksum_ptr)
> +    if (len < max_buffer_size && s->buffer_size > max_buffer_size) {
> +        /* If the buffer is almost full and we're not trying to read
> +           one whole buffer worth of data at once; keep some amount of
> +           the currently buffered data, but move it to the start of the
> +           buffer, to allow filling the buffer with more data. */
> +        int keep = (s->buf_end - s->buffer)/4;
> +        int shift = s->buf_end - keep - s->buffer;
> +
> +        if (s->update_checksum && s->checksum_ptr - s->buffer < shift) {
> +            /* Checksum up to the buffer + shift position (that we're
> +               shifting out of the buffer. */
>             s->checksum = s->update_checksum(s->checksum, s->checksum_ptr,
> -                                             s->buf_end - s->checksum_ptr);
> -        s->checksum_ptr = s->buffer;
> +                                             s->buffer + shift - s->checksum_ptr);
> +        }
> +
> +        memmove(s->buffer, s->buf_end - keep, keep);
> +        s->buf_end -= shift;
> +        s->buf_ptr -= shift;
> +        if (s->update_checksum) {
> +            if (s->checksum_ptr - s->buffer < shift)
> +                s->checksum_ptr = s->buffer;
> +            else
> +                s->checksum_ptr -= shift;
> +        }
> +
> +        dst = s->buf_end;
> +        len = s->buffer_size - (dst - s->buffer);
> +    } else if (len < max_buffer_size) {
> +        /* If the buffer is full so we can't fit a whole write of max_buffer_size,
> +           just restart the pointers from the start of the buffer. */
> +        dst = s->buffer;
> +        len = s->buffer_size;
> +
> +        if (s->update_checksum) {
> +            /* Checksum all data that gets shifted out of the buffer. */
> +            if (s->buf_end > s->checksum_ptr)
> +                s->checksum = s->update_checksum(s->checksum, s->checksum_ptr,
> +                                                 s->buf_end - s->checksum_ptr);
> +            s->checksum_ptr = s->buffer;
> +        }
>     }
>
>     /* make buffer smaller in case it ended up large after probing */
> -- 
> 2.37.1 (Apple Git-137.1)
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Martin Storsjö March 21, 2023, 8:24 p.m. UTC | #2
On Tue, 21 Mar 2023, Marton Balint wrote:

>
>
> On Tue, 21 Mar 2023, Martin Storsjö wrote:
>
>> Normally, fill_buffer reads in one max_packet_size/IO_BUFFER_SIZE
>> worth of data into the buffer, slowly filling the buffer until it
>> is full.
>>
>> Previously, when the buffer was full, fill_buffer would start over
>> from the start, effectively discarding all the previously buffered
>> data.
>>
>> For files that are read linearly, the previous behaviour was fine.
>>
>> For files that exhibit some amount of nonlinear read patterns,
>> especially mov files (where ff_configure_buffers_for_index
>> increases the buffer size to accomodate for the nonlinear reading!)
>> we would mostly be able to seek within the buffer - but whenever
>> we've hit the maximum buffer size, we'd discard most of the buffer
>> and start over with a very small buffer, so the next seek backwards
>> would end up outside of the buffer.
>>
>> Keep one fourth of the buffered data, moving it to the start of
>> the buffer, freeing the rest to be refilled with future data.
>>
>> For mov files with nonlinear read patterns, this almost entirely
>> avoids doing seeks on the lower IO level, where we previously would
>> end up doing seeks occasionally.
>
> Maybe the demuxer should use ffio_ensure_seekback() instead if it knows
> that a seekback will happen? Unconditional memmove of even fourth of all 
> data does not seem like a good idea.

Right, it's probably not ideal to do this unconditionally.

However, it's not that the demuxer really knows that a seekback _will_ 
happen - unless we make it inspect the next couple index entries. And I 
don't think we should make the demuxer pre-analyze the next access 
locations, but keep optimization like this on the separate layer. That 
way, it works as expected as long as the seeks are short enough within the 
expected tolerance, and falls back graciously on regular seeking for the 
accesses that are weirder than that.

If we'd use ffio_ensure_seekback(), we'd make it mandatory for the aviobuf 
layer to cache the data for any insane accesses.

Some stats on the file I'm dealing with: The file is >2 GB, and is not 
exactly interleaved like the mov demuxer reads it, but roughly - when 
demuxing, the mov demuxer mostly jumps back/forward within a maybe ~2 MB 
range. But at the start and end of the file, there's a couple samples that 
are way out of order, causing it to do seeks from one end of the file to 
the other and back. So in that case, if we'd do ffio_ensure_seekback(), 
we'd end up allocating a 2 GB seekback buffer.

Currently, ff_configure_buffers_for_index() correctly measures that it 
needs a large buffer to avoid seeks in this file. (The function finds a 
huge >2 GB pos_delta when inspecting all sample combinations in the file, 
but setting it to the maximum of 16 MB already helps a whole lot, see 
patch 1/2.)

So maybe we could have ff_configure_buffers_for_index set some more flags 
to opt into behaviour like this?

// Martin
Anton Khirnov March 24, 2023, 11:11 a.m. UTC | #3
Quoting Martin Storsjö (2023-03-21 21:24:25)
> On Tue, 21 Mar 2023, Marton Balint wrote:
> 
> >
> >
> > On Tue, 21 Mar 2023, Martin Storsjö wrote:
> >
> >> Normally, fill_buffer reads in one max_packet_size/IO_BUFFER_SIZE
> >> worth of data into the buffer, slowly filling the buffer until it
> >> is full.
> >>
> >> Previously, when the buffer was full, fill_buffer would start over
> >> from the start, effectively discarding all the previously buffered
> >> data.
> >>
> >> For files that are read linearly, the previous behaviour was fine.
> >>
> >> For files that exhibit some amount of nonlinear read patterns,
> >> especially mov files (where ff_configure_buffers_for_index
> >> increases the buffer size to accomodate for the nonlinear reading!)
> >> we would mostly be able to seek within the buffer - but whenever
> >> we've hit the maximum buffer size, we'd discard most of the buffer
> >> and start over with a very small buffer, so the next seek backwards
> >> would end up outside of the buffer.
> >>
> >> Keep one fourth of the buffered data, moving it to the start of
> >> the buffer, freeing the rest to be refilled with future data.
> >>
> >> For mov files with nonlinear read patterns, this almost entirely
> >> avoids doing seeks on the lower IO level, where we previously would
> >> end up doing seeks occasionally.
> >
> > Maybe the demuxer should use ffio_ensure_seekback() instead if it knows
> > that a seekback will happen? Unconditional memmove of even fourth of all 
> > data does not seem like a good idea.
> 
> Right, it's probably not ideal to do this unconditionally.
> 
> However, it's not that the demuxer really knows that a seekback _will_ 
> happen - unless we make it inspect the next couple index entries. And I 
> don't think we should make the demuxer pre-analyze the next access 
> locations, but keep optimization like this on the separate layer. That 
> way, it works as expected as long as the seeks are short enough within the 
> expected tolerance, and falls back graciously on regular seeking for the 
> accesses that are weirder than that.

I suppose changing the buffer into a ring buffer so you don't have to
move the data is not feasible?
Martin Storsjö March 24, 2023, 11:25 a.m. UTC | #4
On Fri, 24 Mar 2023, Anton Khirnov wrote:

> Quoting Martin Storsjö (2023-03-21 21:24:25)
>> On Tue, 21 Mar 2023, Marton Balint wrote:
>> 
>> >
>> >
>> > On Tue, 21 Mar 2023, Martin Storsjö wrote:
>> >
>> >> Normally, fill_buffer reads in one max_packet_size/IO_BUFFER_SIZE
>> >> worth of data into the buffer, slowly filling the buffer until it
>> >> is full.
>> >>
>> >> Previously, when the buffer was full, fill_buffer would start over
>> >> from the start, effectively discarding all the previously buffered
>> >> data.
>> >>
>> >> For files that are read linearly, the previous behaviour was fine.
>> >>
>> >> For files that exhibit some amount of nonlinear read patterns,
>> >> especially mov files (where ff_configure_buffers_for_index
>> >> increases the buffer size to accomodate for the nonlinear reading!)
>> >> we would mostly be able to seek within the buffer - but whenever
>> >> we've hit the maximum buffer size, we'd discard most of the buffer
>> >> and start over with a very small buffer, so the next seek backwards
>> >> would end up outside of the buffer.
>> >>
>> >> Keep one fourth of the buffered data, moving it to the start of
>> >> the buffer, freeing the rest to be refilled with future data.
>> >>
>> >> For mov files with nonlinear read patterns, this almost entirely
>> >> avoids doing seeks on the lower IO level, where we previously would
>> >> end up doing seeks occasionally.
>> >
>> > Maybe the demuxer should use ffio_ensure_seekback() instead if it knows
>> > that a seekback will happen? Unconditional memmove of even fourth of all 
>> > data does not seem like a good idea.
>> 
>> Right, it's probably not ideal to do this unconditionally.
>> 
>> However, it's not that the demuxer really knows that a seekback _will_ 
>> happen - unless we make it inspect the next couple index entries. And I 
>> don't think we should make the demuxer pre-analyze the next access 
>> locations, but keep optimization like this on the separate layer. That 
>> way, it works as expected as long as the seeks are short enough within the 
>> expected tolerance, and falls back graciously on regular seeking for the 
>> accesses that are weirder than that.
>
> I suppose changing the buffer into a ring buffer so you don't have to
> move the data is not feasible?

Something like that would probably be ideal, yes - so we'd have a 
constantly sliding window of data available behind the current position.

I think that would be more work than I'm able to invest in the issue at 
the moment, though. (That doesn't mean I think everyone should suffer the 
overhead of this patch in this form, but I'm more interested in looking at 
heuristic based solutions for triggering this case rather than a full 
rewrite.)

// Martin
Anton Khirnov March 24, 2023, 11:55 a.m. UTC | #5
Quoting Martin Storsjö (2023-03-24 12:25:37)
> On Fri, 24 Mar 2023, Anton Khirnov wrote:
> 
> > Quoting Martin Storsjö (2023-03-21 21:24:25)
> >> On Tue, 21 Mar 2023, Marton Balint wrote:
> >> 
> >> >
> >> >
> >> > On Tue, 21 Mar 2023, Martin Storsjö wrote:
> >> >
> >> >> Normally, fill_buffer reads in one max_packet_size/IO_BUFFER_SIZE
> >> >> worth of data into the buffer, slowly filling the buffer until it
> >> >> is full.
> >> >>
> >> >> Previously, when the buffer was full, fill_buffer would start over
> >> >> from the start, effectively discarding all the previously buffered
> >> >> data.
> >> >>
> >> >> For files that are read linearly, the previous behaviour was fine.
> >> >>
> >> >> For files that exhibit some amount of nonlinear read patterns,
> >> >> especially mov files (where ff_configure_buffers_for_index
> >> >> increases the buffer size to accomodate for the nonlinear reading!)
> >> >> we would mostly be able to seek within the buffer - but whenever
> >> >> we've hit the maximum buffer size, we'd discard most of the buffer
> >> >> and start over with a very small buffer, so the next seek backwards
> >> >> would end up outside of the buffer.
> >> >>
> >> >> Keep one fourth of the buffered data, moving it to the start of
> >> >> the buffer, freeing the rest to be refilled with future data.
> >> >>
> >> >> For mov files with nonlinear read patterns, this almost entirely
> >> >> avoids doing seeks on the lower IO level, where we previously would
> >> >> end up doing seeks occasionally.
> >> >
> >> > Maybe the demuxer should use ffio_ensure_seekback() instead if it knows
> >> > that a seekback will happen? Unconditional memmove of even fourth of all 
> >> > data does not seem like a good idea.
> >> 
> >> Right, it's probably not ideal to do this unconditionally.
> >> 
> >> However, it's not that the demuxer really knows that a seekback _will_ 
> >> happen - unless we make it inspect the next couple index entries. And I 
> >> don't think we should make the demuxer pre-analyze the next access 
> >> locations, but keep optimization like this on the separate layer. That 
> >> way, it works as expected as long as the seeks are short enough within the 
> >> expected tolerance, and falls back graciously on regular seeking for the 
> >> accesses that are weirder than that.
> >
> > I suppose changing the buffer into a ring buffer so you don't have to
> > move the data is not feasible?
> 
> Something like that would probably be ideal, yes - so we'd have a 
> constantly sliding window of data available behind the current position.
> 
> I think that would be more work than I'm able to invest in the issue at 
> the moment, though. (That doesn't mean I think everyone should suffer the 
> overhead of this patch in this form, but I'm more interested in looking at 
> heuristic based solutions for triggering this case rather than a full 
> rewrite.)

As a (hopefully) temporary heuristic, triggering this after observing a
few backward seeks under buffer size sounds reasonable to me.
Marton Balint March 24, 2023, 8:45 p.m. UTC | #6
On Fri, 24 Mar 2023, Anton Khirnov wrote:

> Quoting Martin Storsjö (2023-03-24 12:25:37)
>> On Fri, 24 Mar 2023, Anton Khirnov wrote:
>> 
>> > Quoting Martin Storsjö (2023-03-21 21:24:25)
>> >> On Tue, 21 Mar 2023, Marton Balint wrote:
>> >> 
>> >> >
>> >> >
>> >> > On Tue, 21 Mar 2023, Martin Storsjö wrote:
>> >> >
>> >> >> Normally, fill_buffer reads in one max_packet_size/IO_BUFFER_SIZE
>> >> >> worth of data into the buffer, slowly filling the buffer until it
>> >> >> is full.
>> >> >>
>> >> >> Previously, when the buffer was full, fill_buffer would start over
>> >> >> from the start, effectively discarding all the previously buffered
>> >> >> data.
>> >> >>
>> >> >> For files that are read linearly, the previous behaviour was fine.
>> >> >>
>> >> >> For files that exhibit some amount of nonlinear read patterns,
>> >> >> especially mov files (where ff_configure_buffers_for_index
>> >> >> increases the buffer size to accomodate for the nonlinear reading!)
>> >> >> we would mostly be able to seek within the buffer - but whenever
>> >> >> we've hit the maximum buffer size, we'd discard most of the buffer
>> >> >> and start over with a very small buffer, so the next seek backwards
>> >> >> would end up outside of the buffer.
>> >> >>
>> >> >> Keep one fourth of the buffered data, moving it to the start of
>> >> >> the buffer, freeing the rest to be refilled with future data.
>> >> >>
>> >> >> For mov files with nonlinear read patterns, this almost entirely
>> >> >> avoids doing seeks on the lower IO level, where we previously would
>> >> >> end up doing seeks occasionally.
>> >> >
>> >> > Maybe the demuxer should use ffio_ensure_seekback() instead if it knows
>> >> > that a seekback will happen? Unconditional memmove of even fourth of all 
>> >> > data does not seem like a good idea.
>> >> 
>> >> Right, it's probably not ideal to do this unconditionally.
>> >> 
>> >> However, it's not that the demuxer really knows that a seekback _will_ 
>> >> happen - unless we make it inspect the next couple index entries. And I 
>> >> don't think we should make the demuxer pre-analyze the next access 
>> >> locations, but keep optimization like this on the separate layer. That 
>> >> way, it works as expected as long as the seeks are short enough within the 
>> >> expected tolerance, and falls back graciously on regular seeking for the 
>> >> accesses that are weirder than that.
>> >
>> > I suppose changing the buffer into a ring buffer so you don't have to
>> > move the data is not feasible?
>> 
>> Something like that would probably be ideal, yes - so we'd have a 
>> constantly sliding window of data available behind the current position.
>> 
>> I think that would be more work than I'm able to invest in the issue at 
>> the moment, though. (That doesn't mean I think everyone should suffer the 
>> overhead of this patch in this form, but I'm more interested in looking at 
>> heuristic based solutions for triggering this case rather than a full 
>> rewrite.)
>
> As a (hopefully) temporary heuristic, triggering this after observing a
> few backward seeks under buffer size sounds reasonable to me.

I am uneasy about complicating an already complicated and 
hard-to-follow AVIO layer with heuristics which activate on magic 
behaviour. And we all know how long temporary solutions last :)

I guess we could add some new parameter to AVIOContext end enable this 
data-shifting behaviour explicitly when you reconfigure the buffer size 
for index in the MOV demuxer. But is it worth it? How significant is the 
"improvement" this patch provides over the previous one in the series?

Thanks,
Marton
Martin Storsjö March 24, 2023, 9:05 p.m. UTC | #7
On Fri, 24 Mar 2023, Marton Balint wrote:

> I am uneasy about complicating an already complicated and 
> hard-to-follow AVIO layer with heuristics which activate on magic 
> behaviour. And we all know how long temporary solutions last :)
>
> I guess we could add some new parameter to AVIOContext end enable this 
> data-shifting behaviour explicitly when you reconfigure the buffer size 
> for index in the MOV demuxer. But is it worth it? How significant is the 
> "improvement" this patch provides over the previous one in the series?

With the 2.6 GB, 40 minute mov file I'm looking at, originally, due to the 
issue fixed in patch 1/2, the buffer size was never increased from the 
original 32 KB, so when reading the file linearly, we would do many tens 
of thousands of seek requests, giving absolutely abysmal performance. (I 
saw a server side log number saying 120 000 requests.)

With patch 1/2 applied, while reading the bulk of the file, it does ~170 
seeks. So nothing terrible, but it still feels unnecessarily inefficient 
to do >4 seeks per minute due to the fact that the aviobuf layer is 
throwing away good data that it already had buffered.

In this case, it used a buffer size of 16 MB, and calculating 2.6 GB / 16 
MB ends up very near 170. So every time the 16 MB aviobuf buffer gets full 
and aviobuf clears it, we end up doing a seek backwards.

With patch 2/2 applied, we no longer do any seeks while reading the bulk 
of the file (at the start/end of the file there are still a bunch of 
scattered seeks though).

// Martin
Marton Balint March 24, 2023, 9:35 p.m. UTC | #8
On Fri, 24 Mar 2023, Martin Storsjö wrote:

> On Fri, 24 Mar 2023, Marton Balint wrote:
>
>>  I am uneasy about complicating an already complicated and hard-to-follow
>>  AVIO layer with heuristics which activate on magic behaviour. And we all
>>  know how long temporary solutions last :)
>>
>>  I guess we could add some new parameter to AVIOContext end enable this
>>  data-shifting behaviour explicitly when you reconfigure the buffer size
>>  for index in the MOV demuxer. But is it worth it? How significant is the
>>  "improvement" this patch provides over the previous one in the series?
>
> With the 2.6 GB, 40 minute mov file I'm looking at, originally, due to the 
> issue fixed in patch 1/2, the buffer size was never increased from the 
> original 32 KB, so when reading the file linearly, we would do many tens of 
> thousands of seek requests, giving absolutely abysmal performance. (I saw a 
> server side log number saying 120 000 requests.)
>
> With patch 1/2 applied, while reading the bulk of the file, it does ~170 
> seeks. So nothing terrible, but it still feels unnecessarily inefficient to 
> do >4 seeks per minute due to the fact that the aviobuf layer is throwing 
> away good data that it already had buffered.
>
> In this case, it used a buffer size of 16 MB, and calculating 2.6 GB / 16 MB 
> ends up very near 170. So every time the 16 MB aviobuf buffer gets full and 
> aviobuf clears it, we end up doing a seek backwards.

Thanks for the details. Patch/1 already made the significant improvement, 
so yeah, I am not sure about Patch/2 knowing it is not the "right" way.

Regards,
Marton


>
> With patch 2/2 applied, we no longer do any seeks while reading the bulk of 
> the file (at the start/end of the file there are still a bunch of scattered 
> seeks though).
>
> // Martin
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
Martin Storsjö March 24, 2023, 9:41 p.m. UTC | #9
On Fri, 24 Mar 2023, Marton Balint wrote:

>
>
> On Fri, 24 Mar 2023, Martin Storsjö wrote:
>
>> On Fri, 24 Mar 2023, Marton Balint wrote:
>>
>>>  I am uneasy about complicating an already complicated and hard-to-follow
>>>  AVIO layer with heuristics which activate on magic behaviour. And we all
>>>  know how long temporary solutions last :)
>>>
>>>  I guess we could add some new parameter to AVIOContext end enable this
>>>  data-shifting behaviour explicitly when you reconfigure the buffer size
>>>  for index in the MOV demuxer. But is it worth it? How significant is the
>>>  "improvement" this patch provides over the previous one in the series?
>>
>> With the 2.6 GB, 40 minute mov file I'm looking at, originally, due to the 
>> issue fixed in patch 1/2, the buffer size was never increased from the 
>> original 32 KB, so when reading the file linearly, we would do many tens of 
>> thousands of seek requests, giving absolutely abysmal performance. (I saw a 
>> server side log number saying 120 000 requests.)
>>
>> With patch 1/2 applied, while reading the bulk of the file, it does ~170 
>> seeks. So nothing terrible, but it still feels unnecessarily inefficient to 
>> do >4 seeks per minute due to the fact that the aviobuf layer is throwing 
>> away good data that it already had buffered.
>>
>> In this case, it used a buffer size of 16 MB, and calculating 2.6 GB / 16 
> MB 
>> ends up very near 170. So every time the 16 MB aviobuf buffer gets full and 
>> aviobuf clears it, we end up doing a seek backwards.
>
> Thanks for the details. Patch/1 already made the significant improvement,

Yeah - can I get someone to approve that one too? :-)

> so yeah, I am not sure about Patch/2 knowing it is not the "right" way.

Yeah, I'm a bit on the fence myself. On one hand, it's been a pet peeve of 
mine for years, that we throw away the buffered data regularly like this, 
but the implementation maybe isn't the best. So for the practical 
performance issue it's probably not essential - it's just an annoying wart 
that is left :-)

// Martin
diff mbox series

Patch

diff --git a/libavformat/aviobuf.c b/libavformat/aviobuf.c
index 4ad734a3c3..dfc3e77016 100644
--- a/libavformat/aviobuf.c
+++ b/libavformat/aviobuf.c
@@ -534,8 +534,7 @@  static void fill_buffer(AVIOContext *s)
     FFIOContext *const ctx = (FFIOContext *)s;
     int max_buffer_size = s->max_packet_size ?
                           s->max_packet_size : IO_BUFFER_SIZE;
-    uint8_t *dst        = s->buf_end - s->buffer + max_buffer_size <= s->buffer_size ?
-                          s->buf_end : s->buffer;
+    uint8_t *dst        = s->buf_end;
     int len             = s->buffer_size - (dst - s->buffer);
 
     /* can't fill the buffer without read_packet, just set EOF if appropriate */
@@ -546,11 +545,46 @@  static void fill_buffer(AVIOContext *s)
     if (s->eof_reached)
         return;
 
-    if (s->update_checksum && dst == s->buffer) {
-        if (s->buf_end > s->checksum_ptr)
+    if (len < max_buffer_size && s->buffer_size > max_buffer_size) {
+        /* If the buffer is almost full and we're not trying to read
+           one whole buffer worth of data at once; keep some amount of
+           the currently buffered data, but move it to the start of the
+           buffer, to allow filling the buffer with more data. */
+        int keep = (s->buf_end - s->buffer)/4;
+        int shift = s->buf_end - keep - s->buffer;
+
+        if (s->update_checksum && s->checksum_ptr - s->buffer < shift) {
+            /* Checksum up to the buffer + shift position (that we're
+               shifting out of the buffer. */
             s->checksum = s->update_checksum(s->checksum, s->checksum_ptr,
-                                             s->buf_end - s->checksum_ptr);
-        s->checksum_ptr = s->buffer;
+                                             s->buffer + shift - s->checksum_ptr);
+        }
+
+        memmove(s->buffer, s->buf_end - keep, keep);
+        s->buf_end -= shift;
+        s->buf_ptr -= shift;
+        if (s->update_checksum) {
+            if (s->checksum_ptr - s->buffer < shift)
+                s->checksum_ptr = s->buffer;
+            else
+                s->checksum_ptr -= shift;
+        }
+
+        dst = s->buf_end;
+        len = s->buffer_size - (dst - s->buffer);
+    } else if (len < max_buffer_size) {
+        /* If the buffer is full so we can't fit a whole write of max_buffer_size,
+           just restart the pointers from the start of the buffer. */
+        dst = s->buffer;
+        len = s->buffer_size;
+
+        if (s->update_checksum) {
+            /* Checksum all data that gets shifted out of the buffer. */
+            if (s->buf_end > s->checksum_ptr)
+                s->checksum = s->update_checksum(s->checksum, s->checksum_ptr,
+                                                 s->buf_end - s->checksum_ptr);
+            s->checksum_ptr = s->buffer;
+        }
     }
 
     /* make buffer smaller in case it ended up large after probing */