[FFmpeg-devel] avformat/dashenc : Fix streaming mode support for webm output

Submitted by Oliver Collyer via ffmpeg-devel on April 8, 2019, 10:18 a.m.

Details

Message ID 20190408101824.72791-1-kjeyapal@akamai.com
State New
Headers show

Commit Message

Oliver Collyer via ffmpeg-devel April 8, 2019, 10:18 a.m.
---
 libavformat/dashenc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Oliver Collyer via ffmpeg-devel April 8, 2019, 2:07 p.m.
Karthick J via ffmpeg-devel:
> ---
>  libavformat/dashenc.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/libavformat/dashenc.c b/libavformat/dashenc.c
> index f8d71166d4..9dd520787f 100644
> --- a/libavformat/dashenc.c
> +++ b/libavformat/dashenc.c
> @@ -1750,10 +1750,10 @@ static int dash_write_packet(AVFormatContext *s, AVPacket *pkt)
>      }
>  
>      //write out the data immediately in streaming mode
> -    if (c->streaming && os->segment_type == SEGMENT_TYPE_MP4) {
> +    if (c->streaming) {
>          int len = 0;
>          uint8_t *buf = NULL;
> -        if (!os->written_len)
> +        if (!os->written_len && os->segment_type == SEGMENT_TYPE_MP4)
>              write_styp(os->ctx->pb);
>          avio_flush(os->ctx->pb);
>          len = avio_get_dyn_buf (os->ctx->pb, &buf);
> 
Did you test this and did it really improve latency? Because it
shouldn't. The matroska/webm muxer always uses its own internal
dynamic buffer for writing the current cluster, i.e. most of your
ff_write_chained() calls that write frames don't generate output at
all. The only difference that your patch seems to make is flushing
os->ctx->pb, which prevents the Matroska muxer from updating the
clusters' length fields (writing clusters as unknown-length might
worsen compability with some clients that don't support this).

It might be that you write and send the cluster header (the cluster
EBML ID + the cluster length field (this is not the real cluster
length, but a length denoting "unknown length"; currently, this length
field will be overwritten with the real size lateron) a bit earlier,
but you don't send the actual packets earlier.

If you want to decrease latency, you basically have two options:
a) You can explicitly flush the muxer (send a NULL packet). You
probably want to use/adapt flush_dynbuf() for this.
b) You can generally lower the cluster size and cluster time limits of
the used Matroska muxer. Currently large limits are used for this. The
Matroska muxer uses way lower default values in case the output isn't
seekable (viewed from the Matroska muxer, the output is seekable).

- Andreas

PS: The above description of the Matroska muxer is based upon current
git head. If [1] gets accepted, there will be two changes:
1. At first only the EBML ID and no length field at all will be written.
2. When the cluster is finally written, the length field will be
written correctly and on the lowest number of bytes possible
(currently it is always written on eight bytes).

[1]: https://ffmpeg.org/pipermail/ffmpeg-devel/2019-April/242041.html
Oliver Collyer via ffmpeg-devel April 10, 2019, 6:47 a.m.
On 4/8/19 7:37 PM, Andreas Rheinhardt via ffmpeg-devel wrote:
> Karthick J via ffmpeg-devel:

>> ---

>>  libavformat/dashenc.c | 4 ++--

>>  1 file changed, 2 insertions(+), 2 deletions(-)

>>

>> diff --git a/libavformat/dashenc.c b/libavformat/dashenc.c

>> index f8d71166d4..9dd520787f 100644

>> --- a/libavformat/dashenc.c

>> +++ b/libavformat/dashenc.c

>> @@ -1750,10 +1750,10 @@ static int dash_write_packet(AVFormatContext *s, AVPacket *pkt)

>>      }

>>  

>>      //write out the data immediately in streaming mode

>> -    if (c->streaming && os->segment_type == SEGMENT_TYPE_MP4) {

>> +    if (c->streaming) {

>>          int len = 0;

>>          uint8_t *buf = NULL;

>> -        if (!os->written_len)

>> +        if (!os->written_len && os->segment_type == SEGMENT_TYPE_MP4)

>>              write_styp(os->ctx->pb);

>>          avio_flush(os->ctx->pb);

>>          len = avio_get_dyn_buf (os->ctx->pb, &buf);

>>

> Did you test this and did it really improve latency? Because it

> shouldn't. The matroska/webm muxer always uses its own internal

> dynamic buffer for writing the current cluster, i.e. most of your

> ff_write_chained() calls that write frames don't generate output at

> all. The only difference that your patch seems to make is flushing

> os->ctx->pb, which prevents the Matroska muxer from updating the

> clusters' length fields (writing clusters as unknown-length might

> worsen compability with some clients that don't support this).

>

> It might be that you write and send the cluster header (the cluster

> EBML ID + the cluster length field (this is not the real cluster

> length, but a length denoting "unknown length"; currently, this length

> field will be overwritten with the real size lateron) a bit earlier,

> but you don't send the actual packets earlier.

Thanks for your detailed reply. You are exactly right. 
Only the header gets written with the flush. Rest of the data arrives only after segment end.
>

> If you want to decrease latency, you basically have two options:

> a) You can explicitly flush the muxer (send a NULL packet). You

> probably want to use/adapt flush_dynbuf() for this.

> b) You can generally lower the cluster size and cluster time limits of

> the used Matroska muxer. Currently large limits are used for this. The

> Matroska muxer uses way lower default values in case the output isn't

> seekable (viewed from the Matroska muxer, the output is seekable).

Thanks for providing helpful suggestions.
I tried option b). It didn't work. Matroska muxer ignores cluster time and size limits for dash video outputs.
I want to try option a), maybe at a later time. For time being I have disabled streaming for webm format, and called it out explicitly.
http://ffmpeg.org/pipermail/ffmpeg-devel/2019-April/242429.html

>

> - Andreas

>

> PS: The above description of the Matroska muxer is based upon current

> git head. If [1] gets accepted, there will be two changes:

> 1. At first only the EBML ID and no length field at all will be written.

> 2. When the cluster is finally written, the length field will be

> written correctly and on the lowest number of bytes possible

> (currently it is always written on eight bytes).

Sorry for the dumb question. I don't have much knowledge on webm/matroska format.
If these changes get accepted, will it make it easier for me to implement low latency in dashenc for webm.
>

> [1]: https://ffmpeg.org/pipermail/ffmpeg-devel/2019-April/242041.html

> _______________________________________________

> ffmpeg-devel mailing list

> ffmpeg-devel@ffmpeg.org

> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

>

> To unsubscribe, visit link above, or email

> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Oliver Collyer via ffmpeg-devel April 10, 2019, 2:05 p.m.
Jeyapal, Karthick:
> 
> On 4/8/19 7:37 PM, Andreas Rheinhardt via ffmpeg-devel wrote:
>> Karthick J via ffmpeg-devel:
>>> ---
>>>  libavformat/dashenc.c | 4 ++--
>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/libavformat/dashenc.c b/libavformat/dashenc.c
>>> index f8d71166d4..9dd520787f 100644
>>> --- a/libavformat/dashenc.c
>>> +++ b/libavformat/dashenc.c
>>> @@ -1750,10 +1750,10 @@ static int dash_write_packet(AVFormatContext *s, AVPacket *pkt)
>>>      }
>>>  
>>>      //write out the data immediately in streaming mode
>>> -    if (c->streaming && os->segment_type == SEGMENT_TYPE_MP4) {
>>> +    if (c->streaming) {
>>>          int len = 0;
>>>          uint8_t *buf = NULL;
>>> -        if (!os->written_len)
>>> +        if (!os->written_len && os->segment_type == SEGMENT_TYPE_MP4)
>>>              write_styp(os->ctx->pb);
>>>          avio_flush(os->ctx->pb);
>>>          len = avio_get_dyn_buf (os->ctx->pb, &buf);
>>>
>> Did you test this and did it really improve latency? Because it
>> shouldn't. The matroska/webm muxer always uses its own internal
>> dynamic buffer for writing the current cluster, i.e. most of your
>> ff_write_chained() calls that write frames don't generate output at
>> all. The only difference that your patch seems to make is flushing
>> os->ctx->pb, which prevents the Matroska muxer from updating the
>> clusters' length fields (writing clusters as unknown-length might
>> worsen compability with some clients that don't support this).
>>
>> It might be that you write and send the cluster header (the cluster
>> EBML ID + the cluster length field (this is not the real cluster
>> length, but a length denoting "unknown length"; currently, this length
>> field will be overwritten with the real size lateron) a bit earlier,
>> but you don't send the actual packets earlier.
> Thanks for your detailed reply. You are exactly right. 
> Only the header gets written with the flush. Rest of the data arrives only after segment end.
>>
>> If you want to decrease latency, you basically have two options:
>> a) You can explicitly flush the muxer (send a NULL packet). You
>> probably want to use/adapt flush_dynbuf() for this.
>> b) You can generally lower the cluster size and cluster time limits of
>> the used Matroska muxer. Currently large limits are used for this. The
>> Matroska muxer uses way lower default values in case the output isn't
>> seekable (viewed from the Matroska muxer, the output is seekable).
> Thanks for providing helpful suggestions.
> I tried option b). It didn't work. Matroska muxer ignores cluster time and size limits for dash video outputs.

That's true. The code contains the following comment:
"WebM DASH specification states that the first block of every cluster
has to be a key frame. So for DASH video, we only create a cluster on
seeing key frames."
But reading the WebM dash specifications (I am no expert on dash), I
have to disagree with this: Media Subsegments are defined as one or
more clusters. And while the first block in a subsegment has to be a
key frame, there is no such requirement for the first block of a
cluster in general. Maybe it has been written with an older version of
the WebM Dash specifications in mind?*

If you agree with me, I'll change this.

The Matroska muxer will then write smaller clusters and write data
more often, but not all of this data is actually immediately useable
for you: Currently, it contains the last cluster (that was actually
finished) plus the EBML ID + the size field (denoting "unknown size")
for the next cluster. If my patches get merged, it will just be data
of the last cluster + the EBML ID of the next one. But thinking about
this, it would probably be advantageouos if the Matroska muxer would
only output complete clusters, without the next EBML ID included. Then
the dash encoder could test whether new output has been written and
send this immediately. (This has the disadvantage that the dash muxer
would rely on undocumented behaviour of the Matroska muxer.)

Btw: The Matroska muxer used to write AVIO data marker (currently
broken, but can be fixed). Maybe this could be used as a side-channel
to tell users (like the dash muxer) that the output currently consists
of a complete cluster? (For this, the data marker stuff would need to
be modified: It is currently only used with custom IO.) Then you would
not need to rely on undocumented behaviour, but could test the current
status.

> I want to try option a), maybe at a later time.

Explicitly flushing yourself would avoid the problem with emitting the
ID prematurily and also avoid relying on undocumented behaviour. In
this case, you should still use big cluster size and time limits, so
that no unnecessary clusters are created.

> For time being I have disabled streaming for webm format, and called it out explicitly.
> http://ffmpeg.org/pipermail/ffmpeg-devel/2019-April/242429.html
> 
>>
>> - Andreas
>>
>> PS: The above description of the Matroska muxer is based upon current
>> git head. If [1] gets accepted, there will be two changes:
>> 1. At first only the EBML ID and no length field at all will be written.
>> 2. When the cluster is finally written, the length field will be
>> written correctly and on the lowest number of bytes possible
>> (currently it is always written on eight bytes).
> Sorry for the dumb question. I don't have much knowledge on webm/matroska format.
> If these changes get accepted, will it make it easier for me to implement low latency in dashenc for webm.

These changes will reduce the container overhead penalty for using
more clusters, but they will not affect latency.

- Andreas

*: The demuxer for WebM Dash manifests seems to be based upon the same
thinking: webm_clusters_start_with_keyframe checks whether all
clusters start with keyframes, not whether clusters that are
referenced in the cues (the index) start with keyframes.

Patch hide | download patch | download mbox

diff --git a/libavformat/dashenc.c b/libavformat/dashenc.c
index f8d71166d4..9dd520787f 100644
--- a/libavformat/dashenc.c
+++ b/libavformat/dashenc.c
@@ -1750,10 +1750,10 @@  static int dash_write_packet(AVFormatContext *s, AVPacket *pkt)
     }
 
     //write out the data immediately in streaming mode
-    if (c->streaming && os->segment_type == SEGMENT_TYPE_MP4) {
+    if (c->streaming) {
         int len = 0;
         uint8_t *buf = NULL;
-        if (!os->written_len)
+        if (!os->written_len && os->segment_type == SEGMENT_TYPE_MP4)
             write_styp(os->ctx->pb);
         avio_flush(os->ctx->pb);
         len = avio_get_dyn_buf (os->ctx->pb, &buf);