diff mbox

[FFmpeg-devel,RFC] avcodec/avutil: Add timeline side data

Message ID 1522179841-34881-2-git-send-email-derek.buitenhuis@gmail.com
State New
Headers show

Commit Message

Derek Buitenhuis March 27, 2018, 7:44 p.m. UTC
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
---
 libavcodec/avcodec.h |   8 +++
 libavutil/timeline.h | 160 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 168 insertions(+)
 create mode 100644 libavutil/timeline.h

+#endif /* AVUTIL_TIMELINE_H */

Comments

Alexander Strasser March 27, 2018, 10:46 p.m. UTC | #1
Hi Derek,

I am happy to see someone trying to extend FFmpeg to support these kind
of features in a deeper way. Good luck on this journey!

Below I am mostly commenting on typos and rather minor things.

So you should for now mostly ignore my comments, but it is easier to
write them now that I read the patch. Just saying taking immediate action
in this early stage when things will probably be dropped or modified could
result in unneeded extra work.


On 2018-03-27 20:44 +0100, Derek Buitenhuis wrote:
> Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
> ---
>  libavcodec/avcodec.h |   8 +++
>  libavutil/timeline.h | 160 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 168 insertions(+)
>  create mode 100644 libavutil/timeline.h
> 
> diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
> index 50c34db..6f54495 100644
> --- a/libavcodec/avcodec.h
> +++ b/libavcodec/avcodec.h
> @@ -1358,6 +1358,14 @@ enum AVPacketSideDataType {
>      AV_PKT_DATA_ENCRYPTION_INFO,
>  
>      /**
> +     * This side data contains timeline entries for a given stream. This type
> +     * will only apear as stream side data.
> +     *
> +     * The format is not part of the ABI, use av_timeline_* method to access.
> +     */
> +    AV_PKT_DATA_TIMELINE,
> +
> +    /**
>       * The number of side data types.
>       * This is not part of the public API/ABI in the sense that it may
>       * change when new side data types are added.
> diff --git a/libavutil/timeline.h b/libavutil/timeline.h
> new file mode 100644
> index 0000000..f1f3e1b
> --- /dev/null
> +++ b/libavutil/timeline.h
> @@ -0,0 +1,160 @@
> +/*
> + * Copyright (C) 2018 Derek Buitenhuis
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +#ifndef AVUTIL_TIMELINE_H
> +#define AVUTIL_TIMELINE_H
> +
> +#include <stddef.h>
> +#include <stdint.h>
> +
> +#include "rational.h"
> +
> +typedef struct AVTimelineEntry {
> +    /**
> +     * The start time of the given timeline entry, in stream timebase units.
> +     * If this value is AV_NOPTS_VALUE, you must display black for the duration
> +     * of the entry. For audio, silence must be played.
> +     */
> +    int64_t start;
> +
> +    /**
> +     * The duration of the given timeline entry, in steam timebase units.
                                                       ^^^^^
stream


> +     */
> +    int64_t duration;
> +
> +    /**
> +     * The rate at which this entry should be played back. The value is a multipier
                                                                             ^^^^^^^^^
multiplier


> +     * of the stream's rate, for example: 1.2 means play back this entry at 1.2x speed.
> +     * If this value is 0, then the first sample (located at 'start') must be displayed
> +     * for the duration of the entry.
> +     */
> +    AVRational media_rate;

Wouldn't it better be called rate_mult or rate_multiplier or rate_factor
or something in that fashion?

Nit: As this member is of type AVRational it might be better to word
it more precisely like "if the numerator is 0".


> +} AVTimelineEntry;
> +
> +/**
> + * Describes a timeline for a stream in terms of edits/entries. Each entry must be
> + * played back in order, according to the information in each. Each stream may have
> + * multiple timelines which need to be correlated between different streams.
> + */
> +typedef struct AVTimeline {
> +    /**
> +     * The ID of a given timeline. Since each stream may have multiple timelines
> +     * defined, this value is used to correlated different streams' timelines
                                         ^^^^^^^^^^
correlate?


> +     * which should be used together. For example, if one has two streams,
> +     * one video, and one audio, with two timelines each, the timelines
> +     * with matching IDs should be used in conjuction, to assure everything
                                              ^^^^^^^^^^
conjunction


> +     * is in sync and matches. The concept is similar to that of EditionUID
> +     * in Matroska.
> +     */
> +    uint32_t id;
> +
> +    /**
> +     * An in-order array of entries for the given timeline.
> +     * Each entry contains information on which samples to display for a
> +     * particular edit.
> +     */
> +    AVTimelineEntry *entries;
> +
> +    /**
> +     * Number of entries in the timeline.
> +     */
> +    size_t entry_count;

I think usually we prefix these with nb_ e.g. nb_entries
This comment also applies to all _count suffixed names
following later in this patch.


> +} AVTimeline;
> +
> +typedef struct AVTimelineList {

Would it be better to name it AVTimelineArray?


> +    /**
> +     * An array of timelines associated with the stream.
> +     */
> +    AVTimeline *timelines;
> +
> +    /**
> +     * Then number of timelines associated with the stream.
          ^^^^
Should probably just be dropped


> +     */
> +    size_t timeline_count;
> +} AVTimelineList;
> +
> +/**
> + * Allocates an AVTimeline strcture with the requested number of entires.
                              ^^^^^^^^
structure

> + *
> + * @param entry_count The number of entries in the timeline.
> + *
> + * @return The new AVTimeline structure, or NULL on error.
> + */
> +AVTimeline *av_timeline_alloc(size_t entry_count);

The allocated entries will be uninitialized?
Either way it's probably worth it to document it.


> +
> +
> +/**
> + * Frees an AVTimeline structure and its members.
> + *
> + * @param timeline The AVTimeline structure to free.
> + */
> +void av_timeline_free(AVTimeline *timeline);

Is passing a NULL pointer OK?
Would some other signature and semantic similar to av_freep be better?


> +
> +/**
> + * Allocates an AVTimeline strcture with room for the request number of timelines.
                              ^^^^^^^^                   ^^^^^^^
structure, requested

Should it say AVTimelineList instead of AVTimeline ?


> + *
> + * @param timeline_count The number of entries in the timeline.

  The number of _time lines_ that can be stored in the list

or

  The number of AVTimeline objects/instances that can be stored in the array


> + *
> + * @return The new AVTimelineList structure, or NULL on error.
> + */
> +AVTimelineList *av_timeline_list_alloc(size_t timeline_count);
> +
> +
> +/**
> + * Frees an AVTimelineList structure and its timelines, and their entries.
> + *
> + * @param timeline_list The AVTimelineList structure to free.
> + */
> +void av_timeline_list_free(AVTimeline *timeline_list);
> +
> +/**
> + * Allocates a new AVTimelineList structure and copies the data from an
> + * existing structure, allocating new members.
> + *
> + * @param timeline_list The existing AVTimelineList structure to copy data from.
> + *
> + * @return The new AVTimelineList structure, or NULL on error.
> + */
> +AVTimelineList *av_timeline_list_clone(const AVTimelineList *timeline_list);
> +
> +/**
> + * Creates a copy of the AVTimelineList that is contained in the given side
> + * data. The resulting struct should be passed to av_timeline_list_free()
> + * when done.
> + *
> + * @param side_data The side data array.
> + * @param side_data_size The size of the side data array.
> + *
> + * @return The new AVTimelineList structure, or NULL on error.
> + */
> +AVTimelineList *av_timeline_list_get_side_data(const uint8_t *side_data, size_t side_data_size);

The name av_timeline_list_get_side_data seems a bit odd to me.

Maybe it is in line with other functions, then it might be just
fine. To me something like

  av_timeline_list_create_from_side_data

or similar would be more concise.


> +
> +/**
> + * Allocates and initializes side data that holds a copy of the given timeline
> + * list. The resulting pointer should be either freed using av_free() or given
> + * to av_packet_add_side_data().
> + *
> + * @param timeline_list The AVTimelineList to put into side data.
> + * @param side_data_size A pointer to where the size can be filled in.
> + *
> + * @return The new side-data pointer, or NULL.
> + */
> +uint8_t *av_timeline_list_add_side_data(const AVTimelineList *timeline_list, size_t *side_data_size);

As above this function name doesn't sound very intuitive to me. I didn't
check if we have similar naming patterns around already. So it might
be consistent; I don't know.

> +
> +#endif /* AVUTIL_TIMELINE_H */


Looks quite OK as an initial sketch for an API. Though as already
mentioned I just read your proposal and I do not have deep knowledge
about the topic itself. From my experience the more subtle details 
tend to pop up when actually implementing the thing and they usually
leak into the API or show its deficiencies.


  Alexander
Derek Buitenhuis March 27, 2018, 11:33 p.m. UTC | #2
On 3/27/2018 11:46 PM, Alexander Strasser wrote:
>> +     * of the stream's rate, for example: 1.2 means play back this entry at 1.2x speed.
>> +     * If this value is 0, then the first sample (located at 'start') must be displayed
>> +     * for the duration of the entry.
>> +     */
>> +    AVRational media_rate;
> 
> Wouldn't it better be called rate_mult or rate_multiplier or rate_factor
> or something in that fashion?

The name is taken directly from the ISOBMFF and MOV spec, which is where this
field will mostly come from. I figured changing it might be needless renaming.
>> +    /**
>> +     * Number of entries in the timeline.
>> +     */
>> +    size_t entry_count;
> 
> I think usually we prefix these with nb_ e.g. nb_entries
> This comment also applies to all _count suffixed names
> following later in this patch.

I modeled this after the recently pushed encryption side data. That is, I wanted
to be consistent with other side data APIs. It uses the _count suffix rather than
an nb_ prefix. I can use whichever people prefer.


>> +} AVTimeline;
>> +
>> +typedef struct AVTimelineList {
> 
> Would it be better to name it AVTimelineArray?

I'd prefer to use whichever is convention in FFmpeg's codebase.


>> + *
>> + * @param entry_count The number of entries in the timeline.
>> + *
>> + * @return The new AVTimeline structure, or NULL on error.
>> + */
>> +AVTimeline *av_timeline_alloc(size_t entry_count);
> 
> The allocated entries will be uninitialized?
> Either way it's probably worth it to document it.

OK.

>> +/**
>> + * Frees an AVTimeline structure and its members.
>> + *
>> + * @param timeline The AVTimeline structure to free.
>> + */
>> +void av_timeline_free(AVTimeline *timeline);
> 
> Is passing a NULL pointer OK?
> Would some other signature and semantic similar to av_freep be better?

This is another thing where I tried to match how other side data APIs
like encryption had their APIs, and this is how they did it.


>> +
>> +/**
>> + * Allocates an AVTimeline strcture with room for the request number of timelines.
>                               ^^^^^^^^                   ^^^^^^^
> Should it say AVTimelineList instead of AVTimeline ?

Yeah.

>> + *
>> + * @param timeline_count The number of entries in the timeline.
> 
>   The number of _time lines_ that can be stored in the list


Yeah, copypaste fail. Woops.




>> +AVTimelineList *av_timeline_list_get_side_data(const uint8_t *side_data, size_t side_data_size);
> 
> The name av_timeline_list_get_side_data seems a bit odd to me.
> 
> Maybe it is in line with other functions, then it might be just
> fine. To me something like
> 
>   av_timeline_list_create_from_side_data
> 
> or similar would be more concise.

I would rather keep this name since it is consistent with how we namespace, and
name the other side data APIs
>> +uint8_t *av_timeline_list_add_side_data(const AVTimelineList *timeline_list, size_t *side_data_size);
> 
> As above this function name doesn't sound very intuitive to me. I didn't
> check if we have similar naming patterns around already. So it might
> be consistent; I don't know.

Ditto.

Cheers,
- Derek
Michael Niedermayer March 28, 2018, 6:12 p.m. UTC | #3
On Tue, Mar 27, 2018 at 08:44:01PM +0100, Derek Buitenhuis wrote:
> Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
> ---
>  libavcodec/avcodec.h |   8 +++
>  libavutil/timeline.h | 160 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 168 insertions(+)
>  create mode 100644 libavutil/timeline.h
> 
> diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
> index 50c34db..6f54495 100644
> --- a/libavcodec/avcodec.h
> +++ b/libavcodec/avcodec.h
> @@ -1358,6 +1358,14 @@ enum AVPacketSideDataType {
>      AV_PKT_DATA_ENCRYPTION_INFO,
>  
>      /**
> +     * This side data contains timeline entries for a given stream. This type
> +     * will only apear as stream side data.
> +     *
> +     * The format is not part of the ABI, use av_timeline_* method to access.
> +     */
> +    AV_PKT_DATA_TIMELINE,
> +
> +    /**
>       * The number of side data types.
>       * This is not part of the public API/ABI in the sense that it may
>       * change when new side data types are added.
> diff --git a/libavutil/timeline.h b/libavutil/timeline.h
> new file mode 100644
> index 0000000..f1f3e1b
> --- /dev/null
> +++ b/libavutil/timeline.h
> @@ -0,0 +1,160 @@
> +/*
> + * Copyright (C) 2018 Derek Buitenhuis
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +#ifndef AVUTIL_TIMELINE_H
> +#define AVUTIL_TIMELINE_H
> +
> +#include <stddef.h>
> +#include <stdint.h>
> +
> +#include "rational.h"
> +
> +typedef struct AVTimelineEntry {
> +    /**
> +     * The start time of the given timeline entry, in stream timebase units.
> +     * If this value is AV_NOPTS_VALUE, you must display black for the duration
> +     * of the entry. For audio, silence must be played.
> +     */
> +    int64_t start;
> +
> +    /**
> +     * The duration of the given timeline entry, in steam timebase units.
> +     */
> +    int64_t duration;
> +
> +    /**
> +     * The rate at which this entry should be played back. The value is a multipier
> +     * of the stream's rate, for example: 1.2 means play back this entry at 1.2x speed.
> +     * If this value is 0, then the first sample (located at 'start') must be displayed
> +     * for the duration of the entry.
> +     */
> +    AVRational media_rate;
> +} AVTimelineEntry;
> +
> +/**
> + * Describes a timeline for a stream in terms of edits/entries. Each entry must be
> + * played back in order, according to the information in each. Each stream may have
> + * multiple timelines which need to be correlated between different streams.
> + */
> +typedef struct AVTimeline {
> +    /**
> +     * The ID of a given timeline. Since each stream may have multiple timelines
> +     * defined, this value is used to correlated different streams' timelines
> +     * which should be used together. For example, if one has two streams,
> +     * one video, and one audio, with two timelines each, the timelines
> +     * with matching IDs should be used in conjuction, to assure everything
> +     * is in sync and matches. The concept is similar to that of EditionUID
> +     * in Matroska.
> +     */
> +    uint32_t id;
> +
> +    /**
> +     * An in-order array of entries for the given timeline.
> +     * Each entry contains information on which samples to display for a
> +     * particular edit.
> +     */
> +    AVTimelineEntry *entries;

This is problematic as its non extensible. (unless iam missing something)
Consider that a field is added to AVTimelineEntry, the entries array would
have a larger element size and that would require all user apps to be rebuild

Also, if you want to support quicktime more fully theres more needed.
QT can for example switch even between codecs mid stream IIRC
not sure we want to support this

thx

[...]
Jan Ekström March 28, 2018, 6:35 p.m. UTC | #4
On Wed, Mar 28, 2018 at 9:12 PM, Michael Niedermayer
<michael@niedermayer.cc> wrote:
>> +    /**
>> +     * An in-order array of entries for the given timeline.
>> +     * Each entry contains information on which samples to display for a
>> +     * particular edit.
>> +     */
>> +    AVTimelineEntry *entries;
>
> This is problematic as its non extensible. (unless iam missing something)
> Consider that a field is added to AVTimelineEntry, the entries array would
> have a larger element size and that would require all user apps to be rebuild
>

So you would prefer some sort of iterator? Or a size entry like with
some of the entries in AVEncryptionInfo? I have a feeling this was
based off of the following added into AVEncryptionInfo:
+    AVSubsampleEncryptionInfo *subsamples;
+    uint32_t subsample_count;

Which now would seem to be similarly not being extendable?

> Also, if you want to support quicktime more fully theres more needed.
> QT can for example switch even between codecs mid stream IIRC
> not sure we want to support this
>

Thankfully, that has little to do with virtual timelines (edit lists)
as far as I can tell?

Jan
Derek Buitenhuis March 28, 2018, 6:42 p.m. UTC | #5
On 3/28/2018 7:12 PM, Michael Niedermayer wrote:
> This is problematic as its non extensible. (unless iam missing something)
> Consider that a field is added to AVTimelineEntry, the entries array would
> have a larger element size and that would require all user apps to be rebuild

Do you have an actual suggest though? Afaict, this is exactly how the recently
pushed encryption support works with AVSubsampleEncryptionInfo entries in the
AVEncryptionInfo struct. is this now not acceptable, even though we pushed
that a few days ago after months of review?

I'm all ears to a solution...

> 
> Also, if you want to support quicktime more fully theres more needed.
> QT can for example switch even between codecs mid stream IIRC
> not sure we want to support this

Codec switching is not related to timelines at all. It's entirely separate...

- Derek
Nicolas George March 28, 2018, 6:47 p.m. UTC | #6
Derek Buitenhuis (2018-03-28):
> Do you have an actual suggest though?

Maybe not using a badly designed API that requires everything to be
stored in a serialized uint8_t array.

>					Afaict, this is exactly how the recently
> pushed encryption support works with AVSubsampleEncryptionInfo entries in the
> AVEncryptionInfo struct. is this now not acceptable, even though we pushed
> that a few days ago after months of review?

It just means that these months of review failed to see that flaw. Let
us hope we will not need to extend AVSubsampleEncryptionInfo.

Regards,
wm4 March 28, 2018, 6:49 p.m. UTC | #7
On Wed, 28 Mar 2018 19:42:21 +0100
Derek Buitenhuis <derek.buitenhuis@gmail.com> wrote:

> On 3/28/2018 7:12 PM, Michael Niedermayer wrote:
> > This is problematic as its non extensible. (unless iam missing something)
> > Consider that a field is added to AVTimelineEntry, the entries array would
> > have a larger element size and that would require all user apps to be rebuild  
> 
> Do you have an actual suggest though? Afaict, this is exactly how the recently
> pushed encryption support works with AVSubsampleEncryptionInfo entries in the
> AVEncryptionInfo struct. is this now not acceptable, even though we pushed
> that a few days ago after months of review?
> 
> I'm all ears to a solution...

You just need to change * to **. Then the size of the struct doesn't
matter anymore for array access and can be excluded from the ABI.

> > 
> > Also, if you want to support quicktime more fully theres more needed.
> > QT can for example switch even between codecs mid stream IIRC
> > not sure we want to support this  
> 
> Codec switching is not related to timelines at all. It's entirely separate...
> 
> - Derek
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Derek Buitenhuis March 28, 2018, 6:51 p.m. UTC | #8
On 3/28/2018 7:47 PM, Nicolas George wrote:
> Derek Buitenhuis (2018-03-28):
>> Do you have an actual suggest though?
> 
> Maybe not using a badly designed API that requires everything to be
> stored in a serialized uint8_t array.

Isn't this the same as saying "don't use the existing side data APIs at all"?

Do you expect me to design a whole new side data API first? 

>> 					Afaict, this is exactly how the recently
>> pushed encryption support works with AVSubsampleEncryptionInfo entries in the
>> AVEncryptionInfo struct. is this now not acceptable, even though we pushed
>> that a few days ago after months of review?
> 
> It just means that these months of review failed to see that flaw. Let
> us hope we will not need to extend AVSubsampleEncryptionInfo.

I guess the only consistent thing about FFmpeg APIs is that they are
inconsistent.

I am still open to a different propose solution.

- Derek
Derek Buitenhuis March 28, 2018, 6:53 p.m. UTC | #9
On 3/28/2018 7:49 PM, wm4 wrote:
> You just need to change * to **. Then the size of the struct doesn't
> matter anymore for array access and can be excluded from the ABI.

To be thorough: Does anyone else have opinions on this approach?

- Derek
James Almer March 28, 2018, 6:56 p.m. UTC | #10
On 3/28/2018 3:53 PM, Derek Buitenhuis wrote:
> On 3/28/2018 7:49 PM, wm4 wrote:
>> You just need to change * to **. Then the size of the struct doesn't
>> matter anymore for array access and can be excluded from the ABI.
> 
> To be thorough: Does anyone else have opinions on this approach?

It's what's being done in AVFrame side data, so i guess acceptable?
James Almer March 28, 2018, 6:57 p.m. UTC | #11
On 3/28/2018 3:47 PM, Nicolas George wrote:
> Derek Buitenhuis (2018-03-28):
>> Do you have an actual suggest though?
> 
> Maybe not using a badly designed API that requires everything to be
> stored in a serialized uint8_t array.
> 
>> 					Afaict, this is exactly how the recently
>> pushed encryption support works with AVSubsampleEncryptionInfo entries in the
>> AVEncryptionInfo struct. is this now not acceptable, even though we pushed
>> that a few days ago after months of review?
> 
> It just means that these months of review failed to see that flaw. Let
> us hope we will not need to extend AVSubsampleEncryptionInfo.

We're very much in time to address it in AVSubsampleEncryptionInfo. It
was pushed only a few days ago.
There's no need to let it stay as is if a better solution is available.
Michael Niedermayer March 29, 2018, 1:06 a.m. UTC | #12
On Wed, Mar 28, 2018 at 07:53:31PM +0100, Derek Buitenhuis wrote:
> On 3/28/2018 7:49 PM, wm4 wrote:
> > You just need to change * to **. Then the size of the struct doesn't
> > matter anymore for array access and can be excluded from the ABI.
> 
> To be thorough: Does anyone else have opinions on this approach?

Its how AVStreams are living in AVFormatContext too
so to me it seems thats the obvious and consistent way to go.
i do not know if there are any unforseen issues with that ...

thx

[...]
Michael Niedermayer March 29, 2018, 1:13 a.m. UTC | #13
On Wed, Mar 28, 2018 at 09:35:12PM +0300, Jan Ekström wrote:
> On Wed, Mar 28, 2018 at 9:12 PM, Michael Niedermayer
> <michael@niedermayer.cc> wrote:
> >> +    /**
> >> +     * An in-order array of entries for the given timeline.
> >> +     * Each entry contains information on which samples to display for a
> >> +     * particular edit.
> >> +     */
> >> +    AVTimelineEntry *entries;
> >
> > This is problematic as its non extensible. (unless iam missing something)
> > Consider that a field is added to AVTimelineEntry, the entries array would
> > have a larger element size and that would require all user apps to be rebuild
> >
> 
> So you would prefer some sort of iterator? Or a size entry like with
> some of the entries in AVEncryptionInfo? I have a feeling this was
> based off of the following added into AVEncryptionInfo:
> +    AVSubsampleEncryptionInfo *subsamples;
> +    uint32_t subsample_count;
> 
> Which now would seem to be similarly not being extendable?
> 
> > Also, if you want to support quicktime more fully theres more needed.
> > QT can for example switch even between codecs mid stream IIRC
> > not sure we want to support this
> >
> 
> Thankfully, that has little to do with virtual timelines (edit lists)
> as far as I can tell?

Well, no unless we want a single unified API that represents information
about timespans.
We can use completely unrelated systems to handle the virtual timeline,
the codec and related changes, chapters, ...


[...]
Derek Buitenhuis March 29, 2018, 1:43 p.m. UTC | #14
On 3/29/2018 2:06 AM, Michael Niedermayer wrote:
> Its how AVStreams are living in AVFormatContext too
> so to me it seems thats the obvious and consistent way to go.
> i do not know if there are any unforseen issues with that ...

Sounds like the best way to go about it then.

Thanks,
- Derek
Derek Buitenhuis March 29, 2018, 1:53 p.m. UTC | #15
On 3/29/2018 2:13 AM, Michael Niedermayer wrote:
> Well, no unless we want a single unified API that represents information
> about timespans.
> We can use completely unrelated systems to handle the virtual timeline,
> the codec and related changes, chapters, ...

Personal opinion time: From design and use perspective, an single unified
API to handle all of those things an their edge cases sounds like a
nightmare to use and write.

- Derek
Derek Buitenhuis March 29, 2018, 1:55 p.m. UTC | #16
On 3/27/2018 8:44 PM, Derek Buitenhuis wrote:
> Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
> ---
>  libavcodec/avcodec.h |   8 +++
>  libavutil/timeline.h | 160 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 168 insertions(+)
>  create mode 100644 libavutil/timeline.h

A few notes from Steve on IRC:

* The ID should be 64bit, or Matroska UUIDs cannot be cleanly mapped 1:1.
* There should be a field to indicate which timeline is the default 'on'
  timeline (e.g. the default/on edition in Matroska).

- Derek
Michael Niedermayer March 29, 2018, 6:49 p.m. UTC | #17
On Thu, Mar 29, 2018 at 02:53:52PM +0100, Derek Buitenhuis wrote:
> On 3/29/2018 2:13 AM, Michael Niedermayer wrote:
> > Well, no unless we want a single unified API that represents information
> > about timespans.
> > We can use completely unrelated systems to handle the virtual timeline,
> > the codec and related changes, chapters, ...
> 
> Personal opinion time: From design and use perspective, an single unified
> API to handle all of those things an their edge cases sounds like a
> nightmare to use and write.

That is not what i meant

what i meant is to align the APIs that handle timespan related information
and not have totally different interfaces for each.
For example they might share the same struct in some cases to deliver data
They do have in common that they all in/export data for a sequence of timespans

Also the function interfaces could be aligned to each other

They also may have in common that some formats store the data interleaved
at the time and some store it in a main header.
If this is so the case of collecting all this info from the whole
duration of a file to store it in the output file header at the start
may be a situation common to multiple of these.
A similar API may allow simplifying whichever way we handle this

Iam sure there are more advantages of having APIs which deal with
similar types of data to be similar ...

thx

[...]
Michael Niedermayer March 29, 2018, 6:55 p.m. UTC | #18
On Thu, Mar 29, 2018 at 08:49:21PM +0200, Michael Niedermayer wrote:
> On Thu, Mar 29, 2018 at 02:53:52PM +0100, Derek Buitenhuis wrote:
> > On 3/29/2018 2:13 AM, Michael Niedermayer wrote:
> > > Well, no unless we want a single unified API that represents information
> > > about timespans.
> > > We can use completely unrelated systems to handle the virtual timeline,
> > > the codec and related changes, chapters, ...
> > 
> > Personal opinion time: From design and use perspective, an single unified
> > API to handle all of those things an their edge cases sounds like a
> > nightmare to use and write.
> 
> That is not what i meant
> 
> what i meant is to align the APIs that handle timespan related information
> and not have totally different interfaces for each.
> For example they might share the same struct in some cases to deliver data
> They do have in common that they all in/export data for a sequence of timespans
> 
> Also the function interfaces could be aligned to each other
> 
> They also may have in common that some formats store the data interleaved
> at the time and some store it in a main header.
> If this is so the case of collecting all this info from the whole
> duration of a file to store it in the output file header at the start
> may be a situation common to multiple of these.
> A similar API may allow simplifying whichever way we handle this
> 
> Iam sure there are more advantages of having APIs which deal with
> similar types of data to be similar ...

an example of an API that serves a similar kind of information but is
differnt is AVChapter
I think we should minimize the amount of different public structs that 
describe timespans when that is easily and cleanly possible

thx


[...]
wm4 March 29, 2018, 7:08 p.m. UTC | #19
On Thu, 29 Mar 2018 20:55:52 +0200
Michael Niedermayer <michael@niedermayer.cc> wrote:

> On Thu, Mar 29, 2018 at 08:49:21PM +0200, Michael Niedermayer wrote:
> > On Thu, Mar 29, 2018 at 02:53:52PM +0100, Derek Buitenhuis wrote:  
> > > On 3/29/2018 2:13 AM, Michael Niedermayer wrote:  
> > > > Well, no unless we want a single unified API that represents information
> > > > about timespans.
> > > > We can use completely unrelated systems to handle the virtual timeline,
> > > > the codec and related changes, chapters, ...  
> > > 
> > > Personal opinion time: From design and use perspective, an single unified
> > > API to handle all of those things an their edge cases sounds like a
> > > nightmare to use and write.  
> > 
> > That is not what i meant
> > 
> > what i meant is to align the APIs that handle timespan related information
> > and not have totally different interfaces for each.
> > For example they might share the same struct in some cases to deliver data
> > They do have in common that they all in/export data for a sequence of timespans
> > 
> > Also the function interfaces could be aligned to each other
> > 
> > They also may have in common that some formats store the data interleaved
> > at the time and some store it in a main header.
> > If this is so the case of collecting all this info from the whole
> > duration of a file to store it in the output file header at the start
> > may be a situation common to multiple of these.
> > A similar API may allow simplifying whichever way we handle this
> > 
> > Iam sure there are more advantages of having APIs which deal with
> > similar types of data to be similar ...  
> 
> an example of an API that serves a similar kind of information but is
> differnt is AVChapter
> I think we should minimize the amount of different public structs that 
> describe timespans when that is easily and cleanly possible

Like how?

This sounds like a bikeshed smoke bomb to stop all progress again.
Michael Niedermayer March 30, 2018, 10:20 a.m. UTC | #20
On Thu, Mar 29, 2018 at 09:08:52PM +0200, wm4 wrote:
> On Thu, 29 Mar 2018 20:55:52 +0200
> Michael Niedermayer <michael@niedermayer.cc> wrote:
> 
> > On Thu, Mar 29, 2018 at 08:49:21PM +0200, Michael Niedermayer wrote:
> > > On Thu, Mar 29, 2018 at 02:53:52PM +0100, Derek Buitenhuis wrote:  
> > > > On 3/29/2018 2:13 AM, Michael Niedermayer wrote:  
> > > > > Well, no unless we want a single unified API that represents information
> > > > > about timespans.
> > > > > We can use completely unrelated systems to handle the virtual timeline,
> > > > > the codec and related changes, chapters, ...  
> > > > 
> > > > Personal opinion time: From design and use perspective, an single unified
> > > > API to handle all of those things an their edge cases sounds like a
> > > > nightmare to use and write.  
> > > 
> > > That is not what i meant
> > > 
> > > what i meant is to align the APIs that handle timespan related information
> > > and not have totally different interfaces for each.
> > > For example they might share the same struct in some cases to deliver data
> > > They do have in common that they all in/export data for a sequence of timespans
> > > 
> > > Also the function interfaces could be aligned to each other
> > > 
> > > They also may have in common that some formats store the data interleaved
> > > at the time and some store it in a main header.
> > > If this is so the case of collecting all this info from the whole
> > > duration of a file to store it in the output file header at the start
> > > may be a situation common to multiple of these.
> > > A similar API may allow simplifying whichever way we handle this
> > > 
> > > Iam sure there are more advantages of having APIs which deal with
> > > similar types of data to be similar ...  
> > 
> > an example of an API that serves a similar kind of information but is
> > differnt is AVChapter
> > I think we should minimize the amount of different public structs that 
> > describe timespans when that is easily and cleanly possible
> 

> Like how?

Wait, you cannot think of a single way on how to store timespan data
except by using a completely seperate and different API for each case where
we need to store timespan data? Like not even similarity in the order of
arguments of functions.
because thats ultimatly the only thing i suggested we should not do.


> 
> This sounds like a bikeshed smoke bomb to stop all progress again.

Why do you try to incite hostilities and discord ?
I mean derek (who wrote the patch) has not even had time to reply to my
comment. 
I am very interrested in what his oppinion is, does he agree? does he
think its impossible or too hard or irrelevant ? Or does he see something
i didnt think of at all.

Lets wait for derek before arguing over it.

thx

[...]
wm4 March 30, 2018, 1:13 p.m. UTC | #21
On Fri, 30 Mar 2018 12:20:38 +0200
Michael Niedermayer <michael@niedermayer.cc> wrote:

> On Thu, Mar 29, 2018 at 09:08:52PM +0200, wm4 wrote:
> > On Thu, 29 Mar 2018 20:55:52 +0200
> > Michael Niedermayer <michael@niedermayer.cc> wrote:
> >   
> > > On Thu, Mar 29, 2018 at 08:49:21PM +0200, Michael Niedermayer wrote:  
> > > > On Thu, Mar 29, 2018 at 02:53:52PM +0100, Derek Buitenhuis wrote:    
> > > > > On 3/29/2018 2:13 AM, Michael Niedermayer wrote:    
> > > > > > Well, no unless we want a single unified API that represents information
> > > > > > about timespans.
> > > > > > We can use completely unrelated systems to handle the virtual timeline,
> > > > > > the codec and related changes, chapters, ...    
> > > > > 
> > > > > Personal opinion time: From design and use perspective, an single unified
> > > > > API to handle all of those things an their edge cases sounds like a
> > > > > nightmare to use and write.    
> > > > 
> > > > That is not what i meant
> > > > 
> > > > what i meant is to align the APIs that handle timespan related information
> > > > and not have totally different interfaces for each.
> > > > For example they might share the same struct in some cases to deliver data
> > > > They do have in common that they all in/export data for a sequence of timespans
> > > > 
> > > > Also the function interfaces could be aligned to each other
> > > > 
> > > > They also may have in common that some formats store the data interleaved
> > > > at the time and some store it in a main header.
> > > > If this is so the case of collecting all this info from the whole
> > > > duration of a file to store it in the output file header at the start
> > > > may be a situation common to multiple of these.
> > > > A similar API may allow simplifying whichever way we handle this
> > > > 
> > > > Iam sure there are more advantages of having APIs which deal with
> > > > similar types of data to be similar ...    
> > > 
> > > an example of an API that serves a similar kind of information but is
> > > differnt is AVChapter
> > > I think we should minimize the amount of different public structs that 
> > > describe timespans when that is easily and cleanly possible  
> >   
> 
> > Like how?  
> 
> Wait, you cannot think of a single way on how to store timespan data
> except by using a completely seperate and different API for each case where
> we need to store timespan data? Like not even similarity in the order of
> arguments of functions.
> because thats ultimatly the only thing i suggested we should not do.

I can think of thousand things. I could bikeshed proposals all day,
believe me.

But making very vague suggestions that can't be acted upon and that are
out of this world anyway is not a good way to make progress. You're
just lengthening the discussion by forcing Derek to pull concrete ideas
out of your nose.

If you have an idea that you think is much better, just make a concrete
proposal.

> 
> > 
> > This sounds like a bikeshed smoke bomb to stop all progress again.  
> 
> Why do you try to incite hostilities and discord ?

Well, why do you seemingly put brakes on this patch, with nothing
real to say, just vague proposals about generalizing everything to a
generic thing?

> I mean derek (who wrote the patch) has not even had time to reply to my
> comment. 
> I am very interrested in what his oppinion is, does he agree? does he
> think its impossible or too hard or irrelevant ? Or does he see something
> i didnt think of at all.
> 
> Lets wait for derek before arguing over it.
Derek Buitenhuis March 31, 2018, 3:25 a.m. UTC | #22
On 3/30/2018 11:20 AM, Michael Niedermayer wrote:
> I mean derek (who wrote the patch) has not even had time to reply to my
> comment. 
> I am very interrested in what his oppinion is, does he agree? does he
> think its impossible or too hard or irrelevant ? Or does he see something
> i didnt think of at all.
> 
> Lets wait for derek before arguing over it.

Been on planes the last 1.5 days - will respond properly tomorrow.

- Derek
Derek Buitenhuis March 31, 2018, 2:39 p.m. UTC | #23
On 3/29/2018 7:55 PM, Michael Niedermayer wrote:
>> That is not what i meant
>>
>> what i meant is to align the APIs that handle timespan related information
>> and not have totally different interfaces for each.
>> For example they might share the same struct in some cases to deliver data
>> They do have in common that they all in/export data for a sequence of timespans

[...]

>> They also may have in common that some formats store the data interleaved
>> at the time and some store it in a main header.

Not sure I follow what this has to do with timelines? There is no format that
exists that store timeline data interleaved, as far as I know - it is a
purely theoretical scenario.

>> If this is so the case of collecting all this info from the whole
>> duration of a file to store it in the output file header at the start
>> may be a situation common to multiple of these.
>> A similar API may allow simplifying whichever way we handle this

[...]

>> Iam sure there are more advantages of having APIs which deal with
>> similar types of data to be similar ...
> 
> an example of an API that serves a similar kind of information but is
> differnt is AVChapter
> I think we should minimize the amount of different public structs that 
> describe timespans when that is easily and cleanly possible

What do you have in mind with regards to 'sharing'? I do not particularly
agree that chapters and edits should share stuff because they both happen to have
start and end times. They're quite different things, even if some formats have
conflated them (design flaw IMO).

- Derek
Michael Niedermayer March 31, 2018, 11:44 p.m. UTC | #24
On Sat, Mar 31, 2018 at 03:39:58PM +0100, Derek Buitenhuis wrote:
> On 3/29/2018 7:55 PM, Michael Niedermayer wrote:
> >> That is not what i meant
> >>
> >> what i meant is to align the APIs that handle timespan related information
> >> and not have totally different interfaces for each.
> >> For example they might share the same struct in some cases to deliver data
> >> They do have in common that they all in/export data for a sequence of timespans
> 
> [...]
> 
> >> They also may have in common that some formats store the data interleaved
> >> at the time and some store it in a main header.
> 
> Not sure I follow what this has to do with timelines? There is no format that
> exists that store timeline data interleaved, as far as I know - it is a
> purely theoretical scenario.

that surprises me. But if this case never occurs (or will occur) thats a good
thing. Fewer cases to consider.



> 
> >> If this is so the case of collecting all this info from the whole
> >> duration of a file to store it in the output file header at the start
> >> may be a situation common to multiple of these.
> >> A similar API may allow simplifying whichever way we handle this
> 
> [...]
> 

> >> Iam sure there are more advantages of having APIs which deal with
> >> similar types of data to be similar ...
> > 
> > an example of an API that serves a similar kind of information but is
> > differnt is AVChapter
> > I think we should minimize the amount of different public structs that 
> > describe timespans when that is easily and cleanly possible
> 
> What do you have in mind with regards to 'sharing'? I do not particularly
> agree that chapters and edits should share stuff because they both happen to have
> start and end times. They're quite different things, even if some formats have
> conflated them (design flaw IMO).

I did not think deeply about this but the obvious 3 options are to either
give each its own struct and align the APIs used to access them

to show this just with one function:
AVTimelineList         *av_timeline_list_alloc(size_t timeline_count);
AVChapterList          *av_chapterlist_alloc(size_t chapter_count);
AVTimelineMetadataList *av_timeline_metadata_list_alloc(size_t metadata_count);
AVTimelineCodecList    *av_timeline_codec_list_alloc(size_t codec_count);

or to have some more common struct maybe similar to side data. It is after all
some kind of side/meta data that is specific to timespans.

or to put all in AVFormatContext as normal allocated arrays like chapters are
currently.

What i find a bit ugly is that each API uses the style popular at the time
at which it is added and they are so far each completetly different even though
they are semantically handling the same "unit" of data. that is information
about "timespans".

We have AVChapters which are a public field in AVFormatContext
There is AV_PKT_DATA_STRINGS_METADATA which is a binary specified side data and
there is the newly proposed timeline which is opaque sidedata with accessor functions

Its not the inherent differences of the data that i want to push into a square hole
but all the common parts that could be the same and are very different.
If 2 things dont fit together iam certainly not intending to suggest to put them
together. Nor that 2 things that have reason to be behind differnt APIs should be
not.

I mainly wanted to point out that our APIs for these things are quite
different and that this should be thought about when adding a new API.
Its not a specific cleanup or change i have in mind, more so i wanted to
call attention to the inconsistency that there is between APIs
so it is considered when adding a new API.
If you determine that theres nothing that should be changed from what you
intended previously, thats perfectly fine with me.

Thanks

[...]
Derek Buitenhuis April 1, 2018, 2:39 p.m. UTC | #25
On 4/1/2018 12:44 AM, Michael Niedermayer wrote:
>> Not sure I follow what this has to do with timelines? There is no format that
>> exists that store timeline data interleaved, as far as I know - it is a
>> purely theoretical scenario.
> 
> that surprises me. But if this case never occurs (or will occur) thats a good
> thing. Fewer cases to consider.

At least as far as I'm aware, yes.

>> What do you have in mind with regards to 'sharing'? I do not particularly
>> agree that chapters and edits should share stuff because they both happen to have
>> start and end times. They're quite different things, even if some formats have
>> conflated them (design flaw IMO).
> 
> I did not think deeply about this but the obvious 3 options are to either
> give each its own struct and align the APIs used to access them
> 
> to show this just with one function:
> AVTimelineList         *av_timeline_list_alloc(size_t timeline_count);
> AVChapterList          *av_chapterlist_alloc(size_t chapter_count);
> AVTimelineMetadataList *av_timeline_metadata_list_alloc(size_t metadata_count);
> AVTimelineCodecList    *av_timeline_codec_list_alloc(size_t codec_count);
> 
> or to have some more common struct maybe similar to side data. It is after all
> some kind of side/meta data that is specific to timespans.
> 
> or to put all in AVFormatContext as normal allocated arrays like chapters are
> currently.
> 
> What i find a bit ugly is that each API uses the style popular at the time
> at which it is added and they are so far each completetly different even though
> they are semantically handling the same "unit" of data. that is information
> about "timespans".
> 
> We have AVChapters which are a public field in AVFormatContext
> There is AV_PKT_DATA_STRINGS_METADATA which is a binary specified side data and
> there is the newly proposed timeline which is opaque sidedata with accessor functions
> 
> Its not the inherent differences of the data that i want to push into a square hole
> but all the common parts that could be the same and are very different.
> If 2 things dont fit together iam certainly not intending to suggest to put them
> together. Nor that 2 things that have reason to be behind differnt APIs should be
> not.
> 
> I mainly wanted to point out that our APIs for these things are quite
> different and that this should be thought about when adding a new API.
> Its not a specific cleanup or change i have in mind, more so i wanted to
> call attention to the inconsistency that there is between APIs
> so it is considered when adding a new API.
> If you determine that theres nothing that should be changed from what you
> intended previously, thats perfectly fine with me.


I agree that the APIs are annoyingly different, but that they should remain
separate APIs. The suggestion of aligned (same names, args, etc.) seems the
most reasonable to me - consistency.

I don't really think they can share some common structs, but they can certainly
be aligned.

However, some of the stuff you listed doesn't make sense to me:

* Codec changes are not part of timeline data; they're entirely separate things
  in every format I'm aware of. In MPEG-TS it just changes, in MOV/MP4 it is done
  via multiple 'stsd' atoms/boxes, which don't really carry any time related
  data - there is another atom/box that maps samples to a particular 'stsd',
  and that's it.
* What is AVTimelineMetadataList meant to be?

Lastly, what is the intent here? Is it to align all of these before timeline
support, or to deprecate and align after? Right now the API is align with
other side data APIs. I need to know which scenario this is before I go
ahead and implement this.

- Derek
Michael Niedermayer April 1, 2018, 9:38 p.m. UTC | #26
On Sun, Apr 01, 2018 at 03:39:07PM +0100, Derek Buitenhuis wrote:
> On 4/1/2018 12:44 AM, Michael Niedermayer wrote:
> >> Not sure I follow what this has to do with timelines? There is no format that
> >> exists that store timeline data interleaved, as far as I know - it is a
> >> purely theoretical scenario.
> > 
> > that surprises me. But if this case never occurs (or will occur) thats a good
> > thing. Fewer cases to consider.
> 
> At least as far as I'm aware, yes.
> 
> >> What do you have in mind with regards to 'sharing'? I do not particularly
> >> agree that chapters and edits should share stuff because they both happen to have
> >> start and end times. They're quite different things, even if some formats have
> >> conflated them (design flaw IMO).
> > 
> > I did not think deeply about this but the obvious 3 options are to either
> > give each its own struct and align the APIs used to access them
> > 
> > to show this just with one function:
> > AVTimelineList         *av_timeline_list_alloc(size_t timeline_count);
> > AVChapterList          *av_chapterlist_alloc(size_t chapter_count);
> > AVTimelineMetadataList *av_timeline_metadata_list_alloc(size_t metadata_count);
> > AVTimelineCodecList    *av_timeline_codec_list_alloc(size_t codec_count);
> > 
> > or to have some more common struct maybe similar to side data. It is after all
> > some kind of side/meta data that is specific to timespans.
> > 
> > or to put all in AVFormatContext as normal allocated arrays like chapters are
> > currently.
> > 
> > What i find a bit ugly is that each API uses the style popular at the time
> > at which it is added and they are so far each completetly different even though
> > they are semantically handling the same "unit" of data. that is information
> > about "timespans".
> > 
> > We have AVChapters which are a public field in AVFormatContext
> > There is AV_PKT_DATA_STRINGS_METADATA which is a binary specified side data and
> > there is the newly proposed timeline which is opaque sidedata with accessor functions
> > 
> > Its not the inherent differences of the data that i want to push into a square hole
> > but all the common parts that could be the same and are very different.
> > If 2 things dont fit together iam certainly not intending to suggest to put them
> > together. Nor that 2 things that have reason to be behind differnt APIs should be
> > not.
> > 
> > I mainly wanted to point out that our APIs for these things are quite
> > different and that this should be thought about when adding a new API.
> > Its not a specific cleanup or change i have in mind, more so i wanted to
> > call attention to the inconsistency that there is between APIs
> > so it is considered when adding a new API.
> > If you determine that theres nothing that should be changed from what you
> > intended previously, thats perfectly fine with me.
> 
> 
> I agree that the APIs are annoyingly different, but that they should remain
> separate APIs. The suggestion of aligned (same names, args, etc.) seems the
> most reasonable to me - consistency.
> 
> I don't really think they can share some common structs, but they can certainly
> be aligned.
> 

> However, some of the stuff you listed doesn't make sense to me:
> 
> * Codec changes are not part of timeline data; they're entirely separate things
>   in every format I'm aware of. In MPEG-TS it just changes, in MOV/MP4 it is done
>   via multiple 'stsd' atoms/boxes, which don't really carry any time related
>   data - there is another atom/box that maps samples to a particular 'stsd',
>   and that's it.

yes but in an abstract way each timespan has one codec
its the same kind of "data structure" even if its not stored this way in 
formats.



> * What is AVTimelineMetadataList meant to be?

what we have in AV_PKT_DATA_STRINGS_METADATA


> 
> Lastly, what is the intent here? Is it to align all of these before timeline
> support, or to deprecate and align after? Right now the API is align with
> other side data APIs. I need to know which scenario this is before I go
> ahead and implement this.

Part of the intend is to have a easier to understand API where things follow
similar design.
Another part is "compatibility". Ideally a format like mpegts that just
changes codecs should fill whatever data structure we use for codec changes
in the future. And simply passing these structures over to a mov/mp4 muxer
would make it write stsd accordingly.
It would be suboptimal for example if these 2 would use different ways to
export this codec change data at least unless we also provide code to
convert. 
I didnt now check what exactly is stored in it currently but,
We do with AV_PKT_DATA_STRINGS_METADATA, the chapter array (which also has
metadata in it) possibly already something that goes a bit in that direction

Its absolutely not my intend to suggest that other APIs would need to
be changed before. Much more to bring the current incosistent APIs to people
attention so it is given a thought when a new API is added.
deprecating APIs and replacing them also has a cost for our users ...

Thanks

[...]
Derek Buitenhuis April 2, 2018, 3:46 p.m. UTC | #27
On 4/1/2018 10:38 PM, Michael Niedermayer wrote:
>> I agree that the APIs are annoyingly different, but that they should remain
>> separate APIs. The suggestion of aligned (same names, args, etc.) seems the
>> most reasonable to me - consistency.
>>
>> I don't really think they can share some common structs, but they can certainly
>> be aligned.
>>
> 
>> However, some of the stuff you listed doesn't make sense to me:
>>
>> * Codec changes are not part of timeline data; they're entirely separate things
>>   in every format I'm aware of. In MPEG-TS it just changes, in MOV/MP4 it is done
>>   via multiple 'stsd' atoms/boxes, which don't really carry any time related
>>   data - there is another atom/box that maps samples to a particular 'stsd',
>>   and that's it.
> 
> yes but in an abstract way each timespan has one codec
> its the same kind of "data structure" even if its not stored this way in 
> formats.

I disagree with this. Codec changes shouldn't be represented as timespans.
They may happen in-band, with a new PMT, new stsd, etc. Short of indexing
and possibly decoding the entire file, you won't have a timespan at all
to output. I also think its just a fundamentally wrong way to go about
supporting codec or format changes.

>> * What is AVTimelineMetadataList meant to be?
> 
> what we have in AV_PKT_DATA_STRINGS_METADATA

That is very vague. Could you elaborate on this? A lot of different and not
necessarily time related data can be stored in this generic type. I know
e.g. WebM DASH cue points are, but that's the only one I think is even vaguely
related. And cue points are not timestamps, but a singular timestamp.

>> Lastly, what is the intent here? Is it to align all of these before timeline
>> support, or to deprecate and align after? Right now the API is align with
>> other side data APIs. I need to know which scenario this is before I go
>> ahead and implement this.
> 
> Part of the intend is to have a easier to understand API where things follow
> similar design.

I don't think anybody will disagree with this. But I need a clear path forward,
and not a vague 'would should be consistent' which could mean many things.

> Another part is "compatibility". Ideally a format like mpegts that just
> changes codecs should fill whatever data structure we use for codec changes
> in the future. And simply passing these structures over to a mov/mp4 muxer
> would make it write stsd accordingly.

This can be accomplished with the current packet side data, I think. The code
just hasn't been written to do so yet. Possibly extending or replacing the
'new extradata' type is the best solution IMO. It's not timeline/timespan data,
and certainly can't be provided up-front, but only at packet level.

> It would be suboptimal for example if these 2 would use different ways to
> export this codec change data at least unless we also provide code to
> convert. 

Can't disagree with that, but I think this problem is entirely orthogonal to
timeline data. There is nothing that, IMO, should be shared.

> I didnt now check what exactly is stored in it currently but,
> We do with AV_PKT_DATA_STRINGS_METADATA, the chapter array (which also has
> metadata in it) possibly already something that goes a bit in that direction

Ideally we can deprecate the chapter array in favour of an actual struct, which aligns
with how we currently do side data (encryption, this rfc, etc.)? I don't quite think
there is much overlap between timeline edits and chapters, and conflating the two,
in my opinion, ends up making it more annoying for users.

> Its absolutely not my intend to suggest that other APIs would need to
> be changed before. Much more to bring the current incosistent APIs to people
> attention so it is given a thought when a new API is added.
> deprecating APIs and replacing them also has a cost for our users ...

OK. What is my path forward here then? Make a good API and then work on fixing
the other older worse APIs like chapters? Use the old chapter-style API for timelines
for the sake of consistency?

I get the worry and problem of consistent and large APIs, but I cannot continue the
RFC, or even write an implementation until I know what exactly I need to do to move
it forward; an actionable path to take. You have very valid concerns, but I don't
know what I'm actually supposed to take action on, and do.

- Derek
Michael Niedermayer April 2, 2018, 10:45 p.m. UTC | #28
Hi

On Mon, Apr 02, 2018 at 04:46:18PM +0100, Derek Buitenhuis wrote:
> On 4/1/2018 10:38 PM, Michael Niedermayer wrote:
> >> I agree that the APIs are annoyingly different, but that they should remain
> >> separate APIs. The suggestion of aligned (same names, args, etc.) seems the
> >> most reasonable to me - consistency.
> >>
> >> I don't really think they can share some common structs, but they can certainly
> >> be aligned.
> >>
> > 
> >> However, some of the stuff you listed doesn't make sense to me:
> >>
> >> * Codec changes are not part of timeline data; they're entirely separate things
> >>   in every format I'm aware of. In MPEG-TS it just changes, in MOV/MP4 it is done
> >>   via multiple 'stsd' atoms/boxes, which don't really carry any time related
> >>   data - there is another atom/box that maps samples to a particular 'stsd',
> >>   and that's it.
> > 
> > yes but in an abstract way each timespan has one codec
> > its the same kind of "data structure" even if its not stored this way in 
> > formats.
> 
> I disagree with this. Codec changes shouldn't be represented as timespans.

I think we misunderstand each other here on the terminology

If we have a list of information for ranges of time. That is a a set of
(start,end,info) tripets. Thats what i meant and i used the term timespans
here. We can use any other term, if timespans have a different meaning for you.

That is for example if the codec is not always the same for a stream, that is
this kind of information by definition pretty much.


> They may happen in-band, with a new PMT, new stsd, etc. Short of indexing
> and possibly decoding the entire file, you won't have a timespan at all
> to output. I also think its just a fundamentally wrong way to go about
> supporting codec or format changes.
> 
> >> * What is AVTimelineMetadataList meant to be?
> > 
> > what we have in AV_PKT_DATA_STRINGS_METADATA
> 
> That is very vague. Could you elaborate on this? A lot of different and not
> necessarily time related data can be stored in this generic type. I know
> e.g. WebM DASH cue points are, but that's the only one I think is even vaguely
> related. And cue points are not timestamps, but a singular timestamp.
> 
> >> Lastly, what is the intent here? Is it to align all of these before timeline
> >> support, or to deprecate and align after? Right now the API is align with
> >> other side data APIs. I need to know which scenario this is before I go
> >> ahead and implement this.
> > 
> > Part of the intend is to have a easier to understand API where things follow
> > similar design.
> 
> I don't think anybody will disagree with this. But I need a clear path forward,
> and not a vague 'would should be consistent' which could mean many things.
>
> > Another part is "compatibility". Ideally a format like mpegts that just
> > changes codecs should fill whatever data structure we use for codec changes
> > in the future. And simply passing these structures over to a mov/mp4 muxer
> > would make it write stsd accordingly.
> 
> This can be accomplished with the current packet side data, I think. The code
> just hasn't been written to do so yet. Possibly extending or replacing the
> 'new extradata' type is the best solution IMO. It's not timeline/timespan data,
> and certainly can't be provided up-front, but only at packet level.
> 
> > It would be suboptimal for example if these 2 would use different ways to
> > export this codec change data at least unless we also provide code to
> > convert. 
> 
> Can't disagree with that, but I think this problem is entirely orthogonal to
> timeline data. There is nothing that, IMO, should be shared.
> 
> > I didnt now check what exactly is stored in it currently but,
> > We do with AV_PKT_DATA_STRINGS_METADATA, the chapter array (which also has
> > metadata in it) possibly already something that goes a bit in that direction
> 
> Ideally we can deprecate the chapter array in favour of an actual struct, which aligns
> with how we currently do side data (encryption, this rfc, etc.)? I don't quite think
> there is much overlap between timeline edits and chapters, and conflating the two,
> in my opinion, ends up making it more annoying for users.
> 
> > Its absolutely not my intend to suggest that other APIs would need to
> > be changed before. Much more to bring the current incosistent APIs to people
> > attention so it is given a thought when a new API is added.
> > deprecating APIs and replacing them also has a cost for our users ...
> 
> OK. What is my path forward here then? Make a good API and then work on fixing
> the other older worse APIs like chapters? Use the old chapter-style API for timelines
> for the sake of consistency?
> 
> I get the worry and problem of consistent and large APIs, but I cannot continue the
> RFC, or even write an implementation until I know what exactly I need to do to move
> it forward; an actionable path to take. You have very valid concerns, but I don't
> know what I'm actually supposed to take action on, and do.

ok

Lets see why every API we have is different.
Its because people add new APIs that are fundamnetally different from existing APIs. 
I think its because each has been implemented by a different developer who believed his
choice was better than all prior ones.

I would be surprised if people in 10 years see the choices made today
fundamentally different from how we see the choices people made 10 years ago.

So in this sense my question is what advantage the opaque side data based system
has over a system simply using a C struct the way AVChapter works ?

about actual action and moving forward.
If you want a side data based system there is very little to do. Just take a look
at the API and existing APIs and if you see nothing that can be made more similar
and aligned then thats fine, nothing to change ATM.
In the future we may want to update existing APIs to use side data unless by then
something else is more trendy.

If OTOH you prefer a C struct based one, i think this patch can be simplified alot
as alot of abstraction would not be needed and we also would not need to update
AVChapter to be consistent and also would not require users to update their code
to a new AVChapter API.

Thanks

[...]
wm4 April 3, 2018, 3:15 a.m. UTC | #29
On Tue, 3 Apr 2018 00:45:17 +0200
Michael Niedermayer <michael@niedermayer.cc> wrote:

> So in this sense my question is what advantage the opaque side data based system
> has over a system simply using a C struct the way AVChapter works ?

The _only_ reason the side data is opqaue is because side data can't
contain pointers, which again is because side data must be a flat byte
array for legacy reasons. (Yes, we should change this, but probably
can't now.)

So I think all what you've been writing can be reduced to: should this
really be side data, or would it be better to just make it a new
AVStream field.
Derek Buitenhuis April 3, 2018, 4:06 p.m. UTC | #30
On 4/3/2018 4:15 AM, wm4 wrote:
> So I think all what you've been writing can be reduced to: should this
> really be side data, or would it be better to just make it a new
> AVStream field.

I don't really have an opinion on this one way or another.

Do others?

- Derek
wm4 April 3, 2018, 4:26 p.m. UTC | #31
On Tue, 3 Apr 2018 17:06:03 +0100
Derek Buitenhuis <derek.buitenhuis@gmail.com> wrote:

> On 4/3/2018 4:15 AM, wm4 wrote:
> > So I think all what you've been writing can be reduced to: should this
> > really be side data, or would it be better to just make it a new
> > AVStream field.  
> 
> I don't really have an opinion on this one way or another.
> 
> Do others?

Uh no idea. One angle:

The idea with AVStream side data is that it can be attached as AVPacket
side data. So if an API user is fine with treating all kinds of side
data per AVPacket, it can call av_format_inject_global_side_data() and
ignore AVStream side data. Basically, you could ask the question
whether this kind of side data makes sense as per-AVPacket, and if not,
then it should be AVStream side data. Maybe.

Another angle is that the API better because side data would not add
yet another AVStream field. It makes the API more discoverable: the API
user can easily detect an unsupported side data type and handle it as
soon as he has a sample, instead of having to go over every AVStream
field during the initial implementation, and wondering what the hell it
does (and things going wrong when not handling explicitly). I probably
didn't word that too well, but I hope you get what I mean.
Derek Buitenhuis April 5, 2018, 3:26 p.m. UTC | #32
On 4/3/2018 5:26 PM, wm4 wrote:
> Uh no idea. One angle:
> 
> The idea with AVStream side data is that it can be attached as AVPacket
> side data. So if an API user is fine with treating all kinds of side
> data per AVPacket, it can call av_format_inject_global_side_data() and
> ignore AVStream side data. Basically, you could ask the question
> whether this kind of side data makes sense as per-AVPacket, and if not,
> then it should be AVStream side data. Maybe.

It doesn't make sense as per-packet side-data. 

> Another angle is that the API better because side data would not add
> yet another AVStream field. It makes the API more discoverable: the API
> user can easily detect an unsupported side data type and handle it as
> soon as he has a sample, instead of having to go over every AVStream
> field during the initial implementation, and wondering what the hell it
> does (and things going wrong when not handling explicitly). I probably
> didn't word that too well, but I hope you get what I mean.

I got what you mean. I still don't have an opinion one way or another.
It's a trade-off between Yet Another AVStream Field vs a more complex
API and implementation as side data. I may be slightly erring towards
an AVStream field.

I would be interested to hear others' opinions on this if they have
a stronger one.

- Derek
Nicolas George April 5, 2018, 6:35 p.m. UTC | #33
Derek Buitenhuis (2018-04-05):
> I got what you mean. I still don't have an opinion one way or another.
> It's a trade-off between Yet Another AVStream Field vs a more complex
> API and implementation as side data. I may be slightly erring towards
> an AVStream field.
> 
> I would be interested to hear others' opinions on this if they have
> a stronger one.

A completely braindead API versus using a data structure the way it is
supposed to be used, by adding fields when necessary? You call this a
trade-off?

Regards,
Derek Buitenhuis April 5, 2018, 6:39 p.m. UTC | #34
On 4/5/2018 7:35 PM, Nicolas George wrote:
> A completely braindead API versus using a data structure the way it is
> supposed to be used, by adding fields when necessary? You call this a
> trade-off?

You could have voiced your opinion without being a jackass about it. Cut
it out.

- Derek
Derek Buitenhuis April 5, 2018, 6:51 p.m. UTC | #35
Hi,

If you wish to address me, please do so publicly in the future. Nobody
benefits from this stuff happening in private. CCing ffmpeg-devel where
this belongs.

On 4/5/2018 7:44 PM, Nicolas George wrote:
> This remark would have more weight if you also made it to people who
> actually deserve it, like the person you were just discussing with, and
> who have attacked and insulted, in the past, not only me but also
> Michael, Carl Eugen and others, including in this very thread.

I don't condone it from anyone, I merely replied to a direct reply to myself.
I don't necessarily read every mail on the list, specifically for this reason
(petty fighting), so I don't know what history you guys all have and who
"actually" deserves to be called out. Everyone who acts like a butt deserves
to be called out, myself included (though, I don't think I have this time.)

> I guess it is easier to stand up to people who try to be decent but do
> not mince their words than to bullies.

What? That is an entirely subjective. There are LOTS of times you, and all you
listed have been bullies, IMO. Cut out the holier-than-thou stuff, please.
This list has a serious jerk problem on all "sides", and why I largely avoid
it unless I think I have something of value to contribute. I've tried to keep
my replies here professional. Please do not accuse me of some kind if conspiracy
or favoritism.

- Derek
Rostislav Pehlivanov April 5, 2018, 7:14 p.m. UTC | #36
On 5 April 2018 at 16:26, Derek Buitenhuis <derek.buitenhuis@gmail.com>
wrote:

> On 4/3/2018 5:26 PM, wm4 wrote:
> > Uh no idea. One angle:
> >
> > The idea with AVStream side data is that it can be attached as AVPacket
> > side data. So if an API user is fine with treating all kinds of side
> > data per AVPacket, it can call av_format_inject_global_side_data() and
> > ignore AVStream side data. Basically, you could ask the question
> > whether this kind of side data makes sense as per-AVPacket, and if not,
> > then it should be AVStream side data. Maybe.
>
> It doesn't make sense as per-packet side-data.
>
> > Another angle is that the API better because side data would not add
> > yet another AVStream field. It makes the API more discoverable: the API
> > user can easily detect an unsupported side data type and handle it as
> > soon as he has a sample, instead of having to go over every AVStream
> > field during the initial implementation, and wondering what the hell it
> > does (and things going wrong when not handling explicitly). I probably
> > didn't word that too well, but I hope you get what I mean.
>
> I got what you mean. I still don't have an opinion one way or another.
> It's a trade-off between Yet Another AVStream Field vs a more complex
> API and implementation as side data. I may be slightly erring towards
> an AVStream field.
>
> I would be interested to hear others' opinions on this if they have
> a stronger one.
>
>
I think it makes sense to have it in AVStream rather than as side data.
I don't have an opinion on whether codec switches should be indicated. It
would be nice for them to be a separate segment if that's possible and
reliable, since it would allow users to init all they would at startup. If
not then they can just handle them as they come like a stream and reinit if
they have to. After all this API only concerns complete files rather than
streams.
wm4 April 5, 2018, 7:15 p.m. UTC | #37
On Thu, 5 Apr 2018 19:51:49 +0100
Derek Buitenhuis <derek.buitenhuis@gmail.com> wrote:

> Hi,
> 
> If you wish to address me, please do so publicly in the future. Nobody
> benefits from this stuff happening in private. CCing ffmpeg-devel where
> this belongs.
> 
> On 4/5/2018 7:44 PM, Nicolas George wrote:
> > This remark would have more weight if you also made it to people who
> > actually deserve it, like the person you were just discussing with, and
> > who have attacked and insulted, in the past, not only me but also
> > Michael, Carl Eugen and others, including in this very thread.  

It would be better if the person who wrote this weren't the same who
called me "troll" in the past, consistently ignores my post in the most
respectless manner, is generally without any civility if someone
disagrees with him, and didn't list at least 1 person who aggressively
attacked me in the past as someone who is supposedly a "victim" of me.

I'm really sick of this savagely behavior and I suggest we actually
enforce the CoC.

(I'm ready to accept that Michael's behavior in the past was without
any actual ill intent.)
Derek Buitenhuis April 5, 2018, 7:27 p.m. UTC | #38
On 4/5/2018 8:14 PM, Rostislav Pehlivanov wrote:
> I think it makes sense to have it in AVStream rather than as side data.
> I don't have an opinion on whether codec switches should be indicated. It
> would be nice for them to be a separate segment if that's possible and
> reliable, since it would allow users to init all they would at startup. If
> not then they can just handle them as they come like a stream and reinit if
> they have to.

The consensus so far seems to be AVStream members, so I will switch to that
approach.

For codec changes, not all formats indicate codec changes up front, such as
MPEG-TS which we may be streaming, so I think it has to be handled as it comes
anyway.

> After all this API only concerns complete files rather than
> streams.

'Streaming' should actually be possible in the same sense you can stream a
progressive MP4 of Matroska file, since this data is located in the same place
as the index, which will be upfront (or at least, all in one place).

- Derek
diff mbox

Patch

diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
index 50c34db..6f54495 100644
--- a/libavcodec/avcodec.h
+++ b/libavcodec/avcodec.h
@@ -1358,6 +1358,14 @@  enum AVPacketSideDataType {
     AV_PKT_DATA_ENCRYPTION_INFO,
 
     /**
+     * This side data contains timeline entries for a given stream. This type
+     * will only apear as stream side data.
+     *
+     * The format is not part of the ABI, use av_timeline_* method to access.
+     */
+    AV_PKT_DATA_TIMELINE,
+
+    /**
      * The number of side data types.
      * This is not part of the public API/ABI in the sense that it may
      * change when new side data types are added.
diff --git a/libavutil/timeline.h b/libavutil/timeline.h
new file mode 100644
index 0000000..f1f3e1b
--- /dev/null
+++ b/libavutil/timeline.h
@@ -0,0 +1,160 @@ 
+/*
+ * Copyright (C) 2018 Derek Buitenhuis
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef AVUTIL_TIMELINE_H
+#define AVUTIL_TIMELINE_H
+
+#include <stddef.h>
+#include <stdint.h>
+
+#include "rational.h"
+
+typedef struct AVTimelineEntry {
+    /**
+     * The start time of the given timeline entry, in stream timebase units.
+     * If this value is AV_NOPTS_VALUE, you must display black for the duration
+     * of the entry. For audio, silence must be played.
+     */
+    int64_t start;
+
+    /**
+     * The duration of the given timeline entry, in steam timebase units.
+     */
+    int64_t duration;
+
+    /**
+     * The rate at which this entry should be played back. The value is a multipier
+     * of the stream's rate, for example: 1.2 means play back this entry at 1.2x speed.
+     * If this value is 0, then the first sample (located at 'start') must be displayed
+     * for the duration of the entry.
+     */
+    AVRational media_rate;
+} AVTimelineEntry;
+
+/**
+ * Describes a timeline for a stream in terms of edits/entries. Each entry must be
+ * played back in order, according to the information in each. Each stream may have
+ * multiple timelines which need to be correlated between different streams.
+ */
+typedef struct AVTimeline {
+    /**
+     * The ID of a given timeline. Since each stream may have multiple timelines
+     * defined, this value is used to correlated different streams' timelines
+     * which should be used together. For example, if one has two streams,
+     * one video, and one audio, with two timelines each, the timelines
+     * with matching IDs should be used in conjuction, to assure everything
+     * is in sync and matches. The concept is similar to that of EditionUID
+     * in Matroska.
+     */
+    uint32_t id;
+
+    /**
+     * An in-order array of entries for the given timeline.
+     * Each entry contains information on which samples to display for a
+     * particular edit.
+     */
+    AVTimelineEntry *entries;
+
+    /**
+     * Number of entries in the timeline.
+     */
+    size_t entry_count;
+} AVTimeline;
+
+typedef struct AVTimelineList {
+    /**
+     * An array of timelines associated with the stream.
+     */
+    AVTimeline *timelines;
+
+    /**
+     * Then number of timelines associated with the stream.
+     */
+    size_t timeline_count;
+} AVTimelineList;
+
+/**
+ * Allocates an AVTimeline strcture with the requested number of entires.
+ *
+ * @param entry_count The number of entries in the timeline.
+ *
+ * @return The new AVTimeline structure, or NULL on error.
+ */
+AVTimeline *av_timeline_alloc(size_t entry_count);
+
+
+/**
+ * Frees an AVTimeline structure and its members.
+ *
+ * @param timeline The AVTimeline structure to free.
+ */
+void av_timeline_free(AVTimeline *timeline);
+
+/**
+ * Allocates an AVTimeline strcture with room for the request number of timelines.
+ *
+ * @param timeline_count The number of entries in the timeline.
+ *
+ * @return The new AVTimelineList structure, or NULL on error.
+ */
+AVTimelineList *av_timeline_list_alloc(size_t timeline_count);
+
+
+/**
+ * Frees an AVTimelineList structure and its timelines, and their entries.
+ *
+ * @param timeline_list The AVTimelineList structure to free.
+ */
+void av_timeline_list_free(AVTimeline *timeline_list);
+
+/**
+ * Allocates a new AVTimelineList structure and copies the data from an
+ * existing structure, allocating new members.
+ *
+ * @param timeline_list The existing AVTimelineList structure to copy data from.
+ *
+ * @return The new AVTimelineList structure, or NULL on error.
+ */
+AVTimelineList *av_timeline_list_clone(const AVTimelineList *timeline_list);
+
+/**
+ * Creates a copy of the AVTimelineList that is contained in the given side
+ * data. The resulting struct should be passed to av_timeline_list_free()
+ * when done.
+ *
+ * @param side_data The side data array.
+ * @param side_data_size The size of the side data array.
+ *
+ * @return The new AVTimelineList structure, or NULL on error.
+ */
+AVTimelineList *av_timeline_list_get_side_data(const uint8_t *side_data, size_t side_data_size);
+
+/**
+ * Allocates and initializes side data that holds a copy of the given timeline
+ * list. The resulting pointer should be either freed using av_free() or given
+ * to av_packet_add_side_data().
+ *
+ * @param timeline_list The AVTimelineList to put into side data.
+ * @param side_data_size A pointer to where the size can be filled in.
+ *
+ * @return The new side-data pointer, or NULL.
+ */
+uint8_t *av_timeline_list_add_side_data(const AVTimelineList *timeline_list, size_t *side_data_size);
+