diff mbox

[FFmpeg-devel,1/3] spherical: Add tiled equirectangular type and projection-specific properties

Message ID 20170210232530.GS5776@nb4
State Not Applicable
Headers show

Commit Message

Michael Niedermayer Feb. 10, 2017, 11:25 p.m. UTC
On Fri, Feb 10, 2017 at 04:11:43PM -0500, Vittorio Giovara wrote:
> Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
> ---
> This should help not losing details over muxing and allows
> callers to get additional information in a clean manner.
> 
> Please keep me in CC.
> Vittorio
> 
>  doc/APIchanges        |  5 +++++
>  ffprobe.c             | 11 ++++++++--
>  libavformat/dump.c    | 10 +++++++++
>  libavutil/spherical.h | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  libavutil/version.h   |  2 +-
>  5 files changed, 81 insertions(+), 3 deletions(-)

breaks fate

Test matroska-spherical-mono failed. Look at tests/data/fate/matroska-spherical-mono.err for details.
make: *** [fate-matroska-spherical-mono] Error 1


[...]

Comments

Vittorio Giovara Feb. 14, 2017, 8:52 p.m. UTC | #1
On Fri, Feb 10, 2017 at 6:25 PM, Michael Niedermayer
<michael@niedermayer.cc> wrote:
> On Fri, Feb 10, 2017 at 04:11:43PM -0500, Vittorio Giovara wrote:
>> Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
>> ---
>> This should help not losing details over muxing and allows
>> callers to get additional information in a clean manner.
>>
>> Please keep me in CC.
>> Vittorio
>>
>>  doc/APIchanges        |  5 +++++
>>  ffprobe.c             | 11 ++++++++--
>>  libavformat/dump.c    | 10 +++++++++
>>  libavutil/spherical.h | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>  libavutil/version.h   |  2 +-
>>  5 files changed, 81 insertions(+), 3 deletions(-)
>
> breaks fate
>
> --- ./tests/ref/fate/matroska-spherical-mono    2017-02-10 23:43:51.993432371 +0100
> +++ tests/data/fate/matroska-spherical-mono     2017-02-11 00:24:10.297483318 +0100
> @@ -7,7 +7,7 @@
>  [/SIDE_DATA]
>  [SIDE_DATA]
>  side_data_type=Spherical Mapping
> -side_data_size=16
> +side_data_size=56
>  projection=equirectangular
>  yaw=45
>  pitch=30
> Test matroska-spherical-mono failed. Look at tests/data/fate/matroska-spherical-mono.err for details.
> make: *** [fate-matroska-spherical-mono] Error 1

Ah I didn't notice, it is fixed in the next commit, but I'll amend this one too.


I didn't see any comment/discussion, should I assume it is ok?
Please CC, thank you.
James Almer Feb. 14, 2017, 11:54 p.m. UTC | #2
On 2/14/2017 5:52 PM, Vittorio Giovara wrote:
> On Fri, Feb 10, 2017 at 6:25 PM, Michael Niedermayer
> <michael@niedermayer.cc> wrote:
>> On Fri, Feb 10, 2017 at 04:11:43PM -0500, Vittorio Giovara wrote:
>>> Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
>>> ---
>>> This should help not losing details over muxing and allows
>>> callers to get additional information in a clean manner.
>>>
>>> Please keep me in CC.
>>> Vittorio
>>>
>>>  doc/APIchanges        |  5 +++++
>>>  ffprobe.c             | 11 ++++++++--
>>>  libavformat/dump.c    | 10 +++++++++
>>>  libavutil/spherical.h | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>>  libavutil/version.h   |  2 +-
>>>  5 files changed, 81 insertions(+), 3 deletions(-)
>>
>> breaks fate
>>
>> --- ./tests/ref/fate/matroska-spherical-mono    2017-02-10 23:43:51.993432371 +0100
>> +++ tests/data/fate/matroska-spherical-mono     2017-02-11 00:24:10.297483318 +0100
>> @@ -7,7 +7,7 @@
>>  [/SIDE_DATA]
>>  [SIDE_DATA]
>>  side_data_type=Spherical Mapping
>> -side_data_size=16
>> +side_data_size=56
>>  projection=equirectangular
>>  yaw=45
>>  pitch=30
>> Test matroska-spherical-mono failed. Look at tests/data/fate/matroska-spherical-mono.err for details.
>> make: *** [fate-matroska-spherical-mono] Error 1
> 
> Ah I didn't notice, it is fixed in the next commit, but I'll amend this one too.
> 
> 
> I didn't see any comment/discussion, should I assume it is ok?
> Please CC, thank you.

These are a lot of projection specific fields. It worries me as the
spec may change in the future with new fields added or existing
fields changing purpose. Not to mention the Mesh projection, which
has like fifty specific fields of its own.

Wouldn't it be a better idea to export the binary data of the
equi/cbmp/mesh boxes into an extradata-like field in the
AVSphericalMapping struct, and let the downstream application parse
it instead?

That way you can skip adding the extra Equirectangular enum, and
also add the Mesh one.
Vittorio Giovara Feb. 15, 2017, 2:20 a.m. UTC | #3
On Tue, Feb 14, 2017 at 6:54 PM, James Almer <jamrial@gmail.com> wrote:
> On 2/14/2017 5:52 PM, Vittorio Giovara wrote:
>> On Fri, Feb 10, 2017 at 6:25 PM, Michael Niedermayer
>> <michael@niedermayer.cc> wrote:
>>> On Fri, Feb 10, 2017 at 04:11:43PM -0500, Vittorio Giovara wrote:
>>>> Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
>>>> ---
>>>> This should help not losing details over muxing and allows
>>>> callers to get additional information in a clean manner.
>>>>
>>>> Please keep me in CC.
>>>> Vittorio
>>>>
>>>>  doc/APIchanges        |  5 +++++
>>>>  ffprobe.c             | 11 ++++++++--
>>>>  libavformat/dump.c    | 10 +++++++++
>>>>  libavutil/spherical.h | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>  libavutil/version.h   |  2 +-
>>>>  5 files changed, 81 insertions(+), 3 deletions(-)
>>>
>>> breaks fate
>>>
>>> --- ./tests/ref/fate/matroska-spherical-mono    2017-02-10 23:43:51.993432371 +0100
>>> +++ tests/data/fate/matroska-spherical-mono     2017-02-11 00:24:10.297483318 +0100
>>> @@ -7,7 +7,7 @@
>>>  [/SIDE_DATA]
>>>  [SIDE_DATA]
>>>  side_data_type=Spherical Mapping
>>> -side_data_size=16
>>> +side_data_size=56
>>>  projection=equirectangular
>>>  yaw=45
>>>  pitch=30
>>> Test matroska-spherical-mono failed. Look at tests/data/fate/matroska-spherical-mono.err for details.
>>> make: *** [fate-matroska-spherical-mono] Error 1
>>
>> Ah I didn't notice, it is fixed in the next commit, but I'll amend this one too.
>>
>>
>> I didn't see any comment/discussion, should I assume it is ok?
>> Please CC, thank you.
>
> These are a lot of projection specific fields. It worries me as the
> spec may change in the future with new fields added or existing
> fields changing purpose. Not to mention the Mesh projection, which
> has like fifty specific fields of its own.

Even if the spec change (which at this point would be a terrible
terrible thing to do) there are now files in the wild and software
that have adopted this draft, so we would have to support this anyway.

> Wouldn't it be a better idea to export the binary data of the
> equi/cbmp/mesh boxes into an extradata-like field in the
> AVSphericalMapping struct, and let the downstream application parse
> it instead?

No I don't think so, lavf is an abstraction layer and one of its tasks
is to provide a (simple?) unified entry layer. and letting the user
parse binary data is IMO bad design and very fragile. Also it's not
impossible that another standard for tagging spherical metadata
emerges in the future: the current API could very easily wrap it,
while exporting the binary entry would be too specification-specific
and it would be tied too heavily on the google draft.
James Almer Feb. 15, 2017, 1:52 p.m. UTC | #4
On 2/14/2017 11:20 PM, Vittorio Giovara wrote:
> On Tue, Feb 14, 2017 at 6:54 PM, James Almer <jamrial@gmail.com> wrote:
>> On 2/14/2017 5:52 PM, Vittorio Giovara wrote:
>>> On Fri, Feb 10, 2017 at 6:25 PM, Michael Niedermayer
>>> <michael@niedermayer.cc> wrote:
>>>> On Fri, Feb 10, 2017 at 04:11:43PM -0500, Vittorio Giovara wrote:
>>>>> Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
>>>>> ---
>>>>> This should help not losing details over muxing and allows
>>>>> callers to get additional information in a clean manner.
>>>>>
>>>>> Please keep me in CC.
>>>>> Vittorio
>>>>>
>>>>>  doc/APIchanges        |  5 +++++
>>>>>  ffprobe.c             | 11 ++++++++--
>>>>>  libavformat/dump.c    | 10 +++++++++
>>>>>  libavutil/spherical.h | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>  libavutil/version.h   |  2 +-
>>>>>  5 files changed, 81 insertions(+), 3 deletions(-)
>>>>
>>>> breaks fate
>>>>
>>>> --- ./tests/ref/fate/matroska-spherical-mono    2017-02-10 23:43:51.993432371 +0100
>>>> +++ tests/data/fate/matroska-spherical-mono     2017-02-11 00:24:10.297483318 +0100
>>>> @@ -7,7 +7,7 @@
>>>>  [/SIDE_DATA]
>>>>  [SIDE_DATA]
>>>>  side_data_type=Spherical Mapping
>>>> -side_data_size=16
>>>> +side_data_size=56
>>>>  projection=equirectangular
>>>>  yaw=45
>>>>  pitch=30
>>>> Test matroska-spherical-mono failed. Look at tests/data/fate/matroska-spherical-mono.err for details.
>>>> make: *** [fate-matroska-spherical-mono] Error 1
>>>
>>> Ah I didn't notice, it is fixed in the next commit, but I'll amend this one too.
>>>
>>>
>>> I didn't see any comment/discussion, should I assume it is ok?
>>> Please CC, thank you.
>>
>> These are a lot of projection specific fields. It worries me as the
>> spec may change in the future with new fields added or existing
>> fields changing purpose. Not to mention the Mesh projection, which
>> has like fifty specific fields of its own.
> 
> Even if the spec change (which at this point would be a terrible
> terrible thing to do) there are now files in the wild and software
> that have adopted this draft, so we would have to support this anyway.

If the spec changes, it will be the contents of the equi/cbmp/mesh.
By exporting them raw as extradata, said changes in the spec would
require no changes to our implementation.

> 
>> Wouldn't it be a better idea to export the binary data of the
>> equi/cbmp/mesh boxes into an extradata-like field in the
>> AVSphericalMapping struct, and let the downstream application parse
>> it instead?
> 
> No I don't think so, lavf is an abstraction layer and one of its tasks
> is to provide a (simple?) unified entry layer. and letting the user
> parse binary data is IMO bad design and very fragile. Also it's not
> impossible that another standard for tagging spherical metadata
> emerges in the future: the current API could very easily wrap it,
> while exporting the binary entry would be too specification-specific
> and it would be tied too heavily on the google draft.

AVSphericalMapping is already pretty tied to the google draft, but
i guess you're right, it's at least generic enough for now.

Wait for Aaron's opinion before addressing reviews and pushing. He
sent a different patchset himself and it wouldn't be nice to push
yours without at least giving him a chance to comment.
Vittorio Giovara Feb. 15, 2017, 2:23 p.m. UTC | #5
On Wed, Feb 15, 2017 at 8:52 AM, James Almer <jamrial@gmail.com> wrote:
> On 2/14/2017 11:20 PM, Vittorio Giovara wrote:
>> On Tue, Feb 14, 2017 at 6:54 PM, James Almer <jamrial@gmail.com> wrote:
>>> On 2/14/2017 5:52 PM, Vittorio Giovara wrote:
>>>> On Fri, Feb 10, 2017 at 6:25 PM, Michael Niedermayer
>>>> <michael@niedermayer.cc> wrote:
>>>>> On Fri, Feb 10, 2017 at 04:11:43PM -0500, Vittorio Giovara wrote:
>>>>>> Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
>>>>>> ---
>>>>>> This should help not losing details over muxing and allows
>>>>>> callers to get additional information in a clean manner.
>>>>>>
>>>>>> Please keep me in CC.
>>>>>> Vittorio
>>>>>>
>>>>>>  doc/APIchanges        |  5 +++++
>>>>>>  ffprobe.c             | 11 ++++++++--
>>>>>>  libavformat/dump.c    | 10 +++++++++
>>>>>>  libavutil/spherical.h | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>  libavutil/version.h   |  2 +-
>>>>>>  5 files changed, 81 insertions(+), 3 deletions(-)
>>>>>
>>>>> breaks fate
>>>>>
>>>>> --- ./tests/ref/fate/matroska-spherical-mono    2017-02-10 23:43:51.993432371 +0100
>>>>> +++ tests/data/fate/matroska-spherical-mono     2017-02-11 00:24:10.297483318 +0100
>>>>> @@ -7,7 +7,7 @@
>>>>>  [/SIDE_DATA]
>>>>>  [SIDE_DATA]
>>>>>  side_data_type=Spherical Mapping
>>>>> -side_data_size=16
>>>>> +side_data_size=56
>>>>>  projection=equirectangular
>>>>>  yaw=45
>>>>>  pitch=30
>>>>> Test matroska-spherical-mono failed. Look at tests/data/fate/matroska-spherical-mono.err for details.
>>>>> make: *** [fate-matroska-spherical-mono] Error 1
>>>>
>>>> Ah I didn't notice, it is fixed in the next commit, but I'll amend this one too.
>>>>
>>>>
>>>> I didn't see any comment/discussion, should I assume it is ok?
>>>> Please CC, thank you.
>>>
>>> These are a lot of projection specific fields. It worries me as the
>>> spec may change in the future with new fields added or existing
>>> fields changing purpose. Not to mention the Mesh projection, which
>>> has like fifty specific fields of its own.
>>
>> Even if the spec change (which at this point would be a terrible
>> terrible thing to do) there are now files in the wild and software
>> that have adopted this draft, so we would have to support this anyway.
>
> If the spec changes, it will be the contents of the equi/cbmp/mesh.
> By exporting them raw as extradata, said changes in the spec would
> require no changes to our implementation.

If the spec changes in a non-backward compatible way, the API is the
least of our problems :-)
Nicolas George Feb. 15, 2017, 2:47 p.m. UTC | #6
Le septidi 27 pluviôse, an CCXXV, James Almer a écrit :
> If the spec changes, it will be the contents of the equi/cbmp/mesh.
> By exporting them raw as extradata, said changes in the spec would
> require no changes to our implementation.

By this reasoning, we could as well replace libavformat by open() and
close() and leave to applications the task of dealing with the binary
data.

Now, I realize this has the feel of a slippery-slope fallacy but I
really think it is valid.

The gist of it is that the first question should be: is it (abstracting
the projection properties) within the scope of this library?

If the answer is yes, then we do it, whatever it requires. And the API
will be as complicated as it needs to be.

And of course, if it is not trivial, we will probably get it wrong the
first time. Hopefully, we will get it more or less right eventually.

This process may look a bit unprofessional, but it is really the only
one that work. The cost of working with Libre software in a Bazaar
development model is that we can not hide our mistakes behind NDAs. But
the benefit of not having corporate executive breathing down our necks
is that we do not need to hide our mistakes.

Fortunately, the feature we are discussing here is rather peripheral to
the functionalities of our libraries (it was central, we would not be
having this discussion), so if we get it wrong, the only people who will
get hurt are the people using it, i.e. the people who should be helping
getting it right.

That is all I have to say; I do not know the specifics of the projection
properties to judge the merits of the current proposed API.

Regards,
Aaron Colwell Feb. 15, 2017, 4:48 p.m. UTC | #7
On Wed, Feb 15, 2017 at 5:52 AM James Almer <jamrial@gmail.com> wrote:

> On 2/14/2017 11:20 PM, Vittorio Giovara wrote:
> > On Tue, Feb 14, 2017 at 6:54 PM, James Almer <jamrial@gmail.com> wrote:
> >> On 2/14/2017 5:52 PM, Vittorio Giovara wrote:
> >>> On Fri, Feb 10, 2017 at 6:25 PM, Michael Niedermayer
> >>> <michael@niedermayer.cc> wrote:
> >>>> On Fri, Feb 10, 2017 at 04:11:43PM -0500, Vittorio Giovara wrote:
> >>>>> Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
> >>>>> ---
> >>>>> This should help not losing details over muxing and allows
> >>>>> callers to get additional information in a clean manner.
> >>>>>
> >>>>> Please keep me in CC.
> >>>>> Vittorio
> >>>>>
> >>>>>  doc/APIchanges        |  5 +++++
> >>>>>  ffprobe.c             | 11 ++++++++--
> >>>>>  libavformat/dump.c    | 10 +++++++++
> >>>>>  libavutil/spherical.h | 56
> +++++++++++++++++++++++++++++++++++++++++++++++++++
> >>>>>  libavutil/version.h   |  2 +-
> >>>>>  5 files changed, 81 insertions(+), 3 deletions(-)
> >>>>
> >>>> breaks fate
> >>>>
> >>>> --- ./tests/ref/fate/matroska-spherical-mono    2017-02-10
> 23:43:51.993432371 +0100
> >>>> +++ tests/data/fate/matroska-spherical-mono     2017-02-11
> 00:24:10.297483318 +0100
> >>>> @@ -7,7 +7,7 @@
> >>>>  [/SIDE_DATA]
> >>>>  [SIDE_DATA]
> >>>>  side_data_type=Spherical Mapping
> >>>> -side_data_size=16
> >>>> +side_data_size=56
> >>>>  projection=equirectangular
> >>>>  yaw=45
> >>>>  pitch=30
> >>>> Test matroska-spherical-mono failed. Look at
> tests/data/fate/matroska-spherical-mono.err for details.
> >>>> make: *** [fate-matroska-spherical-mono] Error 1
> >>>
> >>> Ah I didn't notice, it is fixed in the next commit, but I'll amend
> this one too.
> >>>
> >>>
> >>> I didn't see any comment/discussion, should I assume it is ok?
> >>> Please CC, thank you.
> >>
> >> These are a lot of projection specific fields. It worries me as the
> >> spec may change in the future with new fields added or existing
> >> fields changing purpose. Not to mention the Mesh projection, which
> >> has like fifty specific fields of its own.
> >
> > Even if the spec change (which at this point would be a terrible
> > terrible thing to do) there are now files in the wild and software
> > that have adopted this draft, so we would have to support this anyway.
>
> If the spec changes, it will be the contents of the equi/cbmp/mesh.
> By exporting them raw as extradata, said changes in the spec would
> require no changes to our implementation.
>

This is one of the main reasons I was suggesting this path. I think of
these extradata fields much like the extra data that codecs have. It really
is only important to the code that needs to render a specific projection.
For transcoding, you mainly just need to convey the information in a
lossless manner from demuxer to muxer.

I anticipate the spec will change in the future. My plan is that no change
will break what is currently specified in the spec right now, but I
anticipate some changes will be made. Having a solution that can gracefully
handle this would be nice.



> >
> >> Wouldn't it be a better idea to export the binary data of the
> >> equi/cbmp/mesh boxes into an extradata-like field in the
> >> AVSphericalMapping struct, and let the downstream application parse
> >> it instead?
> >
> > No I don't think so, lavf is an abstraction layer and one of its tasks
> > is to provide a (simple?) unified entry layer. and letting the user
> > parse binary data is IMO bad design and very fragile. Also it's not
> > impossible that another standard for tagging spherical metadata
> > emerges in the future: the current API could very easily wrap it,
> > while exporting the binary entry would be too specification-specific
> > and it would be tied too heavily on the google draft.
>
>
I agree with Vittorio that having some form of abstraction is a good thing
and having binary data in places can be problematic. It feels like we could
find some middle ground here by providing helper functions that parse the
binary data into projection specific structs and back just like codecs code
tends to do. I feel like this provides a reasonable balance between having
a common set of fields where things actually have common semantics like
projection_type, yaw/pitch/roll, & extra_data while also providing a way to
get access to projection specific information in a simple way.

At the end of the day players really just need to care about a rendering
mesh so in some sense it would be nice to have that be the abstraction for
the player use case. That is basically what we have done in our internal
player implementations. That could easily be handled by helper functions,
but would be a bad representation for AVSphericalMapping because it would
make transcoding/transmuxing painful.


> AVSphericalMapping is already pretty tied to the google draft, but
> i guess you're right, it's at least generic enough for now.
>
>
I too feel like this path is tying the API pretty closely to the Google
draft's semantics. This seems ok for things like signalling projection type
and yaw/pitch/roll projection rotation, but it seems a little risky for
projection parameterization. All you have to do is look at MPEGs OMAF
drafts to see that there are potentially many more projections to
potentially have to deal with and their parameters may not be useful
outside of each particular projection. That is why I'm wary of the "union
of all projection config fields" type of approach.



> Wait for Aaron's opinion before addressing reviews and pushing. He
> sent a different patchset himself and it wouldn't be nice to push
> yours without at least giving him a chance to comment.
>
>
Thank you. I really appreciate this. I hope it is clear I'm really trying
to find some middle ground here. My team has been working on this stuff
inside Google for several years and I'm just trying to bring some of that
experience and learning into the open here. I defer to you because this is
your project and I am relatively sparse contributor.

Aaron
Vittorio Giovara Feb. 15, 2017, 6:43 p.m. UTC | #8
Hi Aaron,

On Wed, Feb 15, 2017 at 11:48 AM, Aaron Colwell <acolwell@google.com> wrote:
>> If the spec changes, it will be the contents of the equi/cbmp/mesh.
>> By exporting them raw as extradata, said changes in the spec would
>> require no changes to our implementation.
>
>
> This is one of the main reasons I was suggesting this path. I think of these
> extradata fields much like the extra data that codecs have. It really is
> only important to the code that needs to render a specific projection. For
> transcoding, you mainly just need to convey the information in a lossless
> manner from demuxer to muxer.

I understand that, but "transcoding" is not the only task performed here.

Yes, I agree that for your single usecase of converting mp4 metadata
to mkv metadata your solution a feasible one, but the code here also
has
1. convey information to the users
2. fill in side data and frame data (remember this is a pkt side data
that is exported to frames since spherical is a frame property, it
doesn't matter that currently the spec is container level only for
now)
3. offer it to any muxer for additional parsing

You can't expect to fill in binary data and have all three places
behave correctly, and handle multiple theoretical version while at it.
You need an abstraction like the one provided by the API in this
patch.

> I anticipate the spec will change in the future. My plan is that no change
> will break what is currently specified in the spec right now, but I
> anticipate some changes will be made. Having a solution that can gracefully
> handle this would be nice.

A *binary* solution is certainly not nice, especially if you have
multiple version.
Besides, you should really make an effort to change the spec as little
as possible now that it is public, as more and more software will
depend on it and you can't realistically expect that everyone will be
always up-to-date (ffmpeg included).

>> >> Wouldn't it be a better idea to export the binary data of the
>> >> equi/cbmp/mesh boxes into an extradata-like field in the
>> >> AVSphericalMapping struct, and let the downstream application parse
>> >> it instead?
>> >
>> > No I don't think so, lavf is an abstraction layer and one of its tasks
>> > is to provide a (simple?) unified entry layer. and letting the user
>> > parse binary data is IMO bad design and very fragile. Also it's not
>> > impossible that another standard for tagging spherical metadata
>> > emerges in the future: the current API could very easily wrap it,
>> > while exporting the binary entry would be too specification-specific
>> > and it would be tied too heavily on the google draft.
>>
>
> I agree with Vittorio that having some form of abstraction is a good thing
> and having binary data in places can be problematic. It feels like we could
> find some middle ground here by providing helper functions that parse the
> binary data into projection specific structs and back just like codecs code
> tends to do. I feel like this provides a reasonable balance between having a
> common set of fields where things actually have common semantics like
> projection_type, yaw/pitch/roll, & extra_data while also providing a way to
> get access to projection specific information in a simple way.

I don't feel like this has anything to do with the current patchset.
This set only allows for properly conveying information in a clean
matten to the user (and a muxer if you will can be consider as a plain
user). Having a single parsing routine is IMO a useless optimization,
and can be discussed in another thread.

> At the end of the day players really just need to care about a rendering
> mesh so in some sense it would be nice to have that be the abstraction for
> the player use case. That is basically what we have done in our internal
> player implementations. That could easily be handled by helper functions,
> but would be a bad representation for AVSphericalMapping because it would
> make transcoding/transmuxing painful.

Again, a "player" is not your single user here, you have probing tool
that need to provide data understandable by human people. And please
stop saying transcoding/transmuxing is painful, on the opposite it's
incredibly simple, you just need to read AVSphericalMapping (exactly
as any other user) and write down this information. If you think that
new (non-binary) fields should be added to further simplify muxing,
please do tell.

>> Wait for Aaron's opinion before addressing reviews and pushing. He
>> sent a different patchset himself and it wouldn't be nice to push
>> yours without at least giving him a chance to comment.
>>
>
> Thank you. I really appreciate this. I hope it is clear I'm really trying to
> find some middle ground here. My team has been working on this stuff inside
> Google for several years and I'm just trying to bring some of that
> experience and learning into the open here. I defer to you because this is
> your project and I am relatively sparse contributor.

Please don't get me wrong, it's really commendable that Google is
opening up more and more, after using open source for so many years.
On the other hand, do understand that Google is not the only ffmpeg
user in the world, and that it needs to respect additional, incredibly
different, and possibly boring use cases all around that might matter
to other users.
Aaron Colwell Feb. 15, 2017, 7:54 p.m. UTC | #9
Hi Vittorio,

I could reply to all your further comments and clarify some
miscommunication, but I'm just tired of arguing about this. As long as you
are convinced that you can round trip the information defined in the spec,
then I am fine with whatever you folks choose to do. I would rather make
some progress than nothing at all. I also appreciate your willingness to
create alternative patches and advocate for other use cases.

This may not be a blocker for this patch, but one issue with converting the
bounds to pixels like you do here is that resizing a video could result in
incorrect metadata being generated when muxing. If you keep the bounds in
the 0.0-1.0 fixed point space this problem doesn't happen since it is a
resolution independent representation. If you still want to use pixels here
then the resizing filter will need to become aware of AVSphericalMapping
and adjust it accordingly. I think cubemap padding in pixels may have a
similar issue. This is no way intended to be a blocking comment. It is just
meant to raise awareness on something you might not have considered.

Thank you for your patience with me.

Aaron

On Wed, Feb 15, 2017 at 10:43 AM Vittorio Giovara <
vittorio.giovara@gmail.com> wrote:

> Hi Aaron,
>
> On Wed, Feb 15, 2017 at 11:48 AM, Aaron Colwell <acolwell@google.com>
> wrote:
> >> If the spec changes, it will be the contents of the equi/cbmp/mesh.
> >> By exporting them raw as extradata, said changes in the spec would
> >> require no changes to our implementation.
> >
> >
> > This is one of the main reasons I was suggesting this path. I think of
> these
> > extradata fields much like the extra data that codecs have. It really is
> > only important to the code that needs to render a specific projection.
> For
> > transcoding, you mainly just need to convey the information in a lossless
> > manner from demuxer to muxer.
>
> I understand that, but "transcoding" is not the only task performed here.
>
> Yes, I agree that for your single usecase of converting mp4 metadata
> to mkv metadata your solution a feasible one, but the code here also
> has
> 1. convey information to the users
> 2. fill in side data and frame data (remember this is a pkt side data
> that is exported to frames since spherical is a frame property, it
> doesn't matter that currently the spec is container level only for
> now)
> 3. offer it to any muxer for additional parsing
>
> You can't expect to fill in binary data and have all three places
> behave correctly, and handle multiple theoretical version while at it.
> You need an abstraction like the one provided by the API in this
> patch.
>
> > I anticipate the spec will change in the future. My plan is that no
> change
> > will break what is currently specified in the spec right now, but I
> > anticipate some changes will be made. Having a solution that can
> gracefully
> > handle this would be nice.
>
> A *binary* solution is certainly not nice, especially if you have
> multiple version.
> Besides, you should really make an effort to change the spec as little
> as possible now that it is public, as more and more software will
> depend on it and you can't realistically expect that everyone will be
> always up-to-date (ffmpeg included).
>
> >> >> Wouldn't it be a better idea to export the binary data of the
> >> >> equi/cbmp/mesh boxes into an extradata-like field in the
> >> >> AVSphericalMapping struct, and let the downstream application parse
> >> >> it instead?
> >> >
> >> > No I don't think so, lavf is an abstraction layer and one of its tasks
> >> > is to provide a (simple?) unified entry layer. and letting the user
> >> > parse binary data is IMO bad design and very fragile. Also it's not
> >> > impossible that another standard for tagging spherical metadata
> >> > emerges in the future: the current API could very easily wrap it,
> >> > while exporting the binary entry would be too specification-specific
> >> > and it would be tied too heavily on the google draft.
> >>
> >
> > I agree with Vittorio that having some form of abstraction is a good
> thing
> > and having binary data in places can be problematic. It feels like we
> could
> > find some middle ground here by providing helper functions that parse the
> > binary data into projection specific structs and back just like codecs
> code
> > tends to do. I feel like this provides a reasonable balance between
> having a
> > common set of fields where things actually have common semantics like
> > projection_type, yaw/pitch/roll, & extra_data while also providing a way
> to
> > get access to projection specific information in a simple way.
>
> I don't feel like this has anything to do with the current patchset.
> This set only allows for properly conveying information in a clean
> matten to the user (and a muxer if you will can be consider as a plain
> user). Having a single parsing routine is IMO a useless optimization,
> and can be discussed in another thread.
>
> > At the end of the day players really just need to care about a rendering
> > mesh so in some sense it would be nice to have that be the abstraction
> for
> > the player use case. That is basically what we have done in our internal
> > player implementations. That could easily be handled by helper functions,
> > but would be a bad representation for AVSphericalMapping because it would
> > make transcoding/transmuxing painful.
>
> Again, a "player" is not your single user here, you have probing tool
> that need to provide data understandable by human people. And please
> stop saying transcoding/transmuxing is painful, on the opposite it's
> incredibly simple, you just need to read AVSphericalMapping (exactly
> as any other user) and write down this information. If you think that
> new (non-binary) fields should be added to further simplify muxing,
> please do tell.
>
> >> Wait for Aaron's opinion before addressing reviews and pushing. He
> >> sent a different patchset himself and it wouldn't be nice to push
> >> yours without at least giving him a chance to comment.
> >>
> >
> > Thank you. I really appreciate this. I hope it is clear I'm really
> trying to
> > find some middle ground here. My team has been working on this stuff
> inside
> > Google for several years and I'm just trying to bring some of that
> > experience and learning into the open here. I defer to you because this
> is
> > your project and I am relatively sparse contributor.
>
> Please don't get me wrong, it's really commendable that Google is
> opening up more and more, after using open source for so many years.
> On the other hand, do understand that Google is not the only ffmpeg
> user in the world, and that it needs to respect additional, incredibly
> different, and possibly boring use cases all around that might matter
> to other users.
> --
> Vittorio
>
Vittorio Giovara Feb. 16, 2017, 7:09 p.m. UTC | #10
On Wed, Feb 15, 2017 at 2:54 PM, Aaron Colwell <acolwell@google.com> wrote:
> Hi Vittorio,
>
> This may not be a blocker for this patch, but one issue with converting the
> bounds to pixels like you do here is that resizing a video could result in
> incorrect metadata being generated when muxing. If you keep the bounds in
> the 0.0-1.0 fixed point space this problem doesn't happen since it is a
> resolution independent representation. If you still want to use pixels here
> then the resizing filter will need to become aware of AVSphericalMapping and
> adjust it accordingly. I think cubemap padding in pixels may have a similar
> issue. This is no way intended to be a blocking comment. It is just meant to
> raise awareness on something you might not have considered.

Hi Aaron,
this is an interesting point, but leaves up open questions. Do you
foresee actual cases in which this is going to be a problem? It's not
the first time we have fixed point fields exported, but it's also true
that the conversion from a 0.32 system is not very common and could
leave users puzzled. Regarding padding, is the specification going to
be updated to fix this potential issue?
Thank you
Aaron Colwell Feb. 16, 2017, 8:41 p.m. UTC | #11
On Thu, Feb 16, 2017 at 11:09 AM Vittorio Giovara <
vittorio.giovara@gmail.com> wrote:

> On Wed, Feb 15, 2017 at 2:54 PM, Aaron Colwell <acolwell@google.com>
> wrote:
> > Hi Vittorio,
> >
> > This may not be a blocker for this patch, but one issue with converting
> the
> > bounds to pixels like you do here is that resizing a video could result
> in
> > incorrect metadata being generated when muxing. If you keep the bounds in
> > the 0.0-1.0 fixed point space this problem doesn't happen since it is a
> > resolution independent representation. If you still want to use pixels
> here
> > then the resizing filter will need to become aware of AVSphericalMapping
> and
> > adjust it accordingly. I think cubemap padding in pixels may have a
> similar
> > issue. This is no way intended to be a blocking comment. It is just
> meant to
> > raise awareness on something you might not have considered.
>
> Hi Aaron,
> this is an interesting point, but leaves up open questions. Do you
> foresee actual cases in which this is going to be a problem? It's not
> the first time we have fixed point fields exported, but it's also true
> that the conversion from a 0.32 system is not very common and could
> leave users puzzled. Regarding padding, is the specification going to
> be updated to fix this potential issue?
> Thank you
> --
> Vittorio
>

Hi Vittorio,

After sleeping on this, I think what you have will be fine. Resizing a
cubemap w/ padding is just going to have to be handled in a special way
because fractional pixel padding isn't going to work right on the GPU. You
really only want to waste a few pixels on padding, not a constant fraction
of the whole frame. Given that we have to handle cubemaps in a special way
for resizing, then my thoughts about resizing equirect aren't really a big
deal. I don't expect any spec changes will be needed for this.

I suppose this just a long way of saying "we will still need a followup for
resizing spherical videos properly, but this patch LGTM."

Thanks again for working on this.
Aaron
Vittorio Giovara Feb. 21, 2017, 9:26 p.m. UTC | #12
On Thu, Feb 16, 2017 at 3:41 PM, Aaron Colwell <acolwell@google.com> wrote:
> After sleeping on this, I think what you have will be fine. Resizing a
> cubemap w/ padding is just going to have to be handled in a special way
> because fractional pixel padding isn't going to work right on the GPU. You
> really only want to waste a few pixels on padding, not a constant fraction
> of the whole frame. Given that we have to handle cubemaps in a special way
> for resizing, then my thoughts about resizing equirect aren't really a big
> deal. I don't expect any spec changes will be needed for this.

Hey Aaron, thanks for the feedback, I actually had some time to think
about this (yay holidays), and I'm starting to agree with you: leaving
the values for the bounds unchanged is probably better. I think I'll
just add a convenience function to simplify conversion, and I'll send
an updated set soon.
Cheers
diff mbox

Patch

--- ./tests/ref/fate/matroska-spherical-mono    2017-02-10 23:43:51.993432371 +0100
+++ tests/data/fate/matroska-spherical-mono     2017-02-11 00:24:10.297483318 +0100
@@ -7,7 +7,7 @@ 
 [/SIDE_DATA]
 [SIDE_DATA]
 side_data_type=Spherical Mapping
-side_data_size=16
+side_data_size=56
 projection=equirectangular
 yaw=45
 pitch=30