mbox series

[FFmpeg-devel,v6,0/4] Initial implementation of TTML encoding/muxing

Message ID 20210302090040.10484-1-jeebjp@gmail.com
Headers show
Series Initial implementation of TTML encoding/muxing | expand

Message

Jan Ekström March 2, 2021, 9 a.m. UTC
I've intentionally kept this initial version simple (no styling etc) to focus
on the basics. As this goes through review, additional features can be added
(I had initial PoC for styling implemented some time around previous VDD), and
there is another patch set in my queue which would then add support for muxing
TTML into MP4.

Changes from the fifth version:
  - Fixed a stupid mistake in tools/ffescape.
  - Reworked the err_recognition commit based on comments from James and Anton.

Jan

Jan Ekström (3):
  ffprobe: switch to av_bprint_escape for XML escaping
  avcodec: enable usage of err_recognition for encoders
  {avcodec,avformat}: add TTML encoder and muxer

Stefano Sabatini (1):
  avutil/{avstring,bprint}: add XML escaping from ffprobe to avutil

 Changelog                  |   1 +
 doc/APIchanges             |   3 +
 doc/general_contents.texi  |   1 +
 fftools/ffprobe.c          |  32 ++----
 libavcodec/Makefile        |   1 +
 libavcodec/allcodecs.c     |   1 +
 libavcodec/avcodec.h       |   2 +-
 libavcodec/options_table.h |  18 ++--
 libavcodec/ttmlenc.c       | 210 +++++++++++++++++++++++++++++++++++++
 libavcodec/ttmlenc.h       |  28 +++++
 libavcodec/version.h       |   2 +-
 libavformat/Makefile       |   1 +
 libavformat/allformats.c   |   1 +
 libavformat/ttmlenc.c      | 174 ++++++++++++++++++++++++++++++
 libavformat/version.h      |   2 +-
 libavutil/avstring.h       |  14 +++
 libavutil/bprint.c         |  29 +++++
 libavutil/version.h        |   2 +-
 tests/fate/subtitles.mak   |   3 +
 tests/ref/fate/ffprobe_xml |   2 +-
 tests/ref/fate/sub-ttmlenc | 122 +++++++++++++++++++++
 tools/ffescape.c           |   7 +-
 22 files changed, 619 insertions(+), 37 deletions(-)
 create mode 100644 libavcodec/ttmlenc.c
 create mode 100644 libavcodec/ttmlenc.h
 create mode 100644 libavformat/ttmlenc.c
 create mode 100644 tests/ref/fate/sub-ttmlenc

Comments

Jan Ekström March 3, 2021, 8:09 p.m. UTC | #1
On Tue, Mar 2, 2021 at 11:00 AM Jan Ekström <jeebjp@gmail.com> wrote:
>
> I've intentionally kept this initial version simple (no styling etc) to focus
> on the basics. As this goes through review, additional features can be added
> (I had initial PoC for styling implemented some time around previous VDD), and
> there is another patch set in my queue which would then add support for muxing
> TTML into MP4.
>
> Changes from the fifth version:
>   - Fixed a stupid mistake in tools/ffescape.
>   - Reworked the err_recognition commit based on comments from James and Anton.
>
> Jan
>
> Jan Ekström (3):
>   ffprobe: switch to av_bprint_escape for XML escaping
>   avcodec: enable usage of err_recognition for encoders
>   {avcodec,avformat}: add TTML encoder and muxer
>
> Stefano Sabatini (1):
>   avutil/{avstring,bprint}: add XML escaping from ffprobe to avutil
>

So:

1. the XML escaping stuff has been OK'd. (Nicolas, Anton)
2. The err_recognition change has been OK'd (Anton, James on IRC)
3. Multiple nice review passes have been had with the actual encoder
and muxer, and the requests have been applied.
4. The document generated in the FATE test actually passes validation
in https://github.com/skynav/ttt .

Thus, I will be pulling this set in Soon (probably tomorrow). Thus if
anyone still wishes to have a say, I recommend doing it earlier than
later. Then I will be moving onto the TTML-in-MP4 work, which I have
been able to refactor onto relatively working state earlier this week
:) .

Jan
Moritz Barsnick March 4, 2021, 9:57 a.m. UTC | #2
On Wed, Mar 03, 2021 at 22:09:12 +0200, Jan Ekström wrote:
> Thus, I will be pulling this set in Soon (probably tomorrow). Thus if
> anyone still wishes to have a say, I recommend doing it earlier than
> later.

> + * Copyright (c) 2020 24i

Is 24i a company? Just wondering.

Also, I'm not sure whether it's usually advised to split libavformat
and libavcodec changes into two separate commits.

Cheers,
Moritz
Jan Ekström March 4, 2021, 1:52 p.m. UTC | #3
On Thu, Mar 4, 2021 at 11:57 AM Moritz Barsnick <barsnick@gmx.net> wrote:
>
> On Wed, Mar 03, 2021 at 22:09:12 +0200, Jan Ekström wrote:
> > Thus, I will be pulling this set in Soon (probably tomorrow). Thus if
> > anyone still wishes to have a say, I recommend doing it earlier than
> > later.
>
> > + * Copyright (c) 2020 24i
>
> Is 24i a company? Just wondering.
>

Yes. In the case of this patch set this is the case.

> Also, I'm not sure whether it's usually advised to split libavformat
> and libavcodec changes into two separate commits.

Yes, in general if the feature is usable by itself it could be split.
In this case the tests can only be run after both parts are there, and
thus there is only limited usefulness, as currently reading a TTML
document from MXF is supported, and I have not seen such samples out
in the wild.

That said, if we really want to go in pedantic we could split this as follows:

1. Add the muxer, only for the remux use case (from MXF, which we
cannot test since there is no samples).
2. Add the encoder and modify the muxer to be able to handle the
paragraph-based packets.

Alternatively, we can pull in the encoder first, then the muxer. But
the encoder by itself can only be tested through the API or so.

I... just am not so sure in this case splitting the TTML writer and
the TTML encoder from each other makes sense.

Jan
Moritz Barsnick March 4, 2021, 2:43 p.m. UTC | #4
On Thu, Mar 04, 2021 at 15:52:13 +0200, Jan Ekström wrote:
> I... just am not so sure in this case splitting the TTML writer and
> the TTML encoder from each other makes sense.

I agree. In very many cases, muxer and encoder, or demuxer and decoder
are only useful together.

Yet I have observed that commits to libavformat and libavcodec are
usually separated. Check the addition of many other such combinations.
(I think usually the codec comes first, because the muxer/demuxer uses
the codec definition.)

Moritz
Jan Ekström March 4, 2021, 2:52 p.m. UTC | #5
On Thu, Mar 4, 2021, 16:43 Moritz Barsnick <barsnick@gmx.net> wrote:

> On Thu, Mar 04, 2021 at 15:52:13 +0200, Jan Ekström wrote:
> > I... just am not so sure in this case splitting the TTML writer and
> > the TTML encoder from each other makes sense.
>
> I agree. In very many cases, muxer and encoder, or demuxer and decoder
> are only useful together.
>
> Yet I have observed that commits to libavformat and libavcodec are
> usually separated. Check the addition of many other such combinations.
> (I think usually the codec comes first, because the muxer/demuxer uses
> the codec definition.
>

Yes, I have looked at things, and I after I go for a walk will specifically
check previous subtitle additions. Meanwhile, please take a look at the
alternatives that I've raised and please tell me of either of those makes
a) sense b) it easier to revert things in case that is wanted.

The codec definition is there already, as demuxing of ttml packets from mxf
was already implemented. That - as I did note in my previous message would
be why if you really wanted the muxer could be brought in first as well.

Jan
Jan Ekström March 4, 2021, 4:49 p.m. UTC | #6
On Thu, Mar 4, 2021 at 4:52 PM Jan Ekström <jeebjp@gmail.com> wrote:
>
> On Thu, Mar 4, 2021, 16:43 Moritz Barsnick <barsnick@gmx.net> wrote:
>>
>> On Thu, Mar 04, 2021 at 15:52:13 +0200, Jan Ekström wrote:
>> > I... just am not so sure in this case splitting the TTML writer and
>> > the TTML encoder from each other makes sense.
>>
>> I agree. In very many cases, muxer and encoder, or demuxer and decoder
>> are only useful together.
>>
>> Yet I have observed that commits to libavformat and libavcodec are
>> usually separated. Check the addition of many other such combinations.
>> (I think usually the codec comes first, because the muxer/demuxer uses
>> the codec definition.
>
>
> Yes, I have looked at things, and I after I go for a walk will specifically check previous subtitle additions. Meanwhile, please take a look at the alternatives that I've raised and please tell me of either of those makes a) sense b) it easier to revert things in case that is wanted.
>
> The codec definition is there already, as demuxing of ttml packets from mxf was already implemented. That - as I did note in my previous message would be why if you really wanted the muxer could be brought in first as well.

I have asked Anton for an opinion on this because my first reaction
might have been a bit "asdf unclear comments from someone after the
patch set has been in review for months and it's something that's been
part of the patch set from the beginning.", and thus might have
required a bit of re-calibration.

And it was a combo of:

1. ENOCARE (as in, either would be acceptable)
2. If I really wanted an opinion on this, split.

Thus I will split it, encoder first, then muxer. Since the muxer
utilizes a helper from the encoder's header and I'm too lazy at this
point to rewrite things into multiple bits. I do not find it required
in this case, but if it gets things forward...

Jan