diff mbox

[FFmpeg-devel] lavf: add ffprobe demuxer

Message ID 20160917162559.GL10784@barisone
State Superseded
Headers show

Commit Message

Stefano Sabatini Sept. 17, 2016, 4:25 p.m. UTC
On date Sunday 2016-09-04 23:25:56 +0200, Michael Niedermayer encoded:
> On Sun, Sep 04, 2016 at 06:24:37PM +0200, Stefano Sabatini wrote:
> > From: Nicolas George <george@nsup.org>
> > 
> > With several modifications and documentation by Stefano Sabatini
> > <stefasab@gmail.com>.
> > 
> > Signed-off-by: Nicolas George <george@nsup.org>
> > ---
> >  doc/ffprobe-format.texi  | 130 ++++++++++++++++
> >  libavformat/Makefile     |   1 +
> >  libavformat/allformats.c |   2 +
> >  libavformat/ffprobedec.c | 397 +++++++++++++++++++++++++++++++++++++++++++++++
> >  4 files changed, 530 insertions(+)
> >  create mode 100644 doc/ffprobe-format.texi
> >  create mode 100644 libavformat/ffprobedec.c
> 
> this seems not to apply cleanly
> 
> can i pick this from some git repo ? (should work better than patch
> with unavailable ancestors)

Updated, this should apply cleanly on master.

Comments

Paul B Mahol Sept. 17, 2016, 4:42 p.m. UTC | #1
On 9/17/16, Stefano Sabatini <stefasab@gmail.com> wrote:
> On date Sunday 2016-09-04 23:25:56 +0200, Michael Niedermayer encoded:
>> On Sun, Sep 04, 2016 at 06:24:37PM +0200, Stefano Sabatini wrote:
>> > From: Nicolas George <george@nsup.org>
>> >
>> > With several modifications and documentation by Stefano Sabatini
>> > <stefasab@gmail.com>.
>> >
>> > Signed-off-by: Nicolas George <george@nsup.org>
>> > ---
>> >  doc/ffprobe-format.texi  | 130 ++++++++++++++++
>> >  libavformat/Makefile     |   1 +
>> >  libavformat/allformats.c |   2 +
>> >  libavformat/ffprobedec.c | 397
>> > +++++++++++++++++++++++++++++++++++++++++++++++
>> >  4 files changed, 530 insertions(+)
>> >  create mode 100644 doc/ffprobe-format.texi
>> >  create mode 100644 libavformat/ffprobedec.c
>>
>> this seems not to apply cleanly
>>
>> can i pick this from some git repo ? (should work better than patch
>> with unavailable ancestors)
>
> Updated, this should apply cleanly on master.

Why we need this hack?
Nicolas George Sept. 17, 2016, 4:53 p.m. UTC | #2
Le jour du Génie, an CCXXIV, Paul B Mahol a écrit :
> Why we need this hack?

I think I already gave an answer.

Regards,
Stefano Sabatini Sept. 18, 2016, 1:28 p.m. UTC | #3
On date Saturday 2016-09-17 18:42:35 +0200, Paul B Mahol encoded:
> On 9/17/16, Stefano Sabatini <stefasab@gmail.com> wrote:
> > On date Sunday 2016-09-04 23:25:56 +0200, Michael Niedermayer encoded:
> >> On Sun, Sep 04, 2016 at 06:24:37PM +0200, Stefano Sabatini wrote:
> >> > From: Nicolas George <george@nsup.org>
> >> >
> >> > With several modifications and documentation by Stefano Sabatini
> >> > <stefasab@gmail.com>.
> >> >
> >> > Signed-off-by: Nicolas George <george@nsup.org>
> >> > ---
> >> >  doc/ffprobe-format.texi  | 130 ++++++++++++++++
> >> >  libavformat/Makefile     |   1 +
> >> >  libavformat/allformats.c |   2 +
> >> >  libavformat/ffprobedec.c | 397
> >> > +++++++++++++++++++++++++++++++++++++++++++++++
> >> >  4 files changed, 530 insertions(+)
> >> >  create mode 100644 doc/ffprobe-format.texi
> >> >  create mode 100644 libavformat/ffprobedec.c
> >>
> >> this seems not to apply cleanly
> >>
> >> can i pick this from some git repo ? (should work better than patch
> >> with unavailable ancestors)
> >
> > Updated, this should apply cleanly on master.
> 
> Why we need this hack?

My use case: I need to inject a metadata stream using the ff*
tools. Metadata must be specified in a textual format. I already
designed and implemented the fftextdata format, but that was regarded
too limited.

Example:
ffmpeg -i input.mp4 -copyts -i data.ffprobe -map 0:v -map 0:a -map 1:0 -codec:d copy input+data.ts

With this format we allow to cover my use case, and it provides a more
generic format for such purposes (since you can specify multiple
streams). Also, it's inspired by the ffprobe output, so it can be used
together with ffprobe (via simple text editing) to generate files
which can be edited and feed to ffmpeg.

It would be possible to use any binary format for injecting the
metadata, but this approach is not very scriptable. Alternatively a
custom program to convert a custom textual format to any binary format
we support, but this would imply an additional step I want to avoid.

That said, if someone want/can propose an alternative path to this I'm
very open to suggestions.
Stefano Sabatini Sept. 22, 2016, 4:50 p.m. UTC | #4
On date Sunday 2016-09-18 15:28:45 +0200, Stefano Sabatini encoded:
> On date Saturday 2016-09-17 18:42:35 +0200, Paul B Mahol encoded:
> > On 9/17/16, Stefano Sabatini <stefasab@gmail.com> wrote:
> > > On date Sunday 2016-09-04 23:25:56 +0200, Michael Niedermayer encoded:
> > >> On Sun, Sep 04, 2016 at 06:24:37PM +0200, Stefano Sabatini wrote:
> > >> > From: Nicolas George <george@nsup.org>
> > >> >
> > >> > With several modifications and documentation by Stefano Sabatini
> > >> > <stefasab@gmail.com>.
> > >> >
> > >> > Signed-off-by: Nicolas George <george@nsup.org>
> > >> > ---
> > >> >  doc/ffprobe-format.texi  | 130 ++++++++++++++++
> > >> >  libavformat/Makefile     |   1 +
> > >> >  libavformat/allformats.c |   2 +
> > >> >  libavformat/ffprobedec.c | 397
> > >> > +++++++++++++++++++++++++++++++++++++++++++++++
> > >> >  4 files changed, 530 insertions(+)
> > >> >  create mode 100644 doc/ffprobe-format.texi
> > >> >  create mode 100644 libavformat/ffprobedec.c
> > >>
> > >> this seems not to apply cleanly
> > >>
> > >> can i pick this from some git repo ? (should work better than patch
> > >> with unavailable ancestors)
> > >
> > > Updated, this should apply cleanly on master.
> > 
> > Why we need this hack?
> 
> My use case: I need to inject a metadata stream using the ff*
> tools. Metadata must be specified in a textual format. I already
> designed and implemented the fftextdata format, but that was regarded
> too limited.
> 
> Example:
> ffmpeg -i input.mp4 -copyts -i data.ffprobe -map 0:v -map 0:a -map 1:0 -codec:d copy input+data.ts
> 
> With this format we allow to cover my use case, and it provides a more
> generic format for such purposes (since you can specify multiple
> streams). Also, it's inspired by the ffprobe output, so it can be used
> together with ffprobe (via simple text editing) to generate files
> which can be edited and feed to ffmpeg.
> 
> It would be possible to use any binary format for injecting the
> metadata, but this approach is not very scriptable. Alternatively a
> custom program to convert a custom textual format to any binary format
> we support, but this would imply an additional step I want to avoid.
> 
> That said, if someone want/can propose an alternative path to this I'm
> very open to suggestions.

Ping. I'd like to commit this if there are no objections. Also,
possibly add a muxer to generate output directly readable by the
demuxer (this way we don't need ffprobe for generating the output, and
we avoid the ordering issue).
wm4 Sept. 23, 2016, 7:34 a.m. UTC | #5
On Thu, 22 Sep 2016 18:50:27 +0200
Stefano Sabatini <stefasab@gmail.com> wrote:

> On date Sunday 2016-09-18 15:28:45 +0200, Stefano Sabatini encoded:
> > On date Saturday 2016-09-17 18:42:35 +0200, Paul B Mahol encoded:  
> > > On 9/17/16, Stefano Sabatini <stefasab@gmail.com> wrote:  
> > > > On date Sunday 2016-09-04 23:25:56 +0200, Michael Niedermayer encoded:  
> > > >> On Sun, Sep 04, 2016 at 06:24:37PM +0200, Stefano Sabatini wrote:  
> > > >> > From: Nicolas George <george@nsup.org>
> > > >> >
> > > >> > With several modifications and documentation by Stefano Sabatini
> > > >> > <stefasab@gmail.com>.
> > > >> >
> > > >> > Signed-off-by: Nicolas George <george@nsup.org>
> > > >> > ---
> > > >> >  doc/ffprobe-format.texi  | 130 ++++++++++++++++
> > > >> >  libavformat/Makefile     |   1 +
> > > >> >  libavformat/allformats.c |   2 +
> > > >> >  libavformat/ffprobedec.c | 397
> > > >> > +++++++++++++++++++++++++++++++++++++++++++++++
> > > >> >  4 files changed, 530 insertions(+)
> > > >> >  create mode 100644 doc/ffprobe-format.texi
> > > >> >  create mode 100644 libavformat/ffprobedec.c  
> > > >>
> > > >> this seems not to apply cleanly
> > > >>
> > > >> can i pick this from some git repo ? (should work better than patch
> > > >> with unavailable ancestors)  
> > > >
> > > > Updated, this should apply cleanly on master.  
> > > 
> > > Why we need this hack?  
> > 
> > My use case: I need to inject a metadata stream using the ff*
> > tools. Metadata must be specified in a textual format. I already
> > designed and implemented the fftextdata format, but that was regarded
> > too limited.
> > 
> > Example:
> > ffmpeg -i input.mp4 -copyts -i data.ffprobe -map 0:v -map 0:a -map 1:0 -codec:d copy input+data.ts
> > 
> > With this format we allow to cover my use case, and it provides a more
> > generic format for such purposes (since you can specify multiple
> > streams). Also, it's inspired by the ffprobe output, so it can be used
> > together with ffprobe (via simple text editing) to generate files
> > which can be edited and feed to ffmpeg.
> > 
> > It would be possible to use any binary format for injecting the
> > metadata, but this approach is not very scriptable. Alternatively a
> > custom program to convert a custom textual format to any binary format
> > we support, but this would imply an additional step I want to avoid.
> > 
> > That said, if someone want/can propose an alternative path to this I'm
> > very open to suggestions.  
> 
> Ping. I'd like to commit this if there are no objections. Also,
> possibly add a muxer to generate output directly readable by the
> demuxer (this way we don't need ffprobe for generating the output, and
> we avoid the ordering issue).

I'm sorry to jump in as a nay-sayer, but it looks like a quite fishy
concept and I still don't get why it's supposed to be needed. The
documentation also doesn't say what this is supposed to be useful for.

It's also potentially a security issue (lots of text parsing, and can
construct inputs which might be tricky for API users).

I probably can't stop you from pushing this, but at least it shouldn't
be enabled by default.
Stefano Sabatini Sept. 23, 2016, 5:46 p.m. UTC | #6
On date Friday 2016-09-23 09:34:19 +0200, wm4 encoded:
> On Thu, 22 Sep 2016 18:50:27 +0200
> Stefano Sabatini <stefasab@gmail.com> wrote:
[...]
> > Ping. I'd like to commit this if there are no objections. Also,
> > possibly add a muxer to generate output directly readable by the
> > demuxer (this way we don't need ffprobe for generating the output, and
> > we avoid the ordering issue).
> 
> I'm sorry to jump in as a nay-sayer, but it looks like a quite fishy
> concept and I still don't get why it's supposed to be needed. The
> documentation also doesn't say what this is supposed to be useful for.

I'll extend the documentation to add some possibly useful use cases.

My use case: I need to build a data stream with scripting/manual
editing. Since I don't want to have to rely on a binary format (which
is not ideal for that use case) I needed a format simple enough to be
written and analysed without special tools, but with a simple text
editor.

Also, if coupled with a muxer it allows to create an output, tweak it
(e.g. to change the timestamps) and feed it to FFmpeg, which might be
useful from time to time to test/debug specific features.

The alternative might be to build a tool for converting to
custom-format to a "binary" format accepted by FFmpeg, but I want
to avoid the need for an additional tool.

This format is my second attempt after the fftextdata format, which
was regarded too limited. This one is based on the ffprobe default
output (although it's simplified).

> It's also potentially a security issue (lots of text parsing, and can
> construct inputs which might be tricky for API users).
> 
> I probably can't stop you from pushing this, but at least it shouldn't
> be enabled by default.

I understand the security concerns, and I have no objections against
disabling this by default if developers prefer this way.
Carl Eugen Hoyos Sept. 23, 2016, 8:27 p.m. UTC | #7
2016-09-23 19:46 GMT+02:00 Stefano Sabatini <stefasab@gmail.com>:
> On date Friday 2016-09-23 09:34:19 +0200, wm4 encoded:
>> at least it shouldn't be enabled by default.
>
> I understand the security concerns, and I have no objections
> against disabling this by default if developers prefer this way.

Disabling features in configure by default is a bad idea,
a demuxer that has to be forced by the user is ok imo.

Carl Eugen
wm4 Sept. 24, 2016, 1:21 p.m. UTC | #8
On Fri, 23 Sep 2016 19:46:16 +0200
Stefano Sabatini <stefasab@gmail.com> wrote:

> On date Friday 2016-09-23 09:34:19 +0200, wm4 encoded:
> > On Thu, 22 Sep 2016 18:50:27 +0200
> > Stefano Sabatini <stefasab@gmail.com> wrote:  
> [...]
> > > Ping. I'd like to commit this if there are no objections. Also,
> > > possibly add a muxer to generate output directly readable by the
> > > demuxer (this way we don't need ffprobe for generating the output, and
> > > we avoid the ordering issue).  
> > 
> > I'm sorry to jump in as a nay-sayer, but it looks like a quite fishy
> > concept and I still don't get why it's supposed to be needed. The
> > documentation also doesn't say what this is supposed to be useful for.  
> 
> I'll extend the documentation to add some possibly useful use cases.
> 
> My use case: I need to build a data stream with scripting/manual
> editing. Since I don't want to have to rely on a binary format (which
> is not ideal for that use case) I needed a format simple enough to be
> written and analysed without special tools, but with a simple text
> editor.

I still don't get it. Your description makes me think of something like
EDL, but that doesn't seem to have anything to do with it.

> Also, if coupled with a muxer it allows to create an output, tweak it
> (e.g. to change the timestamps) and feed it to FFmpeg, which might be
> useful from time to time to test/debug specific features.
> 
> The alternative might be to build a tool for converting to
> custom-format to a "binary" format accepted by FFmpeg, but I want
> to avoid the need for an additional tool.
> 
> This format is my second attempt after the fftextdata format, which
> was regarded too limited. This one is based on the ffprobe default
> output (although it's simplified).
> 
> > It's also potentially a security issue (lots of text parsing, and can
> > construct inputs which might be tricky for API users).
> > 
> > I probably can't stop you from pushing this, but at least it shouldn't
> > be enabled by default.  
> 
> I understand the security concerns, and I have no objections against
> disabling this by default if developers prefer this way.
wm4 Sept. 24, 2016, 1:21 p.m. UTC | #9
On Fri, 23 Sep 2016 22:27:00 +0200
Carl Eugen Hoyos <ceffmpeg@gmail.com> wrote:

> 2016-09-23 19:46 GMT+02:00 Stefano Sabatini <stefasab@gmail.com>:
> > On date Friday 2016-09-23 09:34:19 +0200, wm4 encoded:  
> >> at least it shouldn't be enabled by default.  
> >
> > I understand the security concerns, and I have no objections
> > against disabling this by default if developers prefer this way.  
> 
> Disabling features in configure by default is a bad idea,
> a demuxer that has to be forced by the user is ok imo.

Yet it's ok if external libs need to be enabled manually by the user?
Carl Eugen Hoyos Sept. 24, 2016, 1:25 p.m. UTC | #10
2016-09-24 15:21 GMT+02:00 wm4 <nfxjfg@googlemail.com>:
> On Fri, 23 Sep 2016 22:27:00 +0200
> Carl Eugen Hoyos <ceffmpeg@gmail.com> wrote:
>
>> 2016-09-23 19:46 GMT+02:00 Stefano Sabatini <stefasab@gmail.com>:
>> > On date Friday 2016-09-23 09:34:19 +0200, wm4 encoded:
>> >> at least it shouldn't be enabled by default.
>> >
>> > I understand the security concerns, and I have no objections
>> > against disabling this by default if developers prefer this way.
>>
>> Disabling features in configure by default is a bad idea,
>> a demuxer that has to be forced by the user is ok imo.
>
> Yet it's ok if external libs need to be enabled manually by the user?

It's neither ok nor not ok but how FFmpeg does it for non-system
libraries.

And I am against changing our defaults unless they are
unreasonable or even buggy: We would brake every single
build script when switching to auto-detection of external
libraries.

Carl Eugen
Stefano Sabatini Sept. 25, 2016, 5:32 p.m. UTC | #11
On date Saturday 2016-09-24 15:21:11 +0200, wm4 encoded:
> On Fri, 23 Sep 2016 19:46:16 +0200
> Stefano Sabatini <stefasab@gmail.com> wrote:
> 
> > On date Friday 2016-09-23 09:34:19 +0200, wm4 encoded:
> > > On Thu, 22 Sep 2016 18:50:27 +0200
> > > Stefano Sabatini <stefasab@gmail.com> wrote:  
> > [...]
> > > > Ping. I'd like to commit this if there are no objections. Also,
> > > > possibly add a muxer to generate output directly readable by the
> > > > demuxer (this way we don't need ffprobe for generating the output, and
> > > > we avoid the ordering issue).  
> > > 
> > > I'm sorry to jump in as a nay-sayer, but it looks like a quite fishy
> > > concept and I still don't get why it's supposed to be needed. The
> > > documentation also doesn't say what this is supposed to be useful for.  
> > 
> > I'll extend the documentation to add some possibly useful use cases.
> > 
> > My use case: I need to build a data stream with scripting/manual
> > editing. Since I don't want to have to rely on a binary format (which
> > is not ideal for that use case) I needed a format simple enough to be
> > written and analysed without special tools, but with a simple text
> > editor.
> 
> I still don't get it. Your description makes me think of something like
> EDL, but that doesn't seem to have anything to do with it.

EDL as "Edit Decision List"? No I don't think this is related at
all.

Suppose you want to inject a data stream, you have the data and you
know where to insert it given its PTS. With this muxer you can
handcraft a textual file in the ffprobe_default format and use ffmpeg
to inject the new data stream.
wm4 Sept. 26, 2016, 4:55 p.m. UTC | #12
On Sun, 25 Sep 2016 19:32:37 +0200
Stefano Sabatini <stefasab@gmail.com> wrote:

> On date Saturday 2016-09-24 15:21:11 +0200, wm4 encoded:
> > On Fri, 23 Sep 2016 19:46:16 +0200
> > Stefano Sabatini <stefasab@gmail.com> wrote:
> >   
> > > On date Friday 2016-09-23 09:34:19 +0200, wm4 encoded:  
> > > > On Thu, 22 Sep 2016 18:50:27 +0200
> > > > Stefano Sabatini <stefasab@gmail.com> wrote:    
> > > [...]  
> > > > > Ping. I'd like to commit this if there are no objections. Also,
> > > > > possibly add a muxer to generate output directly readable by the
> > > > > demuxer (this way we don't need ffprobe for generating the output, and
> > > > > we avoid the ordering issue).    
> > > > 
> > > > I'm sorry to jump in as a nay-sayer, but it looks like a quite fishy
> > > > concept and I still don't get why it's supposed to be needed. The
> > > > documentation also doesn't say what this is supposed to be useful for.    
> > > 
> > > I'll extend the documentation to add some possibly useful use cases.
> > > 
> > > My use case: I need to build a data stream with scripting/manual
> > > editing. Since I don't want to have to rely on a binary format (which
> > > is not ideal for that use case) I needed a format simple enough to be
> > > written and analysed without special tools, but with a simple text
> > > editor.  
> > 
> > I still don't get it. Your description makes me think of something like
> > EDL, but that doesn't seem to have anything to do with it.  
> 
> EDL as "Edit Decision List"? No I don't think this is related at
> all.
> 
> Suppose you want to inject a data stream, you have the data and you
> know where to insert it given its PTS. With this muxer you can
> handcraft a textual file in the ffprobe_default format and use ffmpeg
> to inject the new data stream.

What's a "data stream"? Inject into what? And why?
Stefano Sabatini Sept. 28, 2016, 11:21 p.m. UTC | #13
On date Monday 2016-09-26 18:55:47 +0200, wm4 encoded:
> On Sun, 25 Sep 2016 19:32:37 +0200
> Stefano Sabatini <stefasab@gmail.com> wrote:
[...]
> > > > My use case: I need to build a data stream with scripting/manual
> > > > editing. Since I don't want to have to rely on a binary format (which
> > > > is not ideal for that use case) I needed a format simple enough to be
> > > > written and analysed without special tools, but with a simple text
> > > > editor.  
> > > 
> > > I still don't get it. Your description makes me think of something like
> > > EDL, but that doesn't seem to have anything to do with it.  
> > 
> > EDL as "Edit Decision List"? No I don't think this is related at
> > all.
> > 
> > Suppose you want to inject a data stream, you have the data and you
> > know where to insert it given its PTS. With this muxer you can
> > handcraft a textual file in the ffprobe_default format and use ffmpeg
> > to inject the new data stream.
> 

> What's a "data stream"?

A stream consisting of generic data, that is a stream with type
AV_MEDIA_TYPE_DATA.

> Inject into what?

Interleave the stream with other media streams (audio, video,
etc.). This can be done via remuxing (no need for transcoding).

> And why?

Data stream could be used to insert custom information for
post-processing (e.g. they might contain advertising information).
wm4 Sept. 29, 2016, 7:55 p.m. UTC | #14
On Thu, 29 Sep 2016 01:21:01 +0200
Stefano Sabatini <stefasab@gmail.com> wrote:

> On date Monday 2016-09-26 18:55:47 +0200, wm4 encoded:
> > On Sun, 25 Sep 2016 19:32:37 +0200
> > Stefano Sabatini <stefasab@gmail.com> wrote:  
> [...]
> > > > > My use case: I need to build a data stream with scripting/manual
> > > > > editing. Since I don't want to have to rely on a binary format (which
> > > > > is not ideal for that use case) I needed a format simple enough to be
> > > > > written and analysed without special tools, but with a simple text
> > > > > editor.    
> > > > 
> > > > I still don't get it. Your description makes me think of something like
> > > > EDL, but that doesn't seem to have anything to do with it.    
> > > 
> > > EDL as "Edit Decision List"? No I don't think this is related at
> > > all.
> > > 
> > > Suppose you want to inject a data stream, you have the data and you
> > > know where to insert it given its PTS. With this muxer you can
> > > handcraft a textual file in the ffprobe_default format and use ffmpeg
> > > to inject the new data stream.  
> >   
> 
> > What's a "data stream"?  
> 
> A stream consisting of generic data, that is a stream with type
> AV_MEDIA_TYPE_DATA.
> 
> > Inject into what?  
> 
> Interleave the stream with other media streams (audio, video,
> etc.). This can be done via remuxing (no need for transcoding).
> 
> > And why?  
> 
> Data stream could be used to insert custom information for
> post-processing (e.g. they might contain advertising information).

This seems like a rather special use case. Why does it have a demuxer,
and can't be in your own C code using libavcodec/libavformat?

In addition, I think using the ffprobe "format" is an overcomplication,
and will justify adding even more stuff to the demuxer, until it's a
similarily complex mess like the ffm demuxer/muxer.
Josh Dekker Sept. 29, 2016, 8:02 p.m. UTC | #15
On 29/09/2016 00:21, Stefano Sabatini wrote:
> On date Monday 2016-09-26 18:55:47 +0200, wm4 encoded:
>> On Sun, 25 Sep 2016 19:32:37 +0200
>> Stefano Sabatini <stefasab@gmail.com> wrote:
> [...]
>>>>> My use case: I need to build a data stream with scripting/manual
>>>>> editing. Since I don't want to have to rely on a binary format (which
>>>>> is not ideal for that use case) I needed a format simple enough to be
>>>>> written and analysed without special tools, but with a simple text
>>>>> editor.
>>>>
>>>> I still don't get it. Your description makes me think of something like
>>>> EDL, but that doesn't seem to have anything to do with it.
>>>
>>> EDL as "Edit Decision List"? No I don't think this is related at
>>> all.
>>>
>>> Suppose you want to inject a data stream, you have the data and you
>>> know where to insert it given its PTS. With this muxer you can
>>> handcraft a textual file in the ffprobe_default format and use ffmpeg
>>> to inject the new data stream.
>>
>
>> What's a "data stream"?
>
> A stream consisting of generic data, that is a stream with type
> AV_MEDIA_TYPE_DATA.
>
>> Inject into what?
>
> Interleave the stream with other media streams (audio, video,
> etc.). This can be done via remuxing (no need for transcoding).
>
>> And why?
>
> Data stream could be used to insert custom information for
> post-processing (e.g. they might contain advertising information).
>

Are you sure this cannot be done from outside of libav*? It really seems 
like something which wouldn't actually be very useful for anyone else. 
It's also just another hack in for a non-standardised, exclusive format. 
Lavf is really not a good fit for this and it should be done outside of 
libav*.

--
Josh
Nicolas George Sept. 29, 2016, 8:09 p.m. UTC | #16
L'octidi 8 vendémiaire, an CCXXV, wm4 a écrit :
> This seems like a rather special use case. Why does it have a demuxer,
> and can't be in your own C code using libavcodec/libavformat?

Of course, it can. It just takes more effort overall. You do not believe
that Stefano and I both started working on a similar project on a whim, do
you? We did it because we felt it would gain time in the long run. Once the
demuxer is there, it is ready for all uses.

I very often have samples triggering bugs that I want to tweak to see what
is causing the bug exactly: timestamps, order of packets, even the payload.
Editing them with a powerful text editor or a perl one-liner beats anything
in terms of efficiency to do so.

> In addition, I think using the ffprobe "format" is an overcomplication,
> and will justify adding even more stuff to the demuxer, until it's a
> similarily complex mess like the ffm demuxer/muxer.

This looks like a "slippery slope" fallacy.

Since it is clearly useful almost only for developers doing debugging, I
would not mind having no probe function or even disabled by default, making
it a non-issue security-wise. Beyond that, I have yet to see an actual
argument against it.

Regards,
Stefano Sabatini Oct. 18, 2016, 10:31 a.m. UTC | #17
On date Thursday 2016-09-29 21:02:14 +0100, Josh de Kock encoded:
[...]
> Are you sure this cannot be done from outside of libav*? It really
> seems like something which wouldn't actually be very useful for
> anyone else.

It can be done outside of libav* (reading from a custom format, and
then converting to one of the supported output formats), but this
would imply another processing step.

> It's also just another hack in for a non-standardised,
> exclusive format. Lavf is really not a good fit for this and it
> should be done outside of libav*.

See my previous replies. I think there are both advantages and
disadvantages for integrating this format, and it should be probably
disabled by default for security concerns.
diff mbox

Patch

From bd026e67a4876cff4e50edfb7f0958c87eacbef0 Mon Sep 17 00:00:00 2001
From: Nicolas George <george@nsup.org>
Date: Sat, 11 Jan 2014 19:42:41 +0100
Subject: [PATCH] lavf: add ffprobe demuxer

With several modifications and documentation by Stefano Sabatini
<stefasab@gmail.com>.

Signed-off-by: Nicolas George <george@nsup.org>
---
 doc/ffprobe-format.texi  | 130 ++++++++++++++++
 libavformat/Makefile     |   1 +
 libavformat/allformats.c |   1 +
 libavformat/ffprobedec.c | 397 +++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 529 insertions(+)
 create mode 100644 doc/ffprobe-format.texi
 create mode 100644 libavformat/ffprobedec.c

diff --git a/doc/ffprobe-format.texi b/doc/ffprobe-format.texi
new file mode 100644
index 0000000..be7af15
--- /dev/null
+++ b/doc/ffprobe-format.texi
@@ -0,0 +1,130 @@ 
+@chapter ffprobe demuxer format
+@c man begin FFPROBE DEMUXER FORMAT
+
+The ffprobe demuxer format is inspired by the output generated by the
+ffprobe default format.
+
+It consists of several sections (FORMAT, STREAM, PACKET). Each section
+starts with a single line in the format @samp{[SECTION]} and ends with
+the corresponding line @samp{[/SECTION]}.
+
+where @samp{SECTION} is the name of the section.
+
+A well-formed file consists of an initial @samp{FORMAT} section,
+followed by several @samp{STREAM} sections (one per stream), and
+several @samp{PACKET} sections.
+
+Each section contains a sequence of lines in the form
+@sample{key=value}. In the case of data the key and the value must
+stay on different lines.
+
+Unrecognized values are discarded (this allows the demuxer to accept
+the output generated by @program{ffprobe} without further
+modifications.
+
+The following sections document the fields accepted by each section.
+
+@section FORMAT
+
+@table
+@item nb_streams
+the number of supported streams
+@end table
+
+@section STREAM
+@table
+@item index
+the index number
+@item codec_name
+the codec name (the name must be an accepted FFmpeg codec name
+@item time_base
+Specify the time_base as @samp{NUM/DEN}, this is the time base used to
+specify the timestamps in the corresponding packets
+@end table
+
+@section PACKET
+
+@table
+@item stream_index
+the stream index of the stream (must have been specified in a stream
+section). If not specified the first stream is assumed.
+
+@item pts
+the PTS expressed in the corresponding time base stream units
+
+@item pts_time
+the PTS expressed as an absolute time, using the FFmpeg time syntax
+
+@item dts
+the DTS expressed in the corresponding time base stream units
+
+@item pts_time
+the DTS expressed as an absolute time, using the FFmpeg time syntax
+
+@item duration
+the duration expressed in the corresponding time base stream units
+
+@item duration_tim
+the packet duration expressed as an absolute time, using the FFmpeg time syntax
+
+@item flags
+If the value is equal to "K" mark the packet as a key packet
+
+@item data
+specifies the data as a sequence of hexadecimal values (two
+hexadecimal values specify a byte). The data specification starts on
+the following line and can span over several lines, and must be
+terminated by an empty line. Spaces within each line are discarded.
+@end table
+
+@section Examples
+
+A ffprobe demuxer file might look like this:
+@example
+[FORMAT]
+nb_streams=1
+format_name=ffprobe_default
+[/FORMAT]
+[STREAM]
+index=0
+codec_name=timed_id3
+time_base=1/1000000
+[/STREAM]
+[PACKET]
+stream_index=0
+pts_time=0
+dts_time=0
+duration=N/A
+flags=K
+data=
+f000 0000 0167 42c0 1ed9 81b1 fefc 0440
+57ff d7d1 dfff e11a 3d7e 6cbd 0000 0001
+ce8c 4d9d 108e 25e9 fe
+
+[/PACKET]
+[PACKET]
+stream_index=0
+pts_time=1.00
+duration=N/A
+flags=K
+data=
+f000 0000 0141 9a38 3be5 ffff ffff fffa
+ebc1 3782 7d5e 21e9 ffff abf2 ea4f ed66
+0000 0001 ce8c 4d9d 108e 25e9 fe
+
+[/PACKET]
+@end example
+
+The output generated by the @program{ffprobe} default format is valid
+if the following options are also specified:
+@example
+-show_packets -show_data -show_compact_data -show_format -show_streams -show_packets -show_headers_first
+@end example
+
+Thus, to generate the corresponding output from an input file you can
+do:
+@example
+ffprobe INPUT.ts -show_packets -show_data -show_compact_data -show_format -show_streams -show_packets -show_headers_first > OUTPUT.ffprobe
+@end example
+
+@c man end FFPROBE DEMUXER FORMAT
diff --git a/libavformat/Makefile b/libavformat/Makefile
index 5d827d31..04467be 100644
--- a/libavformat/Makefile
+++ b/libavformat/Makefile
@@ -162,6 +162,7 @@  OBJS-$(CONFIG_FFM_DEMUXER)               += ffmdec.o
 OBJS-$(CONFIG_FFM_MUXER)                 += ffmenc.o
 OBJS-$(CONFIG_FFMETADATA_DEMUXER)        += ffmetadec.o
 OBJS-$(CONFIG_FFMETADATA_MUXER)          += ffmetaenc.o
+OBJS-$(CONFIG_FFPROBE_DEFAULT_DEMUXER)   += ffprobedec.o
 OBJS-$(CONFIG_FIFO_MUXER)                += fifo.o
 OBJS-$(CONFIG_FILMSTRIP_DEMUXER)         += filmstripdec.o
 OBJS-$(CONFIG_FILMSTRIP_MUXER)           += filmstripenc.o
diff --git a/libavformat/allformats.c b/libavformat/allformats.c
index 6a216ef..8ef1822 100644
--- a/libavformat/allformats.c
+++ b/libavformat/allformats.c
@@ -123,6 +123,7 @@  void av_register_all(void)
     REGISTER_MUXER   (F4V,              f4v);
     REGISTER_MUXDEMUX(FFM,              ffm);
     REGISTER_MUXDEMUX(FFMETADATA,       ffmetadata);
+    REGISTER_DEMUXER (FFPROBE_DEFAULT,  ffprobe_default);
     REGISTER_MUXER   (FIFO,             fifo);
     REGISTER_MUXDEMUX(FILMSTRIP,        filmstrip);
     REGISTER_MUXDEMUX(FLAC,             flac);
diff --git a/libavformat/ffprobedec.c b/libavformat/ffprobedec.c
new file mode 100644
index 0000000..3a631c9
--- /dev/null
+++ b/libavformat/ffprobedec.c
@@ -0,0 +1,397 @@ 
+/*
+ * Copyright (c) 2013 Nicolas George
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public License
+ * as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public License
+ * along with FFmpeg; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavutil/avassert.h"
+#include "libavutil/avstring.h"
+#include "libavutil/bprint.h"
+#include "libavutil/parseutils.h"
+#include "avformat.h"
+#include "internal.h"
+
+enum SectionType {
+    SEC_NONE = 0,
+    SEC_FORMAT,
+    SEC_STREAM,
+    SEC_PACKET,
+};
+
+const char *const section_names[] = {
+    [SEC_NONE]   = "NONE",
+    [SEC_FORMAT] = "FORMAT",
+    [SEC_STREAM] = "STREAM",
+    [SEC_PACKET] = "PACKET",
+};
+
+typedef struct {
+    AVClass *class;
+    enum SectionType section;
+    AVBPrint data;
+    int packet_nb;
+} FFprobeContext;
+
+static int ffprobe_probe(AVProbeData *probe)
+{
+    unsigned score;
+
+    if (!av_strstart(probe->buf, "[FORMAT]\n", NULL))
+        return 0;
+    score = !!strstr(probe->buf, "\nnb_streams=") +
+            !!strstr(probe->buf, "\nnb_programs=") +
+            !!strstr(probe->buf, "\nformat_name=") +
+            !!strstr(probe->buf, "\nstart_time=") +
+            !!strstr(probe->buf, "\nsize=");
+    return score >= 3 ? AVPROBE_SCORE_MAX : AVPROBE_SCORE_MAX / 2;
+}
+
+static int ffprobe_read_close(AVFormatContext *avf)
+{
+    FFprobeContext *ffp = avf->priv_data;
+
+    av_bprint_finalize(&ffp->data, NULL);
+    return 0;
+}
+
+/**
+ * Read a section start line ("[SECTION]").
+ * Update FFprobeContext.section.
+ * @return  SectionType (>0) for success,
+ *          SEC_NONE if no section start,
+ *          <0 for error
+ */
+static int read_section_start(AVFormatContext *avf)
+{
+    FFprobeContext *ffp = avf->priv_data;
+    uint8_t buf[4096];
+    const char *rest;
+    int i, ret;
+
+    ret = ff_get_line(avf->pb, buf, sizeof(buf));
+    if (ret == 0 && avio_feof(avf->pb))
+        ret = AVERROR_EOF;
+    if (ret <= 0)
+        return ret;
+    if (*buf != '[')
+        return 0;
+    for (i = 1; i < FF_ARRAY_ELEMS(section_names); i++) {
+        if (av_strstart(buf + 1, section_names[i], &rest) &&
+            !strcmp(rest, "]\n")) {
+            ffp->section = i;
+            return i;
+        }
+    }
+    return SEC_NONE;
+}
+
+/**
+ * Read a line from within a section.
+ * @return  >0 for success, 0 if end of section, <0 for error
+ */
+static int read_section_line(AVFormatContext *avf, uint8_t *buf, size_t size)
+{
+    FFprobeContext *ffp = avf->priv_data;
+    const char *rest;
+    int ret;
+    size_t l;
+
+    if ((ret = ff_get_line(avf->pb, buf, size)) <= 0)
+        return ret;
+    if (av_strstart(buf, "[/", &rest)) {
+        ffp->section = 0;
+        return 0;
+    }
+    if ((l = strlen(buf)) > 0 && buf[l - 1] == '\n')
+        buf[--l] = 0;
+    return 1;
+}
+
+/**
+ * Read hexadecimal data
+ * Store it in FFprobeContext.data.
+ * @return  >=0 for success, <0 for error
+ */
+static int read_data(AVFormatContext *avf, int *size)
+{
+    FFprobeContext *ffp = avf->priv_data;
+    uint8_t buf[4096], *cur;
+    int ret, pos, val;
+
+    if (ffp->data.len)
+        return AVERROR_INVALIDDATA;
+    while ((ret = read_section_line(avf, buf, sizeof(buf)))) {
+        if (ret < 0)
+            return ret;
+        if (!buf[0])
+            break;
+        cur = buf;
+        pos = 0;
+        cur += pos;
+        while (1) {
+            while (*cur == ' ')
+                cur++;
+            if (!*cur)
+                break;
+            if ((unsigned)(*cur - '0') >= 10 &&
+                (unsigned)(*cur - 'a') >=  6 &&
+                (unsigned)(*cur - 'A') >=  6) {
+                av_log(avf, AV_LOG_ERROR,
+                       "Invalid character %c in packet number %d data\n", *cur, ffp->packet_nb);
+                return AVERROR_INVALIDDATA;
+            }
+            pos = 0;
+            if (sscanf(cur, " %02x%n", &val, &pos) < 1 || !pos) {
+                av_log(avf, AV_LOG_ERROR,
+                       "Could not parse value in packet number %d data\n", ffp->packet_nb);
+                return AVERROR_INVALIDDATA;
+            }
+            cur += pos;
+            av_bprint_chars(&ffp->data, val, 1);
+        }
+    }
+
+    if (size)
+        *size = ffp->data.len;
+
+    return av_bprint_is_complete(&ffp->data) ? 0 : AVERROR(ENOMEM);
+}
+
+static int read_section_format(AVFormatContext *avf)
+{
+    uint8_t buf[4096];
+    int ret, val;
+
+    while ((ret = read_section_line(avf, buf, sizeof(buf)))) {
+        if (ret < 0)
+            return ret;
+        if (sscanf(buf, "nb_streams=%d", &val) >= 1) {
+            while (avf->nb_streams < val)
+                if (!avformat_new_stream(avf, NULL))
+                    return AVERROR(ENOMEM);
+        }
+        /* TODO programs */
+        /* TODO start_time duration bit_rate */
+        /* TODO tags */
+    }
+    return SEC_FORMAT;
+}
+
+static int read_section_stream(AVFormatContext *avf)
+{
+    FFprobeContext *ffp = avf->priv_data;
+    uint8_t buf[4096];
+    int ret, index, val1, val2;
+    AVStream *st = NULL;
+    const char *val;
+
+    av_bprint_clear(&ffp->data);
+    while ((ret = read_section_line(avf, buf, sizeof(buf)))) {
+        if (ret < 0)
+            return ret;
+        if (!st) {
+            if (sscanf(buf, "index=%d", &index) >= 1) {
+                if (index == avf->nb_streams) {
+                    if (!avformat_new_stream(avf, NULL))
+                        return AVERROR(ENOMEM);
+                }
+                if ((unsigned)index >= avf->nb_streams) {
+                    av_log(avf, AV_LOG_ERROR, "Invalid stream index: %d\n",
+                           index);
+                    return AVERROR_INVALIDDATA;
+                }
+                st = avf->streams[index];
+            } else {
+                av_log(avf, AV_LOG_ERROR, "Stream without index\n");
+                return AVERROR_INVALIDDATA;
+            }
+        }
+        if (av_strstart(buf, "codec_name=", &val)) {
+            const AVCodecDescriptor *desc = avcodec_descriptor_get_by_name(val);
+            if (desc) {
+                st->codecpar->codec_id   = desc->id;
+                st->codecpar->codec_type = desc->type;
+            }
+            if (!desc) {
+                av_log(avf, AV_LOG_WARNING, "Cannot recognize codec name '%s'", val);
+            }
+        } else if (!strcmp(buf, "extradata=")) {
+            if ((ret = read_data(avf, NULL)) < 0)
+                return ret;
+            if (ffp->data.len) {
+                if ((ret = ff_alloc_extradata(st->codecpar, ffp->data.len)) < 0)
+                    return ret;
+                memcpy(st->codecpar->extradata, ffp->data.str, ffp->data.len);
+            }
+        } else if (sscanf(buf, "time_base=%d/%d", &val1, &val2) >= 2) {
+            st->time_base.num = val1;
+            st->time_base.den = val2;
+        }
+    }
+    return SEC_STREAM;
+}
+
+static int read_section_packet(AVFormatContext *avf, AVPacket *pkt)
+{
+    FFprobeContext *ffp = avf->priv_data;
+    uint8_t buf[4096];
+    int ret;
+    AVPacket p;
+    char flags;
+
+    av_init_packet(&p);
+    p.pos = avio_tell(avf->pb);
+    p.stream_index = -1;
+    p.size = -1;
+    av_bprint_clear(&ffp->data);
+    while ((ret = read_section_line(avf, buf, sizeof(buf)))) {
+        int has_time = 0;
+        int64_t pts, dts, duration;
+        char timebuf[1024];
+
+        if (ret < 0)
+            return ret;
+        if (sscanf(buf, "stream_index=%d", &p.stream_index)) {
+            if ((unsigned)p.stream_index >= avf->nb_streams) {
+                av_log(avf, AV_LOG_ERROR, "Invalid stream number %d specified in packet number %d\n",
+                       p.stream_index, ffp->packet_nb);
+                return AVERROR_INVALIDDATA;
+            }
+        }
+
+#define PARSE_TIME(name_, is_duration)                                  \
+        sscanf(buf, #name_ "=%"SCNi64, &p.name_);                       \
+        has_time = sscanf(buf, #name_ "_time=%s", timebuf);             \
+        if (has_time) {                                                 \
+            int stream_index = p.stream_index == -1 ? 0 : p.stream_index; \
+                                                                        \
+            if (!strcmp(timebuf, "N/A")) {                              \
+                p.name_ = is_duration ? 0 : AV_NOPTS_VALUE;             \
+            } else {                                                    \
+                ret = av_parse_time(&name_, timebuf, 1);                \
+                if (ret < 0) {                                          \
+                    av_log(avf, AV_LOG_ERROR, "Invalid " #name_ " time specification '%s' for data packet\n", \
+                           timebuf);                                    \
+                    return ret;                                         \
+                }                                                       \
+                p.name_ = av_rescale_q(name_, AV_TIME_BASE_Q, avf->streams[stream_index]->time_base); \
+            }                                                           \
+        }                                                               \
+
+        PARSE_TIME(pts, 0);
+        PARSE_TIME(dts, 0);
+        PARSE_TIME(duration, 1);
+
+        if (sscanf(buf, "flags=%c", &flags) >= 1)
+            p.flags = flags == 'K' ? AV_PKT_FLAG_KEY : 0;
+        if (!strcmp(buf, "data="))
+            if ((ret = read_data(avf, &p.size)) < 0)
+                return ret;
+    }
+    if (p.size < 0 || (unsigned)p.stream_index >= avf->nb_streams)
+        return SEC_NONE;
+    if ((ret = av_new_packet(pkt, p.size)) < 0)
+        return ret;
+    p.data = pkt->data;
+    p.buf  = pkt->buf;
+    *pkt = p;
+    if (ffp->data.len) {
+        ffp->data.len = FFMIN(ffp->data.len, pkt->size);
+        memcpy(pkt->data, ffp->data.str, ffp->data.len);
+    }
+    return SEC_PACKET;
+}
+
+static int read_section(AVFormatContext *avf, AVPacket *pkt)
+{
+    FFprobeContext *ffp = avf->priv_data;
+    int ret, section;
+
+    while (!ffp->section)
+        if ((ret = read_section_start(avf)) < 0)
+            return ret;
+    switch (section = ffp->section) {
+    case SEC_FORMAT:
+        return read_section_format(avf);
+    case SEC_STREAM:
+        return read_section_stream(avf);
+    case SEC_PACKET:
+        ret = read_section_packet(avf, pkt);
+        ffp->packet_nb++;
+        return ret;
+    default:
+        av_assert0(!"reached");
+        return AVERROR_BUG;
+    }
+}
+
+static int ffprobe_read_header(AVFormatContext *avf)
+{
+    FFprobeContext *ffp = avf->priv_data;
+    int ret;
+
+    av_bprint_init(&ffp->data, 0, AV_BPRINT_SIZE_UNLIMITED);
+    if ((ret = read_section_start(avf)) < 0)
+        return ret;
+    if (ret != SEC_FORMAT) {
+        av_log(avf, AV_LOG_INFO, "Using noheader mode\n");
+        avf->ctx_flags |= AVFMTCTX_NOHEADER;
+        return 0;
+    }
+    if ((ret = read_section_format(avf)) < 0)
+        return ret;
+
+    /* read stream information */
+    while (1) {
+        ret = read_section_start(avf);
+        if (ret != SEC_STREAM)
+            break;
+        if ((ret = read_section_stream(avf)) < 0)
+            return ret;
+    }
+
+    return 0;
+}
+
+static int ffprobe_read_packet(AVFormatContext *avf, AVPacket *pkt)
+{
+    int ret;
+
+    while (1) {
+        if ((ret = read_section(avf, pkt)) < 0)
+            return ret;
+        if (ret == SEC_PACKET)
+            return 0;
+    }
+}
+
+static const AVClass ffprobe_default_class = {
+    .class_name = "ffprobe_default demuxer",
+    .item_name  = av_default_item_name,
+    .version    = LIBAVUTIL_VERSION_INT,
+};
+
+AVInputFormat ff_ffprobe_default_demuxer = {
+    .name           = "ffprobe_default",
+    .long_name      = NULL_IF_CONFIG_SMALL("FFprobe output (default writer)"),
+    .priv_data_size = sizeof(FFprobeContext),
+    .read_probe     = ffprobe_probe,
+    .read_header    = ffprobe_read_header,
+    .read_packet    = ffprobe_read_packet,
+    .read_close     = ffprobe_read_close,
+    .priv_class     = &ffprobe_default_class,
+};
-- 
1.9.1