mbox series

[FFmpeg-devel,0/4] avdevice/dshow: implement capabilities API

Message ID 20210603224552.1218-1-dcnieho@gmail.com
Headers show
Series avdevice/dshow: implement capabilities API | expand

Message

Diederick C. Niehorster June 3, 2021, 10:45 p.m. UTC
** Resending as it seems they didn't all make it..**

Undeprecating the avdevice capabilities API and implementing it for the
dshow device. Much needed. Together with the other patches i sent, a
dshow device can now be properly used programmatically by programs using
ffmpeg under the hood.

Diederick Niehorster (4):
  avdevice/avdevice: Revert "Deprecate AVDevice Capabilities API"
  avdevice/avdevice: clean up avdevice_capabilities_create
  avdevice/dshow: implement capabilities API
  examples: adding device_get_capabilities example

 configure                              |   2 +
 doc/APIchanges                         |   4 +
 doc/examples/.gitignore                |   1 +
 doc/examples/Makefile                  |  47 +--
 doc/examples/Makefile.example          |   1 +
 doc/examples/device_get_capabilities.c | 151 ++++++++
 libavdevice/avdevice.c                 |  74 +++-
 libavdevice/avdevice.h                 |   5 -
 libavdevice/dshow.c                    | 498 +++++++++++++++++++++++--
 libavdevice/version.h                  |   5 +-
 libavformat/avformat.h                 |  21 ++
 11 files changed, 735 insertions(+), 74 deletions(-)
 create mode 100644 doc/examples/device_get_capabilities.c

Comments

Anton Khirnov June 5, 2021, 2:36 p.m. UTC | #1
Quoting Diederick Niehorster (2021-06-04 00:45:48)
> ** Resending as it seems they didn't all make it..**
> 
> Undeprecating the avdevice capabilities API and implementing it for the
> dshow device. Much needed. Together with the other patches i sent, a
> dshow device can now be properly used programmatically by programs using
> ffmpeg under the hood.

Sorry to rain on your parade, but I don't think we should go ahead with
this before deciding what is to be done with libavdevice. The last
discussion about it died without being resolved, but the issues are
still present.
Diederick C. Niehorster June 5, 2021, 2:51 p.m. UTC | #2
On Sat, Jun 5, 2021 at 4:36 PM Anton Khirnov <anton@khirnov.net> wrote:
> Sorry to rain on your parade, but I don't think we should go ahead with
> this before deciding what is to be done with libavdevice. The last
> discussion about it died without being resolved, but the issues are
> still present.

I understand. I realize I'm new here: Is there a timeline for such a
discussion and could the discussion benefit from a real usage example
such as this patch series? I guess a big change in libavdevice should
at least offer the functionality the current implementation offers (or
theoretically offers :p). I really like the current design where an
avdevice can just be used through the avformat interface. Add
avdevice_register_all(); and Bob's your uncle!

To therefore argue a counterpoint, especially if the timeline is long:
accepting an implementation of the device capabilities API for an
avdevice will help steer that discussion, ensuring this capability is
not forgotten, and also resolves a real usage need (well, just mine)
now instead of in some possibly distant future.

In the meantime, I'll fix up my patch with Andreas' suggestions, along
with proposed changes for cleaning up the public avdevice capabilities
API if undeprecated, so its all ready to go if the decision goes in
favor.

All the best,
Dee
Nicolas George June 7, 2021, 5:02 p.m. UTC | #3
Anton Khirnov (12021-06-05):
> Sorry to rain on your parade, but I don't think we should go ahead with
> this before deciding what is to be done with libavdevice. The last
> discussion about it died without being resolved, but the issues are
> still present.

So, now we have somebody who wants to work on libavdevice, who uses it
for application development, and who tells us that its API is quite
fine.

That's enough to make a decision: libavdevice is generally fine as it is
for the foreseeable future and we can go ahead discussing the merits of
this code.

Regards,
Anton Khirnov June 9, 2021, 6:18 p.m. UTC | #4
Quoting Diederick C. Niehorster (2021-06-05 16:51:32)
> On Sat, Jun 5, 2021 at 4:36 PM Anton Khirnov <anton@khirnov.net> wrote:
> > Sorry to rain on your parade, but I don't think we should go ahead with
> > this before deciding what is to be done with libavdevice. The last
> > discussion about it died without being resolved, but the issues are
> > still present.
> 
> I understand. I realize I'm new here: Is there a timeline for such a
> discussion and could the discussion benefit from a real usage example
> such as this patch series? I guess a big change in libavdevice should
> at least offer the functionality the current implementation offers (or
> theoretically offers :p). I really like the current design where an
> avdevice can just be used through the avformat interface. Add
> avdevice_register_all(); and Bob's your uncle!

There is no timeline, it depends on someone sitting down and doing the
work. The options proposed so far were
1) merging libavdevice into libavformat
2) making libavdevice into an independent library with an independent
   API
3) moving libavdevice functionality into ffmpeg.c

I volunteered to do 1), but was stopped by some issues and Mark
volunteering to do 2). But Mark did not progress far on that apparently,
so now we are stuck. Maybe we should do a technical committee vote on
it.
Diederick C. Niehorster June 9, 2021, 7:49 p.m. UTC | #5
On Wed, Jun 9, 2021 at 8:18 PM Anton Khirnov <anton@khirnov.net> wrote:
>
> Quoting Diederick C. Niehorster (2021-06-05 16:51:32)
> > On Sat, Jun 5, 2021 at 4:36 PM Anton Khirnov <anton@khirnov.net> wrote:
> > > Sorry to rain on your parade, but I don't think we should go ahead with
> > > this before deciding what is to be done with libavdevice. The last
> > > discussion about it died without being resolved, but the issues are
> > > still present.
> >
> > I understand. I realize I'm new here: Is there a timeline for such a
> > discussion and could the discussion benefit from a real usage example
> > such as this patch series? I guess a big change in libavdevice should
> > at least offer the functionality the current implementation offers (or
> > theoretically offers :p). I really like the current design where an
> > avdevice can just be used through the avformat interface. Add
> > avdevice_register_all(); and Bob's your uncle!
>
> There is no timeline, it depends on someone sitting down and doing the
> work. The options proposed so far were
> 1) merging libavdevice into libavformat
> 2) making libavdevice into an independent library with an independent
>    API
> 3) moving libavdevice functionality into ffmpeg.c

Thanks for providing the explored options. What problem is there in
the way things currently are that these would be solving?

> I volunteered to do 1), but was stopped by some issues and Mark
> volunteering to do 2). But Mark did not progress far on that apparently,
> so now we are stuck. Maybe we should do a technical committee vote on
> it.

While not being familiar with the alternative, as said, from my
perspective whats great about the current setup is that avdevices act
just like formats, making them easy to integrate. And for devices that
actually implement functions like get_device_list, control_message and
create_device_capabilities, they are also directly explorable and
controllable like an independent API would presumably allow. Best of
both worlds in my book.

Could you point me to previous discussions regarding options 1 and 2,
if they are available somewhere to read, so i can have a more informed
opinion (in case that would carry any weight)?

Thanks and all the best,
Dee
Anton Khirnov June 9, 2021, 8:33 p.m. UTC | #6
Quoting Diederick C. Niehorster (2021-06-09 21:49:28)
> On Wed, Jun 9, 2021 at 8:18 PM Anton Khirnov <anton@khirnov.net> wrote:
> >
> > Quoting Diederick C. Niehorster (2021-06-05 16:51:32)
> > > On Sat, Jun 5, 2021 at 4:36 PM Anton Khirnov <anton@khirnov.net> wrote:
> > > > Sorry to rain on your parade, but I don't think we should go ahead with
> > > > this before deciding what is to be done with libavdevice. The last
> > > > discussion about it died without being resolved, but the issues are
> > > > still present.
> > >
> > > I understand. I realize I'm new here: Is there a timeline for such a
> > > discussion and could the discussion benefit from a real usage example
> > > such as this patch series? I guess a big change in libavdevice should
> > > at least offer the functionality the current implementation offers (or
> > > theoretically offers :p). I really like the current design where an
> > > avdevice can just be used through the avformat interface. Add
> > > avdevice_register_all(); and Bob's your uncle!
> >
> > There is no timeline, it depends on someone sitting down and doing the
> > work. The options proposed so far were
> > 1) merging libavdevice into libavformat
> > 2) making libavdevice into an independent library with an independent
> >    API
> > 3) moving libavdevice functionality into ffmpeg.c
> 
> Thanks for providing the explored options. What problem is there in
> the way things currently are that these would be solving?

The problem is that libavdevice is a separate library from libavformat,
but fundamentally depends on accessing libavformat internals.

> 
> > I volunteered to do 1), but was stopped by some issues and Mark
> > volunteering to do 2). But Mark did not progress far on that apparently,
> > so now we are stuck. Maybe we should do a technical committee vote on
> > it.
> 
> While not being familiar with the alternative, as said, from my
> perspective whats great about the current setup is that avdevices act
> just like formats, making them easy to integrate. And for devices that
> actually implement functions like get_device_list, control_message and
> create_device_capabilities, they are also directly explorable and
> controllable like an independent API would presumably allow. Best of
> both worlds in my book.
> 
> Could you point me to previous discussions regarding options 1 and 2,
> if they are available somewhere to read, so i can have a more informed
> opinion (in case that would carry any weight)?

Look through the threads
- libavdevice: Add KMS/DRM output device
  started by Nicolas Caramelli on 16 Jan 2021
- avdevice/avdevice: Deprecate AVDevice Capabilities API
  started by Andreas Rheinhardt, on 24 Jan 2021
  a good overview is
  http://lists.ffmpeg.org/pipermail/ffmpeg-devel/2021-January/275158.html
Diederick C. Niehorster June 10, 2021, 1:29 p.m. UTC | #7
Let me respond on two levels.

Before exploring the design space of a separation of libavdevice and
libavformat below, I think it is important to first comment on the
current state (and whether the AVDevice Capabilities part of my patch
series should be blocked by this discussion).

Importantly, I would suppose that any reorganization of libavdevice
and libavformat and redesign of the libavdevice API must aim to offer
at least the same functionality as the current API, that is, an
avdevice should be able to be queried for what devices it offers
(get_device_list), should for each device provide information about
what formats it accepts/can provide
(create_device_capabilities/free_device_capabilities) and should be
able to be controlled through the API (control_message). Perhaps these
take different forms, but same functionality should be offered. As
such, having AVDevice Capabilities API implemented for one of the
devices should help, not hamper, redesign efforts because it shows how
this API would actually be used in practice. Fundamental changes such
as a new avdevice API will be backwards incompatible no matter what,
so having one more bit of important functionality
(create_device_capabilities/free_device_capabilities) implemented
doesn't create a larger threshold to initiating such a redesign
effort. Instead, it forces that all the current API functionality is
thought out as well during the redesign effort and nothing is forgotten. I
thus argue that its a good thing to bring back the AVDevice Capabilities
API, since it helps, not hinders the redesign effort. And lets not
forget it offers users of the current API functionality (me at least)
they need now, not at some indeterminate timepoint in the future.

On Wed, Jun 9, 2021 at 10:33 PM Anton Khirnov <anton@khirnov.net> wrote:
> Look through the threads
> [...]

Thanks for the pointers!

> The problem is that libavdevice is a separate library from libavformat,
> but fundamentally depends on accessing libavformat internals.

Ah ok, so this is at first instance about cleanup/separation, not
necessarily about adding new functionality (I do see Mark's list of
opportunities that a new API offer, copied below). I see Nicolas argue
this entanglement of internals is not a problem in practice, and i
suppose there is a certain amount of taste involved here. Nothing
wrong with that. I guess for me personally that it is a little funky
to have to add/change things in AVFormat when changing the AVDevice
API, and that it may be good to for the longer term look at
disentangling them. I will get back to that below, in response to some
quotes of Mark's messages last January.

Mark's (non-exhaustive) list of opportunities a libavdevice API
redesign offers (numbered by me):
On 20/01/2021 12:41, Mark Thompson wrote:
 > 1. Handle frames as well as packets.
 >    1a. Including hardware frames - DRM objects from KMS/V4L2, D3D
surfaces from Windows desktop duplication (which doesn't currently
exist but should).
 > 2. Clear core option set - currently almost everything is set by
inconsistent private options; things like pixel/sample format,
frame/sample rate, geometry and hardware device should be common
options to all.
 > 3. Asynchronicity - a big annoyance in current recording scenarios
with the ffmpeg utility is that both audio and video capture block,
and do so on the same thread which results in skipped frames.
 > 4. Capability probing - the existing method of options which log
the capabilities are not very useful for API users.

1 and 3 i cannot speak to, but 4 is indeed what i ran into: the
current state of most avdevices is not useful at all for an API user
like me when it comes to capability probing (not a reason though to
get rid of the whole API, but to wonder why it wasn't implemented.
while nobody apparently bothered to do it before me, i think there
will be more than just me who will actually use it). Currently I'd
have to issue device specific options on a not-yet opened device,
listen to the log output, parse it, etc. But the current API already
solves this, if only it was implemented. A clear core option set would
be nice indeed. And the AVDevice Capabilities API actually offers a
start at that, since it lists a bunch of options that should be
relevant to query (and set) for each device in the form of
ff_device_capabilities (in my patchset), or av_device_capabilities
before Andreas' patch removing it in January. I don't think its
complete, but its a good starting point.

Mark Thompson (2021-01-25):
> * Many of those are using it via the ffmpeg utility, but not all.

Indeed, i am an (aspiring) API user, of the dshow device specifically,
and possibly v4l2 later (but my project is Windows-only right now).
Currently hampered by lack of some API not being implemented for
dshow, hence my patch set.

> * The libavdevice API is the libavformat API because it was originally
> split out from libavformat, and it has the nice property that devices
> and files end up being interchangable in some contexts.

I can't underline enough how nice this is. My situation is simple:
devices such as webcams (but plenty others) may deliver video in
various formats, including encoded. I would have to decode those to
use them, output provided by the devices would thus have to go through
much the same pipeline as data from video files. I already had code
for reading in video files, so changes to also support webcams were
absolutely minimal. However, i needed some APIs implemented to really
round things off, make things both convenient (already the case) and
flexible (my patch set).

> * The libavdevice API, being the libavformat API for files, is not
> particularly well-suited in other contexts, because devices may not
> have the same properties as files.

Yeah, not every field in the AVFormatxxx structs is relevant for an
AVDevice. And some are a bit funkily named (like url to stuff the
device name of my webcam into). But are there specific fields one
would wish to provide for an avdevice that are currently not
available?

> * Some odd things like the completely-unused capabilities API and the
> almost-never-used message API are hacked on top of that to try to
> avoid some libavformat issues, but are not actually useful to anyone
> (hence the lack of use).

They certainly are useful! As are the avdevices themselves. I
was surprised that these APIs are not/hardly implemented. My patch set
makes using my webcam much more useful, as i am now able to pause and
restart capture (not leading to a buffer filling up when not
interested in the output!), allow me to discover what devices the user
has attached, and what formats these expose, so i can make a proper UI
(like e.g. OBS studio has). And making this UI is minimal effort as i
would not first have to learn how to work with DirectShow, or to add
yet another dependency to my application (again, ffmpeg would be
needed anyway, as i'd need to decode incoming video). It makes ffmpeg
a tool that allows you to move fast, something you can really build
upon, without losing out on device-specific config/access.

> * To implement devices as AVInputFormat/AVOutputFormat instances,
> libavdevice currently needs access to the internals of libavformat.
> * Many developers want to get rid of that dependency on libavformat
> internals, because it creates a corresponding ugliness on the
> libavformat side which has to leave those parts exposed in an
> ABI-constrained way.

What specific internals does libavdevice depend on? Is it only the
various function pointers in AVInputFormat and AVOutputFormat which
are specific to devices, not all formats? Or is there more? I also
understand that avdevices need to implement some of the other function
pointers to be functional (e.g. read_header, read_packet and
read_close), but that seems unavoidable if we'd want avdevices to be
usable where avformats are (and again: that's a huge plus in my view).
I also understand that the AVDevice API being exposed in the
libavformat makes it harder to evolve the AVDevice API.

Let me make an observation though: if we would not want to lose the
possibility to use avdevices drop-in in the place of AVFormats, some
kind of component that has access to internals of both seems
unavoidable. To me, the logical way to keep AVdevices interchangeable
with AVFormats while separating out the AVDevice API would be to
provide some kind of avdevice generic wrapper/adapter format that
would translate between the AVFormat and AVDevice API. This wrapper
would presumably be an AVFormat, but for it to work it would need
access to AVDevice internals (if only to remap function pointers). If
it is in the avdevice library, it would need access to AVFormat
internals. So some entanglement involving internals is unavoidable,
and a bullet that has to be swallowed. Agreed?

Anyway, out of Mark's options i'd vote for a separate new AVDevice
API, and an adapter component to expose/plug in AVDevices as formats.
This general adapter can expose the generic options (and
device-specific options as child options), handle any threading as
needed, map device names to the url field, etc. Workflow could then be
something like (rough proposal to get this started):
AVDeviceContext* dev_ctx = avdevice_alloc_context();
AVInputDevice* dev_inp_ctx = av_find_input_device("dshow"); // or
av_input_device_next(AV_DEVICE_VIDEO) or
av_device_next(AV_DEVICE_VIDEO | AV_DEVICE_INPUT) for any
avdevice_open_input(dev_ctx, dev_inp_ctx, options);
// or:
AVDeviceContext* dev_ctx = avdevice_alloc_input_context(AVInputDevice*
device, const char* dev_name); // e.g. dev_name="dshow"
avdevice_open_input(dev_ctx, NULL, options);
// and similar for output.
// to start capture, discovers stream parameters if not yet known
avdevice_start();
// to just discover stream parameters without starting
avdevice_probe(); // after open
// NB: need to provide a way for devices to provide multiple streams
(e.g. dshow can provide video and audio simultaneously). Should
AVDeviceContexts have AVStreams? Then you introduce a bunch of extra
entanglement again...

// then
AVFormatContext* fmt_ctx = avformat_adapt_avdevice(dev_ctx);
// and use format like usual (except its already opened!)

What this does not offer is av_find_input_format being able to find
devices (some user code may depend on that!), which is a nice part of
the current situation as well. code like
AVFormatContext* fmt_ctx = NULL;
AVInputFormat* fmt = av_find_input_format("something");
avformat_open_input(&fmt_ctx, "url", fmt, &opts);
works for devices as well, you just need to call
avdevice_register_all() first, use a device name like "dshow" in
av_find_input_format, and use a special url such as "video=Integrated
Webcam". Without such functionality you'd need a bunch of special
cases in your app to allow users to use devices as well. Perhaps this
can also still be provided as is currently the case. We should then
also implement a avformat_get_avdevice() function to get the avdevice
from the avdevice adapter format.

As seen above, and argued earlier, complete separation appears to me
impossible without losing most of the benefits of having avdevices in
the first place, and their current ease of use. But happy middle
ground allowing an advanced+flexible libavdevice API and a cleaned up
libavformat API does seem possible. There is a sweet spot there.

All that said, lets not stop work on the current avdevice component
(my patch set) while figuring out the way forward.

Cheers,
Dee
Nicolas George June 11, 2021, 12:14 p.m. UTC | #8
Anton Khirnov (12021-06-09):
> > > There is no timeline, it depends on someone sitting down and doing the
> > > work. The options proposed so far were
> > > 1) merging libavdevice into libavformat
> > > 2) making libavdevice into an independent library with an independent
> > >    API
> > > 3) moving libavdevice functionality into ffmpeg.c
> > Thanks for providing the explored options. What problem is there in
> > the way things currently are that these would be solving?
> The problem is that libavdevice is a separate library from libavformat,
> but fundamentally depends on accessing libavformat internals.

Point 3 is just completely unacceptable, as there are applications using
libavdevice. Please do not mention it again unless you have new strong
arguments about it.

As for point 2, I explained at the time (but for some reason you
neglected to mention it) that it was a false good idea. Of course,
libavdevice is not the same thing as libavformat, and therefore it's
obvious they should have different APIs to reflect their specificity.

But obvious does not mean true. If we have learned something from
object-oriented programming, it's that similar APIs should be merged,
not split, so that objects that share common properties can be
transparently used with the same code path.

If somebody were to make a separate API for libavdevice, the only result
would be that all code would ceaselessly go trough compatibility
wrappers. A pure waste of time.

Here too, if you ever mention it again, please take this argument into
consideration.

As for point 1, it would be nice to be able to use ff_* functions rather
than the avpriv_* hacks. But unless you realize that it should not apply
to libavformat/libavdevice but to all the libraries in ffmpeg, let me
tell you it is not an efficient use of your time.

Still, if you want to use your time inefficiently and turn
libavformat+libavdevice into a single linking unit, probably named
libavformat, then I have no objection since it is a small step into the
right direction.

OTOH, please do not touch the directory structure, as it would be
annoying and bring no benefit.

Regards,
Nicolas George June 11, 2021, 1:16 p.m. UTC | #9
Diederick C. Niehorster (12021-06-10):
> Let me respond on two levels.
> 
> Before exploring the design space of a separation of libavdevice and
> libavformat below, I think it is important to first comment on the
> current state (and whether the AVDevice Capabilities part of my patch
> series should be blocked by this discussion).
> 
> Importantly, I would suppose that any reorganization of libavdevice
> and libavformat and redesign of the libavdevice API must aim to offer
> at least the same functionality as the current API, that is, an
> avdevice should be able to be queried for what devices it offers
> (get_device_list), should for each device provide information about
> what formats it accepts/can provide
> (create_device_capabilities/free_device_capabilities) and should be
> able to be controlled through the API (control_message). Perhaps these
> take different forms, but same functionality should be offered. As
> such, having AVDevice Capabilities API implemented for one of the
> devices should help, not hamper, redesign efforts because it shows how
> this API would actually be used in practice. Fundamental changes such
> as a new avdevice API will be backwards incompatible no matter what,
> so having one more bit of important functionality
> (create_device_capabilities/free_device_capabilities) implemented
> doesn't create a larger threshold to initiating such a redesign
> effort. Instead, it forces that all the current API functionality is
> thought out as well during the redesign effort and nothing is forgotten. I
> thus argue that its a good thing to bring back the AVDevice Capabilities
> API, since it helps, not hinders the redesign effort. And lets not
> forget it offers users of the current API functionality (me at least)
> they need now, not at some indeterminate timepoint in the future.

I mostly agree with all that. A good API merges similar things. We
should use the object-oriented approach: base APIs for everything that
handles frames-or-packets, so that generic (data copy, timestamps
update, metadata manipulation, etc.) operations can be performed with a
single code path, and then specialized derived APIs for more specific
components.

Input devices are demuxers with a few extra methods; output devices are
muxers with a few extra methods. We already have the beginning of a
class/interface hierarchy:

	formats
	  |
	  +----	muxers
	  |	  |
	  |	  +----	output devices
	  |
	  +----	demuxers
	   	  |
	   	  +----	input devices

Also, IIRC, we already have at least one protocol that does endpoint
discovery. On one hand, protocols are a separate API even from muxers
and demuxers. On the other hand, endpoint discovery is a very
device-like feature.

I take it as a sign that we should include protocols in the discussion.

> Ah ok, so this is at first instance about cleanup/separation, not
> necessarily about adding new functionality (I do see Mark's list of
> opportunities that a new API offer, copied below). I see Nicolas argue
> this entanglement of internals is not a problem in practice, and i

Almost true. We have a huge problem about the entanglement of the
libraries, but the libavformat-libavdevice aspect is a tiny part of it.
The problem is that we have eight separate libraries that depend on each
other and are developed simultaneously. Furthermore, people will always
use these libraries all at once; at worse they will not use a few of the
smaller ones. Therefore, this split brings no benefit, but it forces us
to worry about mutual compatibility of different versions of our
libraries.

Unfortunately, each time I tried to bring this issue to the discussion,
people objected with argument that betrayed a limited reflection on the
big picture and a few misconceptions about how precisely linking work,
both static and dynamic.

> suppose there is a certain amount of taste involved here. Nothing
> wrong with that. I guess for me personally that it is a little funky
> to have to add/change things in AVFormat when changing the AVDevice
> API

We frequently have to change things in libavutil to implement things in
libavcodec, or in libavcodec do implement things in libavformat or
libavdevice. These are several libraries, but a single project, a single
development intent.

> 1 and 3 i cannot speak to, but 4 is indeed what i ran into: the
> current state of most avdevices is not useful at all for an API user
> like me when it comes to capability probing (not a reason though to
> get rid of the whole API, but to wonder why it wasn't implemented.
> while nobody apparently bothered to do it before me, i think there
> will be more than just me who will actually use it). Currently I'd
> have to issue device specific options on a not-yet opened device,
> listen to the log output, parse it, etc. But the current API already
> solves this, if only it was implemented. A clear core option set would
> be nice indeed. And the AVDevice Capabilities API actually offers a
> start at that, since it lists a bunch of options that should be
> relevant to query (and set) for each device in the form of
> ff_device_capabilities (in my patchset), or av_device_capabilities
> before Andreas' patch removing it in January. I don't think its
> complete, but its a good starting point.

I agree. Thought have been given to designing this API, the efforts have
dried up before implementing the functional parts, but the design is
sound, and a good starting point to work again.

> Indeed, i am an (aspiring) API user, of the dshow device specifically,
> and possibly v4l2 later (but my project is Windows-only right now).
> Currently hampered by lack of some API not being implemented for
> dshow, hence my patch set.

And thank you for it.

I want to add that in my mind, one of the goalposts for putting
libavdevice into shape is to allow re-implementing ffplay using it. (And
possibly a symmetrical interactive recording tool, ffrecord or
something.)

> > * The libavdevice API is the libavformat API because it was originally
> > split out from libavformat, and it has the nice property that devices
> > and files end up being interchangable in some contexts.
> I can't underline enough how nice this is. My situation is simple:

I can't emphasize enough how important this is. I want to say that
people who don't see how nice this feature is, how fundamental it is in
the design of libavdevice's API, are just too incompetent about it to
participate meaningfully in the discussion yet.

> > * The libavdevice API, being the libavformat API for files, is not
> > particularly well-suited in other contexts, because devices may not
> > have the same properties as files.
> Yeah, not every field in the AVFormatxxx structs is relevant for an
> AVDevice. And some are a bit funkily named (like url to stuff the
> device name of my webcam into). But are there specific fields one
> would wish to provide for an avdevice that are currently not
> available?

I think the problem emphasized here is not really about fields, more
about the working of the API: files are read on demand, while operate
continuously, that makes a big difference.

But really, we already have this difference with network streams,
especially those that do not have flow control, for example those in
multicast. These network streams have aspects of protocols, but also
aspect of devices.

And the answer is NOT to separate libavio from libavformat: protocols
and formats mesh with each other, see the example of RTP.

> Let me make an observation though: if we would not want to lose the
> possibility to use avdevices drop-in in the place of AVFormats, some
> kind of component that has access to internals of both seems
> unavoidable.

Indeed.

And you can add: unless somebody intends to rework the code for all our
current devices to adapt it to the new API, we would also need a
compatibility wrapper for them.

In practice, that would look like this:

	application
	 → libavformat API
	    → libavdevice compatibility wrapper
	       → libavdevice API
	          → wrapper for old-style device
	             → actual device

While the useful code is just:

	application
	 → libavformat/device API
	    → actual device

That's just an insane idea, a waste of time.

> Anyway, out of Mark's options i'd vote for a separate new AVDevice
> API, and an adapter component to expose/plug in AVDevices as formats.

I do not agree. I think we need to think this globally: shape our
existing APIs into a coherent object-oriented hierarchy of
classes/interfaces. This is not limited to formats and devices, we
should include protocols in the discussion, and probably codecs and
filters too.

And to handle the fact that devices and network streams are
asynchronous, the main API needs to be asynchronous itself.

Which brings me to my project to redesign libavformat around an event
loop with callbacks.

I have moderately started working on it, by writing the documentation
for the low-level single-thread event loop API. Then I need to write the
documentation for the high-level multi-thread scheduler API. Then I can
get to coding.

>	   Without such functionality you'd need a bunch of special
> cases in your app to allow users to use devices as well.

Exactly. And that means some applications that were capable of using
some devices would lose that ability. We do not want that.

> All that said, lets not stop work on the current avdevice component
> (my patch set) while figuring out the way forward.

You are absolutely right on this last point.

Regards,
Anton Khirnov June 11, 2021, 3:24 p.m. UTC | #10
Quoting Diederick C. Niehorster (2021-06-10 15:29:57)
> > The problem is that libavdevice is a separate library from libavformat,
> > but fundamentally depends on accessing libavformat internals.
> 
> Ah ok, so this is at first instance about cleanup/separation, not
> necessarily about adding new functionality

It is also about new functionality, since one of the main stated
advantages of libavdevice is that it can be used transparently by all
programs that use libavformat. New libavdevice-specific APIs
go against this.

> (I do see Mark's list of
> opportunities that a new API offer, copied below). I see Nicolas argue
> this entanglement of internals is not a problem in practice, and i
> suppose there is a certain amount of taste involved here.

Do note that Nicolas' position in library separation is quite unorthodox
--- I am not aware of anyone else supporting it, several people strongly
disagree with it. It also disagrees with our current practice.

> Nothing wrong with that. I guess for me personally that it is a little
> funky to have to add/change things in AVFormat when changing the
> AVDevice API, and that it may be good to for the longer term look at
> disentangling them. I will get back to that below, in response to some
> quotes of Mark's messages last January.
> 
> Mark's (non-exhaustive) list of opportunities a libavdevice API
> redesign offers (numbered by me):
> On 20/01/2021 12:41, Mark Thompson wrote:
>  > 1. Handle frames as well as packets.
>  >    1a. Including hardware frames - DRM objects from KMS/V4L2, D3D
> surfaces from Windows desktop duplication (which doesn't currently
> exist but should).
>  > 2. Clear core option set - currently almost everything is set by
> inconsistent private options; things like pixel/sample format,
> frame/sample rate, geometry and hardware device should be common
> options to all.
>  > 3. Asynchronicity - a big annoyance in current recording scenarios
> with the ffmpeg utility is that both audio and video capture block,
> and do so on the same thread which results in skipped frames.
>  > 4. Capability probing - the existing method of options which log
> the capabilities are not very useful for API users.
> 
> 1 and 3 i cannot speak to, but 4 is indeed what i ran into: the
> current state of most avdevices is not useful at all for an API user
> like me when it comes to capability probing (not a reason though to
> get rid of the whole API, but to wonder why it wasn't implemented.
> while nobody apparently bothered to do it before me, i think there
> will be more than just me who will actually use it). Currently I'd
> have to issue device specific options on a not-yet opened device,
> listen to the log output, parse it, etc. But the current API already
> solves this, if only it was implemented. A clear core option set would
> be nice indeed. And the AVDevice Capabilities API actually offers a
> start at that, since it lists a bunch of options that should be
> relevant to query (and set) for each device in the form of
> ff_device_capabilities (in my patchset), or av_device_capabilities
> before Andreas' patch removing it in January. I don't think its
> complete, but its a good starting point.
> 
> Mark Thompson (2021-01-25):
> > * Many of those are using it via the ffmpeg utility, but not all.
> 
> Indeed, i am an (aspiring) API user, of the dshow device specifically,
> and possibly v4l2 later (but my project is Windows-only right now).
> Currently hampered by lack of some API not being implemented for
> dshow, hence my patch set.
> 
> > * The libavdevice API is the libavformat API because it was originally
> > split out from libavformat, and it has the nice property that devices
> > and files end up being interchangable in some contexts.
> 
> I can't underline enough how nice this is. My situation is simple:
> devices such as webcams (but plenty others) may deliver video in
> various formats, including encoded. I would have to decode those to
> use them, output provided by the devices would thus have to go through
> much the same pipeline as data from video files. I already had code
> for reading in video files, so changes to also support webcams were
> absolutely minimal. However, i needed some APIs implemented to really
> round things off, make things both convenient (already the case) and
> flexible (my patch set).

I see a contradiction here. On one hand you're saying that the
usefulness of lavd comes from it having the same API as lavf. But then
you want to add a whole bunch of libavdevice-specific APIs. So any
program that wants to use them has to be specifically programmed for
libavdevice anyway. And libavformat API is saddled with extra frameworks
that are of no use to "normal" (de)muxing.

At that point, why insist on accessing lavd through the lavf API? You
can have a lavd-specific API that can cleanly export everything that is
specific to capture devices, without ugly hacks that are there
currently.

> 
> > * The libavdevice API, being the libavformat API for files, is not
> > particularly well-suited in other contexts, because devices may not
> > have the same properties as files.
> 
> Yeah, not every field in the AVFormatxxx structs is relevant for an
> AVDevice. And some are a bit funkily named (like url to stuff the
> device name of my webcam into). But are there specific fields one
> would wish to provide for an avdevice that are currently not
> available?

One thing mentioned by Mark that you cite above is that lavf is designed
around working with encoded data in the form of AVPackets, whereas for
some devices handled by lavd it would be better to use decoded frames
(or even hw surfaces) wrapped in an AVFrame.

> > * To implement devices as AVInputFormat/AVOutputFormat instances,
> > libavdevice currently needs access to the internals of libavformat.
> > * Many developers want to get rid of that dependency on libavformat
> > internals, because it creates a corresponding ugliness on the
> > libavformat side which has to leave those parts exposed in an
> > ABI-constrained way.
> 
> What specific internals does libavdevice depend on? Is it only the
> various function pointers in AVInputFormat and AVOutputFormat which
> are specific to devices, not all formats? Or is there more? I also
> understand that avdevices need to implement some of the other function
> pointers to be functional (e.g. read_header, read_packet and
> read_close), but that seems unavoidable if we'd want avdevices to be
> usable where avformats are (and again: that's a huge plus in my view).
> I also understand that the AVDevice API being exposed in the
> libavformat makes it harder to evolve the AVDevice API.

The function pointers, various private APIs, contents of
AVFormatInternal, etc. This is a pretty big deal, since it restricts
what we can do to libavformat internals without breaking ABI.
Anton Khirnov June 11, 2021, 3:38 p.m. UTC | #11
Quoting Nicolas George (2021-06-11 14:14:57)
> Anton Khirnov (12021-06-09):
> > > > There is no timeline, it depends on someone sitting down and doing the
> > > > work. The options proposed so far were
> > > > 1) merging libavdevice into libavformat
> > > > 2) making libavdevice into an independent library with an independent
> > > >    API
> > > > 3) moving libavdevice functionality into ffmpeg.c
> > > Thanks for providing the explored options. What problem is there in
> > > the way things currently are that these would be solving?
> > The problem is that libavdevice is a separate library from libavformat,
> > but fundamentally depends on accessing libavformat internals.
> 
> Point 3 is just completely unacceptable, as there are applications using
> libavdevice. Please do not mention it again unless you have new strong
> arguments about it.

WTF?

It is not your business to police what I can or cannot talk about. All
valid options can be considered and discussed, regardless of how you
personally feel about them.

And it would be really really REALLY nice if you finally learned to
distinguish between your personal opinions, official project policy, and
objective truth. Other people may have opinions and preferences that
disagree with yours, that does not make them necessarily wrong.
Nicolas George June 11, 2021, 3:41 p.m. UTC | #12
Anton Khirnov (12021-06-11):
> And it would be really really REALLY nice if you finally learned to
> distinguish between your personal opinions, official project policy, and
> objective truth.

Have you missed the part where I ask you to give arguments?

Regards,
Diederick C. Niehorster June 12, 2021, 11:50 a.m. UTC | #13
Nicolas, I agree with what you said, and you obviously have given this
more thought than me. Your and Anton's replies provide two by now
pretty clear different views, I hope more will chime in.
I will only highlight some things below.

On Fri, Jun 11, 2021 at 3:17 PM Nicolas George <george@nsup.org> wrote:
> Input devices are demuxers with a few extra methods; output devices are
> muxers with a few extra methods. We already have the beginning of a
> class/interface hierarchy:
>
>         formats
>           |
>           +---- muxers
>           |       |
>           |       +---- output devices
>           |
>           +---- demuxers
>                   |
>                   +---- input devices

Exactly, this is what the class hierarchy in my program using ffmpeg
also looks like.

>
> I agree. Thought have been given to designing this API, the efforts have
> dried up before implementing the functional parts, but the design is
> sound, and a good starting point to work again.

Yes, so lets use it to solve a real (my) problem now. Doing so does
not hamper a large redesign/reorganization effort later.

> I think the problem emphasized here is not really about fields, more
> about the working of the API: files are read on demand, while operate
> continuously, that makes a big difference.
>
> But really, we already have this difference with network streams,
> especially those that do not have flow control, for example those in
> multicast. These network streams have aspects of protocols, but also
> aspect of devices.
>
> And the answer is NOT to separate libavio from libavformat: protocols
> and formats mesh with each other, see the example of RTP.

:) I have been wondering why protocols are in formats, and not in a
separate libavprotocol or libavtransport or so. But i understand
indeed that some of these, like devices, must be intertwined.

> In practice, that would look like this:
>
>         application
>          → libavformat API
>             → libavdevice compatibility wrapper
>                → libavdevice API
>                   → wrapper for old-style device
>                      → actual device
>
> While the useful code is just:
>
>         application
>          → libavformat/device API
>             → actual device
>
> That's just an insane idea, a waste of time.

Hmm, fair enough!

>
> > Anyway, out of Mark's options i'd vote for a separate new AVDevice
> > API, and an adapter component to expose/plug in AVDevices as formats.
>
> I do not agree. I think we need to think this globally: shape our
> existing APIs into a coherent object-oriented hierarchy of
> classes/interfaces. This is not limited to formats and devices, we
> should include protocols in the discussion, and probably codecs and
> filters too.

This is an interesting point. It would force a discussion about which
"classes" should sensibly have access to internals of other classes
(i.e. alike public/protected/private), and thereby make completely
explicit how each component is linked to the others. It seems to me
that physical organization both into a folder structure and into
different dlls is rather secondary. As you wrote elsewhere, a lot of
the libraries depend on each other, and that makes sense.

> And to handle the fact that devices and network streams are
> asynchronous, the main API needs to be asynchronous itself.
>
> Which brings me to my project to redesign libavformat around an event
> loop with callbacks.
>
> I have moderately started working on it, by writing the documentation
> for the low-level single-thread event loop API. Then I need to write the
> documentation for the high-level multi-thread scheduler API. Then I can
> get to coding.

Looking forward to it, sounds useful!

Cheers,
Dee
Diederick C. Niehorster June 12, 2021, 11:53 a.m. UTC | #14
Reorganized a bit for easier replying.

Also, while i think this is an important discussion, i do not see why
it should stop de-deprecation of a good API. Deprecating the device
capabilities API cleaned up avformat a bit, but various other function
pointers are left. A redesign would clean them all up at once, if
thats the direction taken. As such, having the device capabilities api
does not create a hindrance to a redesign. As i argued, having a use
example of important functionality helps, not hinders the redesign
effort. And lets not forget it offers users of the current API (me at
least) functionality they need now, not at some indeterminate
timepoint in the future. So, lets not stop work on the current
avdevice component (my patch set) while figuring out the way forward.
Lets keep the below discussion separate, they are two separate points
to decide on I think. What is your view on this Anton / how should
this issue be decided?

On Fri, Jun 11, 2021 at 5:24 PM Anton Khirnov <anton@khirnov.net> wrote:
>
> It is also about new functionality, since one of the main stated
> advantages of libavdevice is that it can be used transparently by all
> programs that use libavformat. New libavdevice-specific APIs
> go against this.
>
> I see a contradiction here. On one hand you're saying that the
> usefulness of lavd comes from it having the same API as lavf. But then
> you want to add a whole bunch of libavdevice-specific APIs. So any
> program that wants to use them has to be specifically programmed for
> libavdevice anyway. And libavformat API is saddled with extra frameworks
> that are of no use to "normal" (de)muxing.
>
> At that point, why insist on accessing lavd through the lavf API? You
> can have a lavd-specific API that can cleanly export everything that is
> specific to capture devices, without ugly hacks that are there
> currently.

I do not think there is a contradiction. Note also that i did not say
i "want to add a whole bunch of libavdevice-specific APIs." For (not
unimporant) convenience, as an API user, i want to use some of the
already existing API to enhance my use of avdevices.

Quoting a bit from Nicolas' reply allows me to make the point that
there is no contradiction:

On Fri, Jun 11, 2021 at 3:17 PM Nicolas George <george@nsup.org> wrote:
> Input devices are demuxers with a few extra methods; output devices are
> muxers with a few extra methods. We already have the beginning of a
> class/interface hierarchy:
>
>         formats
>           |
>           +---- muxers
>           |       |
>           |       +---- output devices
>           |
>           +---- demuxers
>                   |
>                   +---- input devices

Code in my (C++) program looks just like that. I have a
significantly-size class with code for accessing ffmpeg demuxers,
originally written to be able to read files. Then i realized access to
webcams would be important too. It was very simply added by deriving
my webcam class from the general ffmpeg input class and adding a
little bit of convenience functions to deal with device discovery,
proper URL formatting (so users don't have to prepend "video=" for
dshow). I.e., almost all of the code is shared between my class for
general avformats and the avdevice class.

But, for more advanced use than just streaming in packets/frames, I
need to gain a bit of control over the avdevice. These are convenience
things, even if not unimportant, and appropriate APIs are already
available. Thats what i implemented for dshow. Right now i needed
another DirectShow library to do device discovery and given a device,
format discovery. The ffmpeg API is (was) already there, just needed
to be implemented. I also needed a way to pause/resume (just stopping
reading packets like you can do when the input is a file does not work
for a realtime source, where it leads to a full packet list buffer,
and a kinda nasty restart as first you get a bunch of irrelevant old
frames, and may lose a few of the first ones you do want as the buffer
was still full). These bits of additional API thus helps make using
avdevices better, but even without them, i (and i assume many other
happy users out there) already have a very workable solution.

> > (I do see Mark's list of
> > opportunities that a new API offer, copied below). I see Nicolas argue
> > this entanglement of internals is not a problem in practice, and i
> > suppose there is a certain amount of taste involved here.
>
> Do note that Nicolas' position in library separation is quite unorthodox
> --- I am not aware of anyone else supporting it, several people strongly
> disagree with it. It also disagrees with our current practice.
> [...]
>
> The function pointers, various private APIs, contents of
> AVFormatInternal, etc. This is a pretty big deal, since it restricts
> what we can do to libavformat internals without breaking ABI.

This may be above my pay grade, but: ABI here would be ABI between the
various dlls, not user-facing right? (They are internals after all,
and hidden behind opaque pointers). I assume the intention is that a
user uses a set of ffmpeg dlls from the same build, not mix and match?
Are there such issues in that case?

I understand the beauty of clean design and generic interfaces.
Less/no entanglement allows moving faster and more flexibly with
individual components. But i also think that there is a limit to how
separated different components can become. As i wrote in my previous
message:
1. lavd being accessible through the lavf interface is super
convenient and important.
2. if we would not want to lose the possibility to use avdevices
drop-in in the place of AVFormats, if making avdevices separate
completely from avfromats, some kind of adapter/wrapper component that
has access to internals of both is unavoidable.

So a goal, to my personal taste, could/should certainly be to minimize
use of internal APIs across library boundaries. But not eliminate it
completely at the cost of functionality.

I read somewhere on the list that avpriv_ cross-library solutions are
considered a hack. Why? Mixing them up with other actually internal to
a library functions is perhaps what makes things appear messy. Moving
declarations for functions that are not meant to be used by users, but
are part of a cross-library internal API into their own header may
clean things up and make it even clearer where links between the
libraries exist.

When it comes to avdevice, having some data on the situation may be
good. I had a look at how some avdevices (those i have available in my
build environment here on windows/msvc without external libs) depend
on internals of other libraries.
1. #include "libavformat/internal.h": dshow, vfwcap, gdigrab and lavfi
all include this only for the avpriv_set_pts_info() function
2. lavfi furthermore includes libavformat/avio_internal.h and
libavutil/internal.h. I am not sure why, commenting them out does not
lead to any build warnings or errors.

There may be other libraries with closer entanglement, but it seems
minimal for these four devices: one avpriv_ function and the function
pointers in the AVFormat structs. grepping for avpriv_ in the whole
avdevice library, it is indeed mostly avpriv_set_pts_info() and
avpriv_open() that are used.

I note that the internal field of AVFormat does not appear to be
accessed in any of the avdevices at all. Just grep the avdevice
directory for "internal": not a single use.

>
> One thing mentioned by Mark that you cite above is that lavf is designed
> around working with encoded data in the form of AVPackets, whereas for
> some devices handled by lavd it would be better to use decoded frames
> (or even hw surfaces) wrapped in an AVFrame.

Understood.

Cheers,
Dee