Message ID | 20240422155836.385333-2-ffmpeg-devel@pileofstuff.org |
---|---|
State | New |
Headers | show |
Series | all: Link to "context" from all contexts with documentation | expand |
Context | Check | Description |
---|---|---|
yinshiyou/make_loongarch64 | success | Make finished |
yinshiyou/make_fate_loongarch64 | success | Make fate finished |
andriy/make_x86 | success | Make finished |
andriy/make_fate_x86 | success | Make fate finished |
On date Monday 2024-04-22 16:56:48 +0100, Andrew Sayers wrote: > Derived from detailed explanations kindly provided by Stefano Sabatini: > https://ffmpeg.org/pipermail/ffmpeg-devel/2024-April/325903.html > --- > doc/context.md | 276 +++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 276 insertions(+) > create mode 100644 doc/context.md > > diff --git a/doc/context.md b/doc/context.md > new file mode 100644 > index 0000000000..73caacf54f > --- /dev/null > +++ b/doc/context.md > @@ -0,0 +1,276 @@ > +# Context-oriented programming > + > +Like many C projects, FFmpeg has adopted the subset of object-oriented techniques > +that help solve its problems. Object-like structures are called "contexts", > +and this document provides a general introduction to how they work. > + > +Object-oriented programming usually focusses on *access* as a primary concern. > +For example, members of an object are visibly designated "private", "constant" > +etc. to control how they are accessed. *Reflection* is a secondary concern, > +where it is provided at all. For example, C++ has no built-in way to get a > +string containing the name of a variable. > + > +Reflection is extremely important for FFmpeg, because user-facing options are > +implemented by reflecting state at runtime. Limiting access is a secondary > +concern, mainly important for ensuring implementation details can change > +between versions. > + > +An object-oriented programmer learning FFmpeg should be careful not to fixate on > +FFmpeg's access control features, nor to overlook its reflection capabilities. > +Both are present, but have been given the level of focus appropriate for the task. > + > +## Example: modify text then print it > + > +The example below shows a context structure that receives input strings, > +modifies them in some context-dependant way, then prints them to a specified > +filehandle. > + > +```c > +/** > + * Type information, accessible at runtime. > + * > + * Useful when configuring objects. > + */ > +enum ModifyThenPrintDialect { > + MODIFY_THEN_PRINT_PLAIN_TEXT = 0, > + MODIFY_THEN_PRINT_REGEX = 1, > + MODIFY_THEN_PRINT_REGEX_PCRE = 2 > +}; > + > +/** > + * User-facing information about types > + * > + * Useful for describing contexts to the user. > + */ > +static const char* ModifyThenPrintDialectName[] = { > + "plain text", > + "regular expression", > + "Perl-compatible regular expression" > +}; > + > +/** > + * Context for functions that modify strings before printing them > + */ > +struct ModifyThenPrintContext { > + > + /** > + * Information about the type of this particular instance > + * > + * Object-oriented programs would probably represent this example > + * as a sub-class, but some instance-specific functionality > + * behaves more like a mixin. > + */ > + enum ModifyThenPrintDialect dialect; > + > + /** > + * Internal context > + * > + * Object-oriented programs would put private members in here, > + * but could also use it to store a virtual function table > + * or other "invisible" information. > + */ > + void* priv_data; style: type *name; here and below > + > + /** > + * User-configurable options > + * > + * Best set through an API, but can be set directly if necessary > + * > + * Data from users needs to be validated before it's set, and the API > + * might e.g. want to update some internal buffer when these change. > + * Setting this directly is always less robust than using an API, > + * but might be worthwhile if you're willing to spend the time checking > + * for edge cases, and don't mind your code misbehaving if future > + * versions change the API but not the structure. > + * > + * Object-oriented programs would likely make these protected members, > + * initialised in a constructor and accessed with getters and setters. > + * Making them user-configurable would be left to the programmer. > + */ > + char *replace_this; > + char *with_this; > + > + /** > + * Programmer-configurable variable > + * > + * Object-oriented programs would represent this as a public member. > + */ > + FILE *out; > + > +}; > + > +/** > + * Allocate and initialize a ModifyThenPrintContext > + * > + * This creates a new pointer, then fills in some sensible defaults. > + * > + * We can reasonably assume this function will initialise `priv_data` > + * with a dialect-specific object, but shouldn't make any assumptions > + * about what that object is. > + * > + * Object-oriented programs would likely represent this as an > + * allocator function and a constructor. > + */ > +int ModifyThenPrintContext_alloc_context(struct ModifyThenPrintContext **ctx, > + enum ModifyThenPrintDialect dialect, > + FILE *out); > + > +/** > + * Uninitialize and deallocate a ModifyThenPrintContext > + * > + * This does any work required by the internal context (e.g. deallocating > + * `priv_data`), then deallocates the main context itself. > + * > + * Object-oriented programs would likely represent this as a > + * destructor and a deallocator. > + */ > +int ModifyThenPrintContext_free(struct ModifyThenPrintContext *ctx); > + > +/** > + * Configure a ModifyThenPrintContext > + * > + * This checks the arguments are valid in the context's dialect, > + * then updates the options as necessary > + * > + * Object-oriented programs would likely represent this as a > + * collection of setter functions. > + */ > +int ModifyThenPrintContext_configure(struct ModifyThenPrintContext *ctx, > + char* replace_this, > + char* with_this); > + > +/** > + * Print the contents of a ModifyThenPrintContext to a filehandle > + * > + * Provides human-readable information about keys and values. > + * > + * Object-oriented programs would likely represent this with some kind of > + * `serialise` operation. > + */ > +int ModifyThenPrintContext_dump(struct ModifyThenPrintContext **ctx, > + FILE *dump_fh); > + > +/** > + * Print a single message > + * > + * Object-oriented programs would likely represent this with a member function. > + */ > +int ModifyThenPrintContext_print(struct ModifyThenPrintContext *ctx, > + char *msg); > + > +/** > + * How this context might be used in practice > + */ > +int print_hello_world() > +{ > + > + int ret = 0; > + > + struct ModifyThenPrintContext *ctx; > + > + if ( ModifyThenPrintContext_alloc_context( &ctx, MODIFY_THEN_PRINT_REGEX, stdout ) < 0 ) { > + ret = -1; > + goto EXIT_WITHOUT_CLEANUP; > + } > + > + if ( ModifyThenPrintContext_configure(ctx, "Hi|Hullo", "Hello") < 0 ) { > + ret = -1; > + goto FINISH; > + } > + > + if ( ModifyThenPrintContext_print( ctx, "Hi, world!\n" ) < 0 ) { > + ret = -1; > + goto FINISH; > + } > + > + if ( ModifyThenPrintContext_print( ctx, "Hullo, world!\n" ) < 0 ) { > + ret = -1; > + goto FINISH; > + } > + > + FINISH: > + if ( ModifyThenPrintContext_free( ctx ) ) { > + ret = -1; > + goto EXIT_WITHOUT_CLEANUP; > + } > + > + EXIT_WITHOUT_CLEANUP: > + return ret; > + > +} > +``` I don't have a strong opinion, but I'd probably focus on providing a typical example of a common API (check doc/examples). Also I see here there is a strong focus on OOP, this might be counter-productive in case the reader is not familiar with OOP terminology. OTOH the content might be useful for readers coming from an OOP background and terminology. I wonder if this content might be isolated in a dedicated section, so that non-OOP readers can simply skip it. But this is not a strong objection, and can be possibly reworked in a later step. > + > +## FFmpeg context structures > + > +The example above is a generic example of a context structure and its > +associated functions. Some FFmpeg contexts are no more complex than > +that example, just as some objects are just key/value stores with some > +functions attached. But here are some examples to show the variety of > +contexts available in FFmpeg. > + > +AVHashContext presents a generic API for hashing data. @ref hash.h > +"Its associated functions" show how to create, use and destroy a hash. > +The exact algorithm is specified by passing a string to av_hash_alloc(), > +and the list of strings can be retrieved from av_hash_names(). > +Algorithm-specific internal context is stored in AVHashContext::ctx. > + > +AVCodecContext is a complex member representing data about an encoding or > +decoding session. @ref avcodec.h "Its associated functions" provide an > +abstract interface to encode and decode data. Like most widely-used contexts, > +its first member is an AVClass pointer containing instance-specific information. > +That means it's an *AVClass context structure* (discussed in more detail below). > + > +X264Context contains the internal context for an AVCodecContext that uses > +the x264 library. @ref libx264.c "Its associated functions" provide a concrete > +implementation for interacting with libx264. This class is not directly > +accessible through FFmpeg's public interface, so unlike AVCodecContext it can > +change significantly between releases. Note: still, we are not supposed to break compatibilty, for example by changing the private option names, or this will break both scripts using the ff* tools and applications using the libav* libraries, since they refer to those option names. In other words, the options names provide the public API to interact with a specific codec. > +## Reflection with AVClass and AVOptions > + > +An *AVClass context structure* is a context whose first member is an AVClass > +that reflects its contents. An *@ref avoptions "AVOptions"-enabled struct* > +is a structure which reflects its contents using the @ref avoptions "AVOptions" > +API. "reflects its contents" is a bit too strong, we want only to be able to set and query options on such structures. In some cases the AVOptions API is abused also to get "properties" from the structures (see for example the "location" option in libavformation/http.c). But the whole content in general is not available through AVOptions, only a specific subset which is supposed to configure the structure. > These terms have become synonymous in modern usage, but you might notice > +some distinction when looking at code written after AVClass was developed but > +before @ref avoptions "AVOptions" was added. > + > +At first glance, an object-oriented programmer might be tempted to look at AVClass > +as a base class that contexts inherit from. But unlike a base class, an AVClass > +doesn't imply any theoretical relationship between objects, and contexts of the > +same type will often have different AVClass values. It's even theoretically possible > +for a single AVClass to be shared between contexts of different types. A better > +analogy would be to [C++ concepts](https://en.wikipedia.org/wiki/Concepts_(C%2B%2B)) > +or [Java interfaces](https://en.wikipedia.org/wiki/Interface_(Java)). > + > +To understand how AVClass and @ref avoptions "AVOptions" work, > +consider the requirements for a `libx264` encoder: > + > +- it has to support common encoder options like "bitrate" > +- it has to support encoder-specific options like "profile" > + - the exact options could change quickly if a legal ruling forces a change of backend but this will be a compatiblity break for most applications and scripts (you will notice in this case because people will complain aloud about the breakage), so we need to be careful when changing options following the usual deprecation dance > +- it has to provide useful feedback to users about unsupported options > + > +Common encoder options like "bitrate" need to be stored in AVCodecContext itself > +to avoid duplicating code, while encoder-specific options like "profile" have to > +be stored in the X264Context instance stored in AVCodecContext::priv_data. > +But both need to be presented to users (along with help text, default values etc.) > +in a way that makes it easy to build user interfaces to get and set those options. nit: s/easy/possible/ > + > +A context's AVClass member contains a list of AVOption objects, describing > +the user-configurable members of the class. It also contains functions to > +iterate through a tree of AVClass objects representing internal context either > +for the current object or for any class that could potentially exist in the > +current version of FFmpeg. > + > +The @ref avoptions "AVOptions" API accepts any AVClass context structure, > +looks through its AVOption data, and uses that to examine, introspect, and modify > +the structure. Because the API doesn't contain any type-specific information, > +it can be used to create a general user interface that adapts automatically > +when e.g. a new version of FFmpeg adds a new configuration option. > + > +## Summary > + > +FFmpeg uses "contexts" in ways that are often resemble object-oriented programming. > +But it focuses less on encapsulation within arbitrarily complex systems, > +and more on providing reflectivity to make pleasant user interfaces. nit++: s/pleasant//, here the point is to make them possible, we don't know if they are pleasant :-) [...] Note: I'll be probably offline for one week or so.
On Mon, Apr 22, 2024 at 07:05:12PM +0200, Stefano Sabatini wrote: > On date Monday 2024-04-22 16:56:48 +0100, Andrew Sayers wrote: > > Derived from detailed explanations kindly provided by Stefano Sabatini: > > https://ffmpeg.org/pipermail/ffmpeg-devel/2024-April/325903.html > > --- > > doc/context.md | 276 +++++++++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 276 insertions(+) > > create mode 100644 doc/context.md > > > > diff --git a/doc/context.md b/doc/context.md > > new file mode 100644 > > index 0000000000..73caacf54f > > --- /dev/null > > +++ b/doc/context.md > > @@ -0,0 +1,276 @@ > > +# Context-oriented programming > > + > > +Like many C projects, FFmpeg has adopted the subset of object-oriented techniques > > +that help solve its problems. Object-like structures are called "contexts", > > +and this document provides a general introduction to how they work. > > + > > +Object-oriented programming usually focusses on *access* as a primary concern. > > +For example, members of an object are visibly designated "private", "constant" > > +etc. to control how they are accessed. *Reflection* is a secondary concern, > > +where it is provided at all. For example, C++ has no built-in way to get a > > +string containing the name of a variable. > > + > > +Reflection is extremely important for FFmpeg, because user-facing options are > > +implemented by reflecting state at runtime. Limiting access is a secondary > > +concern, mainly important for ensuring implementation details can change > > +between versions. > > + > > +An object-oriented programmer learning FFmpeg should be careful not to fixate on > > +FFmpeg's access control features, nor to overlook its reflection capabilities. > > +Both are present, but have been given the level of focus appropriate for the task. > > + > > +## Example: modify text then print it > > + > > +The example below shows a context structure that receives input strings, > > +modifies them in some context-dependant way, then prints them to a specified > > +filehandle. > > + > > +```c > > +/** > > + * Type information, accessible at runtime. > > + * > > + * Useful when configuring objects. > > + */ > > +enum ModifyThenPrintDialect { > > + MODIFY_THEN_PRINT_PLAIN_TEXT = 0, > > + MODIFY_THEN_PRINT_REGEX = 1, > > + MODIFY_THEN_PRINT_REGEX_PCRE = 2 > > +}; > > + > > +/** > > + * User-facing information about types > > + * > > + * Useful for describing contexts to the user. > > + */ > > +static const char* ModifyThenPrintDialectName[] = { > > + "plain text", > > + "regular expression", > > + "Perl-compatible regular expression" > > +}; > > + > > +/** > > + * Context for functions that modify strings before printing them > > + */ > > +struct ModifyThenPrintContext { > > + > > + /** > > + * Information about the type of this particular instance > > + * > > + * Object-oriented programs would probably represent this example > > + * as a sub-class, but some instance-specific functionality > > + * behaves more like a mixin. > > + */ > > + enum ModifyThenPrintDialect dialect; > > + > > + /** > > + * Internal context > > + * > > + * Object-oriented programs would put private members in here, > > + * but could also use it to store a virtual function table > > + * or other "invisible" information. > > + */ > > > + void* priv_data; > > style: > type *name; > > here and below > > > > + > > + /** > > + * User-configurable options > > + * > > + * Best set through an API, but can be set directly if necessary > > + * > > + * Data from users needs to be validated before it's set, and the API > > + * might e.g. want to update some internal buffer when these change. > > + * Setting this directly is always less robust than using an API, > > + * but might be worthwhile if you're willing to spend the time checking > > + * for edge cases, and don't mind your code misbehaving if future > > + * versions change the API but not the structure. > > + * > > + * Object-oriented programs would likely make these protected members, > > + * initialised in a constructor and accessed with getters and setters. > > + * Making them user-configurable would be left to the programmer. > > + */ > > + char *replace_this; > > + char *with_this; > > + > > + /** > > + * Programmer-configurable variable > > + * > > + * Object-oriented programs would represent this as a public member. > > + */ > > + FILE *out; > > + > > +}; > > + > > +/** > > + * Allocate and initialize a ModifyThenPrintContext > > + * > > + * This creates a new pointer, then fills in some sensible defaults. > > + * > > + * We can reasonably assume this function will initialise `priv_data` > > + * with a dialect-specific object, but shouldn't make any assumptions > > + * about what that object is. > > + * > > + * Object-oriented programs would likely represent this as an > > + * allocator function and a constructor. > > + */ > > +int ModifyThenPrintContext_alloc_context(struct ModifyThenPrintContext **ctx, > > + enum ModifyThenPrintDialect dialect, > > + FILE *out); > > + > > +/** > > + * Uninitialize and deallocate a ModifyThenPrintContext > > + * > > + * This does any work required by the internal context (e.g. deallocating > > + * `priv_data`), then deallocates the main context itself. > > + * > > + * Object-oriented programs would likely represent this as a > > + * destructor and a deallocator. > > + */ > > +int ModifyThenPrintContext_free(struct ModifyThenPrintContext *ctx); > > + > > +/** > > + * Configure a ModifyThenPrintContext > > + * > > + * This checks the arguments are valid in the context's dialect, > > + * then updates the options as necessary > > + * > > + * Object-oriented programs would likely represent this as a > > + * collection of setter functions. > > + */ > > +int ModifyThenPrintContext_configure(struct ModifyThenPrintContext *ctx, > > + char* replace_this, > > + char* with_this); > > + > > +/** > > + * Print the contents of a ModifyThenPrintContext to a filehandle > > + * > > + * Provides human-readable information about keys and values. > > + * > > + * Object-oriented programs would likely represent this with some kind of > > + * `serialise` operation. > > + */ > > +int ModifyThenPrintContext_dump(struct ModifyThenPrintContext **ctx, > > + FILE *dump_fh); > > + > > +/** > > + * Print a single message > > + * > > + * Object-oriented programs would likely represent this with a member function. > > + */ > > +int ModifyThenPrintContext_print(struct ModifyThenPrintContext *ctx, > > + char *msg); > > + > > +/** > > + * How this context might be used in practice > > + */ > > +int print_hello_world() > > +{ > > + > > + int ret = 0; > > + > > + struct ModifyThenPrintContext *ctx; > > + > > + if ( ModifyThenPrintContext_alloc_context( &ctx, MODIFY_THEN_PRINT_REGEX, stdout ) < 0 ) { > > + ret = -1; > > + goto EXIT_WITHOUT_CLEANUP; > > + } > > + > > + if ( ModifyThenPrintContext_configure(ctx, "Hi|Hullo", "Hello") < 0 ) { > > + ret = -1; > > + goto FINISH; > > + } > > + > > + if ( ModifyThenPrintContext_print( ctx, "Hi, world!\n" ) < 0 ) { > > + ret = -1; > > + goto FINISH; > > + } > > + > > + if ( ModifyThenPrintContext_print( ctx, "Hullo, world!\n" ) < 0 ) { > > + ret = -1; > > + goto FINISH; > > + } > > + > > + FINISH: > > + if ( ModifyThenPrintContext_free( ctx ) ) { > > + ret = -1; > > + goto EXIT_WITHOUT_CLEANUP; > > + } > > + > > + EXIT_WITHOUT_CLEANUP: > > + return ret; > > + > > +} > > +``` > > I don't have a strong opinion, but I'd probably focus on providing a > typical example of a common API (check doc/examples). Also I see here > there is a strong focus on OOP, this might be counter-productive in > case the reader is not familiar with OOP terminology. > > OTOH the content might be useful for readers coming from an OOP > background and terminology. I wonder if this content might be isolated > in a dedicated section, so that non-OOP readers can simply skip it. > > But this is not a strong objection, and can be possibly reworked in a > later step. > This is really a document for FFmpeg newbies, so we need to assume as little prior knowledge as possible. After a few days to think it over, I think we should avoid assuming... Knowledge of object-oriented programming. For example, this should be useful to a research mathematician with a project that involves codec algorithms. So the next draft should feel less like "FFmpeg for OOP devs" and more like "FFmpeg for newbies (with some optional OOP background reading)". Knowing that programming doesn't *have* to be object-oriented. OOP has become so ubiquitous nowadays, there are plenty of programmers who will insist everything is OOP if you just speak loudly and slowly. This is a harder problem in some ways, because someone who doesn't understand can always re-read until they do, while someone who jumps to the wrong conclusion will just keep reading and try to make things fit their assumption (e.g. my earlier messages in this thread!). So the "how it differs from OOP" stuff needs to stay fairly prominent. Knowing anything about FFmpeg (or multimedia in general). I like the idea of tweaking `doc/examples` to better introduce FFmpeg fundamentals, but explaining "context" is a steep enough learning curve on its own - introducing concepts like "stream" and "codec" at the same time seems like too much. > > + > > +## FFmpeg context structures > > + > > +The example above is a generic example of a context structure and its > > +associated functions. Some FFmpeg contexts are no more complex than > > +that example, just as some objects are just key/value stores with some > > +functions attached. But here are some examples to show the variety of > > +contexts available in FFmpeg. > > + > > +AVHashContext presents a generic API for hashing data. @ref hash.h > > +"Its associated functions" show how to create, use and destroy a hash. > > +The exact algorithm is specified by passing a string to av_hash_alloc(), > > +and the list of strings can be retrieved from av_hash_names(). > > +Algorithm-specific internal context is stored in AVHashContext::ctx. > > + > > +AVCodecContext is a complex member representing data about an encoding or > > +decoding session. @ref avcodec.h "Its associated functions" provide an > > +abstract interface to encode and decode data. Like most widely-used contexts, > > +its first member is an AVClass pointer containing instance-specific information. > > +That means it's an *AVClass context structure* (discussed in more detail below). > > + > > +X264Context contains the internal context for an AVCodecContext that uses > > +the x264 library. @ref libx264.c "Its associated functions" provide a concrete > > > +implementation for interacting with libx264. This class is not directly > > +accessible through FFmpeg's public interface, so unlike AVCodecContext it can > > +change significantly between releases. > > Note: still, we are not supposed to break compatibilty, for example by > changing the private option names, or this will break both scripts > using the ff* tools and applications using the libav* libraries, since > they refer to those option names. In other words, the options names > provide the public API to interact with a specific codec. > > > +## Reflection with AVClass and AVOptions > > + > > +An *AVClass context structure* is a context whose first member is an AVClass > > +that reflects its contents. An *@ref avoptions "AVOptions"-enabled struct* > > +is a structure which reflects its contents using the @ref avoptions "AVOptions" > > +API. > > "reflects its contents" is a bit too strong, we want only to be able > to set and query options on such structures. In some cases the > AVOptions API is abused also to get "properties" from the structures > (see for example the "location" option in libavformation/http.c). But > the whole content in general is not available through AVOptions, only > a specific subset which is supposed to configure the structure. > > > These terms have become synonymous in modern usage, but you might notice > > +some distinction when looking at code written after AVClass was developed but > > +before @ref avoptions "AVOptions" was added. > > + > > +At first glance, an object-oriented programmer might be tempted to look at AVClass > > +as a base class that contexts inherit from. But unlike a base class, an AVClass > > +doesn't imply any theoretical relationship between objects, and contexts of the > > +same type will often have different AVClass values. It's even theoretically possible > > +for a single AVClass to be shared between contexts of different types. A better > > +analogy would be to [C++ concepts](https://en.wikipedia.org/wiki/Concepts_(C%2B%2B)) > > +or [Java interfaces](https://en.wikipedia.org/wiki/Interface_(Java)). > > + > > +To understand how AVClass and @ref avoptions "AVOptions" work, > > +consider the requirements for a `libx264` encoder: > > + > > +- it has to support common encoder options like "bitrate" > > > +- it has to support encoder-specific options like "profile" > > > + - the exact options could change quickly if a legal ruling forces a change of backend > > but this will be a compatiblity break for most applications and > scripts (you will notice in this case because people will complain > aloud about the breakage), so we need to be careful when changing > options following the usual deprecation dance > > > +- it has to provide useful feedback to users about unsupported options > > + > > +Common encoder options like "bitrate" need to be stored in AVCodecContext itself > > +to avoid duplicating code, while encoder-specific options like "profile" have to > > +be stored in the X264Context instance stored in AVCodecContext::priv_data. > > > +But both need to be presented to users (along with help text, default values etc.) > > +in a way that makes it easy to build user interfaces to get and set those options. > > nit: s/easy/possible/ > > > + > > +A context's AVClass member contains a list of AVOption objects, describing > > +the user-configurable members of the class. It also contains functions to > > +iterate through a tree of AVClass objects representing internal context either > > +for the current object or for any class that could potentially exist in the > > +current version of FFmpeg. > > + > > +The @ref avoptions "AVOptions" API accepts any AVClass context structure, > > +looks through its AVOption data, and uses that to examine, introspect, and modify > > +the structure. Because the API doesn't contain any type-specific information, > > +it can be used to create a general user interface that adapts automatically > > +when e.g. a new version of FFmpeg adds a new configuration option. > > + > > +## Summary > > + > > > +FFmpeg uses "contexts" in ways that are often resemble object-oriented programming. > > +But it focuses less on encapsulation within arbitrarily complex systems, > > +and more on providing reflectivity to make pleasant user interfaces. > > nit++: s/pleasant//, here the point is to make them possible, we don't > know if they are pleasant :-) Comments above should be reflected in a patch I'll send after writing this. The patch also adds more links from the code, because I noticed doxygen was replacing all "Context for foo" comments with links to some private struct that happened to be called "Context". I've also gone through the code looking for edge cases we haven't covered. Here are some questions trying to prompt an "oh yeah I forgot to mention that"-type answer. Anything where the answer is more like "that should probably be rewritten to be clearer", let me know and I'll avoid confusing newbies with it. av_ambient_viewing_environment_create_side_data() takes an AVFrame as its first argument, and returns a new AVAmbientViewingEnvironment. What is the context object for that function - AVFrame or AVAmbientViewingEnvironment? av_register_bitstream_filter() (deprecated 4.0, removed 5.0) took an `AVBitStreamFilter *` as its first argument, but I don't think you'd say the argument provided "context" for the function. So would I be right in saying `AVBitStreamFilter *` is not a context, despite looking like one? av_buffersink_*() all take a `const AVFilterContext *` argument. What does the difference between av_buffersink prefix and AVFilter type mean? av_channel_description_bprint() takes a `struct AVBPrint *` as its first argument, then `enum AVChannel`. Is the context AVBPrint, AVChannel, or both? Does it make sense for a function to have two contexts? Related to the previous question, does `av_cmp_q()` count as a function with two contexts? Or no contexts? Finally, a general question - functions of the form "avfoo" seem like they are more consistent than "av_foo". Does the underscore mean anything?
On Mon, Apr 29, 2024 at 10:10:35AM +0100, Andrew Sayers wrote: > > I've also gone through the code looking for edge cases we haven't covered. > Here are some questions trying to prompt an "oh yeah I forgot to mention > that"-type answer. Anything where the answer is more like "that should > probably be rewritten to be clearer", let me know and I'll avoid confusing > newbies with it. > > av_ambient_viewing_environment_create_side_data() takes an AVFrame as its > first argument, and returns a new AVAmbientViewingEnvironment. What is the > context object for that function - AVFrame or AVAmbientViewingEnvironment? > > av_register_bitstream_filter() (deprecated 4.0, removed 5.0) took an > `AVBitStreamFilter *` as its first argument, but I don't think you'd say > the argument provided "context" for the function. So would I be right in > saying `AVBitStreamFilter *` is not a context, despite looking like one? > > av_buffersink_*() all take a `const AVFilterContext *` argument. > What does the difference between av_buffersink prefix and AVFilter type mean? > > av_channel_description_bprint() takes a `struct AVBPrint *` as its first > argument, then `enum AVChannel`. Is the context AVBPrint, AVChannel, > or both? Does it make sense for a function to have two contexts? > > Related to the previous question, does `av_cmp_q()` count as a function > with two contexts? Or no contexts? > > Finally, a general question - functions of the form "avfoo" seem like they > are more consistent than "av_foo". Does the underscore mean anything? One extra question I hadn't thought to ask before - when, if at all, would an ordinary user be expected to access the contents of AVClass directly? Or AVOption for that matter? For example, it's not clear to me how people are supposed to use AVClass.category - is this an important use case we haven't covered? Either way, happy to propose an AVClass patch to make it clearer. It's my turn to be off for a few days, so will pick up any responses next week.
On date Monday 2024-04-29 10:10:35 +0100, Andrew Sayers wrote: > On Mon, Apr 22, 2024 at 07:05:12PM +0200, Stefano Sabatini wrote: [...] > > I don't have a strong opinion, but I'd probably focus on providing a > > typical example of a common API (check doc/examples). Also I see here > > there is a strong focus on OOP, this might be counter-productive in > > case the reader is not familiar with OOP terminology. > > > > OTOH the content might be useful for readers coming from an OOP > > background and terminology. I wonder if this content might be isolated > > in a dedicated section, so that non-OOP readers can simply skip it. > > > > But this is not a strong objection, and can be possibly reworked in a > > later step. > > > > This is really a document for FFmpeg newbies, so we need to assume as > little prior knowledge as possible. After a few days to think it > over, I think we should avoid assuming... > > Knowledge of object-oriented programming. For example, this should be > useful to a research mathematician with a project that involves codec > algorithms. So the next draft should feel less like "FFmpeg for OOP > devs" and more like "FFmpeg for newbies (with some optional OOP > background reading)". > > Knowing that programming doesn't *have* to be object-oriented. > OOP has become so ubiquitous nowadays, there are plenty of programmers > who will insist everything is OOP if you just speak loudly and slowly. > This is a harder problem in some ways, because someone who doesn't > understand can always re-read until they do, while someone who jumps > to the wrong conclusion will just keep reading and try to make things > fit their assumption (e.g. my earlier messages in this thread!). > So the "how it differs from OOP" stuff needs to stay fairly prominent. > > Knowing anything about FFmpeg (or multimedia in general). I like the > idea of tweaking `doc/examples` to better introduce FFmpeg > fundamentals, but explaining "context" is a steep enough learning > curve on its own - introducing concepts like "stream" and "codec" at > the same time seems like too much. But even if you show the API that does not mean you need to explain it entirely, you only need to highligth the structural relationships: // create an instance context, whatever it is c = avcodec_alloc_context3(codec); if (!c) { fprintf(stderr, "Could not allocate video codec context\n"); exit(1); } // set context parametres directly c->bit_rate = 400000; /* resolution must be a multiple of two */ c->width = 352; c->height = 288; /* frames per second */ c->time_base = (AVRational){1, 25}; c->framerate = (AVRational){25, 1}; // use av_opt API to set the options? ... // open the codec context provided a codec ret = avcodec_open2(c, codec, NULL); if (ret < 0) { fprintf(stderr, "Could not open codec: %s\n", av_err2str(ret)); exit(1); } You might even replace avcodec_ with fooblargh_ and get the same effect, with the addition that with avcodec_ you are already familiarizing the user with the concrete API rather than with an hypotetical one. [...] > I've also gone through the code looking for edge cases we haven't covered. > Here are some questions trying to prompt an "oh yeah I forgot to mention > that"-type answer. Anything where the answer is more like "that should > probably be rewritten to be clearer", let me know and I'll avoid confusing > newbies with it. > > av_ambient_viewing_environment_create_side_data() takes an AVFrame as its > first argument, and returns a new AVAmbientViewingEnvironment. What is the > context object for that function - AVFrame or AVAmbientViewingEnvironment? But this should be clear from the doxy: /** * Allocate and add an AVAmbientViewingEnvironment structure to an existing * AVFrame as side data. * * @return the newly allocated struct, or NULL on failure */ AVAmbientViewingEnvironment *av_ambient_viewing_environment_create_side_data(AVFrame *frame); Also, you are assuming that all the function should have a context. That's not the case, as you don't always need to keep track of a "context" when performing operations. > > av_register_bitstream_filter() (deprecated 4.0, removed 5.0) took an > `AVBitStreamFilter *` as its first argument, but I don't think you'd say > the argument provided "context" for the function. So would I be right in > saying `AVBitStreamFilter *` is not a context, despite looking like one? This was finally dropped so this is even missing. But again, it seems you are assuming that all the functions should take a context, which is not the case. In that case you had: av_register_bistream_filter(filter) which was registering the filter in the program global state. > av_buffersink_*() all take a `const AVFilterContext *` argument. > What does the difference between av_buffersink prefix and AVFilter type mean? None, probabily it should have been named avfilter_buffersink since this is a libavfilter API, seel below. > av_channel_description_bprint() takes a `struct AVBPrint *` as its first > argument, then `enum AVChannel`. Is the context AVBPrint, AVChannel, > or both? Does it make sense for a function to have two contexts? Again, this should be clear from the doxy: /** * Get a human readable string describing a given channel. * * @param buf pre-allocated buffer where to put the generated string * @param buf_size size in bytes of the buffer. * @param channel the AVChannel whose description to get * @return amount of bytes needed to hold the output string, or a negative AVERROR * on failure. If the returned value is bigger than buf_size, then the * string was truncated. */ int av_channel_description(char *buf, size_t buf_size, enum AVChannel channel); /** * bprint variant of av_channel_description(). * * @note the string will be appended to the bprint buffer. */ void av_channel_description_bprint(struct AVBPrint *bp, enum AVChannel channel_id); > Related to the previous question, does `av_cmp_q()` count as a function > with two contexts? Or no contexts? And again, it looks like you are overgeneralizing and require that all the functions take a context. In general, the C language is procedural and so is the FFmpeg API. The fact that we are assuming some constructs which might mimic or resemble OOP does not mean that all the API was designed in that way. > > Finally, a general question - functions of the form "avfoo" seem like they > are more consistent than "av_foo". Does the underscore mean anything? This is due to the fact that despite the effort, implementing a consistent API is difficult, also due to different reviewers/contributors picking different conventions. In some case we prefer av_ (most of libavutil) for the generic API, while libavcodec/libavformat/libavfilter tend to use avLIB_, but there might be exceptions.
I'm still travelling, so the following thoughts might be a bit half-formed. But I wanted to get some feedback before sitting down for a proper think. On Sun, May 05, 2024 at 09:29:10AM +0200, Stefano Sabatini wrote: > On date Monday 2024-04-29 10:10:35 +0100, Andrew Sayers wrote: > > On Mon, Apr 22, 2024 at 07:05:12PM +0200, Stefano Sabatini wrote: > [...] > > > I don't have a strong opinion, but I'd probably focus on providing a > > > typical example of a common API (check doc/examples). Also I see here > > > there is a strong focus on OOP, this might be counter-productive in > > > case the reader is not familiar with OOP terminology. > > > > > > OTOH the content might be useful for readers coming from an OOP > > > background and terminology. I wonder if this content might be isolated > > > in a dedicated section, so that non-OOP readers can simply skip it. > > > > > > But this is not a strong objection, and can be possibly reworked in a > > > later step. > > > > > > > This is really a document for FFmpeg newbies, so we need to assume as > > little prior knowledge as possible. After a few days to think it > > over, I think we should avoid assuming... > > > > Knowledge of object-oriented programming. For example, this should be > > useful to a research mathematician with a project that involves codec > > algorithms. So the next draft should feel less like "FFmpeg for OOP > > devs" and more like "FFmpeg for newbies (with some optional OOP > > background reading)". > > > > Knowing that programming doesn't *have* to be object-oriented. > > OOP has become so ubiquitous nowadays, there are plenty of programmers > > who will insist everything is OOP if you just speak loudly and slowly. > > This is a harder problem in some ways, because someone who doesn't > > understand can always re-read until they do, while someone who jumps > > to the wrong conclusion will just keep reading and try to make things > > fit their assumption (e.g. my earlier messages in this thread!). > > So the "how it differs from OOP" stuff needs to stay fairly prominent. > > > > > Knowing anything about FFmpeg (or multimedia in general). I like the > > idea of tweaking `doc/examples` to better introduce FFmpeg > > fundamentals, but explaining "context" is a steep enough learning > > curve on its own - introducing concepts like "stream" and "codec" at > > the same time seems like too much. > > But even if you show the API that does not mean you need to explain > it entirely, you only need to highligth the structural relationships: > > // create an instance context, whatever it is > c = avcodec_alloc_context3(codec); > if (!c) { > fprintf(stderr, "Could not allocate video codec context\n"); > exit(1); > } > > // set context parametres directly > c->bit_rate = 400000; > /* resolution must be a multiple of two */ > c->width = 352; > c->height = 288; > /* frames per second */ > c->time_base = (AVRational){1, 25}; > c->framerate = (AVRational){25, 1}; > > // use av_opt API to set the options? > ... > > // open the codec context provided a codec > ret = avcodec_open2(c, codec, NULL); > if (ret < 0) { > fprintf(stderr, "Could not open codec: %s\n", av_err2str(ret)); > exit(1); > } > > You might even replace avcodec_ with fooblargh_ and get the same > effect, with the addition that with avcodec_ you are already > familiarizing the user with the concrete API rather than with an > hypotetical one. > > [...] > > > I've also gone through the code looking for edge cases we haven't covered. > > Here are some questions trying to prompt an "oh yeah I forgot to mention > > that"-type answer. Anything where the answer is more like "that should > > probably be rewritten to be clearer", let me know and I'll avoid confusing > > newbies with it. > > > > > av_ambient_viewing_environment_create_side_data() takes an AVFrame as its > > first argument, and returns a new AVAmbientViewingEnvironment. What is the > > context object for that function - AVFrame or AVAmbientViewingEnvironment? > > But this should be clear from the doxy: > > /** > * Allocate and add an AVAmbientViewingEnvironment structure to an existing > * AVFrame as side data. > * > * @return the newly allocated struct, or NULL on failure > */ > AVAmbientViewingEnvironment *av_ambient_viewing_environment_create_side_data(AVFrame *frame); I'm afraid it's not clear, at least to me. I think you're saying the AVFrame is the context because a "create" function can't have a context any more than a C++ "new" can be called as a member. But the function's prefix points to the other conclusion, and neither signal is clear enough on its own. My current thinking is to propose separate patches renaming arguments to `ctx` whenever I find functions I can't parse. That's not as good as a simple rule like "the first argument is always the context", but better than adding a paragraph or two about how to read the docs. > Also, you are assuming that all the function should have a > context. That's not the case, as you don't always need to keep track > of a "context" when performing operations. It sounds like there's a subtle distinction to be made here: when a struct is named "FooContext" or a variable is named `ctx`, the writer has signalled that "context" is the (only) correct interpretation. OTOH, when something acts like a context but isn't explicitly labelled as such, it may or may not have been intended as a context, and treating it as one may or may not be helpful - it's more "if it helps" than an objective fact. That means reasonable people can disagree about edge cases - for example, it's genuinely unclear whether av_register_bitstream_filter()'s argument is a context, and TBH if I weren't writing this doc I would just avoid thinking about av_ambient_viewing_environment_create_side_data() altogether. > > > > > av_register_bitstream_filter() (deprecated 4.0, removed 5.0) took an > > `AVBitStreamFilter *` as its first argument, but I don't think you'd say > > the argument provided "context" for the function. So would I be right in > > saying `AVBitStreamFilter *` is not a context, despite looking like one? > > This was finally dropped so this is even missing. But again, it seems > you are assuming that all the functions should take a context, which > is not the case. In that case you had: > av_register_bistream_filter(filter) > > which was registering the filter in the program global state. > > > av_buffersink_*() all take a `const AVFilterContext *` argument. > > What does the difference between av_buffersink prefix and AVFilter type mean? > > None, probabily it should have been named avfilter_buffersink since > this is a libavfilter API, seel below. > > > av_channel_description_bprint() takes a `struct AVBPrint *` as its first > > argument, then `enum AVChannel`. Is the context AVBPrint, AVChannel, > > or both? Does it make sense for a function to have two contexts? > > Again, this should be clear from the doxy: > /** > * Get a human readable string describing a given channel. > * > * @param buf pre-allocated buffer where to put the generated string > * @param buf_size size in bytes of the buffer. > * @param channel the AVChannel whose description to get > * @return amount of bytes needed to hold the output string, or a negative AVERROR > * on failure. If the returned value is bigger than buf_size, then the > * string was truncated. > */ > int av_channel_description(char *buf, size_t buf_size, enum AVChannel channel); > > /** > * bprint variant of av_channel_description(). > * > * @note the string will be appended to the bprint buffer. > */ > void av_channel_description_bprint(struct AVBPrint *bp, enum AVChannel channel_id); I think you're saying that I should look at which word appears more often in the doxy ("channel") rather than which word appears first in the argument list ("buf")? As above, the solution might be to rename the variable in a separate patch rather than teach people another special case. > > > Related to the previous question, does `av_cmp_q()` count as a function > > with two contexts? Or no contexts? > > And again, it looks like you are overgeneralizing and require that all > the functions take a context. In general, the C language is procedural > and so is the FFmpeg API. The fact that we are assuming some > constructs which might mimic or resemble OOP does not mean that all > the API was designed in that way. > > > > > Finally, a general question - functions of the form "avfoo" seem like they > > are more consistent than "av_foo". Does the underscore mean anything? > > This is due to the fact that despite the effort, implementing a > consistent API is difficult, also due to different > reviewers/contributors picking different conventions. > > In some case we prefer av_ (most of libavutil) for the generic API, > while libavcodec/libavformat/libavfilter tend to use avLIB_, but there > might be exceptions. Given all of the above (and Zhao Zhili's useful examples), I think the document might be better like: Section 1: context is a good way to think about some code constructs. If you've ever used a callback function with an arbitrary callback argument, you've used a context. <quick example of a hypothetical callback function with a context variable>. Section 2: some projects explicitly use "context" as a metaphor. <comparison of curl and FFmpeg's md5 contexts>. This should be understood as a broad metaphor that different people use to mean slightly different things. Section 3: FFmpeg has developed various conventions around contexts. <discussion of a non-AVClass example that follows most patterns but breaks some others>. Section 4: user-facing contexts use AVClass+AVOptions. This is a common and highly visible special case, but not the inevitable endpoint of all contexts. <discussion of an AVClass-enabled object>.
On date Sunday 2024-05-05 22:04:36 +0100, Andrew Sayers wrote: > I'm still travelling, so the following thoughts might be a bit > half-formed. But I wanted to get some feedback before sitting down > for a proper think. [...] > > > I've also gone through the code looking for edge cases we haven't covered. > > > Here are some questions trying to prompt an "oh yeah I forgot to mention > > > that"-type answer. Anything where the answer is more like "that should > > > probably be rewritten to be clearer", let me know and I'll avoid confusing > > > newbies with it. > > > > > > > > av_ambient_viewing_environment_create_side_data() takes an AVFrame as its > > > first argument, and returns a new AVAmbientViewingEnvironment. What is the > > > context object for that function - AVFrame or AVAmbientViewingEnvironment? > > > > But this should be clear from the doxy: > > > > /** > > * Allocate and add an AVAmbientViewingEnvironment structure to an existing > > * AVFrame as side data. > > * > > * @return the newly allocated struct, or NULL on failure > > */ > > AVAmbientViewingEnvironment *av_ambient_viewing_environment_create_side_data(AVFrame *frame); > > I'm afraid it's not clear, at least to me. I think you're saying the > AVFrame is the context because a "create" function can't have a > context any more than a C++ "new" can be called as a member. But the > function's prefix points to the other conclusion, and neither signal > is clear enough on its own. No, what I'm saying is that in some cases you don't need to think in terms of contexts, in this case there is no context at all, the function takes a frame and modify it, and returns the ambient environment to be used by the following functions. This should be very clear by reading the doxy. There is no rule dictating the first param of each FFmpeg function should be a "context". > > My current thinking is to propose separate patches renaming arguments > to `ctx` whenever I find functions I can't parse. That's not as good > as a simple rule like "the first argument is always the context", but > better than adding a paragraph or two about how to read the docs. There cannot be such rule, because it would be false in many cases. > > Also, you are assuming that all the function should have a > > context. That's not the case, as you don't always need to keep track > > of a "context" when performing operations. > [...] > > > av_channel_description_bprint() takes a `struct AVBPrint *` as its first > > > argument, then `enum AVChannel`. Is the context AVBPrint, AVChannel, > > > or both? Does it make sense for a function to have two contexts? > > > > Again, this should be clear from the doxy: > > /** > > * Get a human readable string describing a given channel. > > * > > * @param buf pre-allocated buffer where to put the generated string > > * @param buf_size size in bytes of the buffer. > > * @param channel the AVChannel whose description to get > > * @return amount of bytes needed to hold the output string, or a negative AVERROR > > * on failure. If the returned value is bigger than buf_size, then the > > * string was truncated. > > */ > > int av_channel_description(char *buf, size_t buf_size, enum AVChannel channel); > > > > /** > > * bprint variant of av_channel_description(). > > * > > * @note the string will be appended to the bprint buffer. > > */ > > void av_channel_description_bprint(struct AVBPrint *bp, enum AVChannel channel_id); > > I think you're saying that I should look at which word appears more > often in the doxy ("channel") rather than which word appears first in > the argument list ("buf")? As above, the solution might be to rename > the variable in a separate patch rather than teach people another > special case. This is more about the semantics described in English language by the doxy (which is normative). Again, thinking in terms of "contexts" is misleading in this case. In this case you have two functions, av_channel_description writing a string to a buffer with fixed size, the second modifying an "AVBPrint" struct, which is a high-level buffer providing more flexibility (and "bprint" is used as a verb in the doxy, which might be misleading). Note that both signatures are mimicing the standard C library convention: memcpy(dst, dst_size, src) which in turn is a mnemonics for: dst = src meaning that we are copying data from src to dst. You might think that in fact you are operating on a context (the dst buffer or the AVBPrint struct), but you don't need to introduce the concept of context for these simple functions.
On Wed, May 22, 2024 at 12:37:37PM +0200, Stefano Sabatini wrote: > On date Sunday 2024-05-05 22:04:36 +0100, Andrew Sayers wrote: > > I'm still travelling, so the following thoughts might be a bit > > half-formed. But I wanted to get some feedback before sitting down > > for a proper think. > [...] > > > > I've also gone through the code looking for edge cases we haven't covered. > > > > Here are some questions trying to prompt an "oh yeah I forgot to mention > > > > that"-type answer. Anything where the answer is more like "that should > > > > probably be rewritten to be clearer", let me know and I'll avoid confusing > > > > newbies with it. > > > > > > > > > > > av_ambient_viewing_environment_create_side_data() takes an AVFrame as its > > > > first argument, and returns a new AVAmbientViewingEnvironment. What is the > > > > context object for that function - AVFrame or AVAmbientViewingEnvironment? > > > > > > But this should be clear from the doxy: > > > > > > /** > > > * Allocate and add an AVAmbientViewingEnvironment structure to an existing > > > * AVFrame as side data. > > > * > > > * @return the newly allocated struct, or NULL on failure > > > */ > > > AVAmbientViewingEnvironment *av_ambient_viewing_environment_create_side_data(AVFrame *frame); > > > > I'm afraid it's not clear, at least to me. I think you're saying the > > AVFrame is the context because a "create" function can't have a > > context any more than a C++ "new" can be called as a member. But the > > function's prefix points to the other conclusion, and neither signal > > is clear enough on its own. > > No, what I'm saying is that in some cases you don't need to think in > terms of contexts, in this case there is no context at all, the > function takes a frame and modify it, and returns the ambient > environment to be used by the following functions. This should be very > clear by reading the doxy. There is no rule dictating the first param > of each FFmpeg function should be a "context". I'm still writing up a reply to your longer feedback, but on this topic... This function is in the "av_ambient_viewing_environment" namespace, and returns an object that is clearly used as a context for other functions. So saying "stop thinking about contexts" just leaves a negative space and a bad thing to fill it with (confusion in my case). I've found it useful to think about "receiving" vs. "producing" a context: * avcodec_alloc_context3() produces a context, but does not receive one * sws_init_context() receives a context, but does not produce one * av_ambient_viewing_environment_create_side_data() receives one context, and produces another How about if the document mostly talks about functions as having contexts, then follows it up with something like: There are some edge cases where this doesn't work. <examples>. If you find contexts a useful metaphor in these cases, you might prefer to think about them as "receiving" and "producing" contexts. ... or something similar that acknowledges contexts are unnecessary here, but provides a solution for people that want to use them anyway.
On date Wednesday 2024-05-22 13:47:36 +0100, Andrew Sayers wrote: > On Wed, May 22, 2024 at 12:37:37PM +0200, Stefano Sabatini wrote: > > On date Sunday 2024-05-05 22:04:36 +0100, Andrew Sayers wrote: > > > I'm still travelling, so the following thoughts might be a bit > > > half-formed. But I wanted to get some feedback before sitting down > > > for a proper think. > > [...] > > > > > I've also gone through the code looking for edge cases we haven't covered. > > > > > Here are some questions trying to prompt an "oh yeah I forgot to mention > > > > > that"-type answer. Anything where the answer is more like "that should > > > > > probably be rewritten to be clearer", let me know and I'll avoid confusing > > > > > newbies with it. > > > > > > > > > > > > > > av_ambient_viewing_environment_create_side_data() takes an AVFrame as its > > > > > first argument, and returns a new AVAmbientViewingEnvironment. What is the > > > > > context object for that function - AVFrame or AVAmbientViewingEnvironment? > > > > > > > > But this should be clear from the doxy: > > > > > > > > /** > > > > * Allocate and add an AVAmbientViewingEnvironment structure to an existing > > > > * AVFrame as side data. > > > > * > > > > * @return the newly allocated struct, or NULL on failure > > > > */ > > > > AVAmbientViewingEnvironment *av_ambient_viewing_environment_create_side_data(AVFrame *frame); > > > > > > I'm afraid it's not clear, at least to me. I think you're saying the > > > AVFrame is the context because a "create" function can't have a > > > context any more than a C++ "new" can be called as a member. But the > > > function's prefix points to the other conclusion, and neither signal > > > is clear enough on its own. > > > > No, what I'm saying is that in some cases you don't need to think in > > terms of contexts, in this case there is no context at all, the > > function takes a frame and modify it, and returns the ambient > > environment to be used by the following functions. This should be very > > clear by reading the doxy. There is no rule dictating the first param > > of each FFmpeg function should be a "context". > > I'm still writing up a reply to your longer feedback, but on this topic... > > This function is in the "av_ambient_viewing_environment" namespace, and returns > an object that is clearly used as a context for other functions. So saying > "stop thinking about contexts" just leaves a negative space and a bad thing > to fill it with (confusion in my case). >av_ambient_viewing_environment_create_side_data() takes an AVFrame as its >first argument, and returns a new AVAmbientViewingEnvironment. What is the >context object for that function - AVFrame or AVAmbientViewingEnvironment? av_ambient_viewing_environment_ defines the namespace. In this case we are not using the av_frame_ API since we want to have all the ambient API defined in a single place. Alternatively we might have extended the av_frame API (e.g. av_frame_create_ambient_viewing_environment_side_data()), but we wanted to avoid to reference the ambient API in the minimal frame API, so this approach makes perfect sense to me and I cannot see major inconsistencies. You can think about it in terms of a static or class constructor function, where AVFrame is simply an argument of the contructor. What is the approach that you would have expected in this case? > I've found it useful to think about "receiving" vs. "producing" a context: > > * avcodec_alloc_context3() produces a context, but does not receive one > * sws_init_context() receives a context, but does not produce one > * av_ambient_viewing_environment_create_side_data() receives one context, > and produces another > > How about if the document mostly talks about functions as having contexts, > then follows it up with something like: > > There are some edge cases where this doesn't work. <examples>. > If you find contexts a useful metaphor in these cases, you might > prefer to think about them as "receiving" and "producing" contexts. > > ... or something similar that acknowledges contexts are unnecessary here, > but provides a solution for people that want to use them anyway. I see, but I still see an excessive use of the context concept in place of the simpler and more natural use of parameters (in this case AVFrame is simply the input element), and this scheme is pretty common in C APIs (note that ambient is a simple structure, no need to use AVClass or options for this simple struct), so I see no need to tag this as an edge case.
diff --git a/doc/context.md b/doc/context.md new file mode 100644 index 0000000000..73caacf54f --- /dev/null +++ b/doc/context.md @@ -0,0 +1,276 @@ +# Context-oriented programming + +Like many C projects, FFmpeg has adopted the subset of object-oriented techniques +that help solve its problems. Object-like structures are called "contexts", +and this document provides a general introduction to how they work. + +Object-oriented programming usually focusses on *access* as a primary concern. +For example, members of an object are visibly designated "private", "constant" +etc. to control how they are accessed. *Reflection* is a secondary concern, +where it is provided at all. For example, C++ has no built-in way to get a +string containing the name of a variable. + +Reflection is extremely important for FFmpeg, because user-facing options are +implemented by reflecting state at runtime. Limiting access is a secondary +concern, mainly important for ensuring implementation details can change +between versions. + +An object-oriented programmer learning FFmpeg should be careful not to fixate on +FFmpeg's access control features, nor to overlook its reflection capabilities. +Both are present, but have been given the level of focus appropriate for the task. + +## Example: modify text then print it + +The example below shows a context structure that receives input strings, +modifies them in some context-dependant way, then prints them to a specified +filehandle. + +```c +/** + * Type information, accessible at runtime. + * + * Useful when configuring objects. + */ +enum ModifyThenPrintDialect { + MODIFY_THEN_PRINT_PLAIN_TEXT = 0, + MODIFY_THEN_PRINT_REGEX = 1, + MODIFY_THEN_PRINT_REGEX_PCRE = 2 +}; + +/** + * User-facing information about types + * + * Useful for describing contexts to the user. + */ +static const char* ModifyThenPrintDialectName[] = { + "plain text", + "regular expression", + "Perl-compatible regular expression" +}; + +/** + * Context for functions that modify strings before printing them + */ +struct ModifyThenPrintContext { + + /** + * Information about the type of this particular instance + * + * Object-oriented programs would probably represent this example + * as a sub-class, but some instance-specific functionality + * behaves more like a mixin. + */ + enum ModifyThenPrintDialect dialect; + + /** + * Internal context + * + * Object-oriented programs would put private members in here, + * but could also use it to store a virtual function table + * or other "invisible" information. + */ + void* priv_data; + + /** + * User-configurable options + * + * Best set through an API, but can be set directly if necessary + * + * Data from users needs to be validated before it's set, and the API + * might e.g. want to update some internal buffer when these change. + * Setting this directly is always less robust than using an API, + * but might be worthwhile if you're willing to spend the time checking + * for edge cases, and don't mind your code misbehaving if future + * versions change the API but not the structure. + * + * Object-oriented programs would likely make these protected members, + * initialised in a constructor and accessed with getters and setters. + * Making them user-configurable would be left to the programmer. + */ + char *replace_this; + char *with_this; + + /** + * Programmer-configurable variable + * + * Object-oriented programs would represent this as a public member. + */ + FILE *out; + +}; + +/** + * Allocate and initialize a ModifyThenPrintContext + * + * This creates a new pointer, then fills in some sensible defaults. + * + * We can reasonably assume this function will initialise `priv_data` + * with a dialect-specific object, but shouldn't make any assumptions + * about what that object is. + * + * Object-oriented programs would likely represent this as an + * allocator function and a constructor. + */ +int ModifyThenPrintContext_alloc_context(struct ModifyThenPrintContext **ctx, + enum ModifyThenPrintDialect dialect, + FILE *out); + +/** + * Uninitialize and deallocate a ModifyThenPrintContext + * + * This does any work required by the internal context (e.g. deallocating + * `priv_data`), then deallocates the main context itself. + * + * Object-oriented programs would likely represent this as a + * destructor and a deallocator. + */ +int ModifyThenPrintContext_free(struct ModifyThenPrintContext *ctx); + +/** + * Configure a ModifyThenPrintContext + * + * This checks the arguments are valid in the context's dialect, + * then updates the options as necessary + * + * Object-oriented programs would likely represent this as a + * collection of setter functions. + */ +int ModifyThenPrintContext_configure(struct ModifyThenPrintContext *ctx, + char* replace_this, + char* with_this); + +/** + * Print the contents of a ModifyThenPrintContext to a filehandle + * + * Provides human-readable information about keys and values. + * + * Object-oriented programs would likely represent this with some kind of + * `serialise` operation. + */ +int ModifyThenPrintContext_dump(struct ModifyThenPrintContext **ctx, + FILE *dump_fh); + +/** + * Print a single message + * + * Object-oriented programs would likely represent this with a member function. + */ +int ModifyThenPrintContext_print(struct ModifyThenPrintContext *ctx, + char *msg); + +/** + * How this context might be used in practice + */ +int print_hello_world() +{ + + int ret = 0; + + struct ModifyThenPrintContext *ctx; + + if ( ModifyThenPrintContext_alloc_context( &ctx, MODIFY_THEN_PRINT_REGEX, stdout ) < 0 ) { + ret = -1; + goto EXIT_WITHOUT_CLEANUP; + } + + if ( ModifyThenPrintContext_configure(ctx, "Hi|Hullo", "Hello") < 0 ) { + ret = -1; + goto FINISH; + } + + if ( ModifyThenPrintContext_print( ctx, "Hi, world!\n" ) < 0 ) { + ret = -1; + goto FINISH; + } + + if ( ModifyThenPrintContext_print( ctx, "Hullo, world!\n" ) < 0 ) { + ret = -1; + goto FINISH; + } + + FINISH: + if ( ModifyThenPrintContext_free( ctx ) ) { + ret = -1; + goto EXIT_WITHOUT_CLEANUP; + } + + EXIT_WITHOUT_CLEANUP: + return ret; + +} +``` + +## FFmpeg context structures + +The example above is a generic example of a context structure and its +associated functions. Some FFmpeg contexts are no more complex than +that example, just as some objects are just key/value stores with some +functions attached. But here are some examples to show the variety of +contexts available in FFmpeg. + +AVHashContext presents a generic API for hashing data. @ref hash.h +"Its associated functions" show how to create, use and destroy a hash. +The exact algorithm is specified by passing a string to av_hash_alloc(), +and the list of strings can be retrieved from av_hash_names(). +Algorithm-specific internal context is stored in AVHashContext::ctx. + +AVCodecContext is a complex member representing data about an encoding or +decoding session. @ref avcodec.h "Its associated functions" provide an +abstract interface to encode and decode data. Like most widely-used contexts, +its first member is an AVClass pointer containing instance-specific information. +That means it's an *AVClass context structure* (discussed in more detail below). + +X264Context contains the internal context for an AVCodecContext that uses +the x264 library. @ref libx264.c "Its associated functions" provide a concrete +implementation for interacting with libx264. This class is not directly +accessible through FFmpeg's public interface, so unlike AVCodecContext it can +change significantly between releases. + +## Reflection with AVClass and AVOptions + +An *AVClass context structure* is a context whose first member is an AVClass +that reflects its contents. An *@ref avoptions "AVOptions"-enabled struct* +is a structure which reflects its contents using the @ref avoptions "AVOptions" +API. These terms have become synonymous in modern usage, but you might notice +some distinction when looking at code written after AVClass was developed but +before @ref avoptions "AVOptions" was added. + +At first glance, an object-oriented programmer might be tempted to look at AVClass +as a base class that contexts inherit from. But unlike a base class, an AVClass +doesn't imply any theoretical relationship between objects, and contexts of the +same type will often have different AVClass values. It's even theoretically possible +for a single AVClass to be shared between contexts of different types. A better +analogy would be to [C++ concepts](https://en.wikipedia.org/wiki/Concepts_(C%2B%2B)) +or [Java interfaces](https://en.wikipedia.org/wiki/Interface_(Java)). + +To understand how AVClass and @ref avoptions "AVOptions" work, +consider the requirements for a `libx264` encoder: + +- it has to support common encoder options like "bitrate" +- it has to support encoder-specific options like "profile" + - the exact options could change quickly if a legal ruling forces a change of backend +- it has to provide useful feedback to users about unsupported options + +Common encoder options like "bitrate" need to be stored in AVCodecContext itself +to avoid duplicating code, while encoder-specific options like "profile" have to +be stored in the X264Context instance stored in AVCodecContext::priv_data. +But both need to be presented to users (along with help text, default values etc.) +in a way that makes it easy to build user interfaces to get and set those options. + +A context's AVClass member contains a list of AVOption objects, describing +the user-configurable members of the class. It also contains functions to +iterate through a tree of AVClass objects representing internal context either +for the current object or for any class that could potentially exist in the +current version of FFmpeg. + +The @ref avoptions "AVOptions" API accepts any AVClass context structure, +looks through its AVOption data, and uses that to examine, introspect, and modify +the structure. Because the API doesn't contain any type-specific information, +it can be used to create a general user interface that adapts automatically +when e.g. a new version of FFmpeg adds a new configuration option. + +## Summary + +FFmpeg uses "contexts" in ways that are often resemble object-oriented programming. +But it focuses less on encapsulation within arbitrarily complex systems, +and more on providing reflectivity to make pleasant user interfaces.