diff mbox series

[FFmpeg-devel,v3,1/3] doc: Explain what "context" means

Message ID 20240422155836.385333-2-ffmpeg-devel@pileofstuff.org
State New
Headers show
Series all: Link to "context" from all contexts with documentation | expand

Checks

Context Check Description
yinshiyou/make_loongarch64 success Make finished
yinshiyou/make_fate_loongarch64 success Make fate finished
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Andrew Sayers April 22, 2024, 3:56 p.m. UTC
Derived from detailed explanations kindly provided by Stefano Sabatini:
https://ffmpeg.org/pipermail/ffmpeg-devel/2024-April/325903.html
---
 doc/context.md | 276 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 276 insertions(+)
 create mode 100644 doc/context.md

Comments

Stefano Sabatini April 22, 2024, 5:05 p.m. UTC | #1
On date Monday 2024-04-22 16:56:48 +0100, Andrew Sayers wrote:
> Derived from detailed explanations kindly provided by Stefano Sabatini:
> https://ffmpeg.org/pipermail/ffmpeg-devel/2024-April/325903.html
> ---
>  doc/context.md | 276 +++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 276 insertions(+)
>  create mode 100644 doc/context.md
> 
> diff --git a/doc/context.md b/doc/context.md
> new file mode 100644
> index 0000000000..73caacf54f
> --- /dev/null
> +++ b/doc/context.md
> @@ -0,0 +1,276 @@
> +# Context-oriented programming
> +
> +Like many C projects, FFmpeg has adopted the subset of object-oriented techniques
> +that help solve its problems.  Object-like structures are called "contexts",
> +and this document provides a general introduction to how they work.
> +
> +Object-oriented programming usually focusses on *access* as a primary concern.
> +For example, members of an object are visibly designated "private", "constant"
> +etc. to control how they are accessed.  *Reflection* is a secondary concern,
> +where it is provided at all.  For example, C++ has no built-in way to get a
> +string containing the name of a variable.
> +
> +Reflection is extremely important for FFmpeg, because user-facing options are
> +implemented by reflecting state at runtime. Limiting access is a secondary
> +concern, mainly important for ensuring implementation details can change
> +between versions.
> +
> +An object-oriented programmer learning FFmpeg should be careful not to fixate on
> +FFmpeg's access control features, nor to overlook its reflection capabilities.
> +Both are present, but have been given the level of focus appropriate for the task.
> +
> +## Example: modify text then print it
> +
> +The example below shows a context structure that receives input strings,
> +modifies them in some context-dependant way, then prints them to a specified
> +filehandle.
> +
> +```c
> +/**
> + * Type information, accessible at runtime.
> + *
> + * Useful when configuring objects.
> + */
> +enum ModifyThenPrintDialect {
> +    MODIFY_THEN_PRINT_PLAIN_TEXT = 0,
> +    MODIFY_THEN_PRINT_REGEX      = 1,
> +    MODIFY_THEN_PRINT_REGEX_PCRE = 2
> +};
> +
> +/**
> + * User-facing information about types
> + *
> + * Useful for describing contexts to the user.
> + */
> +static const char* ModifyThenPrintDialectName[] = {
> +    "plain text",
> +    "regular expression",
> +    "Perl-compatible regular expression"
> +};
> +
> +/**
> + * Context for functions that modify strings before printing them
> + */
> +struct ModifyThenPrintContext {
> +
> +    /**
> +     * Information about the type of this particular instance
> +     *
> +     * Object-oriented programs would probably represent this example
> +     * as a sub-class, but some instance-specific functionality
> +     * behaves more like a mixin.
> +     */
> +    enum ModifyThenPrintDialect dialect;
> +
> +    /**
> +     * Internal context
> +     *
> +     * Object-oriented programs would put private members in here,
> +     * but could also use it to store a virtual function table
> +     * or other "invisible" information.
> +     */

> +    void* priv_data;

style:
type *name;

here and below


> +
> +    /**
> +     * User-configurable options
> +     *
> +     * Best set through an API, but can be set directly if necessary
> +     *
> +     * Data from users needs to be validated before it's set, and the API
> +     * might e.g. want to update some internal buffer when these change.
> +     * Setting this directly is always less robust than using an API,
> +     * but might be worthwhile if you're willing to spend the time checking
> +     * for edge cases, and don't mind your code misbehaving if future
> +     * versions change the API but not the structure.
> +     *
> +     * Object-oriented programs would likely make these protected members,
> +     * initialised in a constructor and accessed with getters and setters.
> +     * Making them user-configurable would be left to the programmer.
> +     */
> +    char *replace_this;
> +    char *with_this;
> +
> +    /**
> +     * Programmer-configurable variable
> +     *
> +     * Object-oriented programs would represent this as a public member.
> +     */
> +    FILE *out;
> +
> +};
> +
> +/**
> + * Allocate and initialize a ModifyThenPrintContext
> + *
> + * This creates a new pointer, then fills in some sensible defaults.
> + *
> + * We can reasonably assume this function will initialise `priv_data`
> + * with a dialect-specific object, but shouldn't make any assumptions
> + * about what that object is.
> + *
> + * Object-oriented programs would likely represent this as an
> + * allocator function and a constructor.
> + */
> +int ModifyThenPrintContext_alloc_context(struct ModifyThenPrintContext **ctx,
> +                                         enum ModifyThenPrintDialect dialect,
> +                                         FILE *out);
> +
> +/**
> + * Uninitialize and deallocate a ModifyThenPrintContext
> + *
> + * This does any work required by the internal context (e.g. deallocating
> + * `priv_data`), then deallocates the main context itself.
> + *
> + * Object-oriented programs would likely represent this as a
> + * destructor and a deallocator.
> + */
> +int ModifyThenPrintContext_free(struct ModifyThenPrintContext *ctx);
> +
> +/**
> + * Configure a ModifyThenPrintContext
> + *
> + * This checks the arguments are valid in the context's dialect,
> + * then updates the options as necessary
> + *
> + * Object-oriented programs would likely represent this as a
> + * collection of setter functions.
> + */
> +int ModifyThenPrintContext_configure(struct ModifyThenPrintContext *ctx,
> +                                     char* replace_this,
> +                                     char* with_this);
> +
> +/**
> + * Print the contents of a ModifyThenPrintContext to a filehandle
> + *
> + * Provides human-readable information about keys and values.
> + *
> + * Object-oriented programs would likely represent this with some kind of
> + * `serialise` operation.
> + */
> +int ModifyThenPrintContext_dump(struct ModifyThenPrintContext **ctx,
> +                                FILE *dump_fh);
> +
> +/**
> + * Print a single message
> + *
> + * Object-oriented programs would likely represent this with a member function.
> + */
> +int ModifyThenPrintContext_print(struct ModifyThenPrintContext *ctx,
> +                                 char *msg);
> +
> +/**
> + * How this context might be used in practice
> + */
> +int print_hello_world()
> +{
> +
> +    int ret = 0;
> +
> +    struct ModifyThenPrintContext *ctx;
> +
> +    if ( ModifyThenPrintContext_alloc_context( &ctx, MODIFY_THEN_PRINT_REGEX, stdout ) < 0 ) {
> +        ret = -1;
> +        goto EXIT_WITHOUT_CLEANUP;
> +    }
> +
> +    if ( ModifyThenPrintContext_configure(ctx, "Hi|Hullo", "Hello") < 0 ) {
> +        ret = -1;
> +        goto FINISH;
> +    }
> +
> +    if ( ModifyThenPrintContext_print( ctx, "Hi, world!\n" ) < 0 ) {
> +        ret = -1;
> +        goto FINISH;
> +    }
> +
> +    if ( ModifyThenPrintContext_print( ctx, "Hullo, world!\n" ) < 0 ) {
> +        ret = -1;
> +        goto FINISH;
> +    }
> +
> +    FINISH:
> +    if ( ModifyThenPrintContext_free( ctx ) ) {
> +        ret = -1;
> +        goto EXIT_WITHOUT_CLEANUP;
> +    }
> +
> +    EXIT_WITHOUT_CLEANUP:
> +    return ret;
> +
> +}
> +```

I don't have a strong opinion, but I'd probably focus on providing a
typical example of a common API (check doc/examples). Also I see here
there is a strong focus on OOP, this might be counter-productive in
case the reader is not familiar with OOP terminology.

OTOH the content might be useful for readers coming from an OOP
background and terminology. I wonder if this content might be isolated
in a dedicated section, so that non-OOP readers can simply skip it.

But this is not a strong objection, and can be possibly reworked in a
later step.

> +
> +## FFmpeg context structures
> +
> +The example above is a generic example of a context structure and its
> +associated functions.  Some FFmpeg contexts are no more complex than
> +that example, just as some objects are just key/value stores with some
> +functions attached.  But here are some examples to show the variety of
> +contexts available in FFmpeg.
> +
> +AVHashContext presents a generic API for hashing data.  @ref hash.h
> +"Its associated functions" show how to create, use and destroy a hash.
> +The exact algorithm is specified by passing a string to av_hash_alloc(),
> +and the list of strings can be retrieved from av_hash_names().
> +Algorithm-specific internal context is stored in AVHashContext::ctx.
> +
> +AVCodecContext is a complex member representing data about an encoding or
> +decoding session.  @ref avcodec.h "Its associated functions" provide an
> +abstract interface to encode and decode data.  Like most widely-used contexts,
> +its first member is an AVClass pointer containing instance-specific information.
> +That means it's an *AVClass context structure* (discussed in more detail below).
> +
> +X264Context contains the internal context for an AVCodecContext that uses
> +the x264 library.  @ref libx264.c "Its associated functions" provide a concrete

> +implementation for interacting with libx264.  This class is not directly
> +accessible through FFmpeg's public interface, so unlike AVCodecContext it can
> +change significantly between releases.

Note: still, we are not supposed to break compatibilty, for example by
changing the private option names, or this will break both scripts
using the ff* tools and applications using the libav* libraries, since
they refer to those option names. In other words, the options names
provide the public API to interact with a specific codec.

> +## Reflection with AVClass and AVOptions
> +
> +An *AVClass context structure* is a context whose first member is an AVClass
> +that reflects its contents.  An *@ref avoptions "AVOptions"-enabled struct*
> +is a structure which reflects its contents using the @ref avoptions "AVOptions"
> +API.

"reflects its contents" is a bit too strong, we want only to be able
to set and query options on such structures. In some cases the
AVOptions API is abused also to get "properties" from the structures
(see for example the "location" option in libavformation/http.c). But
the whole content in general is not available through AVOptions, only
a specific subset which is supposed to configure the structure.

> These terms have become synonymous in modern usage, but you might notice
> +some distinction when looking at code written after AVClass was developed but
> +before @ref avoptions "AVOptions" was added.
> +
> +At first glance, an object-oriented programmer might be tempted to look at AVClass
> +as a base class that contexts inherit from.  But unlike a base class, an AVClass
> +doesn't imply any theoretical relationship between objects, and contexts of the
> +same type will often have different AVClass values.  It's even theoretically possible
> +for a single AVClass to be shared between contexts of different types.  A better
> +analogy would be to [C++ concepts](https://en.wikipedia.org/wiki/Concepts_(C%2B%2B))
> +or [Java interfaces](https://en.wikipedia.org/wiki/Interface_(Java)).
> +
> +To understand how AVClass and @ref avoptions "AVOptions" work,
> +consider the requirements for a `libx264` encoder:
> +
> +- it has to support common encoder options like "bitrate"

> +- it has to support encoder-specific options like "profile"

> +  - the exact options could change quickly if a legal ruling forces a change of backend

but this will be a compatiblity break for most applications and
scripts (you will notice in this case because people will complain
aloud about the breakage), so we need to be careful when changing
options following the usual deprecation dance

> +- it has to provide useful feedback to users about unsupported options
> +
> +Common encoder options like "bitrate" need to be stored in AVCodecContext itself
> +to avoid duplicating code, while encoder-specific options like "profile" have to
> +be stored in the X264Context instance stored in AVCodecContext::priv_data.

> +But both need to be presented to users (along with help text, default values etc.)
> +in a way that makes it easy to build user interfaces to get and set those options.

nit: s/easy/possible/

> +
> +A context's AVClass member contains a list of AVOption objects, describing
> +the user-configurable members of the class.  It also contains functions to
> +iterate through a tree of AVClass objects representing internal context either
> +for the current object or for any class that could potentially exist in the
> +current version of FFmpeg.
> +
> +The @ref avoptions "AVOptions" API accepts any AVClass context structure,
> +looks through its AVOption data, and uses that to examine, introspect, and modify
> +the structure.  Because the API doesn't contain any type-specific information,
> +it can be used to create a general user interface that adapts automatically
> +when e.g. a new version of FFmpeg adds a new configuration option.
> +
> +## Summary
> +

> +FFmpeg uses "contexts" in ways that are often resemble object-oriented programming.
> +But it focuses less on encapsulation within arbitrarily complex systems,
> +and more on providing reflectivity to make pleasant user interfaces.

nit++: s/pleasant//, here the point is to make them possible, we don't
know if they are pleasant :-)

[...]

Note: I'll be probably offline for one week or so.
Andrew Sayers April 29, 2024, 9:10 a.m. UTC | #2
On Mon, Apr 22, 2024 at 07:05:12PM +0200, Stefano Sabatini wrote:
> On date Monday 2024-04-22 16:56:48 +0100, Andrew Sayers wrote:
> > Derived from detailed explanations kindly provided by Stefano Sabatini:
> > https://ffmpeg.org/pipermail/ffmpeg-devel/2024-April/325903.html
> > ---
> >  doc/context.md | 276 +++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 276 insertions(+)
> >  create mode 100644 doc/context.md
> > 
> > diff --git a/doc/context.md b/doc/context.md
> > new file mode 100644
> > index 0000000000..73caacf54f
> > --- /dev/null
> > +++ b/doc/context.md
> > @@ -0,0 +1,276 @@
> > +# Context-oriented programming
> > +
> > +Like many C projects, FFmpeg has adopted the subset of object-oriented techniques
> > +that help solve its problems.  Object-like structures are called "contexts",
> > +and this document provides a general introduction to how they work.
> > +
> > +Object-oriented programming usually focusses on *access* as a primary concern.
> > +For example, members of an object are visibly designated "private", "constant"
> > +etc. to control how they are accessed.  *Reflection* is a secondary concern,
> > +where it is provided at all.  For example, C++ has no built-in way to get a
> > +string containing the name of a variable.
> > +
> > +Reflection is extremely important for FFmpeg, because user-facing options are
> > +implemented by reflecting state at runtime. Limiting access is a secondary
> > +concern, mainly important for ensuring implementation details can change
> > +between versions.
> > +
> > +An object-oriented programmer learning FFmpeg should be careful not to fixate on
> > +FFmpeg's access control features, nor to overlook its reflection capabilities.
> > +Both are present, but have been given the level of focus appropriate for the task.
> > +
> > +## Example: modify text then print it
> > +
> > +The example below shows a context structure that receives input strings,
> > +modifies them in some context-dependant way, then prints them to a specified
> > +filehandle.
> > +
> > +```c
> > +/**
> > + * Type information, accessible at runtime.
> > + *
> > + * Useful when configuring objects.
> > + */
> > +enum ModifyThenPrintDialect {
> > +    MODIFY_THEN_PRINT_PLAIN_TEXT = 0,
> > +    MODIFY_THEN_PRINT_REGEX      = 1,
> > +    MODIFY_THEN_PRINT_REGEX_PCRE = 2
> > +};
> > +
> > +/**
> > + * User-facing information about types
> > + *
> > + * Useful for describing contexts to the user.
> > + */
> > +static const char* ModifyThenPrintDialectName[] = {
> > +    "plain text",
> > +    "regular expression",
> > +    "Perl-compatible regular expression"
> > +};
> > +
> > +/**
> > + * Context for functions that modify strings before printing them
> > + */
> > +struct ModifyThenPrintContext {
> > +
> > +    /**
> > +     * Information about the type of this particular instance
> > +     *
> > +     * Object-oriented programs would probably represent this example
> > +     * as a sub-class, but some instance-specific functionality
> > +     * behaves more like a mixin.
> > +     */
> > +    enum ModifyThenPrintDialect dialect;
> > +
> > +    /**
> > +     * Internal context
> > +     *
> > +     * Object-oriented programs would put private members in here,
> > +     * but could also use it to store a virtual function table
> > +     * or other "invisible" information.
> > +     */
> 
> > +    void* priv_data;
> 
> style:
> type *name;
> 
> here and below
> 
> 
> > +
> > +    /**
> > +     * User-configurable options
> > +     *
> > +     * Best set through an API, but can be set directly if necessary
> > +     *
> > +     * Data from users needs to be validated before it's set, and the API
> > +     * might e.g. want to update some internal buffer when these change.
> > +     * Setting this directly is always less robust than using an API,
> > +     * but might be worthwhile if you're willing to spend the time checking
> > +     * for edge cases, and don't mind your code misbehaving if future
> > +     * versions change the API but not the structure.
> > +     *
> > +     * Object-oriented programs would likely make these protected members,
> > +     * initialised in a constructor and accessed with getters and setters.
> > +     * Making them user-configurable would be left to the programmer.
> > +     */
> > +    char *replace_this;
> > +    char *with_this;
> > +
> > +    /**
> > +     * Programmer-configurable variable
> > +     *
> > +     * Object-oriented programs would represent this as a public member.
> > +     */
> > +    FILE *out;
> > +
> > +};
> > +
> > +/**
> > + * Allocate and initialize a ModifyThenPrintContext
> > + *
> > + * This creates a new pointer, then fills in some sensible defaults.
> > + *
> > + * We can reasonably assume this function will initialise `priv_data`
> > + * with a dialect-specific object, but shouldn't make any assumptions
> > + * about what that object is.
> > + *
> > + * Object-oriented programs would likely represent this as an
> > + * allocator function and a constructor.
> > + */
> > +int ModifyThenPrintContext_alloc_context(struct ModifyThenPrintContext **ctx,
> > +                                         enum ModifyThenPrintDialect dialect,
> > +                                         FILE *out);
> > +
> > +/**
> > + * Uninitialize and deallocate a ModifyThenPrintContext
> > + *
> > + * This does any work required by the internal context (e.g. deallocating
> > + * `priv_data`), then deallocates the main context itself.
> > + *
> > + * Object-oriented programs would likely represent this as a
> > + * destructor and a deallocator.
> > + */
> > +int ModifyThenPrintContext_free(struct ModifyThenPrintContext *ctx);
> > +
> > +/**
> > + * Configure a ModifyThenPrintContext
> > + *
> > + * This checks the arguments are valid in the context's dialect,
> > + * then updates the options as necessary
> > + *
> > + * Object-oriented programs would likely represent this as a
> > + * collection of setter functions.
> > + */
> > +int ModifyThenPrintContext_configure(struct ModifyThenPrintContext *ctx,
> > +                                     char* replace_this,
> > +                                     char* with_this);
> > +
> > +/**
> > + * Print the contents of a ModifyThenPrintContext to a filehandle
> > + *
> > + * Provides human-readable information about keys and values.
> > + *
> > + * Object-oriented programs would likely represent this with some kind of
> > + * `serialise` operation.
> > + */
> > +int ModifyThenPrintContext_dump(struct ModifyThenPrintContext **ctx,
> > +                                FILE *dump_fh);
> > +
> > +/**
> > + * Print a single message
> > + *
> > + * Object-oriented programs would likely represent this with a member function.
> > + */
> > +int ModifyThenPrintContext_print(struct ModifyThenPrintContext *ctx,
> > +                                 char *msg);
> > +
> > +/**
> > + * How this context might be used in practice
> > + */
> > +int print_hello_world()
> > +{
> > +
> > +    int ret = 0;
> > +
> > +    struct ModifyThenPrintContext *ctx;
> > +
> > +    if ( ModifyThenPrintContext_alloc_context( &ctx, MODIFY_THEN_PRINT_REGEX, stdout ) < 0 ) {
> > +        ret = -1;
> > +        goto EXIT_WITHOUT_CLEANUP;
> > +    }
> > +
> > +    if ( ModifyThenPrintContext_configure(ctx, "Hi|Hullo", "Hello") < 0 ) {
> > +        ret = -1;
> > +        goto FINISH;
> > +    }
> > +
> > +    if ( ModifyThenPrintContext_print( ctx, "Hi, world!\n" ) < 0 ) {
> > +        ret = -1;
> > +        goto FINISH;
> > +    }
> > +
> > +    if ( ModifyThenPrintContext_print( ctx, "Hullo, world!\n" ) < 0 ) {
> > +        ret = -1;
> > +        goto FINISH;
> > +    }
> > +
> > +    FINISH:
> > +    if ( ModifyThenPrintContext_free( ctx ) ) {
> > +        ret = -1;
> > +        goto EXIT_WITHOUT_CLEANUP;
> > +    }
> > +
> > +    EXIT_WITHOUT_CLEANUP:
> > +    return ret;
> > +
> > +}
> > +```
> 
> I don't have a strong opinion, but I'd probably focus on providing a
> typical example of a common API (check doc/examples). Also I see here
> there is a strong focus on OOP, this might be counter-productive in
> case the reader is not familiar with OOP terminology.
> 
> OTOH the content might be useful for readers coming from an OOP
> background and terminology. I wonder if this content might be isolated
> in a dedicated section, so that non-OOP readers can simply skip it.
> 
> But this is not a strong objection, and can be possibly reworked in a
> later step.
> 

This is really a document for FFmpeg newbies, so we need to assume as
little prior knowledge as possible.  After a few days to think it
over, I think we should avoid assuming...

Knowledge of object-oriented programming.  For example, this should be
useful to a research mathematician with a project that involves codec
algorithms.  So the next draft should feel less like "FFmpeg for OOP
devs" and more like "FFmpeg for newbies (with some optional OOP
background reading)".

Knowing that programming doesn't *have* to be object-oriented.
OOP has become so ubiquitous nowadays, there are plenty of programmers
who will insist everything is OOP if you just speak loudly and slowly.
This is a harder problem in some ways, because someone who doesn't
understand can always re-read until they do, while someone who jumps
to the wrong conclusion will just keep reading and try to make things
fit their assumption (e.g. my earlier messages in this thread!).
So the "how it differs from OOP" stuff needs to stay fairly prominent.

Knowing anything about FFmpeg (or multimedia in general).  I like the
idea of tweaking `doc/examples` to better introduce FFmpeg
fundamentals, but explaining "context" is a steep enough learning
curve on its own - introducing concepts like "stream" and "codec" at
the same time seems like too much.

> > +
> > +## FFmpeg context structures
> > +
> > +The example above is a generic example of a context structure and its
> > +associated functions.  Some FFmpeg contexts are no more complex than
> > +that example, just as some objects are just key/value stores with some
> > +functions attached.  But here are some examples to show the variety of
> > +contexts available in FFmpeg.
> > +
> > +AVHashContext presents a generic API for hashing data.  @ref hash.h
> > +"Its associated functions" show how to create, use and destroy a hash.
> > +The exact algorithm is specified by passing a string to av_hash_alloc(),
> > +and the list of strings can be retrieved from av_hash_names().
> > +Algorithm-specific internal context is stored in AVHashContext::ctx.
> > +
> > +AVCodecContext is a complex member representing data about an encoding or
> > +decoding session.  @ref avcodec.h "Its associated functions" provide an
> > +abstract interface to encode and decode data.  Like most widely-used contexts,
> > +its first member is an AVClass pointer containing instance-specific information.
> > +That means it's an *AVClass context structure* (discussed in more detail below).
> > +
> > +X264Context contains the internal context for an AVCodecContext that uses
> > +the x264 library.  @ref libx264.c "Its associated functions" provide a concrete
> 
> > +implementation for interacting with libx264.  This class is not directly
> > +accessible through FFmpeg's public interface, so unlike AVCodecContext it can
> > +change significantly between releases.
> 
> Note: still, we are not supposed to break compatibilty, for example by
> changing the private option names, or this will break both scripts
> using the ff* tools and applications using the libav* libraries, since
> they refer to those option names. In other words, the options names
> provide the public API to interact with a specific codec.
> 
> > +## Reflection with AVClass and AVOptions
> > +
> > +An *AVClass context structure* is a context whose first member is an AVClass
> > +that reflects its contents.  An *@ref avoptions "AVOptions"-enabled struct*
> > +is a structure which reflects its contents using the @ref avoptions "AVOptions"
> > +API.
> 
> "reflects its contents" is a bit too strong, we want only to be able
> to set and query options on such structures. In some cases the
> AVOptions API is abused also to get "properties" from the structures
> (see for example the "location" option in libavformation/http.c). But
> the whole content in general is not available through AVOptions, only
> a specific subset which is supposed to configure the structure.
> 
> > These terms have become synonymous in modern usage, but you might notice
> > +some distinction when looking at code written after AVClass was developed but
> > +before @ref avoptions "AVOptions" was added.
> > +
> > +At first glance, an object-oriented programmer might be tempted to look at AVClass
> > +as a base class that contexts inherit from.  But unlike a base class, an AVClass
> > +doesn't imply any theoretical relationship between objects, and contexts of the
> > +same type will often have different AVClass values.  It's even theoretically possible
> > +for a single AVClass to be shared between contexts of different types.  A better
> > +analogy would be to [C++ concepts](https://en.wikipedia.org/wiki/Concepts_(C%2B%2B))
> > +or [Java interfaces](https://en.wikipedia.org/wiki/Interface_(Java)).
> > +
> > +To understand how AVClass and @ref avoptions "AVOptions" work,
> > +consider the requirements for a `libx264` encoder:
> > +
> > +- it has to support common encoder options like "bitrate"
> 
> > +- it has to support encoder-specific options like "profile"
> 
> > +  - the exact options could change quickly if a legal ruling forces a change of backend
> 
> but this will be a compatiblity break for most applications and
> scripts (you will notice in this case because people will complain
> aloud about the breakage), so we need to be careful when changing
> options following the usual deprecation dance
> 
> > +- it has to provide useful feedback to users about unsupported options
> > +
> > +Common encoder options like "bitrate" need to be stored in AVCodecContext itself
> > +to avoid duplicating code, while encoder-specific options like "profile" have to
> > +be stored in the X264Context instance stored in AVCodecContext::priv_data.
> 
> > +But both need to be presented to users (along with help text, default values etc.)
> > +in a way that makes it easy to build user interfaces to get and set those options.
> 
> nit: s/easy/possible/
> 
> > +
> > +A context's AVClass member contains a list of AVOption objects, describing
> > +the user-configurable members of the class.  It also contains functions to
> > +iterate through a tree of AVClass objects representing internal context either
> > +for the current object or for any class that could potentially exist in the
> > +current version of FFmpeg.
> > +
> > +The @ref avoptions "AVOptions" API accepts any AVClass context structure,
> > +looks through its AVOption data, and uses that to examine, introspect, and modify
> > +the structure.  Because the API doesn't contain any type-specific information,
> > +it can be used to create a general user interface that adapts automatically
> > +when e.g. a new version of FFmpeg adds a new configuration option.
> > +
> > +## Summary
> > +
> 
> > +FFmpeg uses "contexts" in ways that are often resemble object-oriented programming.
> > +But it focuses less on encapsulation within arbitrarily complex systems,
> > +and more on providing reflectivity to make pleasant user interfaces.
> 
> nit++: s/pleasant//, here the point is to make them possible, we don't
> know if they are pleasant :-)

Comments above should be reflected in a patch I'll send after writing this.
The patch also adds more links from the code, because I noticed doxygen was
replacing all "Context for foo" comments with links to some private struct
that happened to be called "Context".

I've also gone through the code looking for edge cases we haven't covered.
Here are some questions trying to prompt an "oh yeah I forgot to mention
that"-type answer.  Anything where the answer is more like "that should
probably be rewritten to be clearer", let me know and I'll avoid confusing
newbies with it.

av_ambient_viewing_environment_create_side_data() takes an AVFrame as its
first argument, and returns a new AVAmbientViewingEnvironment.  What is the
context object for that function - AVFrame or AVAmbientViewingEnvironment?

av_register_bitstream_filter() (deprecated 4.0, removed 5.0) took an
`AVBitStreamFilter *` as its first argument, but I don't think you'd say
the argument provided "context" for the function.  So would I be right in
saying `AVBitStreamFilter *` is not a context, despite looking like one?

av_buffersink_*() all take a `const AVFilterContext *` argument.
What does the difference between av_buffersink prefix and AVFilter type mean?

av_channel_description_bprint() takes a `struct AVBPrint *` as its first
argument, then `enum AVChannel`.  Is the context AVBPrint, AVChannel,
or both?  Does it make sense for a function to have two contexts?

Related to the previous question, does `av_cmp_q()` count as a function
with two contexts?  Or no contexts?

Finally, a general question - functions of the form "avfoo" seem like they
are more consistent than "av_foo".  Does the underscore mean anything?
Andrew Sayers May 2, 2024, 10:03 a.m. UTC | #3
On Mon, Apr 29, 2024 at 10:10:35AM +0100, Andrew Sayers wrote:
> 
> I've also gone through the code looking for edge cases we haven't covered.
> Here are some questions trying to prompt an "oh yeah I forgot to mention
> that"-type answer.  Anything where the answer is more like "that should
> probably be rewritten to be clearer", let me know and I'll avoid confusing
> newbies with it.
> 
> av_ambient_viewing_environment_create_side_data() takes an AVFrame as its
> first argument, and returns a new AVAmbientViewingEnvironment.  What is the
> context object for that function - AVFrame or AVAmbientViewingEnvironment?
> 
> av_register_bitstream_filter() (deprecated 4.0, removed 5.0) took an
> `AVBitStreamFilter *` as its first argument, but I don't think you'd say
> the argument provided "context" for the function.  So would I be right in
> saying `AVBitStreamFilter *` is not a context, despite looking like one?
> 
> av_buffersink_*() all take a `const AVFilterContext *` argument.
> What does the difference between av_buffersink prefix and AVFilter type mean?
> 
> av_channel_description_bprint() takes a `struct AVBPrint *` as its first
> argument, then `enum AVChannel`.  Is the context AVBPrint, AVChannel,
> or both?  Does it make sense for a function to have two contexts?
> 
> Related to the previous question, does `av_cmp_q()` count as a function
> with two contexts?  Or no contexts?
> 
> Finally, a general question - functions of the form "avfoo" seem like they
> are more consistent than "av_foo".  Does the underscore mean anything?

One extra question I hadn't thought to ask before - when, if at all, would an
ordinary user be expected to access the contents of AVClass directly?  Or
AVOption for that matter?  For example, it's not clear to me how people are
supposed to use AVClass.category - is this an important use case we haven't
covered?  Either way, happy to propose an AVClass patch to make it clearer.

It's my turn to be off for a few days, so will pick up any responses next week.
diff mbox series

Patch

diff --git a/doc/context.md b/doc/context.md
new file mode 100644
index 0000000000..73caacf54f
--- /dev/null
+++ b/doc/context.md
@@ -0,0 +1,276 @@ 
+# Context-oriented programming
+
+Like many C projects, FFmpeg has adopted the subset of object-oriented techniques
+that help solve its problems.  Object-like structures are called "contexts",
+and this document provides a general introduction to how they work.
+
+Object-oriented programming usually focusses on *access* as a primary concern.
+For example, members of an object are visibly designated "private", "constant"
+etc. to control how they are accessed.  *Reflection* is a secondary concern,
+where it is provided at all.  For example, C++ has no built-in way to get a
+string containing the name of a variable.
+
+Reflection is extremely important for FFmpeg, because user-facing options are
+implemented by reflecting state at runtime.  Limiting access is a secondary
+concern, mainly important for ensuring implementation details can change
+between versions.
+
+An object-oriented programmer learning FFmpeg should be careful not to fixate on
+FFmpeg's access control features, nor to overlook its reflection capabilities.
+Both are present, but have been given the level of focus appropriate for the task.
+
+## Example: modify text then print it
+
+The example below shows a context structure that receives input strings,
+modifies them in some context-dependant way, then prints them to a specified
+filehandle.
+
+```c
+/**
+ * Type information, accessible at runtime.
+ *
+ * Useful when configuring objects.
+ */
+enum ModifyThenPrintDialect {
+    MODIFY_THEN_PRINT_PLAIN_TEXT = 0,
+    MODIFY_THEN_PRINT_REGEX      = 1,
+    MODIFY_THEN_PRINT_REGEX_PCRE = 2
+};
+
+/**
+ * User-facing information about types
+ *
+ * Useful for describing contexts to the user.
+ */
+static const char* ModifyThenPrintDialectName[] = {
+    "plain text",
+    "regular expression",
+    "Perl-compatible regular expression"
+};
+
+/**
+ * Context for functions that modify strings before printing them
+ */
+struct ModifyThenPrintContext {
+
+    /**
+     * Information about the type of this particular instance
+     *
+     * Object-oriented programs would probably represent this example
+     * as a sub-class, but some instance-specific functionality
+     * behaves more like a mixin.
+     */
+    enum ModifyThenPrintDialect dialect;
+
+    /**
+     * Internal context
+     *
+     * Object-oriented programs would put private members in here,
+     * but could also use it to store a virtual function table
+     * or other "invisible" information.
+     */
+    void* priv_data;
+
+    /**
+     * User-configurable options
+     *
+     * Best set through an API, but can be set directly if necessary
+     *
+     * Data from users needs to be validated before it's set, and the API
+     * might e.g. want to update some internal buffer when these change.
+     * Setting this directly is always less robust than using an API,
+     * but might be worthwhile if you're willing to spend the time checking
+     * for edge cases, and don't mind your code misbehaving if future
+     * versions change the API but not the structure.
+     *
+     * Object-oriented programs would likely make these protected members,
+     * initialised in a constructor and accessed with getters and setters.
+     * Making them user-configurable would be left to the programmer.
+     */
+    char *replace_this;
+    char *with_this;
+
+    /**
+     * Programmer-configurable variable
+     *
+     * Object-oriented programs would represent this as a public member.
+     */
+    FILE *out;
+
+};
+
+/**
+ * Allocate and initialize a ModifyThenPrintContext
+ *
+ * This creates a new pointer, then fills in some sensible defaults.
+ *
+ * We can reasonably assume this function will initialise `priv_data`
+ * with a dialect-specific object, but shouldn't make any assumptions
+ * about what that object is.
+ *
+ * Object-oriented programs would likely represent this as an
+ * allocator function and a constructor.
+ */
+int ModifyThenPrintContext_alloc_context(struct ModifyThenPrintContext **ctx,
+                                         enum ModifyThenPrintDialect dialect,
+                                         FILE *out);
+
+/**
+ * Uninitialize and deallocate a ModifyThenPrintContext
+ *
+ * This does any work required by the internal context (e.g. deallocating
+ * `priv_data`), then deallocates the main context itself.
+ *
+ * Object-oriented programs would likely represent this as a
+ * destructor and a deallocator.
+ */
+int ModifyThenPrintContext_free(struct ModifyThenPrintContext *ctx);
+
+/**
+ * Configure a ModifyThenPrintContext
+ *
+ * This checks the arguments are valid in the context's dialect,
+ * then updates the options as necessary
+ *
+ * Object-oriented programs would likely represent this as a
+ * collection of setter functions.
+ */
+int ModifyThenPrintContext_configure(struct ModifyThenPrintContext *ctx,
+                                     char* replace_this,
+                                     char* with_this);
+
+/**
+ * Print the contents of a ModifyThenPrintContext to a filehandle
+ *
+ * Provides human-readable information about keys and values.
+ *
+ * Object-oriented programs would likely represent this with some kind of
+ * `serialise` operation.
+ */
+int ModifyThenPrintContext_dump(struct ModifyThenPrintContext **ctx,
+                                FILE *dump_fh);
+
+/**
+ * Print a single message
+ *
+ * Object-oriented programs would likely represent this with a member function.
+ */
+int ModifyThenPrintContext_print(struct ModifyThenPrintContext *ctx,
+                                 char *msg);
+
+/**
+ * How this context might be used in practice
+ */
+int print_hello_world()
+{
+
+    int ret = 0;
+
+    struct ModifyThenPrintContext *ctx;
+
+    if ( ModifyThenPrintContext_alloc_context( &ctx, MODIFY_THEN_PRINT_REGEX, stdout ) < 0 ) {
+        ret = -1;
+        goto EXIT_WITHOUT_CLEANUP;
+    }
+
+    if ( ModifyThenPrintContext_configure(ctx, "Hi|Hullo", "Hello") < 0 ) {
+        ret = -1;
+        goto FINISH;
+    }
+
+    if ( ModifyThenPrintContext_print( ctx, "Hi, world!\n" ) < 0 ) {
+        ret = -1;
+        goto FINISH;
+    }
+
+    if ( ModifyThenPrintContext_print( ctx, "Hullo, world!\n" ) < 0 ) {
+        ret = -1;
+        goto FINISH;
+    }
+
+    FINISH:
+    if ( ModifyThenPrintContext_free( ctx ) ) {
+        ret = -1;
+        goto EXIT_WITHOUT_CLEANUP;
+    }
+
+    EXIT_WITHOUT_CLEANUP:
+    return ret;
+
+}
+```
+
+## FFmpeg context structures
+
+The example above is a generic example of a context structure and its
+associated functions.  Some FFmpeg contexts are no more complex than
+that example, just as some objects are just key/value stores with some
+functions attached.  But here are some examples to show the variety of
+contexts available in FFmpeg.
+
+AVHashContext presents a generic API for hashing data.  @ref hash.h
+"Its associated functions" show how to create, use and destroy a hash.
+The exact algorithm is specified by passing a string to av_hash_alloc(),
+and the list of strings can be retrieved from av_hash_names().
+Algorithm-specific internal context is stored in AVHashContext::ctx.
+
+AVCodecContext is a complex member representing data about an encoding or
+decoding session.  @ref avcodec.h "Its associated functions" provide an
+abstract interface to encode and decode data.  Like most widely-used contexts,
+its first member is an AVClass pointer containing instance-specific information.
+That means it's an *AVClass context structure* (discussed in more detail below).
+
+X264Context contains the internal context for an AVCodecContext that uses
+the x264 library.  @ref libx264.c "Its associated functions" provide a concrete
+implementation for interacting with libx264.  This class is not directly
+accessible through FFmpeg's public interface, so unlike AVCodecContext it can
+change significantly between releases.
+
+## Reflection with AVClass and AVOptions
+
+An *AVClass context structure* is a context whose first member is an AVClass
+that reflects its contents.  An *@ref avoptions "AVOptions"-enabled struct*
+is a structure which reflects its contents using the @ref avoptions "AVOptions"
+API.  These terms have become synonymous in modern usage, but you might notice
+some distinction when looking at code written after AVClass was developed but
+before @ref avoptions "AVOptions" was added.
+
+At first glance, an object-oriented programmer might be tempted to look at AVClass
+as a base class that contexts inherit from.  But unlike a base class, an AVClass
+doesn't imply any theoretical relationship between objects, and contexts of the
+same type will often have different AVClass values.  It's even theoretically possible
+for a single AVClass to be shared between contexts of different types.  A better
+analogy would be to [C++ concepts](https://en.wikipedia.org/wiki/Concepts_(C%2B%2B))
+or [Java interfaces](https://en.wikipedia.org/wiki/Interface_(Java)).
+
+To understand how AVClass and @ref avoptions "AVOptions" work,
+consider the requirements for a `libx264` encoder:
+
+- it has to support common encoder options like "bitrate"
+- it has to support encoder-specific options like "profile"
+  - the exact options could change quickly if a legal ruling forces a change of backend
+- it has to provide useful feedback to users about unsupported options
+
+Common encoder options like "bitrate" need to be stored in AVCodecContext itself
+to avoid duplicating code, while encoder-specific options like "profile" have to
+be stored in the X264Context instance stored in AVCodecContext::priv_data.
+But both need to be presented to users (along with help text, default values etc.)
+in a way that makes it easy to build user interfaces to get and set those options.
+
+A context's AVClass member contains a list of AVOption objects, describing
+the user-configurable members of the class.  It also contains functions to
+iterate through a tree of AVClass objects representing internal context either
+for the current object or for any class that could potentially exist in the
+current version of FFmpeg.
+
+The @ref avoptions "AVOptions" API accepts any AVClass context structure,
+looks through its AVOption data, and uses that to examine, introspect, and modify
+the structure.  Because the API doesn't contain any type-specific information,
+it can be used to create a general user interface that adapts automatically
+when e.g. a new version of FFmpeg adds a new configuration option.
+
+## Summary
+
+FFmpeg uses "contexts" in ways that are often resemble object-oriented programming.
+But it focuses less on encapsulation within arbitrarily complex systems,
+and more on providing reflectivity to make pleasant user interfaces.