[FFmpeg-devel,v6,1/4] doc: Explain what "context" means

Message ID	20240604144919.213799-2-ffmpeg-devel@pileofstuff.org
State	New
Headers	show Delivered-To: ffmpegpatchwork2@gmail.com Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; From: Andrew Sayers <ffmpeg-devel@pileofstuff.org> To: ffmpeg-devel@ffmpeg.org Date: Tue, 4 Jun 2024 15:47:21 +0100 Message-ID: <20240604144919.213799-2-ffmpeg-devel@pileofstuff.org> In-Reply-To: <20240604144919.213799-1-ffmpeg-devel@pileofstuff.org> References: <20240418150614.3952107-1-ffmpeg-devel@pileofstuff.org> <20240604144919.213799-1-ffmpeg-devel@pileofstuff.org> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v6 1/4] doc: Explain what "context" means Precedence: list Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Cc: Andrew Sayers <ffmpeg-devel@pileofstuff.org> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
Series	doc: Explain what "context" means \| expand [FFmpeg-devel,v6,0/4] doc: Explain what "context" means [FFmpeg-devel,v6,1/4] doc: Explain what "context" means [FFmpeg-devel,v6,2/4] lavu: Clarify relationship between AVClass, AVOption and context [FFmpeg-devel,v6,3/4] all: Link to "context" from all public contexts with documentation [FFmpeg-devel,v6,4/4] all: Rewrite documentation for contexts

Context	Check	Description
yinshiyou/make_loongarch64	success	Make finished
yinshiyou/make_fate_loongarch64	success	Make fate finished
andriy/make_x86	success	Make finished
andriy/make_fate_x86	success	Make fate finished

Andrew Sayers June 4, 2024, 2:47 p.m. UTC

Derived from explanations kindly provided by Stefano Sabatini and others:
https://ffmpeg.org/pipermail/ffmpeg-devel/2024-April/325903.html
---
 doc/context.md | 430 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 430 insertions(+)
 create mode 100644 doc/context.md

Anton Khirnov June 5, 2024, 8:15 a.m. UTC | #1

Quoting Andrew Sayers (2024-06-04 16:47:21)
> Derived from explanations kindly provided by Stefano Sabatini and others:
> https://ffmpeg.org/pipermail/ffmpeg-devel/2024-April/325903.html
> ---
>  doc/context.md | 430 +++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 430 insertions(+)
>  create mode 100644 doc/context.md

430 lines to say "context is a struct storing an object's state"?

Stefano Sabatini June 12, 2024, 8:52 p.m. UTC | #2

On date Tuesday 2024-06-04 15:47:21 +0100, Andrew Sayers wrote:
> Derived from explanations kindly provided by Stefano Sabatini and others:
> https://ffmpeg.org/pipermail/ffmpeg-devel/2024-April/325903.html
> ---
>  doc/context.md | 430 +++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 430 insertions(+)
>  create mode 100644 doc/context.md
> diff --git a/doc/context.md b/doc/context.md
> new file mode 100644
> index 0000000000..bd8cb58696
> --- /dev/null
> +++ b/doc/context.md
> @@ -0,0 +1,430 @@
> +@page Context Introduction to contexts
> +
> +@tableofcontents
> +
> +FFmpeg uses the term “context” to refer to an idiom
> +you have probably used before:
> +
> +```c
> +// C structs often share context between functions:
> +
> +FILE *my_file; // my_file stores information about a filehandle
> +
> +printf(my_file, "hello "); // my_file provides context to this function,
> +printf(my_file, "world!"); // and also to this function
> +```
> +
> +```python
> +# Python classes provide context for the methods they contain:
> +
> +class MyClass:
> +    def print(self,message):
> +        if self.prev_message != message:
> +            self.prev_message = message
> +            print(message)
> +```
> +
> +<!-- marked "c" because Doxygen doesn't support JS highlighting: -->
> +```c
> +// Many JavaScript callbacks accept an optional context argument:
> +
> +const my_object = {};
> +
> +my_array.forEach(function_1, my_object);
> +my_array.forEach(function_2, my_object);
> +```
> +
> +Be careful comparing FFmpeg contexts to things you're already familiar with -
> +FFmpeg may sometimes happen to reuse words you recognise, but mean something
> +completely different.  For example, the AVClass struct has nothing to do with
> +[object-oriented classes](https://en.wikipedia.org/wiki/Class_(computer_programming)).
> +
> +If you've used contexts in other C projects, you may want to read
> +@ref Context_comparison before the rest of the document.

My impression is that this is growing out of scope for a
reference. The doxy is a reference, therefore it should be clean and
terse, and we should avoid adding too much information, enough
information should be right enough. In fact, a reference is different
from a tutorial, and much different from a C tutorial. Also this is
not a treatise comparing different languages and frameworks, as this
would confuse beginners and would annoy experienced developers.

I propose to cut this patch to provide the minimal information you can
expect in a reference, but not more than that. Addition can be added
later, but I think we should try to avoid any unnecessary content, in
the spirit of keeping this a reference. More extensive discussions
might be done in a separate place (the wiki, a blog post etc.), but in
the spirit of a keeping this a reference they should not be put here.

> +
> +@section Context_general “Context” as a general concept
> +
> +@par
> +A context is any data structure used by several functions
> +(or several instances of the same function) that all operate on the same entity.
> +
> +In the broadest sense, “context” is just a way to think about code.

> +You can even use it to think about code written by people who have never
> +heard the term, or who would disagree with you about what it means.
> +Consider the following snippet:
> +
> +```c
> +struct DualWriter {
> +    int fd1, fd2;
> +};
> +
> +ssize_t write_to_two_files(
> +    struct DualWriter *my_writer,
> +    uint8_t *buf,
> +    int buf_size
> +) {
> +
> +    ssize_t bytes_written_1 = write(my_writer->fd1, buf, buf_size);
> +    ssize_t bytes_written_2 = write(my_writer->fd2, buf, buf_size);
> +
> +    if ( bytes_written_1 != bytes_written_2 ) {
> +        // ... handle this edge case ...
> +    }
> +
> +    return bytes_written_1;
> +
> +}
> +
> +int main() {
> +
> +    struct DualWriter my_writer;
> +    my_writer.fd1 = open("file1", 0644, "wb");
> +    my_writer.fd2 = open("file2", 0644, "wb");
> +
> +    write_to_two_files(&my_writer, "hello ", sizeof("hello "));
> +    write_to_two_files(&my_writer, "world!", sizeof("world!"));
> +
> +    close( my_writer.fd1 );
> +    close( my_writer.fd2 );
> +
> +}
> +```
> +
> +The term “context” doesn't appear anywhere in the snippet.  But `DualWriter`
> +is passed to several instances of `write_to_two_files()` that operate on
> +the same entity, so it fits the definition of a context.
> +
> +When reading code that isn't explicitly described in terms of contexts,
> +remember that your interpretation may differ from other people's.
> +For example, FFmpeg's avio_alloc_context() accepts a set of callback functions
> +and an `opaque` argument - even though this function guarantees to *return*
> +a context, it does not require `opaque` to *provide* context for the callback
> +functions.  So you could choose to pass a struct like `DualWriter` as the
> +`opaque` argument, or you could pass callbacks that use `stdin` and `stdout`
> +and just pass a `NULL` argument for `opaque`.

I'd skip all this part, as we assume the reader is already familiar
with C language and with data encapsulation through struct, if he is
not this is not the right place where to teach about C language
fundamentals.

> +
> +When reading code that *is* explicitly described in terms of contexts,
> +remember that the term's meaning is guaranteed by *the project's community*,
> +not *the language it's written in*.  That means guarantees may be more flexible
> +and change more over time.  For example, programming languages that use
> +[encapsulation](https://en.wikipedia.org/wiki/Encapsulation_(computer_programming))
> +will simply refuse to compile code that violates its rules about access,
> +while communities can put up with special cases if they improve code quality.
> +

This looks a bit vague so I'd rather drop this.

> +The next section will discuss what specific conventions FFmpeg developers mean
> +when they describe parts of their code as using “contexts”.
> +
> +@section Context_ffmpeg FFmpeg contexts
> +
> +This section discusses specific context-related conventions used in FFmpeg.
> +Some of these are used in other projects, others are unique to this project.
> +
> +@subsection Context_indicating Indicating context: “Context”, “ctx” etc.
> +
> +```c
> +// Context struct names usually end with `Context`:
> +struct AVSomeContext {
> +  ...
> +};
> +
> +// Functions are usually named after their context,
> +// context parameters usually come first and are often called `ctx`:
> +void av_some_function(AVSomeContext *ctx, ...);
> +```
> +
> +FFmpeg struct names usually signal whether they are contexts (e.g. AVBSFContext
> +or AVCodecContext).  Exceptions to this rule include AVMD5, which is only
> +identified as a context by @ref libavutil/md5.c "the functions that call it".
> +
> +Function names usually signal the context they're associated with (e.g.
> +av_md5_alloc() or avcodec_alloc_context3()).  Exceptions to this rule include
> +@ref avformat.h "AVFormatContext's functions", many of which begin with
> +just `av_`.
> +
> +Functions usually signal their context parameter by putting it first and
> +naming it some variant of `ctx`.  Exceptions include av_bsf_alloc(), which puts
> +its context argument second to emphasise it's an out variable.
> +

> +Some functions fit awkwardly within FFmpeg's context idiom, so they send mixed
> +signals.  For example, av_ambient_viewing_environment_create_side_data() creates
> +an AVAmbientViewingEnvironment context, then adds it to the side-data of an
> +AVFrame context.  So its name hints at one context, its parameter hints at
> +another, and its documentation is silent on the issue.  You might prefer to
> +think of such functions as not having a context, or as “receiving” one context
> +and “producing” another.

I'd skip this paragraph. In fact, I think that API makes perfect
sense, OOP languages adopt such constructs all the time, for example
this could be a static module/class constructor. In other words, we
are not telling anywhere that all functions should take a "context" as
its first argument, and the documentation specify exactly how this
works, if you feel this is not clear or silent probably this is a sign
that that function documentation should be extended.

> +
> +@subsection Context_data_hiding Data hiding: private contexts
> +
> +```c
> +// Context structs often hide private context:
> +struct AVSomeContext {
> +  void *priv_data; // sometimes just called "internal"
> +};
> +```
> +
> +Contexts present a public interface, so changing a context's members forces
> +everyone that uses the library to at least recompile their program,
> +if not rewrite it to remain compatible.  Many contexts reduce this problem
> +by including a private context with a type that is not exposed in the public
> +interface.  Hiding information this way ensures it can be modified without
> +affecting downstream software.
> +
> +Private contexts often store variables users aren't supposed to see
> +(similar to an [OOP](https://en.wikipedia.org/wiki/Object-oriented_programming)
> +private block), but can be used for more than just access control.  They can
> +also store information shared between some but not all instances of a context
> +(e.g. codec-specific functionality), and @ref Context_avoptions
> +"AVOptions-enabled structs" can provide user configuration options through
> +the @ref avoptions "AVOptions API".

I'll skip this section as well: data hiding is a common C technique,
and AVOptions are already covered later in the document or in another
dedicated section.

> +@subsection Context_lifetime Manage lifetime: creation, use, destruction
> +
> +```c
> +void my_function(...) {
> +
> +    // Context structs are allocated then initialized with associated functions:
> +
> +    AVSomeContext *ctx = av_some_context_alloc(...);
> +
> +    // ... configure ctx ...
> +
> +    av_some_context_init(ctx, ...);
> +
> +    // ... use ctx ...
> +
> +    // Context structs are closed then freed with associated functions:
> +
> +    av_some_context_close(ctx);
> +    av_some_context_free(ctx);
> +
> +}
> +```
> +FFmpeg contexts go through the following stages of life:
> +
> +1. allocation (often a function that ends with `_alloc`)
> +   * a range of memory is allocated for use by the structure
> +   * memory is allocated on boundaries that improve caching
> +   * memory is reset to zeroes, some internal structures may be initialized
> +2. configuration (implemented by setting values directly on the context)
> +   * no function for this - calling code populates the structure directly
> +   * memory is populated with useful values
> +   * simple contexts can skip this stage
> +3. initialization (often a function that ends with `_init`)
> +   * setup actions are performed based on the configuration (e.g. opening files)
> +5. normal usage
> +   * most functions are called in this stage
> +   * documentation implies some members are now read-only (or not used at all)
> +   * some contexts allow re-initialization
> +6. closing (often a function that ends with `_close()`)
> +   * teardown actions are performed (e.g. closing files)
> +7. deallocation (often a function that ends with `_free()`)
> +   * memory is returned to the pool of available memory
> +
> +This can mislead object-oriented programmers, who expect something more like:
> +
> +1. allocation (usually a `new` keyword)
> +   * a range of memory is allocated for use by the structure
> +   * memory *may* be reset (e.g. for security reasons)
> +2. initialization (usually a constructor)
> +   * memory is populated with useful values
> +   * related setup actions are performed based on arguments (e.g. opening files)
> +3. normal usage
> +   * most functions are called in this stage
> +   * compiler enforces that some members are read-only (or private)
> +   * no going back to the previous stage
> +4. finalization (usually a destructor)
> +   * teardown actions are performed (e.g. closing files)
> +5. deallocation (usually a `delete` keyword)
> +   * memory is returned to the pool of available memory
> +
> +The remainder of this section discusses FFmpeg's differences from OOP, to help
> +object-oriented programmers avoid misconceptions.  You can safely skip this
> +section if you aren't familiar with the OOP lifetime described above.
> +
> +FFmpeg's allocation stage is broadly similar to the OOP stage of the same name.
> +Both set aside some memory for use by a new entity, but FFmpeg's stage can also
> +do some higher-level operations.  For example, @ref Context_avoptions
> +"AVOptions-enabled structs" set their AVClass member during allocation.
> +
> +FFmpeg's configuration stage involves setting any variables you want before
> +you start using the context.  Complicated FFmpeg structures like AVCodecContext
> +tend to have many members you *could* set, but in practice most programs set
> +few if any of them.  The freeform configuration stage works better than bundling
> +these into the initialization stage, which would lead to functions with
> +impractically many parameters, and would mean each new option was an
> +incompatible change to the API.  One way to understand the problem is to read
> +@ref Context_avoptions "the AVOptions section below" and think how a constructor
> +would handle those options.
> +
> +FFmpeg's initialization stage involves calling a function that sets the context
> +up based on your configuration.
> +
> +FFmpeg's first three stages do the same job as OOP's first two stages.
> +This can mislead object-oriented developers, who expect to do less work in the
> +allocation stage, and more work in the initialization stage.  To simplify this,
> +most FFmpeg contexts provide a combined allocator and initializer function.
> +For historical reasons, suffixes like `_alloc`, `_init`, `_alloc_context` and
> +even `_open` can indicate the function does any combination of allocation and
> +initialization.
> +
> +FFmpeg's "closing" stage is broadly similar to OOP's "finalization" stage,
> +but some contexts allow re-initialization after finalization.  For example,
> +SwrContext lets you call swr_close() then swr_init() to reuse a context.
> +Be aware that some FFmpeg functions happen to use the word "finalize" in a way
> +that has nothing to do with the OOP stage (e.g. av_bsf_list_finalize()).
> +
> +FFmpeg's "deallocation" stage is broadly similar to OOP, but can perform some
> +higher-level functions (similar to the allocation stage).
> +
> +Closing functions usually end with "_close", while deallocation
> +functions usually end with "_free".  Very few contexts need the flexibility of
> +separate "closing" and "deallocation" stages, so many "_free" functions
> +implicitly close the context first.

About this I have mixed feelings, but to me it sounds like a-posteriori
rationalization.

I don't think there is a general rule with the allocation/closing/free
rule for the various FFmpeg APIs, and giving the impression that this
is the case might be misleading. In practice the user needs to master
only a single API at a time (encodering/decoding, muxing/demuxing,
etc.)  each one with possibly slight differences in how the term
close/allocation/free are used. This is probably not optimal, but in
practice it works as the user do not really need to know all the
possible uses of the API (she will work through what she is interested
for the job at hand).

> +
> +@subsection Context_avoptions Configuration options: AVOptions-enabled structs
> +

> +The @ref avoptions "AVOptions API" is a framework to configure user-facing
> +options, e.g. on the command-line or in GUI configuration forms.

This looks wrong. AVOptions is not at all about CLI or GUI options, is
just some way to set/get fields (that is "options") defined in a
struct (a context) using a high level API including: setting multiple
options at once (through a textual encoding or a dictionary),
input/range validation, setting more fields based on a single option
(e.g. the size) etc.

Then you can query the options in a given struct and create
corresponding options in a UI, but this is not the point of AVOptions.

> +To understand FFmpeg's configuration requirements, run `ffmpeg -h full` on the
> +command-line, then ask yourself how you would implement all those options
> +with the C standard [`getopt` function](https://en.wikipedia.org/wiki/Getopt).
> +You can also ask the same question for other approaches - for example, how would
> +you maintain a GUI with 15,000+ configuration options?
> +
> +Most solutions assume you can just put all options in a single code block,
> +which is unworkable at FFmpeg's scale.  Instead, we split configuration
> +across many *AVOptions-enabled structs*, which use the @ref avoptions
> +"AVOptions API" to inspect and configure options, including in private contexts.
> +
> +AVOptions-accessible members of a context should be accessed through the
> +@ref avoptions "AVOptions API" whenever possible, even if they're not hidden
> +in a private context.  That ensures values are validated as they're set, and
> +means you won't have to do as much work if a future version of FFmpeg changes
> +the allowed values.
> +

> +Although not strictly required, it is best to only modify options during
> +the configuration stage.  Initialized structs may be accessed by internal
> +FFmpeg threads, and modifying them can cause weird intermittent bugs.
> +
> +@subsection Context_logging Logging: AVClass context structures
> +
> +FFmpeg's @ref lavu_log "logging facility" needs to be simple to use,
> +but flexible enough to let people debug problems.  And much like options,
> +it needs to work the same across a wide variety of unrelated structs.
> +
> +FFmpeg structs that support the logging framework are called *@ref AVClass
> +context structures*.  The name @ref AVClass was chosen early in FFmpeg's
> +development, but in practice it only came to store information about
> +logging, and about options.

OTOH hand AVOptions and logging should be discussed in the relevant
files, to avoid duplication.

> +
> +@section Context_further Further information about contexts
> +
> +So far, this document has provided a theoretical guide to FFmpeg contexts.
> +This final section provides some alternative approaches to the topic,
> +which may help round out your understanding.
> +
> +@subsection Context_example Learning by example: context for a codec
> +
> +It can help to learn contexts by doing a deep dive into a specific struct.
> +This section will discuss AVCodecContext - an AVOptions-enabled struct
> +that contains information about encoding or decoding one stream of data
> +(e.g. the video in a movie).
> +
> +The name "AVCodecContext" tells us this is a context.  Many of
> +@ref libavcodec/avcodec.h "its functions" start with an `avctx` parameter,
> +indicating this parameter provides context for that function.
> +
> +AVCodecContext::internal contains the private context.  For example,
> +codec-specific information might be stored here.
> +
> +AVCodecContext is allocated with avcodec_alloc_context3(), initialized with
> +avcodec_open2(), and freed with avcodec_free_context().  Most of its members
> +are configured with the @ref avoptions "AVOptions API", but for example you
> +can set AVCodecContext::draw_horiz_band() if your program happens to need it.
> +
> +AVCodecContext provides an abstract interface to many different *codecs*.
> +Options supported by many codecs (e.g. "bitrate") are kept in AVCodecContext
> +and exposed with AVOptions.  Options that are specific to one codec are
> +stored in the private context, and also exposed with AVOptions.
> +
> +AVCodecContext::av_class contains logging metadata to ensure all codec-related
> +error messages look the same, plus implementation details about options.
> +
> +To support a specific codec, AVCodecContext's private context is set to
> +an encoder-specific data type.  For example, the video codec
> +[H.264](https://en.wikipedia.org/wiki/Advanced_Video_Coding) is supported via
> +[the x264 library](https://www.videolan.org/developers/x264.html), and
> +implemented in X264Context.  Although included in the documentation, X264Context
> +is not part of the public API.  That means FFmpeg's @ref ffmpeg_versioning
> +"strict rules about changing public structs" aren't as important here, so a
> +version of FFmpeg could modify X264Context or replace it with another type
> +altogether.  An adverse legal ruling or security problem could even force us to
> +switch to a completely different library without a major version bump.
> +
> +The design of AVCodecContext provides several important guarantees:
> +
> +- lets you use the same interface for any codec
> +- supports common encoder options like "bitrate" without duplicating code
> +- supports encoder-specific options like "profile" without bulking out the public interface
> +- exposes both types of options to users, with help text and detection of missing options
> +- provides uniform logging output
> +- hides implementation details (e.g. its encoding buffer)
> +

> +@subsection Context_comparison Learning by comparison: FFmpeg vs. Curl contexts

About this, I'm still not really convinced that this should be part of
a reference, in the sense that it is adding more information than
really needed and it treats concepts related to the C language rather
than to the FFmpeg API itself.

[...]

Andrew Sayers June 13, 2024, 2:20 p.m. UTC | #3

On Wed, Jun 12, 2024 at 10:52:00PM +0200, Stefano Sabatini wrote:
> On date Tuesday 2024-06-04 15:47:21 +0100, Andrew Sayers wrote:
[...]
> My impression is that this is growing out of scope for a
> reference. The doxy is a reference, therefore it should be clean and
> terse, and we should avoid adding too much information, enough
> information should be right enough. In fact, a reference is different
> from a tutorial, and much different from a C tutorial. Also this is
> not a treatise comparing different languages and frameworks, as this
> would confuse beginners and would annoy experienced developers.
> 
> I propose to cut this patch to provide the minimal information you can
> expect in a reference, but not more than that. Addition can be added
> later, but I think we should try to avoid any unnecessary content, in
> the spirit of keeping this a reference. More extensive discussions
> might be done in a separate place (the wiki, a blog post etc.), but in
> the spirit of a keeping this a reference they should not be put here.

I would agree if we had a tradition of linking to the wiki or regular blog
posts, but even proposing internal links has generated pushback in this thread,
so that feels like making the perfect the enemy of the good.  Let's get this
committed, see how people react, then look for improvements.

In fact, once this is available in the trunk version of the website,
we should ask for feedback from the libav-user ML and #ffmpeg IRC channel.
Then we can expand/move/remove stuff based on feedback.

> 
> > +
> > +@section Context_general “Context” as a general concept
[...]
> I'd skip all this part, as we assume the reader is already familiar
> with C language and with data encapsulation through struct, if he is
> not this is not the right place where to teach about C language
> fundamentals.

I disagree, for a reason I've been looking for an excuse to mention :)

Let's assume 90% of people who use FFmpeg already know something in the doc.
You could say that part of the doc is useless to 90% of the audience.
Or you could say that 90% of FFmpeg users are not our audience.

Looking at it the second way means you need to spend more time on "routing" -
linking to the document in ways that (only) attract your target audience,
making a table of contents with headings that aid skip-readers, etc.
But once you've routed people around the bits they don't care about,
it's fine to have documentation that's only needed by a minority.

Also, less interesting but equally important - context is not a C language
fundamental, it's more like an emergent property of large C projects.  A
developer that came here without knowing e.g. what a struct is could read
any of the online tutorials that explain the concept better than we could.
I'd be happy to link to a good tutorial about contexts if we found one,
but we have to meet people where they are, and this is the best solution
I've been able to find.

> 
> > +
> > +When reading code that *is* explicitly described in terms of contexts,
> > +remember that the term's meaning is guaranteed by *the project's community*,
> > +not *the language it's written in*.  That means guarantees may be more flexible
> > +and change more over time.  For example, programming languages that use
> > +[encapsulation](https://en.wikipedia.org/wiki/Encapsulation_(computer_programming))
> > +will simply refuse to compile code that violates its rules about access,
> > +while communities can put up with special cases if they improve code quality.
> > +
> 
> This looks a bit vague so I'd rather drop this.

This probably looks vague to you because you're part of the 90% of people this
paragraph isn't for.  All programming languages provide some guarantees, and
leave others up to the community to enforce (or not).  Over time, people stop
seeing the language guarantees at all, and assume the only alternative is
anarchy.  For example, if you got involved in a large JavaScript project,
you might be horrified to see almost all structs are the same type ("Object"),
and are implemented as dictionaries that are expected to have certain keys.
But in practice, this stuff gets enforced at the community level well enough.
Similarly, a JS programmer might be horrified to learn FFmpeg needs a whole
major version bump just to add a key to a struct.  This paragraph is there to
nudge people who have stopped seeing things we need them to look out for.

If you'd like to maintain an official FFmpeg blog, I'd be happy to expand the
paragraph above into a medium-sized post, then just link it from the doc.
But that post would be too subjective to be a wiki page - JavaScript is
evolving in a more strongly-typed direction, so it would only make sense to
future readers if they could say "oh yeah this was written in 2024, JS was
still like that back then".  This paragraph is an achievable compromise -
covers enough ground to give people a way to think about the code, short enough
for people who don't care to skip over, and objective enough to belong in
documentation.  We can always change it if we find a better solution.

[...]
> > +Some functions fit awkwardly within FFmpeg's context idiom, so they send mixed
> > +signals.  For example, av_ambient_viewing_environment_create_side_data() creates
> > +an AVAmbientViewingEnvironment context, then adds it to the side-data of an
> > +AVFrame context.  So its name hints at one context, its parameter hints at
> > +another, and its documentation is silent on the issue.  You might prefer to
> > +think of such functions as not having a context, or as “receiving” one context
> > +and “producing” another.
> 
> I'd skip this paragraph. In fact, I think that API makes perfect
> sense, OOP languages adopt such constructs all the time, for example
> this could be a static module/class constructor. In other words, we
> are not telling anywhere that all functions should take a "context" as
> its first argument, and the documentation specify exactly how this
> works, if you feel this is not clear or silent probably this is a sign
> that that function documentation should be extended.

That would be fine if it were just this function, but FFmpeg is littered
with special cases that don't quite fit.  Another example might be
swr_alloc_set_opts2(), which can take an SwrContext in a way that resembles
a context, or can take NULL and allocate a new SwrContext.  And yes,
we could document that edge case, and the next one, and the one after that.
But even if we documented every little special case that existed today,
there's no rule, so new bits of API will just reintroduce the problem again.

There's a deeper issue here - as an expert, when you don't know something,
your default assumption is that it's undefined, and could evolve in future.
When a newbie doesn't know something, their default assumption is that
everybody else knows and they're just stupid.  That assumption drives
newbies away from projects, so it's important to fill in as many blanks as
possible, even if it has to be with simple answers that they eventually
evolve beyond (and feel smart for doing so).

> > +@subsection Context_lifetime Manage lifetime: creation, use, destruction
[...]
> About this I have mixed feelings, but to me it sounds like a-posteriori
> rationalization.
> 
> I don't think there is a general rule with the allocation/closing/free
> rule for the various FFmpeg APIs, and giving the impression that this
> is the case might be misleading. In practice the user needs to master
> only a single API at a time (encodering/decoding, muxing/demuxing,
> etc.)  each one with possibly slight differences in how the term
> close/allocation/free are used. This is probably not optimal, but in
> practice it works as the user do not really need to know all the
> possible uses of the API (she will work through what she is interested
> for the job at hand).

Note: I'm assuming "this" means "this section", not "this paragraph".
Apologies if it was intended as a specific nitpick about closing functions.

TBH, a lot of this document is about inventing memorable rules of thumb.
The alternative is to say "FFmpeg devs can't agree on an answer, so they just
left you to memorise 3,000+ special cases".

Let's assume learning the whole of FFmpeg means understanding 3,000 tokens
(I'm not sure the exact count in 7.0, but it's about that number if you don't
include individual members of structs, arguments to functions etc.).  Let's
also assume it takes an average of ten minutes to learn each token (obviously
that varies - AV_LOG_PANIC will take less, AVCodecContext will take more).
That means you'd have to spend 8 hours a day every day for over two months
to learn FFmpeg.  Obviously there are usable subsets, but they mostly cut out
the simple things, so don't save nearly as much time as you'd think.  If you
want people to pick up FFmpeg, they need to learn a useful subset in about 8
hours, which requires a drastically simplified explanation.

(the above is closely related to an argument from a recent post[1],
but the numbers might help explain the scale of the challenge)

There may not be an explicit rule for context lifetimes, but I've looked at the
code carefully enough to have a nuanced opinion about the number of tokens,
and the edgiest case I've found so far is swr_alloc_set_opts2() (see above).
I'm open to counterexamples, but the model discussed in this section feels
pretty reliable.

> 
> > +
> > +@subsection Context_avoptions Configuration options: AVOptions-enabled structs
> > +
> 
> > +The @ref avoptions "AVOptions API" is a framework to configure user-facing
> > +options, e.g. on the command-line or in GUI configuration forms.
> 
> This looks wrong. AVOptions is not at all about CLI or GUI options, is
> just some way to set/get fields (that is "options") defined in a
> struct (a context) using a high level API including: setting multiple
> options at once (through a textual encoding or a dictionary),
> input/range validation, setting more fields based on a single option
> (e.g. the size) etc.
> 
> Then you can query the options in a given struct and create
> corresponding options in a UI, but this is not the point of AVOptions.

There's a problem here I haven't been communicating clearly enough.
I think that's because I've understated the problem in the past,
so I'll try overstating instead:

"Option" is a meaningless noise word.  A new developer might ask "is this like a
command-line option, or a CMake option?  Is it like those Python functions with
a million keyword arguments, or a config file with sensible defaults?".
Answering "it can be any of those if you need it to be" might help an advanced
user (not our audience), but is bewildering to a newbie who needs a rough guide
for how they're likely to use it.  The only solution that's useful to a newbie
is to provide a frame of reference, preferably in the form of something they
already know how to use.

Having said all that, yes this particular answer is wrong.  Could you apply [2]
so I can start thinking about what to replace it with?

[...]

[1] https://ffmpeg.org/pipermail/ffmpeg-devel/2024-June/328970.html
[2] https://ffmpeg.org/pipermail/ffmpeg-devel/2024-June/329068.html

Stefano Sabatini June 15, 2024, 9:17 a.m. UTC | #4

On date Thursday 2024-06-13 15:20:38 +0100, Andrew Sayers wrote:
> On Wed, Jun 12, 2024 at 10:52:00PM +0200, Stefano Sabatini wrote:
[...]
> > > +@section Context_general “Context” as a general concept
> [...]
> > I'd skip all this part, as we assume the reader is already familiar
> > with C language and with data encapsulation through struct, if he is
> > not this is not the right place where to teach about C language
> > fundamentals.
> 
> I disagree, for a reason I've been looking for an excuse to mention :)
> 

> Let's assume 90% of people who use FFmpeg already know something in the doc.
> You could say that part of the doc is useless to 90% of the audience.
> Or you could say that 90% of FFmpeg users are not our audience.
> 
> Looking at it the second way means you need to spend more time on "routing" -
> linking to the document in ways that (only) attract your target audience,
> making a table of contents with headings that aid skip-readers, etc.
> But once you've routed people around the bits they don't care about,
> it's fine to have documentation that's only needed by a minority.
> 

> Also, less interesting but equally important - context is not a C language
> fundamental, it's more like an emergent property of large C projects.  A
> developer that came here without knowing e.g. what a struct is could read
> any of the online tutorials that explain the concept better than we could.
> I'd be happy to link to a good tutorial about contexts if we found one,
> but we have to meet people where they are, and this is the best solution
> I've been able to find.

The context is just another way to call a struct used to keep an
entity state operated by several functions (that is in other words an
object and its methods), it's mostly about the specific jargon used by
FFmpeg (and used by other C projects as well). In addition to this we
provide some generic utilities (logging+avoptions) which can be used
through AVClass employment.

Giving a longer explanation is making this appear something more
complicated than actually is. My point is that providing more
information than actually needed provides the long-wall-of-text effect
(I need to read through all this to understand it - nah I'd rather
give-up), thus discouraging readers.

> 
> > 
> > > +
> > > +When reading code that *is* explicitly described in terms of contexts,
> > > +remember that the term's meaning is guaranteed by *the project's community*,
> > > +not *the language it's written in*.  That means guarantees may be more flexible
> > > +and change more over time.  For example, programming languages that use
> > > +[encapsulation](https://en.wikipedia.org/wiki/Encapsulation_(computer_programming))
> > > +will simply refuse to compile code that violates its rules about access,
> > > +while communities can put up with special cases if they improve code quality.
> > > +
> > 
> > This looks a bit vague so I'd rather drop this.

I mean, if you read for the first time:
| [the context] term's meaning is guaranteed by *the project's
| community*, not the languaguage it's written for.
| That means guarantees may be more flexible and change more over time.

it's very hard to figure out what these guarantees are about, and this
might apply to every specific language and to every specific term,
that's why I consider this "vague".

[...]
> > > +Some functions fit awkwardly within FFmpeg's context idiom, so they send mixed
> > > +signals.  For example, av_ambient_viewing_environment_create_side_data() creates
> > > +an AVAmbientViewingEnvironment context, then adds it to the side-data of an
> > > +AVFrame context.  So its name hints at one context, its parameter hints at
> > > +another, and its documentation is silent on the issue.  You might prefer to
> > > +think of such functions as not having a context, or as “receiving” one context
> > > +and “producing” another.
> > 
> > I'd skip this paragraph. In fact, I think that API makes perfect
> > sense, OOP languages adopt such constructs all the time, for example
> > this could be a static module/class constructor. In other words, we
> > are not telling anywhere that all functions should take a "context" as
> > its first argument, and the documentation specify exactly how this
> > works, if you feel this is not clear or silent probably this is a sign
> > that that function documentation should be extended.
> 

> That would be fine if it were just this function, but FFmpeg is littered
> with special cases that don't quite fit.

I still fail to see the general rule for which this is creating a
special case. If this is a special case, what is this special case
for?

> Another example might be swr_alloc_set_opts2(), which can take an
> SwrContext in a way that resembles a context, or can take NULL and
> allocate a new SwrContext.  And yes, we could document that edge
> case, and the next one, and the one after that. But even if we
> documented every little special case that existed today, there's no
> rule, so new bits of API will just reintroduce the problem again.

In this specific case I'd say the API is not much ergonomic, and
probably it would have been better to keep operations (allow+set_opts)
separated, but then it is much a choice of the user (you can decide to
keep alloc and set_opts as different operations). On the other hand,
*it is already documented* so there is not much to add.

> There's a deeper issue here - as an expert, when you don't know something,
> your default assumption is that it's undefined, and could evolve in future.
> When a newbie doesn't know something, their default assumption is that
> everybody else knows and they're just stupid.  That assumption drives
> newbies away from projects, so it's important to fill in as many blanks as
> possible, even if it has to be with simple answers that they eventually
> evolve beyond (and feel smart for doing so).
> 
> > > +@subsection Context_lifetime Manage lifetime: creation, use, destruction
> [...]
> > About this I have mixed feelings, but to me it sounds like a-posteriori
> > rationalization.
> > 
> > I don't think there is a general rule with the allocation/closing/free
> > rule for the various FFmpeg APIs, and giving the impression that this
> > is the case might be misleading. In practice the user needs to master
> > only a single API at a time (encodering/decoding, muxing/demuxing,
> > etc.)  each one with possibly slight differences in how the term
> > close/allocation/free are used. This is probably not optimal, but in
> > practice it works as the user do not really need to know all the
> > possible uses of the API (she will work through what she is interested
> > for the job at hand).
> 
> Note: I'm assuming "this" means "this section", not "this paragraph".
> Apologies if it was intended as a specific nitpick about closing functions.
> 
> TBH, a lot of this document is about inventing memorable rules of thumb.

> The alternative is to say "FFmpeg devs can't agree on an answer, so they just
> left you to memorise 3,000+ special cases".

This is not what most people do. You don't have to read all the
manual, especially you don't have to memorize the complete API, but
only the relevant part for the task at hand. If you need to learn
about decoding, you would probably start with the decoding API
(libavcodec/avcodec.h - which is unfortunatly pretty much complex
because the problem is complex), and you can ignore pretty much
everything else. If you only need to work with hash methods, you only
need to learn about that API.

So in general, you only learn the smaller bits needed for the task at
hand. For example I never used the av_ambient_viewing_environment API,
and I will probably never will.

More realistically, people learn from examples, so that's why we
should improve doc/examples. doxy is mostly for providing a complete
reference in case you want to fine tune an already working piece of
code or you have problems understanding specific bits (how are
timestamps handled? When should I ref/unref a frame? How can I get the
name of a channel layout?).

Usually it is a mixed process (you read the API doc, or you read the
examples to get the general idea, and you interpolate between the
two).

> Let's assume learning the whole of FFmpeg means understanding 3,000 tokens
> (I'm not sure the exact count in 7.0, but it's about that number if you don't
> include individual members of structs, arguments to functions etc.).  Let's
> also assume it takes an average of ten minutes to learn each token (obviously
> that varies - AV_LOG_PANIC will take less, AVCodecContext will take more).
> That means you'd have to spend 8 hours a day every day for over two months
> to learn FFmpeg.

> Obviously there are usable subsets, but they mostly cut out the
> simple things, so don't save nearly as much time as you'd think.  If
> you want people to pick up FFmpeg, they need to learn a useful
> subset in about 8 hours, which requires a drastically simplified
> explanation.

I agree, I'm fine with giving a high level description of what we mean
by "context" - but probably one paragraph or two should be enough, but
adding too much information might get the exact opposite effect
(exposing more information than needed - assuming the reader is
familiar with the C language, which we should assume).

> (the above is closely related to an argument from a recent post[1],
> but the numbers might help explain the scale of the challenge)
> 
> There may not be an explicit rule for context lifetimes, but I've looked at the
> code carefully enough to have a nuanced opinion about the number of tokens,
> and the edgiest case I've found so far is swr_alloc_set_opts2() (see above).
> I'm open to counterexamples, but the model discussed in this section feels
> pretty reliable.

swr_alloc_set_opts() is an unfortunate case, probably it would be
better to fix the API to have a dedicated set_opts in place of
aggregating two operations - one of the issues being that the function
cannot distinguish the case of allocation failure (ENOMEM) from
invalid parameters (EINVAL).

That said, I would not mind keeping the general discussion, but
probably I'd cut the part about "finalize" to avoid the reference to
the C++ language, in general I'd avoid comparisons with C++ all around
the place.

> > > +
> > > +@subsection Context_avoptions Configuration options: AVOptions-enabled structs
> > > +
> > 
> > > +The @ref avoptions "AVOptions API" is a framework to configure user-facing
> > > +options, e.g. on the command-line or in GUI configuration forms.
> > 
> > This looks wrong. AVOptions is not at all about CLI or GUI options, is
> > just some way to set/get fields (that is "options") defined in a
> > struct (a context) using a high level API including: setting multiple
> > options at once (through a textual encoding or a dictionary),
> > input/range validation, setting more fields based on a single option
> > (e.g. the size) etc.
> > 
> > Then you can query the options in a given struct and create
> > corresponding options in a UI, but this is not the point of AVOptions.
> 
> There's a problem here I haven't been communicating clearly enough.
> I think that's because I've understated the problem in the past,
> so I'll try overstating instead:
> 
> "Option" is a meaningless noise word.  A new developer might ask "is this like a
> command-line option, or a CMake option?  Is it like those Python functions with
> a million keyword arguments, or a config file with sensible defaults?".
> Answering "it can be any of those if you need it to be" might help an advanced
> user (not our audience), but is bewildering to a newbie who needs a rough guide
> for how they're likely to use it.  The only solution that's useful to a newbie
> is to provide a frame of reference, preferably in the form of something they
> already know how to use.

This is the definition I gave:

[AVOptions system is] just some way to set/get fields (that is
"options") defined in a struct (a context) using a high level API
including: setting multiple options at once (through a textual
encoding or a dictionary), input/range validation, setting more fields
based on a single option (e.g. the size) etc.

"Setting options" on a struct/context with AVOptions means to set
_fields_ on the struct, following the very specific rules defined by
AVOptions.

So it's not like "it can be any of those if you need it to be", but it
has a very specific meaning.

> Having said all that, yes this particular answer is wrong.  Could
> you apply [2] so I can start thinking about what to replace it with?

I'll have a look at it.

Andrew Sayers June 16, 2024, 6:02 p.m. UTC | #5

Meta note #1: I've replied in this thread but changed the subject line.
That's because it needs to stay focussed on solving this thread's problem,
but may be of more general interest.

Meta note #2: Stefano, I appreciate your feedback, but would rather wait
for [1] to get sorted out, then formulate my thoughts while writing a new
version.  That way I'll be more focussed on ways to improve things for readers.

This thread started with what I thought was a trivia question[1] -
what is a context?  It's short for "AVClass context structure", which is
synonymous with "AVOptions-enabled struct".  It turned out to be more complex
than that, so I wrote a little patch[3] explaining this piece of jargon.
But it turned out to be more complex again, and so on until we got a 430-line
document explaining things in voluminous detail.

Everyone agrees this isn't ideal, so here are some alternatives.
This may also inspire thoughts about FFmpeg development in general.

# Alternative: Just give up

The argument: We tried something, learnt a lot, but couldn't find a solution
we agreed on, so let's come back another day.

Obviously this is the easy way out, but essentially means leaving a critical
bug in the documentation (misleads the reader about a fundamental concept).
Even the most negative take on this document is that it's better than nothing,
so I think we can rule this one out.

# Err on the side of under-communicating

The argument: this document is on the right tracks, but explains too many things
the reader can already be assumed to know.

This argument is more complex than it appears.  To take some silly examples,
I'm not going to learn Mandarin just because FFmpeg users can't be assumed to
speak English.  But I am willing to use American spelling because it's what
more readers are used to.  This e-mail is plenty long enough already, so
I'll stick to some high-level points about this argument.

The main risk of cutting documentation is that if someone can't follow a single
step, they're lost and don't even know how to express their problem.  Imagine
teaching maths to children - you need to teach them what numbers are, then how
to add them together, then multiplication, then finally exponents.  But if you
say "we don't need to teach numbers because kids all watch Numberblocks now",
you'll cover the majority of kids who could have worked it out anyway, and
leave a minority who just give up and say "I guess I must be bad at maths".
I'd argue it's better to write more, then get feedback from actual newbies and
cut based on the evidence - we'll get it wrong either way, but at least this way
the newbies will know what they want us to cut.

Incidentally, there's a much stronger argument for *drafting* a long document,
even if it gets cut down before it's committed.  FFmpeg has lots of undocumented
nuances that experts just know and newbies don't know to ask, and this thread is
full of instances where writing more detail helped tease out a misunderstanding.
[1] is a great example - I had finally written enough detail to uncover my
assumption that all AVOptions could be set at any time, then that thread
taught me to look for a flag that tells you the options for which that's true.

If you assume I'm not the only person who has been subtly misled that way,
you could argue it's better to commit the long version.  That would give readers
more opportunities to confront their own wrong assumptions, instead of reading
something that assumed they knew one thing, but let them keep believing another.
The obvious counterargument is that we should...

# Spread the information across multiple documents

The argument: this document puts too much information in one place.  We should
instead focus on making small patches that put information people need to know
where they need to know it.

This is where things get more interesting to a general audience.

If you have repo commit access, you're probably imagining a workflow like:
write a bunch of little commits, send them out for review, then commit them
when people stop replying.  Your access is evidence that you basically know how
things work, and also lets you make plans confident in the knowledge that
anything you need committed will make it there in the end.

My workflow is nothing like that.  This thread has constantly reinforced that I
don't understand FFmpeg, so it's better for me not to have commit access.  But
that means I can only work on one patch at once, because I will probably learn
something that invalidates any other work I would have done.  It also means
a single patch not getting interest is enough to sink the project altogether.
I can put up with that when it's one big multi-faceted patch, because I can work
on one part while waiting for feedback on another part.  But my small patches
generally involve a few hours of work, a week of waiting, a ping begging for
attention, then often being rejected or ignored.  In the real world, the only
thing this approach will achieve is to burn me out.

It might be possible to revisit this idea *after* committing the document,
when we're fairly confident the answers are right and just need to tweak
the presentation based on feedback.  Or other people could write documentation
based on issues brought up in this thread, and I'll cut as appropriate.
But for now this is a non-starter.

# Write a blog post

The argument: doxygen and texinfo are good for documenting "timeless truths".
But we don't have anywhere to put transitory information like upgrade guides,
or subjective information like guidance about best practices.  This document
shoehorns information in here that really belongs somewhere like that.

This is basically true, but that doesn't solve anything.

I have neither a personal blog nor the desire to write one, and even if I wrote
an excellent guide to contexts as a general concept, nobody would actually find
the blog to read it.  So this idea would only work if we e.g. overhauled the
news area of the site[4] to look more like GIMP's news section[5].

If someone else would like to start a project like that, I can promise a good
series of posts to help get the ball rolling, and will be happy to trim down
the document as those posts go public.

# Write tutorials

The argument: this explains ideas from first principles that are better
explained by example.  This document shoehorns information in here that
really belongs somewhere like that.

This seems reasonable in principle, but as well as the "lots of small commits"
problem discussed above, FFmpeg is currently structured so nobody is actually
going to do that.

I've tried to learn FFmpeg many times over the years, and last year's attempt
involved trying to write a subtitles tutorial.  I didn't submit it to the ML
because I didn't understand the theory well enough, and was fairly sure I had
made some underlying error that made it all wrong.  Everyone who does understand
FFmpeg well enough to write a good subtitles tutorial seems to understand it
well enough to want a complete rewrite, but not care enough to get that done.
So subtitles end up falling between two stools - too ugly for newbies to learn,
not ugly enough for experts to fix.

If you want relatively new developers to have the confidence to write tutorials,
they need the theory to be well-documented first.

# Rewrite the API

The argument: instead of writing a bunch of words apologising for an interface,
just write a better interface.

In general, I'm a big believer in using documentation to drive API decisions.
But in FFmpeg's case, it wouldn't help to have a single developer trying to fix
a community-wide problem.

We've previously discussed swr_alloc_set_opts() (essentially two functions
sharing one symbol) and av_ambient_viewing_environment_create_side_data()
(receives a different context argument than its function name).
A better example of the community-wide issue might be av_new_packet()
(initializes but does not allocate, despite slightly confusing documentation)
vs. av_new_program() (allocates and initializes, but has no documentation).
Both of these could be documented better, and a developer who only needs to
learn one won't be bothered by the other.  But the real problem is that they
use the word "new" to mean fundamentally incompatible things, creating a trap
for anyone reviewing code that uses the "new" they haven't seen before.

Solving this problem wouldn't just involve rewriting a bunch of functions.
It would involve motivating the community to avoid writing that sort of API
in future, which would take all of us many years to achieve.

# Apply this, then iterate

The argument: OK fine, but dragging down other solutions doesn't help this one.
We should at least try to solve these problems.

This is the position I've currently landed up at.  I'm increasingly able to
justify the big picture decisions behind the document, and we're getting to
the point where detailed discussions are just putting opinions where evidence
should go.

I've talked in a previous e-mail about getting feedback from #ffmpeg and the
libav-user mailing list.  We've also talked about the value of making the
document useful to people familiar with other programming languages, so we
could try reaching out to people who write bindings in other languages.

This sort of iteration can be pretty quick, because we don't need to wait for
a major release.  We just need to have something on ffmpeg.org so people know
this is a "real" project.  As such, I think a "release early, release often"
approach is the best way forward.

[1] https://ffmpeg.org/pipermail/ffmpeg-devel/2024-June/329068.html
[2] https://ffmpeg.org/pipermail/ffmpeg-devel/2024-April/325811.html
[3] https://ffmpeg.org/pipermail/ffmpeg-devel/2024-April/325903.html
[4] https://ffmpeg.org/index.html#news
[5] https://www.gimp.org/news/

Paul B Mahol June 16, 2024, 9:20 p.m. UTC | #6

Avoid filling some of bellow points:

https://erikbern.com/2023/12/13/simple-sabotage-for-software.html

Especially part of rewriting public or internal API just for rewrite.

Stefano Sabatini July 1, 2024, 10:16 p.m. UTC | #7

On date Sunday 2024-06-16 19:02:51 +0100, Andrew Sayers wrote:
> Meta note #1: I've replied in this thread but changed the subject line.
> That's because it needs to stay focussed on solving this thread's problem,
> but may be of more general interest.
> 
> Meta note #2: Stefano, I appreciate your feedback, but would rather wait
> for [1] to get sorted out, then formulate my thoughts while writing a new
> version.  That way I'll be more focussed on ways to improve things for readers.
> 
> This thread started with what I thought was a trivia question[1] -
> what is a context?  It's short for "AVClass context structure", which is
> synonymous with "AVOptions-enabled struct".  It turned out to be more complex
> than that, so I wrote a little patch[3] explaining this piece of jargon.
> But it turned out to be more complex again, and so on until we got a 430-line
> document explaining things in voluminous detail.
> 
> Everyone agrees this isn't ideal, so here are some alternatives.
> This may also inspire thoughts about FFmpeg development in general.
> 
> # Alternative: Just give up
> 
> The argument: We tried something, learnt a lot, but couldn't find a solution
> we agreed on, so let's come back another day.
> 
> Obviously this is the easy way out, but essentially means leaving a critical
> bug in the documentation (misleads the reader about a fundamental concept).
> Even the most negative take on this document is that it's better than nothing,
> so I think we can rule this one out.
> 
> # Err on the side of under-communicating
> 
> The argument: this document is on the right tracks, but explains too many things
> the reader can already be assumed to know.
> 
> This argument is more complex than it appears.  To take some silly examples,
> I'm not going to learn Mandarin just because FFmpeg users can't be assumed to
> speak English.  But I am willing to use American spelling because it's what
> more readers are used to.  This e-mail is plenty long enough already, so
> I'll stick to some high-level points about this argument.
> 
> The main risk of cutting documentation is that if someone can't follow a single
> step, they're lost and don't even know how to express their problem.  Imagine
> teaching maths to children - you need to teach them what numbers are, then how
> to add them together, then multiplication, then finally exponents.  But if you
> say "we don't need to teach numbers because kids all watch Numberblocks now",
> you'll cover the majority of kids who could have worked it out anyway, and
> leave a minority who just give up and say "I guess I must be bad at maths".
> I'd argue it's better to write more, then get feedback from actual newbies and
> cut based on the evidence - we'll get it wrong either way, but at least this way
> the newbies will know what they want us to cut.
> 
> Incidentally, there's a much stronger argument for *drafting* a long document,
> even if it gets cut down before it's committed.  FFmpeg has lots of undocumented
> nuances that experts just know and newbies don't know to ask, and this thread is
> full of instances where writing more detail helped tease out a misunderstanding.
> [1] is a great example - I had finally written enough detail to uncover my
> assumption that all AVOptions could be set at any time, then that thread
> taught me to look for a flag that tells you the options for which that's true.
> 
> If you assume I'm not the only person who has been subtly misled that way,
> you could argue it's better to commit the long version.  That would give readers
> more opportunities to confront their own wrong assumptions, instead of reading
> something that assumed they knew one thing, but let them keep believing another.
> The obvious counterargument is that we should...
> 
> # Spread the information across multiple documents
> 
> The argument: this document puts too much information in one place.  We should
> instead focus on making small patches that put information people need to know
> where they need to know it.
> 
> This is where things get more interesting to a general audience.
> 
> If you have repo commit access, you're probably imagining a workflow like:
> write a bunch of little commits, send them out for review, then commit them
> when people stop replying.  Your access is evidence that you basically know how
> things work, and also lets you make plans confident in the knowledge that
> anything you need committed will make it there in the end.
> 
> My workflow is nothing like that.  This thread has constantly reinforced that I
> don't understand FFmpeg, so it's better for me not to have commit access.  But
> that means I can only work on one patch at once, because I will probably learn
> something that invalidates any other work I would have done.  It also means
> a single patch not getting interest is enough to sink the project altogether.
> I can put up with that when it's one big multi-faceted patch, because I can work
> on one part while waiting for feedback on another part.  But my small patches
> generally involve a few hours of work, a week of waiting, a ping begging for
> attention, then often being rejected or ignored.  In the real world, the only
> thing this approach will achieve is to burn me out.
> 
> It might be possible to revisit this idea *after* committing the document,
> when we're fairly confident the answers are right and just need to tweak
> the presentation based on feedback.  Or other people could write documentation
> based on issues brought up in this thread, and I'll cut as appropriate.
> But for now this is a non-starter.
> 
> # Write a blog post
> 
> The argument: doxygen and texinfo are good for documenting "timeless truths".
> But we don't have anywhere to put transitory information like upgrade guides,
> or subjective information like guidance about best practices.  This document
> shoehorns information in here that really belongs somewhere like that.
> 
> This is basically true, but that doesn't solve anything.
> 
> I have neither a personal blog nor the desire to write one, and even if I wrote
> an excellent guide to contexts as a general concept, nobody would actually find
> the blog to read it.  So this idea would only work if we e.g. overhauled the
> news area of the site[4] to look more like GIMP's news section[5].
> 
> If someone else would like to start a project like that, I can promise a good
> series of posts to help get the ball rolling, and will be happy to trim down
> the document as those posts go public.
> 
[...]
> # Rewrite the API
> 
> The argument: instead of writing a bunch of words apologising for an interface,
> just write a better interface.
> 
> In general, I'm a big believer in using documentation to drive API decisions.
> But in FFmpeg's case, it wouldn't help to have a single developer trying to fix
> a community-wide problem.
> 

> We've previously discussed swr_alloc_set_opts() (essentially two functions
> sharing one symbol) and av_ambient_viewing_environment_create_side_data()
> (receives a different context argument than its function name).
> A better example of the community-wide issue might be av_new_packet()
> (initializes but does not allocate, despite slightly confusing documentation)
> vs. av_new_program() (allocates and initializes, but has no documentation).
> Both of these could be documented better, and a developer who only needs to

> learn one won't be bothered by the other.  But the real problem is that they
> use the word "new" to mean fundamentally incompatible things, creating a trap
> for anyone reviewing code that uses the "new" they haven't seen before.
> 
> Solving this problem wouldn't just involve rewriting a bunch of functions.
> It would involve motivating the community to avoid writing that sort of API
> in future, which would take all of us many years to achieve.

I see one of the differences about you and I perceive the API is that
I'm not assuming that every time you see "_new_" in a function there
will be a very exact behavior to be assumed in that function. This
would be nice in theory, but in practice you will see that different
APIs were written by different persons in different years and they
were approved by different reviewers.

I would happy enough if the documentation is correct when stating the
behavior of the function, but I don't pretend that since two functions
both contain the term "new" they can be described by the same notion
of "new". Also this is why we should not create an expectation of what
a function does if it contains "new".

This might work if there was a naming guideline the
contributors/reviewers must commit to, or if a function name is
enforced by the language (e.g. as in the case of C++). Since this is
not the case for FFmpeg, we should not set an expectation about
function terminology and behavior. Same is also true for contexts (we
don't have any "contract" stating that context functions should always
pick a "context" as first argument, and I noted several times even in
OOP languages there are constructs - e.g. static/class methods - which
works as constructors and therefore take no self as argument. In the
case of the C language there is no notion of class, so there is no way
to formally distinguish the two cases. I can come with a more concrete
example if this is not clear enough).

Finally, this does not mean that we should not try to make the FFmpeg
API more consistent (I tried for years), but that does not mean that
we should make a-priori assumptions about how the API behaves when you
read "context" or "new" in a function name - the documentation of the
function should clarify what it does - without resorting to a general
theory setting up too strict expectations which do not match reality.

There are a few examples which should be fixed (e.g. the ugly
swr_alloc_set_opts()) but I don't think we should really
mention/apologize for that in the documentation.

> 
> # Apply this, then iterate
> 
> The argument: OK fine, but dragging down other solutions doesn't help this one.
> We should at least try to solve these problems.
> 
> This is the position I've currently landed up at.  I'm increasingly able to
> justify the big picture decisions behind the document, and we're getting to
> the point where detailed discussions are just putting opinions where evidence
> should go.
> 
> I've talked in a previous e-mail about getting feedback from #ffmpeg and the
> libav-user mailing list.  We've also talked about the value of making the
> document useful to people familiar with other programming languages, so we
> could try reaching out to people who write bindings in other languages.
> 
> This sort of iteration can be pretty quick, because we don't need to wait for
> a major release.  We just need to have something on ffmpeg.org so people know
> this is a "real" project.  As such, I think a "release early, release often"
> approach is the best way forward.

[...]

Andrew, sorry again for the slow reply. Thinking about the whole
discussion, I reckon I probably gave some bad advice, and I totally
understand how this is feeling dragging and burning out, and I'm sorry
for that.

I'm still on the idea of erring on the side of under-communicating for
the reference documentation (with the idea that too much information
is just too much, and would scare people away and make it harder to
maintain the documentation, as now you have to check in many places
when changing/updating it, resulting in contradicting content).

So at the moment I'd be willing to publish an abridged version of your
latest patch, with the suggested cuts - I can make the edit myself if
you prefer like that. This way we can get the non controversial parts
committed, and we can work on the other parts where there is no still
agreement.

Also, I'd like to hear opinions from other developers, although my
impression - from the scattered feedback I read - is that other
developers have the same feeling as me.

In general, having different channels for different targets would be
ideal, e.g. for articles and tutorials. For this it would be ideal to
have a blog entry for the project org, to simplify contributions from
contributors who don't want to setup a blog just for that and to
collect resources in a single place. In practice we lack this so this
is not an option at the moment (and the wiki is not the ideal place
too).

Andrew Sayers July 2, 2024, 9:56 a.m. UTC | #8

On Tue, Jul 02, 2024 at 12:16:21AM +0200, Stefano Sabatini wrote:
> On date Sunday 2024-06-16 19:02:51 +0100, Andrew Sayers wrote:
[...]
> 
> Andrew, sorry again for the slow reply. Thinking about the whole
> discussion, I reckon I probably gave some bad advice, and I totally
> understand how this is feeling dragging and burning out, and I'm sorry
> for that.
> 
> I'm still on the idea of erring on the side of under-communicating for
> the reference documentation (with the idea that too much information
> is just too much, and would scare people away and make it harder to
> maintain the documentation, as now you have to check in many places
> when changing/updating it, resulting in contradicting content).
> 
> So at the moment I'd be willing to publish an abridged version of your
> latest patch, with the suggested cuts - I can make the edit myself if
> you prefer like that. This way we can get the non controversial parts
> committed, and we can work on the other parts where there is no still
> agreement.
> 
> Also, I'd like to hear opinions from other developers, although my
> impression - from the scattered feedback I read - is that other
> developers have the same feeling as me.
> 
> In general, having different channels for different targets would be
> ideal, e.g. for articles and tutorials. For this it would be ideal to
> have a blog entry for the project org, to simplify contributions from
> contributors who don't want to setup a blog just for that and to
> collect resources in a single place. In practice we lack this so this
> is not an option at the moment (and the wiki is not the ideal place
> too).

No problem about the delay, although my thinking has moved on a little
(e.g. it turns out GIMP uses the word "context" in a completely different
way than we do[1]).  But rather than argue over today's minutia, here's
a big picture idea...

It sounds like your vision is for smaller, more disparate documentation;
and you're willing to spend some time writing that up.  How would you feel
about taking the AVClass/AVOptions bits from this document, and working them
in to the existing AVClass/AVOptions documentation?  That would require a level
of experience (and commit access) beyond what I can offer, after which we could
come back here and uncontroversially trim that stuff out of this document.

For inspiration, here are some uninformed questions a newbie might ask:

* (reading AVClass) does the struct name mean I have to learn OOP before I can
  use FFmpeg?
* (reading AVOptions) if the options API only works post-init for a subset of
  options, should I just ignore this API and set the variables directly
  whenever I like?

[1] https://developer.gimp.org/api/2.0/libgimp/libgimp-gimpcontext.html

Stefano Sabatini July 6, 2024, 11:33 a.m. UTC | #9

On date Tuesday 2024-07-02 10:56:34 +0100, Andrew Sayers wrote:
> On Tue, Jul 02, 2024 at 12:16:21AM +0200, Stefano Sabatini wrote:
> > On date Sunday 2024-06-16 19:02:51 +0100, Andrew Sayers wrote:
> [...]
> > 
> > Andrew, sorry again for the slow reply. Thinking about the whole
> > discussion, I reckon I probably gave some bad advice, and I totally
> > understand how this is feeling dragging and burning out, and I'm sorry
> > for that.
> > 
> > I'm still on the idea of erring on the side of under-communicating for
> > the reference documentation (with the idea that too much information
> > is just too much, and would scare people away and make it harder to
> > maintain the documentation, as now you have to check in many places
> > when changing/updating it, resulting in contradicting content).
> > 
> > So at the moment I'd be willing to publish an abridged version of your
> > latest patch, with the suggested cuts - I can make the edit myself if
> > you prefer like that. This way we can get the non controversial parts
> > committed, and we can work on the other parts where there is no still
> > agreement.
> > 
> > Also, I'd like to hear opinions from other developers, although my
> > impression - from the scattered feedback I read - is that other
> > developers have the same feeling as me.
> > 
> > In general, having different channels for different targets would be
> > ideal, e.g. for articles and tutorials. For this it would be ideal to
> > have a blog entry for the project org, to simplify contributions from
> > contributors who don't want to setup a blog just for that and to
> > collect resources in a single place. In practice we lack this so this
> > is not an option at the moment (and the wiki is not the ideal place
> > too).
> 
> No problem about the delay, although my thinking has moved on a little
> (e.g. it turns out GIMP uses the word "context" in a completely different
> way than we do[1]).  But rather than argue over today's minutia, here's
> a big picture idea...
> 
> It sounds like your vision is for smaller, more disparate documentation;
> and you're willing to spend some time writing that up.  How would you feel
> about taking the AVClass/AVOptions bits from this document, and working them
> in to the existing AVClass/AVOptions documentation?  That would require a level
> of experience (and commit access) beyond what I can offer, after which we could
> come back here and uncontroversially trim that stuff out of this document.
> 

> For inspiration, here are some uninformed questions a newbie might ask:
> 

> * (reading AVClass) does the struct name mean I have to learn OOP before I can
>   use FFmpeg?

The answer is definitively no, the point of AVClass is keeping the
"core" functionality for a class of contexts.

> * (reading AVOptions) if the options API only works post-init for a subset of
>   options, should I just ignore this API and set the variables directly
>   whenever I like?

Nothing prevents directly accessing a struct, but then yuo will miss
the following benefits:
* ABI/API compatibility in case of field renames in the context struct
* validation
* higher level setting functionality (e.g. to set options from
  a dictionary or from a string encoding multiple options)

I'll try to write something (and probably we should have a dedicated
class.h header to decouple it from the logging functionality).

[FFmpeg-devel,v6,1/4] doc: Explain what "context" means

Checks

Commit Message

Comments

Patch