Message ID | 20230826180748.15977-4-anton@khirnov.net |
---|---|
State | New |
Headers | show |
Series | [FFmpeg-devel,1/6] doc/developer: move a sentence to a more appropriate place | expand |
Context | Check | Description |
---|---|---|
yinshiyou/make_loongarch64 | success | Make finished |
yinshiyou/make_fate_loongarch64 | success | Make fate finished |
andriy/make_x86 | success | Make finished |
andriy/make_fate_x86 | success | Make fate finished |
Anton Khirnov: > Document our longstanding de facto policies on things like correctness, > thread-safety, UB, etc. > --- > doc/developer.texi | 50 +++++++++++++++++++++++++++++++++------------- > 1 file changed, 36 insertions(+), 14 deletions(-) > > diff --git a/doc/developer.texi b/doc/developer.texi > index df43119f98..afa0148137 100644 > --- a/doc/developer.texi > +++ b/doc/developer.texi > @@ -274,10 +274,6 @@ symbols. If in doubt, just avoid names starting with @code{_} altogether. > @section Miscellaneous conventions > > @itemize @bullet > -@item > -fprintf and printf are forbidden in libavformat and libavcodec, > -please use av_log() instead. > - > @item > Casts should be used only when necessary. Unneeded parentheses > should also be avoided if they don't make the code easier to understand. > @@ -286,6 +282,42 @@ should also be avoided if they don't make the code easier to understand. > @anchor{Development Policy} > @chapter Development Policy > > +@section Code behaviour > + > +@subheading Correctness > +The code must be valid. It must not crash, abort, access invalid pointers, leak > +memory, cause data races or signed integer overflow, or otherwise invoke > +undefined behaviour. Error codes should be checked and, when applicable, > +forwarded to the caller. > + > +@subheading Thread- and library-safety > +Our libraries may be called by multiple independent callers in the same process. > +These calls may happen from any number of threads and the different call sites > +may not be aware of each other - e.g. a user program may be calling us directly, > +and use one or more libraries that also call us. The code must behave correctly > +under such conditions. Some of this can no longer be guaranteed when FFmpeg is built without threading support, but called from multiple threads concurrently. Should this be mentioned or would it just confuse readers? (I'm leaning to the latter.) - Andreas
Quoting Andreas Rheinhardt (2023-08-26 20:36:25) > Anton Khirnov: > > Document our longstanding de facto policies on things like correctness, > > thread-safety, UB, etc. > > --- > > doc/developer.texi | 50 +++++++++++++++++++++++++++++++++------------- > > 1 file changed, 36 insertions(+), 14 deletions(-) > > > > diff --git a/doc/developer.texi b/doc/developer.texi > > index df43119f98..afa0148137 100644 > > --- a/doc/developer.texi > > +++ b/doc/developer.texi > > @@ -274,10 +274,6 @@ symbols. If in doubt, just avoid names starting with @code{_} altogether. > > @section Miscellaneous conventions > > > > @itemize @bullet > > -@item > > -fprintf and printf are forbidden in libavformat and libavcodec, > > -please use av_log() instead. > > - > > @item > > Casts should be used only when necessary. Unneeded parentheses > > should also be avoided if they don't make the code easier to understand. > > @@ -286,6 +282,42 @@ should also be avoided if they don't make the code easier to understand. > > @anchor{Development Policy} > > @chapter Development Policy > > > > +@section Code behaviour > > + > > +@subheading Correctness > > +The code must be valid. It must not crash, abort, access invalid pointers, leak > > +memory, cause data races or signed integer overflow, or otherwise invoke > > +undefined behaviour. Error codes should be checked and, when applicable, > > +forwarded to the caller. > > + > > +@subheading Thread- and library-safety > > +Our libraries may be called by multiple independent callers in the same process. > > +These calls may happen from any number of threads and the different call sites > > +may not be aware of each other - e.g. a user program may be calling us directly, > > +and use one or more libraries that also call us. The code must behave correctly > > +under such conditions. > > Some of this can no longer be guaranteed when FFmpeg is built without > threading support, but called from multiple threads concurrently. Should > this be mentioned or would it just confuse readers? (I'm leaning to the > latter.) I tend to agree, it's a very obscure use case anyway. But we should probably document more explicitly somewhere (and warn in configure) what building without threads actually implies.
Il sab 26 ago 2023, 20:08 Anton Khirnov <anton@khirnov.net> ha scritto: > Document our longstanding de facto policies on things like correctness, > thread-safety, UB, etc. > UB? --- > doc/developer.texi | 50 +++++++++++++++++++++++++++++++++------------- > 1 file changed, 36 insertions(+), 14 deletions(-) > > diff --git a/doc/developer.texi b/doc/developer.texi > index df43119f98..afa0148137 100644 > --- a/doc/developer.texi > +++ b/doc/developer.texi > @@ -274,10 +274,6 @@ symbols. If in doubt, just avoid names starting with > @code{_} altogether. > @section Miscellaneous conventions > > @itemize @bullet > -@item > -fprintf and printf are forbidden in libavformat and libavcodec, > -please use av_log() instead. > - > @item > Casts should be used only when necessary. Unneeded parentheses > should also be avoided if they don't make the code easier to understand. > @@ -286,6 +282,42 @@ should also be avoided if they don't make the code > easier to understand. > @anchor{Development Policy} > @chapter Development Policy > > +@section Code behaviour > + > +@subheading Correctness > +The code must be valid. It must not crash, abort, access invalid > pointers, leak > +memory, cause data races or signed integer overflow, or otherwise invoke > +undefined behaviour. Invoke => assume? Error codes should be checked and, when applicable, > +forwarded to the caller. > + > +@subheading Thread- and library-safety > +Our libraries may be called by multiple independent callers in the same > process. > +These calls may happen from any number of threads and the different call > sites > +may not be aware of each other - e.g. a user program may be calling us > directly, > Calling us -> calling the/our libraries +and use one or more libraries that also call us. The code must behave > correctly > +under such conditions. > + > +@subheading Robustness > +The code must treat as untrusted any bytestream received from a caller or > read > +from a file, network, etc. It must not misbehave when arbitrary data is > sent to > +it - typically it should print an error message and return > +@code{AVERROR_INVALIDDATA} on encountering invalid input data. > + > +@subheading Memory allocation > +The code must use the @code{av_malloc()} family of functions from > +@file{libavutil/mem.h} to perform all memory allocation, except in > special cases > +(e.g. when interacting with an external library that requires a specific > +allocator to be used). > + > +All allocations should be checked and @code{AVERROR(ENOMEM)} returned on > +failure. A common mistake is that error paths leak memory - make sure > that does > +not happen. > + > +@subheading stdio > +Our libraries must not access the stdio streams stdin/stdout/stderr > directly > +(e.g. via @code{printf()} family of functions), as that is not > library-safe. For > +logging, use @code{av_log()}. Lgtm otherwise, thanks. > >
Quoting Stefano Sabatini (2023-08-27 14:38:44) > Il sab 26 ago 2023, 20:08 Anton Khirnov <anton@khirnov.net> ha scritto: > > > Document our longstanding de facto policies on things like correctness, > > thread-safety, UB, etc. > > > > UB? undefined behavior > > --- > > doc/developer.texi | 50 +++++++++++++++++++++++++++++++++------------- > > 1 file changed, 36 insertions(+), 14 deletions(-) > > > > diff --git a/doc/developer.texi b/doc/developer.texi > > index df43119f98..afa0148137 100644 > > --- a/doc/developer.texi > > +++ b/doc/developer.texi > > @@ -274,10 +274,6 @@ symbols. If in doubt, just avoid names starting with > > @code{_} altogether. > > @section Miscellaneous conventions > > > > @itemize @bullet > > -@item > > -fprintf and printf are forbidden in libavformat and libavcodec, > > -please use av_log() instead. > > - > > @item > > Casts should be used only when necessary. Unneeded parentheses > > should also be avoided if they don't make the code easier to understand. > > @@ -286,6 +282,42 @@ should also be avoided if they don't make the code > > easier to understand. > > @anchor{Development Policy} > > @chapter Development Policy > > > > +@section Code behaviour > > + > > +@subheading Correctness > > +The code must be valid. It must not crash, abort, access invalid > > pointers, leak > > +memory, cause data races or signed integer overflow, or otherwise invoke > > +undefined behaviour. > > > Invoke => assume? No, 'assume' would mean something else. 'cause' maybe. > > Error codes should be checked and, when applicable, > > +forwarded to the caller. > > + > > +@subheading Thread- and library-safety > > +Our libraries may be called by multiple independent callers in the same > > process. > > +These calls may happen from any number of threads and the different call > > sites > > +may not be aware of each other - e.g. a user program may be calling us > > directly, > > > > Calling us -> calling the/our libraries I wanted to avoid using the word 'libraries' multiple times, to avoid possible confusion of which library is being talked about - whether ours or some external one.
On date Tuesday 2023-08-29 10:34:45 +0200, Anton Khirnov wrote: > Quoting Stefano Sabatini (2023-08-27 14:38:44) > > Il sab 26 ago 2023, 20:08 Anton Khirnov <anton@khirnov.net> ha scritto: > > > > > Document our longstanding de facto policies on things like correctness, > > > thread-safety, UB, etc. > > > > > > > UB? > > undefined behavior yes, let's avoid acronyms > > > > > --- > > > doc/developer.texi | 50 +++++++++++++++++++++++++++++++++------------- > > > 1 file changed, 36 insertions(+), 14 deletions(-) > > > > > > diff --git a/doc/developer.texi b/doc/developer.texi > > > index df43119f98..afa0148137 100644 > > > --- a/doc/developer.texi > > > +++ b/doc/developer.texi > > > @@ -274,10 +274,6 @@ symbols. If in doubt, just avoid names starting with > > > @code{_} altogether. > > > @section Miscellaneous conventions > > > > > > @itemize @bullet > > > -@item > > > -fprintf and printf are forbidden in libavformat and libavcodec, > > > -please use av_log() instead. > > > - > > > @item > > > Casts should be used only when necessary. Unneeded parentheses > > > should also be avoided if they don't make the code easier to understand. > > > @@ -286,6 +282,42 @@ should also be avoided if they don't make the code > > > easier to understand. > > > @anchor{Development Policy} > > > @chapter Development Policy > > > > > > +@section Code behaviour > > > + > > > +@subheading Correctness > > > +The code must be valid. It must not crash, abort, access invalid > > > pointers, leak > > > +memory, cause data races or signed integer overflow, or otherwise invoke > > > +undefined behaviour. > > > > > > Invoke => assume? > > No, 'assume' would mean something else. 'cause' maybe. "cause" sounds good. > > > > > Error codes should be checked and, when applicable, > > > +forwarded to the caller. > > > + > > > +@subheading Thread- and library-safety > > > +Our libraries may be called by multiple independent callers in the same > > > process. > > > +These calls may happen from any number of threads and the different call > > > sites > > > +may not be aware of each other - e.g. a user program may be calling us > > > directly, > > > > > > > Calling us -> calling the/our libraries > > I wanted to avoid using the word 'libraries' multiple times, to avoid > possible confusion of which library is being talked about - whether ours > or some external one. But using "us" is even more confusing (who/what is "us"?), I have no better ideas than repeating "our/the FFmpeg libraries".
On Thu, Aug 31, 2023 at 05:28:48PM +0200, Stefano Sabatini wrote: > On date Tuesday 2023-08-29 10:34:45 +0200, Anton Khirnov wrote: > > Quoting Stefano Sabatini (2023-08-27 14:38:44) > > > Il sab 26 ago 2023, 20:08 Anton Khirnov <anton@khirnov.net> ha scritto: > > > > > > > Document our longstanding de facto policies on things like correctness, > > > > thread-safety, UB, etc. > > > > > > > > > > UB? > > > > undefined behavior > > yes, let's avoid acronyms +1 (unless they are very widely established or not critical to the understaning of the text, like ISO in ISO C) thx [...]
Le torstaina 31. elokuuta 2023, 18.28.48 EEST Stefano Sabatini a écrit : > On date Tuesday 2023-08-29 10:34:45 +0200, Anton Khirnov wrote: > > Quoting Stefano Sabatini (2023-08-27 14:38:44) > > > > > Il sab 26 ago 2023, 20:08 Anton Khirnov <anton@khirnov.net> ha scritto: > > > > Document our longstanding de facto policies on things like > > > > correctness, > > > > thread-safety, UB, etc. > > > > > > UB? > > > > undefined behavior > > yes, let's avoid acronyms I think that developers should know common accronyms of the primary programming language of the project, and it is totally reasonable to expect that they are. And yes, UB is a very well known accronym in this context. In my experience, people who don't know the accronym don't know the concept either, so spelling it out wouldn't even help. TBH, this one is on you.
On Fri, Sep 1, 2023 at 7:16 PM Rémi Denis-Courmont <remi@remlab.net> wrote: > Le torstaina 31. elokuuta 2023, 18.28.48 EEST Stefano Sabatini a écrit : > > On date Tuesday 2023-08-29 10:34:45 +0200, Anton Khirnov wrote: > > > Quoting Stefano Sabatini (2023-08-27 14:38:44) > > > > > > > Il sab 26 ago 2023, 20:08 Anton Khirnov <anton@khirnov.net> ha > scritto: > > > > > Document our longstanding de facto policies on things like > > > > > correctness, > > > > > thread-safety, UB, etc. > > > > > > > > UB? > > > > > > undefined behavior > > > > yes, let's avoid acronyms > > I think that developers should know common accronyms of the primary > programming language of the project, and it is totally reasonable to > expect > that they are. > > And yes, UB is a very well known accronym in this context. In my > experience, > people who don't know the accronym don't know the concept either, so > spelling > it out wouldn't even help. > > TBH, this one is on you. > TBH, novice and newbies lack bunch of general and specific knowledge anyway. > > -- > レミ・デニ-クールモン > http://www.remlab.net/ > > > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". >
On date Friday 2023-09-01 20:10:11 +0300, Rémi Denis-Courmont wrote: > Le torstaina 31. elokuuta 2023, 18.28.48 EEST Stefano Sabatini a écrit : > > On date Tuesday 2023-08-29 10:34:45 +0200, Anton Khirnov wrote: > > > Quoting Stefano Sabatini (2023-08-27 14:38:44) > > > > > > > Il sab 26 ago 2023, 20:08 Anton Khirnov <anton@khirnov.net> ha scritto: > > > > > Document our longstanding de facto policies on things like > > > > > correctness, > > > > > thread-safety, UB, etc. > > > > > > > > UB? > > > > > > undefined behavior > > > > yes, let's avoid acronyms > > I think that developers should know common accronyms of the primary > programming language of the project, and it is totally reasonable to expect > that they are. > > And yes, UB is a very well known accronym in this context. In my experience, > people who don't know the accronym don't know the concept either, so spelling > it out wouldn't even help. > > TBH, this one is on you. In general, it's good practice to avoid unnecessary use of acronyms as they add ambiguity with very small gain (some fraction of a second when typing and a few bytes in storage size). Also note that the acronym is only used in the commit log and spelled out in the texi file. That said, I'm not blocking on this but I added the note since it was not very clear from the context when reading it.
On Fri, Sep 01, 2023 at 08:10:11PM +0300, Rémi Denis-Courmont wrote: > Le torstaina 31. elokuuta 2023, 18.28.48 EEST Stefano Sabatini a écrit : > > On date Tuesday 2023-08-29 10:34:45 +0200, Anton Khirnov wrote: > > > Quoting Stefano Sabatini (2023-08-27 14:38:44) > > > > > > > Il sab 26 ago 2023, 20:08 Anton Khirnov <anton@khirnov.net> ha scritto: > > > > > Document our longstanding de facto policies on things like > > > > > correctness, > > > > > thread-safety, UB, etc. > > > > > > > > UB? > > > > > > undefined behavior > > > > yes, let's avoid acronyms > > I think that developers should know common accronyms of the primary > programming language of the project, and it is totally reasonable to expect > that they are. > > And yes, UB is a very well known accronym in this context. In my experience, > people who don't know the accronym don't know the concept either, so spelling > it out wouldn't even help. There is a difference between not knowing a concept and not knowing what concept is referenced. "Undefined behavior" can be googled, and can to some extend be understood from the meaning of the words by people with a math background. googling "UB" results in "Universitätsbibliothek Wien" here and nothing on the first page of links is related to "undefined" So i do think, in case of a policy or some coding rules it should be spelled out thx [...]
Le lauantaina 2. syyskuuta 2023, 23.03.32 EEST Michael Niedermayer a écrit : > > And yes, UB is a very well known accronym in this context. In my > > experience, people who don't know the accronym don't know the concept > > either, so spelling it out wouldn't even help. > > "Undefined behavior" can be googled, and can to some extend be understood > from the meaning of the words by people with a math background. And? So can "C language ub" or "programming ub". I would think that adult software engineers know that you need to scope when searching a short accronym. Either way, first result on Google is the Wikipedia page starting with "In computer programming, undefined behavior (UB)" (exact quote)...
On Sun, Sep 03, 2023 at 11:29:32AM +0300, Rémi Denis-Courmont wrote: > Le lauantaina 2. syyskuuta 2023, 23.03.32 EEST Michael Niedermayer a écrit : > > > And yes, UB is a very well known accronym in this context. In my > > > experience, people who don't know the accronym don't know the concept > > > either, so spelling it out wouldn't even help. > > > > "Undefined behavior" can be googled, and can to some extend be understood > > from the meaning of the words by people with a math background. > > And? So can "C language ub" or "programming ub". I would think that adult > software engineers know that you need to scope when searching a short > accronym. Its a policy not a puzzle also are all software engneers here adults ? thx [...]
diff --git a/doc/developer.texi b/doc/developer.texi index df43119f98..afa0148137 100644 --- a/doc/developer.texi +++ b/doc/developer.texi @@ -274,10 +274,6 @@ symbols. If in doubt, just avoid names starting with @code{_} altogether. @section Miscellaneous conventions @itemize @bullet -@item -fprintf and printf are forbidden in libavformat and libavcodec, -please use av_log() instead. - @item Casts should be used only when necessary. Unneeded parentheses should also be avoided if they don't make the code easier to understand. @@ -286,6 +282,42 @@ should also be avoided if they don't make the code easier to understand. @anchor{Development Policy} @chapter Development Policy +@section Code behaviour + +@subheading Correctness +The code must be valid. It must not crash, abort, access invalid pointers, leak +memory, cause data races or signed integer overflow, or otherwise invoke +undefined behaviour. Error codes should be checked and, when applicable, +forwarded to the caller. + +@subheading Thread- and library-safety +Our libraries may be called by multiple independent callers in the same process. +These calls may happen from any number of threads and the different call sites +may not be aware of each other - e.g. a user program may be calling us directly, +and use one or more libraries that also call us. The code must behave correctly +under such conditions. + +@subheading Robustness +The code must treat as untrusted any bytestream received from a caller or read +from a file, network, etc. It must not misbehave when arbitrary data is sent to +it - typically it should print an error message and return +@code{AVERROR_INVALIDDATA} on encountering invalid input data. + +@subheading Memory allocation +The code must use the @code{av_malloc()} family of functions from +@file{libavutil/mem.h} to perform all memory allocation, except in special cases +(e.g. when interacting with an external library that requires a specific +allocator to be used). + +All allocations should be checked and @code{AVERROR(ENOMEM)} returned on +failure. A common mistake is that error paths leak memory - make sure that does +not happen. + +@subheading stdio +Our libraries must not access the stdio streams stdin/stdout/stderr directly +(e.g. via @code{printf()} family of functions), as that is not library-safe. For +logging, use @code{av_log()}. + @section Patches/Committing @subheading Licenses for patches must be compatible with FFmpeg. Contributions should be licensed under the @@ -395,11 +427,6 @@ If it is a bug, the bug has to be fixed. If it is not, the code should be changed to not generate a warning unless that causes a slowdown or obfuscates the code. -@subheading Check untrusted input properly. -Never write to unallocated memory, never write over the end of arrays, -always check values read from some untrusted source before using them -as array index or other risky things. - @section Library public interfaces Every library in FFmpeg provides a set of public APIs in its installed headers, which are those listed in the variable @code{HEADERS} in that library's @@ -811,11 +838,6 @@ an explanation why to your patchset, its ok to not test if theres a reason. @item If you added YASM code please check that things still work with --disable-yasm. -@item -Make sure you check the return values of function and return appropriate -error codes. Especially memory allocation functions like @code{av_malloc()} -are notoriously left unchecked, which is a serious problem. - @item Test your code with valgrind and or Address Sanitizer to ensure it's free of leaks, out of array accesses, etc.