diff mbox series

[FFmpeg-devel,1/2] avcodec: add YUV color space metadata to AVCodec

Message ID 20240205174413.92730-1-ffmpeg@haasn.xyz
State New
Headers show
Series [FFmpeg-devel,1/2] avcodec: add YUV color space metadata to AVCodec | expand

Checks

Context Check Description
yinshiyou/make_loongarch64 success Make finished
yinshiyou/make_fate_loongarch64 success Make fate finished
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Niklas Haas Feb. 5, 2024, 5:44 p.m. UTC
From: Niklas Haas <git@haasn.dev>

This is motivated primarily by a desire for YUVJ removal, which will
require signalling the supported color ranges as part of the codec
capabilities. But since we're adding YUV range, we might as well add the
YUV color matrix as well - since some codecs (e.g. VP8, JPEG) only
support certain values.

I decided to preserve the ambiguous and misleading "color_spaces" name,
for symmetry with AVFrame.colorspace. (Though this would IMO be better
called "color_matrix" or "color_system")

I also decided to omit the other AVColor* fields for now, because
vf_scale cannot handle auto-conversion between primaries/transfer/etc.
There is little value in adding metadata we cannot do anything with, and
no harm in extending the API again in the future. In theory, vf_scale
can handle conversion between chroma locations, but also the signalling
for this is annoying, so I'll defer it to a future commit.
---
 doc/APIchanges       | 3 +++
 libavcodec/codec.h   | 6 ++++++
 libavcodec/version.h | 2 +-
 3 files changed, 10 insertions(+), 1 deletion(-)

Comments

Niklas Haas Feb. 5, 2024, 5:48 p.m. UTC | #1
It would make me happy if we could get this API addition in before 7.0,
even though it's non-functional on its own, as it sets the groundwork to
be able to start properly deprecating YUVJ.

After this series, the next steps are:

1. Make decoders stop outputting YUVJ frames
2. Mark the actual YUVJ pixel formats as deprecated, add a removal API
   define, and guard internal usages by AV_NOWARN_DEPRECATED
3. After sufficient time passes, remove YUVJ formats fully

I have a branch prepared for each.
Andreas Rheinhardt Feb. 5, 2024, 6:04 p.m. UTC | #2
Niklas Haas:
> From: Niklas Haas <git@haasn.dev>
> 
> This is motivated primarily by a desire for YUVJ removal, which will
> require signalling the supported color ranges as part of the codec
> capabilities. But since we're adding YUV range, we might as well add the
> YUV color matrix as well - since some codecs (e.g. VP8, JPEG) only
> support certain values.
> 
> I decided to preserve the ambiguous and misleading "color_spaces" name,
> for symmetry with AVFrame.colorspace. (Though this would IMO be better
> called "color_matrix" or "color_system")
> 
> I also decided to omit the other AVColor* fields for now, because
> vf_scale cannot handle auto-conversion between primaries/transfer/etc.
> There is little value in adding metadata we cannot do anything with, and
> no harm in extending the API again in the future. In theory, vf_scale
> can handle conversion between chroma locations, but also the signalling
> for this is annoying, so I'll defer it to a future commit.
> ---
>  doc/APIchanges       | 3 +++
>  libavcodec/codec.h   | 6 ++++++
>  libavcodec/version.h | 2 +-
>  3 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/doc/APIchanges b/doc/APIchanges
> index 1f5724324a..7849ce47d9 100644
> --- a/doc/APIchanges
> +++ b/doc/APIchanges
> @@ -2,6 +2,9 @@ The last version increases of all libraries were on 2023-02-09
>  
>  API changes, most recent first:
>  
> +2024-02-xx - xxxxxxxxxx - lavc 60.40.100 - avcodec.h
> +  Add AVCodec.color_ranges and AVCodec.color_spaces.
> +
>  2024-02-04 - xxxxxxxxxx - lavc 60.39.100 - packet.h
>    Add AV_PKT_DATA_AMBIENT_VIEWING_ENVIRONMENT.
>  
> diff --git a/libavcodec/codec.h b/libavcodec/codec.h
> index 8034f1a53c..8bd678de7a 100644
> --- a/libavcodec/codec.h
> +++ b/libavcodec/codec.h
> @@ -235,6 +235,12 @@ typedef struct AVCodec {
>       * Array of supported channel layouts, terminated with a zeroed layout.
>       */
>      const AVChannelLayout *ch_layouts;
> +
> +    /**
> +     * Array of supported YUV color formats. Ignored for RGB/Gray formats.
> +     */
> +    const enum AVColorRange *color_ranges; ///< terminated by AVCOL_RANGE_UNSPECIFIED
> +    const enum AVColorSpace *color_spaces; ///< terminated by AVCOL_SPC_UNSPECIFIED
>  } AVCodec;
>  
>  /**
> diff --git a/libavcodec/version.h b/libavcodec/version.h
> index f2f14eaed1..19f3f4a272 100644
> --- a/libavcodec/version.h
> +++ b/libavcodec/version.h
> @@ -29,7 +29,7 @@
>  
>  #include "version_major.h"
>  
> -#define LIBAVCODEC_VERSION_MINOR  39
> +#define LIBAVCODEC_VERSION_MINOR  40
>  #define LIBAVCODEC_VERSION_MICRO 100
>  
>  #define LIBAVCODEC_VERSION_INT  AV_VERSION_INT(LIBAVCODEC_VERSION_MAJOR, \

This presumes the relevant states to be a cartesian product. Which need
not be true. A callback would be better; this would also allow to base
the list on other values already set in an AVCodecContext. And if this
were extended, it would also allow to remove init_static_data one day.
It is furthermore quite wasteful to store color_ranges in a list,
although there are only very few states for it.

- Andreas
Niklas Haas Feb. 5, 2024, 6:37 p.m. UTC | #3
On Mon, 05 Feb 2024 19:04:30 +0100 Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote:
> This presumes the relevant states to be a cartesian product. Which need
> not be true. A callback would be better; this would also allow to base
> the list on other values already set in an AVCodecContext. And if this
> were extended, it would also allow to remove init_static_data one day.
> It is furthermore quite wasteful to store color_ranges in a list,
> although there are only very few states for it.

What signature would you propose for such a callback?

I disagree with using a list for the range being wasteful. In terms of
binary size, they are shared (see ff_color_range_mpeg/jpeg definition).
At best you are saving a few bits from using an int bitmask instead of
a pointer.

The main reason to use a list is API simplicity and consistency. Working
with a bitmask is more awkward and would require more specialized code
as opposed to the current design which allows simply re-using macros
like DEF_CHOOSE_FORMATS in fftools, which is designed to handle lists.
Not to mention the equivalent libavfilter API also being list-based.
Niklas Haas Feb. 8, 2024, 11:30 a.m. UTC | #4
On Mon, 05 Feb 2024 19:37:48 +0100 Niklas Haas <ffmpeg@haasn.xyz> wrote:
> On Mon, 05 Feb 2024 19:04:30 +0100 Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote:
> > This presumes the relevant states to be a cartesian product. Which need
> > not be true. A callback would be better; this would also allow to base
> > the list on other values already set in an AVCodecContext. And if this
> > were extended, it would also allow to remove init_static_data one day.
> > It is furthermore quite wasteful to store color_ranges in a list,
> > although there are only very few states for it.
> 
> What signature would you propose for such a callback?

Well, there are two fundamental approaches here:

1. The callback somehow sets or returns a list of supported colorspaces.
2. The callback accepts a particular configuration and simply returns if
   it's supported or not, with fields set to AVCOL_*_UNKNOWN being
   ignored.

The first case has the pro of simplicity, and the ability to enumerate
more easily, but the drawback of only being scalable if we return
a cartesian set (i.e. set each attribute list independently, rather than
generating one big list).

The second case has the pro of flexibility, and the ability to handle
non-cartesian use cases, but the con of being slightly more awkward to
recover a list (for e.g. filtering purpose). Fortunately, we can recover
the minimal cartesian superset of the actual supported set in O(n) time,
and then still verify the exact format chosen at encode time.

One possibility could be to add a `test()` callback which simply tests
if the current codec context has valid parameters. For example:

struct AVCodecContext {
    ...
    int (*test)(const AVCodecContext *ctx);
}

But this suggests a very stateful API, where you have to mutate
AVCodecContext in order to query what formats would be supported.
I don't like this.

So probably it makes more sense to just directly ingest a new struct
for this purpose, which we can then extend easily. (Or should we just
ingest AVFrame directly, i.e. int (*test_avframe)(ctx, AVFrame *frame)?)
Andreas Rheinhardt Feb. 8, 2024, 12:33 p.m. UTC | #5
Niklas Haas:
> On Mon, 05 Feb 2024 19:37:48 +0100 Niklas Haas <ffmpeg@haasn.xyz> wrote:
>> On Mon, 05 Feb 2024 19:04:30 +0100 Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote:
>>> This presumes the relevant states to be a cartesian product. Which need
>>> not be true. A callback would be better; this would also allow to base
>>> the list on other values already set in an AVCodecContext. And if this
>>> were extended, it would also allow to remove init_static_data one day.
>>> It is furthermore quite wasteful to store color_ranges in a list,
>>> although there are only very few states for it.
>>
>> What signature would you propose for such a callback?
> 
> Well, there are two fundamental approaches here:
> 
> 1. The callback somehow sets or returns a list of supported colorspaces.
> 2. The callback accepts a particular configuration and simply returns if
>    it's supported or not, with fields set to AVCOL_*_UNKNOWN being
>    ignored.
> 
> The first case has the pro of simplicity, and the ability to enumerate
> more easily, but the drawback of only being scalable if we return
> a cartesian set (i.e. set each attribute list independently, rather than
> generating one big list).
> 
> The second case has the pro of flexibility, and the ability to handle
> non-cartesian use cases, but the con of being slightly more awkward to
> recover a list (for e.g. filtering purpose). Fortunately, we can recover
> the minimal cartesian superset of the actual supported set in O(n) time,
> and then still verify the exact format chosen at encode time.
> 
> One possibility could be to add a `test()` callback which simply tests
> if the current codec context has valid parameters. For example:
> 
> struct AVCodecContext {
>     ...
>     int (*test)(const AVCodecContext *ctx);
> }
> 
> But this suggests a very stateful API, where you have to mutate
> AVCodecContext in order to query what formats would be supported.
> I don't like this.
> 
> So probably it makes more sense to just directly ingest a new struct
> for this purpose, which we can then extend easily. (Or should we just
> ingest AVFrame directly, i.e. int (*test_avframe)(ctx, AVFrame *frame)?)

Sorry for not answering earlier.
My intention is to allow users who only want to deal with the common
case of a cartesian product to continue to do so, but to also support
other usecases.
The public function would look like

int avcodec_get_supported_config(const AVCodecContext *avctx, const
AVCodec *codec, int **supported_configs, unsigned desired_configs,
unsigned flags, void *logctx);

avctx can be NULL (in which case this allows to return all potential
configurations, irrespective of e.g. the level of strictness).
codec can be omitted if avctx->codec is set, but if both are supplied
and avctx->codec is set, they have to match (like in avcodec_open2()).
desired_configs is a bitfield of configs that the user wants to get
information about; your patch would have to add flags for color_ranges
and color_spaces.
supported_configs will on return point to something like an array of
struct { int desired_config0, desired_config1,...;}. The order of the
entries will be fixed (say coincide with the order of the bits in the
desired_configs bitfields).
If one member is the unspec value for its type, then this means that it
works with everything.
supported_configs will be allocated; ownership passes to the user.
Using a multidimensional sentinel (that would depend on desired_configs)
is clumsy, so there will be two supported ways for this (depends upon a
flag to be supplied in flags): One method that really adds a
multidimensional sentinel, the other method that writes the number of
entries into **supported_configs, so that the first entry starts at
(*supported_configs)[1]. This allows users that only want to deal with
the factors of a cartesian product separately to continue to do so.

- Andreas
Niklas Haas Feb. 8, 2024, 8:32 p.m. UTC | #6
On Thu, 08 Feb 2024 13:33:51 +0100 Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote:
> Sorry for not answering earlier.
> My intention is to allow users who only want to deal with the common
> case of a cartesian product to continue to do so, but to also support
> other usecases.
> The public function would look like
> 
> int avcodec_get_supported_config(const AVCodecContext *avctx, const
> AVCodec *codec, int **supported_configs, unsigned desired_configs,
> unsigned flags, void *logctx);
> 
> avctx can be NULL (in which case this allows to return all potential
> configurations, irrespective of e.g. the level of strictness).
> codec can be omitted if avctx->codec is set, but if both are supplied
> and avctx->codec is set, they have to match (like in avcodec_open2()).
> desired_configs is a bitfield of configs that the user wants to get
> information about; your patch would have to add flags for color_ranges
> and color_spaces.
> supported_configs will on return point to something like an array of
> struct { int desired_config0, desired_config1,...;}. The order of the
> entries will be fixed (say coincide with the order of the bits in the
> desired_configs bitfields).
> If one member is the unspec value for its type, then this means that it
> works with everything.
> supported_configs will be allocated; ownership passes to the user.
> Using a multidimensional sentinel (that would depend on desired_configs)
> is clumsy, so there will be two supported ways for this (depends upon a
> flag to be supplied in flags): One method that really adds a
> multidimensional sentinel, the other method that writes the number of
> entries into **supported_configs, so that the first entry starts at
> (*supported_configs)[1]. This allows users that only want to deal with
> the factors of a cartesian product separately to continue to do so.

OTOH this design will necessarily either result in exponential
explosion, or end up requiring the caller to make assumptions about
which fields are independent (and should thus be queried separately),
the moment a codec imposes *any* restriction (cartesian or not) on
multiple fields at the same time.

I also think that a `test()` callback, as I previously proposed, is also
overkill and doesn't actually solve anything. Codecs can already error
out on invalid configurations at open() time, and any practical use of
such an API would also end up having to make the cartesian assumption
one way or the other.

So in summary, I still think that we should enforce the assumption that
these fields form a cartesian set - it's simple, fast, useful and
doesn't overengineer for hypothetical constraints that we couldn't realistically
address one way or the other. Afaict, all current codecs support
a cartesian set of metadata, but feel free to correct me if I'm wrong.
Niklas Haas Feb. 9, 2024, 12:11 p.m. UTC | #7
On Mon, 05 Feb 2024 19:04:30 +0100 Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote:
> This presumes the relevant states to be a cartesian product. Which need
> not be true. A callback would be better; this would also allow to base
> the list on other values already set in an AVCodecContext. And if this
> were extended, it would also allow to remove init_static_data one day.
> It is furthermore quite wasteful to store color_ranges in a list,
> although there are only very few states for it.

There is also the consideration to be made that using a callback is
inconsistent with the established design. Consider that framerates,
pix_fmts, samplerates, sample fmts and channel layouts are all currently
provided as static arrays in AVCodec. There is a natural symmetry
between these items and the ones I intend to add (yuv matrix, range,
chroma location, primaries and gamma) - all of them are descriptive of
the way data is encoded, and are therefore also (or should be)
negotiable filter link properties.

If we add a new callback API, should we then extend it to also include
all of the existing items from the above list? Is there a reason that
yuv range supports needs to be more dynamic than the others?

Food for thought: mjpeg is not the only codec that puts restrictions on
the format support based on the strictness level. For example,
yuv4mpegpipe_muxer errors out with a strictness warning if you use
a non-standard pixel format. And arguably, in this case, this is
**preferred** behavior over "silently" inserting a scale filter to
convert to a supported format, as the whole point of y4m2 is to
encapsulate raw data as-is.

Should we:

1. Add a new dynamic callback that can query lists for all of the above
   in a way dependent on the strictness level, and use it as
   a replacement for the static lists currently in AVCodec?

2. Continue with the status quo of having these lists be static, plus
   dynamic checks at open() time, and continue using the "convenience
   hack" of having ffmpeg_tools automatically restrict limited range mjpeg?

It is not immediately obvious to me that an automatic conversion to
a supported format is *necessarily* preferred to erroring out unless the
user specifies a lower strictness level.

As for an API, I think that rather than having an AVCodecContext-aware
callback at all, I would just make callbacks that directly ingest the
strictness level in AVCodec. That makes it far less of a black box about
which fields of the AVCodecContext are relevant here.

i.e.

struct AVCodec {
    const enum AVColorRange (*get_color_ranges)(int strictness);
    const enum AVColorSpace (*get_color_spaces)(int strictness);
    // ditto for the other parameters?
}
Niklas Haas March 23, 2024, 5:51 p.m. UTC | #8
On Fri, 09 Feb 2024 13:11:38 +0100 Niklas Haas <ffmpeg@haasn.xyz> wrote:
> On Mon, 05 Feb 2024 19:04:30 +0100 Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote:
> > This presumes the relevant states to be a cartesian product. Which need
> > not be true. A callback would be better; this would also allow to base
> > the list on other values already set in an AVCodecContext. And if this
> > were extended, it would also allow to remove init_static_data one day.
> > It is furthermore quite wasteful to store color_ranges in a list,
> > although there are only very few states for it.
> 
> There is also the consideration to be made that using a callback is
> inconsistent with the established design. Consider that framerates,
> pix_fmts, samplerates, sample fmts and channel layouts are all currently
> provided as static arrays in AVCodec. There is a natural symmetry
> between these items and the ones I intend to add (yuv matrix, range,
> chroma location, primaries and gamma) - all of them are descriptive of
> the way data is encoded, and are therefore also (or should be)
> negotiable filter link properties.
> 
> If we add a new callback API, should we then extend it to also include
> all of the existing items from the above list? Is there a reason that
> yuv range supports needs to be more dynamic than the others?
> 
> Food for thought: mjpeg is not the only codec that puts restrictions on
> the format support based on the strictness level. For example,
> yuv4mpegpipe_muxer errors out with a strictness warning if you use
> a non-standard pixel format. And arguably, in this case, this is
> **preferred** behavior over "silently" inserting a scale filter to
> convert to a supported format, as the whole point of y4m2 is to
> encapsulate raw data as-is.
> 
> Should we:
> 
> 1. Add a new dynamic callback that can query lists for all of the above
>    in a way dependent on the strictness level, and use it as
>    a replacement for the static lists currently in AVCodec?
> 
> 2. Continue with the status quo of having these lists be static, plus
>    dynamic checks at open() time, and continue using the "convenience
>    hack" of having ffmpeg_tools automatically restrict limited range mjpeg?
> 
> It is not immediately obvious to me that an automatic conversion to
> a supported format is *necessarily* preferred to erroring out unless the
> user specifies a lower strictness level.
> 
> As for an API, I think that rather than having an AVCodecContext-aware
> callback at all, I would just make callbacks that directly ingest the
> strictness level in AVCodec. That makes it far less of a black box about
> which fields of the AVCodecContext are relevant here.
> 
> i.e.
> 
> struct AVCodec {
>     const enum AVColorRange (*get_color_ranges)(int strictness);
>     const enum AVColorSpace (*get_color_spaces)(int strictness);
>     // ditto for the other parameters?
> }

Ping for review?
Andreas Rheinhardt March 24, 2024, 12:25 p.m. UTC | #9
Niklas Haas:
> On Thu, 08 Feb 2024 13:33:51 +0100 Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote:
>> Sorry for not answering earlier.
>> My intention is to allow users who only want to deal with the common
>> case of a cartesian product to continue to do so, but to also support
>> other usecases.
>> The public function would look like
>>
>> int avcodec_get_supported_config(const AVCodecContext *avctx, const
>> AVCodec *codec, int **supported_configs, unsigned desired_configs,
>> unsigned flags, void *logctx);
>>
>> avctx can be NULL (in which case this allows to return all potential
>> configurations, irrespective of e.g. the level of strictness).
>> codec can be omitted if avctx->codec is set, but if both are supplied
>> and avctx->codec is set, they have to match (like in avcodec_open2()).
>> desired_configs is a bitfield of configs that the user wants to get
>> information about; your patch would have to add flags for color_ranges
>> and color_spaces.
>> supported_configs will on return point to something like an array of
>> struct { int desired_config0, desired_config1,...;}. The order of the
>> entries will be fixed (say coincide with the order of the bits in the
>> desired_configs bitfields).
>> If one member is the unspec value for its type, then this means that it
>> works with everything.
>> supported_configs will be allocated; ownership passes to the user.
>> Using a multidimensional sentinel (that would depend on desired_configs)
>> is clumsy, so there will be two supported ways for this (depends upon a
>> flag to be supplied in flags): One method that really adds a
>> multidimensional sentinel, the other method that writes the number of
>> entries into **supported_configs, so that the first entry starts at
>> (*supported_configs)[1]. This allows users that only want to deal with
>> the factors of a cartesian product separately to continue to do so.
> 
> OTOH this design will necessarily either result in exponential
> explosion, or end up requiring the caller to make assumptions about
> which fields are independent (and should thus be queried separately),
> the moment a codec imposes *any* restriction (cartesian or not) on
> multiple fields at the same time.
> 
> I also think that a `test()` callback, as I previously proposed, is also
> overkill and doesn't actually solve anything. Codecs can already error
> out on invalid configurations at open() time, and any practical use of
> such an API would also end up having to make the cartesian assumption
> one way or the other.
> 
> So in summary, I still think that we should enforce the assumption that
> these fields form a cartesian set - it's simple, fast, useful and
> doesn't overengineer for hypothetical constraints that we couldn't realistically
> address one way or the other. Afaict, all current codecs support
> a cartesian set of metadata, but feel free to correct me if I'm wrong.

Ok, the gain of non-cartesian sets don't warrant the complexity incurred.

- Andreas
Andreas Rheinhardt March 24, 2024, 1:04 p.m. UTC | #10
Niklas Haas:
> On Mon, 05 Feb 2024 19:04:30 +0100 Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote:
>> This presumes the relevant states to be a cartesian product. Which need
>> not be true. A callback would be better; this would also allow to base
>> the list on other values already set in an AVCodecContext. And if this
>> were extended, it would also allow to remove init_static_data one day.
>> It is furthermore quite wasteful to store color_ranges in a list,
>> although there are only very few states for it.
> 
> There is also the consideration to be made that using a callback is
> inconsistent with the established design. Consider that framerates,
> pix_fmts, samplerates, sample fmts and channel layouts are all currently
> provided as static arrays in AVCodec. There is a natural symmetry
> between these items and the ones I intend to add (yuv matrix, range,
> chroma location, primaries and gamma) - all of them are descriptive of
> the way data is encoded, and are therefore also (or should be)
> negotiable filter link properties.
> 
> If we add a new callback API, should we then extend it to also include
> all of the existing items from the above list? Is there a reason that
> yuv range supports needs to be more dynamic than the others?
> 

It should support everything; and I'd like to remove the other (public)
static lists, too (after the necessary deprecations).

> Food for thought: mjpeg is not the only codec that puts restrictions on
> the format support based on the strictness level. For example,
> yuv4mpegpipe_muxer errors out with a strictness warning if you use
> a non-standard pixel format. And arguably, in this case, this is
> **preferred** behavior over "silently" inserting a scale filter to
> convert to a supported format, as the whole point of y4m2 is to
> encapsulate raw data as-is.
> 
> Should we:
> 
> 1. Add a new dynamic callback that can query lists for all of the above
>    in a way dependent on the strictness level, and use it as
>    a replacement for the static lists currently in AVCodec?
> 
> 2. Continue with the status quo of having these lists be static, plus
>    dynamic checks at open() time, and continue using the "convenience
>    hack" of having ffmpeg_tools automatically restrict limited range mjpeg?
> 

I really want this convenience hack removed.

> It is not immediately obvious to me that an automatic conversion to
> a supported format is *necessarily* preferred to erroring out unless the
> user specifies a lower strictness level.

I agree. (In fact, on default strictness, the current code inserts a
scale filter even if one explicitly adds "-color_range tv".)

> 
> As for an API, I think that rather than having an AVCodecContext-aware
> callback at all, I would just make callbacks that directly ingest the
> strictness level in AVCodec. That makes it far less of a black box about
> which fields of the AVCodecContext are relevant here.
> 
> i.e.
> 
> struct AVCodec {
>     const enum AVColorRange (*get_color_ranges)(int strictness);
>     const enum AVColorSpace (*get_color_spaces)(int strictness);
>     // ditto for the other parameters?
> }

Your callbacks would hardcode that the only thing that matters is
strictness. And it would be very expensive, because these fields would
be in every AVCodec, even though only a minority of AVCodecs (namely
video encoders) would use them. (supported_framerates is even only set
by two encoders. What a waste.) Adding an API like

int avcodec_get_supported_config(const AVCodecContext *avctx, const
AVCodec *codec, void **supported_configs, unsigned *num_configs, enum
AVCodecConfigs desired_config,
unsigned flags, void *logctx);
(enum AVCodecConfigs would contain a value for pix fmts, sample fmts etc.)

allows to keep the details hidden and therefore use a compact way to
store it.

- Andreas
Niklas Haas April 3, 2024, 6:55 p.m. UTC | #11
On Sun, 24 Mar 2024 14:04:37 +0100 Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote:
> Niklas Haas:
> > On Mon, 05 Feb 2024 19:04:30 +0100 Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote:
> >> This presumes the relevant states to be a cartesian product. Which need
> >> not be true. A callback would be better; this would also allow to base
> >> the list on other values already set in an AVCodecContext. And if this
> >> were extended, it would also allow to remove init_static_data one day.
> >> It is furthermore quite wasteful to store color_ranges in a list,
> >> although there are only very few states for it.
> > 
> > There is also the consideration to be made that using a callback is
> > inconsistent with the established design. Consider that framerates,
> > pix_fmts, samplerates, sample fmts and channel layouts are all currently
> > provided as static arrays in AVCodec. There is a natural symmetry
> > between these items and the ones I intend to add (yuv matrix, range,
> > chroma location, primaries and gamma) - all of them are descriptive of
> > the way data is encoded, and are therefore also (or should be)
> > negotiable filter link properties.
> > 
> > If we add a new callback API, should we then extend it to also include
> > all of the existing items from the above list? Is there a reason that
> > yuv range supports needs to be more dynamic than the others?
> > 
> 
> It should support everything; and I'd like to remove the other (public)
> static lists, too (after the necessary deprecations).
> 
> > Food for thought: mjpeg is not the only codec that puts restrictions on
> > the format support based on the strictness level. For example,
> > yuv4mpegpipe_muxer errors out with a strictness warning if you use
> > a non-standard pixel format. And arguably, in this case, this is
> > **preferred** behavior over "silently" inserting a scale filter to
> > convert to a supported format, as the whole point of y4m2 is to
> > encapsulate raw data as-is.
> > 
> > Should we:
> > 
> > 1. Add a new dynamic callback that can query lists for all of the above
> >    in a way dependent on the strictness level, and use it as
> >    a replacement for the static lists currently in AVCodec?
> > 
> > 2. Continue with the status quo of having these lists be static, plus
> >    dynamic checks at open() time, and continue using the "convenience
> >    hack" of having ffmpeg_tools automatically restrict limited range mjpeg?
> > 
> 
> I really want this convenience hack removed.
> 
> > It is not immediately obvious to me that an automatic conversion to
> > a supported format is *necessarily* preferred to erroring out unless the
> > user specifies a lower strictness level.
> 
> I agree. (In fact, on default strictness, the current code inserts a
> scale filter even if one explicitly adds "-color_range tv".)
> 
> > 
> > As for an API, I think that rather than having an AVCodecContext-aware
> > callback at all, I would just make callbacks that directly ingest the
> > strictness level in AVCodec. That makes it far less of a black box about
> > which fields of the AVCodecContext are relevant here.
> > 
> > i.e.
> > 
> > struct AVCodec {
> >     const enum AVColorRange (*get_color_ranges)(int strictness);
> >     const enum AVColorSpace (*get_color_spaces)(int strictness);
> >     // ditto for the other parameters?
> > }
> 
> Your callbacks would hardcode that the only thing that matters is
> strictness. And it would be very expensive, because these fields would
> be in every AVCodec, even though only a minority of AVCodecs (namely
> video encoders) would use them. (supported_framerates is even only set
> by two encoders. What a waste.) Adding an API like
> 
> int avcodec_get_supported_config(const AVCodecContext *avctx, const
> AVCodec *codec, void **supported_configs, unsigned *num_configs, enum
> AVCodecConfigs desired_config,
> unsigned flags, void *logctx);
> (enum AVCodecConfigs would contain a value for pix fmts, sample fmts etc.)

Having an extra `logctx` here seems redundant and inconsistent with
other avctx-taking functions, which log to the provided `avctx`.

> 
> allows to keep the details hidden and therefore use a compact way to
> store it.
> 
> - Andreas
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
diff mbox series

Patch

diff --git a/doc/APIchanges b/doc/APIchanges
index 1f5724324a..7849ce47d9 100644
--- a/doc/APIchanges
+++ b/doc/APIchanges
@@ -2,6 +2,9 @@  The last version increases of all libraries were on 2023-02-09
 
 API changes, most recent first:
 
+2024-02-xx - xxxxxxxxxx - lavc 60.40.100 - avcodec.h
+  Add AVCodec.color_ranges and AVCodec.color_spaces.
+
 2024-02-04 - xxxxxxxxxx - lavc 60.39.100 - packet.h
   Add AV_PKT_DATA_AMBIENT_VIEWING_ENVIRONMENT.
 
diff --git a/libavcodec/codec.h b/libavcodec/codec.h
index 8034f1a53c..8bd678de7a 100644
--- a/libavcodec/codec.h
+++ b/libavcodec/codec.h
@@ -235,6 +235,12 @@  typedef struct AVCodec {
      * Array of supported channel layouts, terminated with a zeroed layout.
      */
     const AVChannelLayout *ch_layouts;
+
+    /**
+     * Array of supported YUV color formats. Ignored for RGB/Gray formats.
+     */
+    const enum AVColorRange *color_ranges; ///< terminated by AVCOL_RANGE_UNSPECIFIED
+    const enum AVColorSpace *color_spaces; ///< terminated by AVCOL_SPC_UNSPECIFIED
 } AVCodec;
 
 /**
diff --git a/libavcodec/version.h b/libavcodec/version.h
index f2f14eaed1..19f3f4a272 100644
--- a/libavcodec/version.h
+++ b/libavcodec/version.h
@@ -29,7 +29,7 @@ 
 
 #include "version_major.h"
 
-#define LIBAVCODEC_VERSION_MINOR  39
+#define LIBAVCODEC_VERSION_MINOR  40
 #define LIBAVCODEC_VERSION_MICRO 100
 
 #define LIBAVCODEC_VERSION_INT  AV_VERSION_INT(LIBAVCODEC_VERSION_MAJOR, \