[FFmpeg-devel,001/244] Add a new channel layout API

Message ID	20191206101710.22028-1-anton@khirnov.net
State	New
Headers	show Return-Path: <ffmpeg-devel-bounces@ffmpeg.org> From: Anton Khirnov <anton@khirnov.net> To: ffmpeg-devel@ffmpeg.org Date: Fri, 6 Dec 2019 11:16:55 +0100 Message-Id: <20191206101710.22028-1-anton@khirnov.net> In-Reply-To: <CAPYw7P4j0tgG=M8NPBRjjOA5LRq4jGWFWAwXnqXVs5GoVw8bKQ@mail.gmail.com> References: <CAPYw7P4j0tgG=M8NPBRjjOA5LRq4jGWFWAwXnqXVs5GoVw8bKQ@mail.gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 001/244] Add a new channel layout API Precedence: list Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>

Anton Khirnov Dec. 6, 2019, 10:16 a.m. UTC

The new API is more extensible and allows for custom layouts.
More accurate information is exported, eg for decoders that do not
set a channel layout, lavc will not make one up for them.

Deprecate the old API working with just uint64_t bitmasks.

Expanded and completed by Vittorio Giovara <vittorio.giovara@gmail.com>.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
---
 libavutil/channel_layout.c | 393 +++++++++++++++++++++++++++++------
 libavutil/channel_layout.h | 415 ++++++++++++++++++++++++++++++++++---
 libavutil/version.h        |   3 +
 3 files changed, 719 insertions(+), 92 deletions(-)

Nicolas George Dec. 7, 2019, 8:25 p.m. UTC | #1

Anton Khirnov (12019-12-06):
> The new API is more extensible and allows for custom layouts.
> More accurate information is exported, eg for decoders that do not
> set a channel layout, lavc will not make one up for them.
> 
> Deprecate the old API working with just uint64_t bitmasks.
> 
> Expanded and completed by Vittorio Giovara <vittorio.giovara@gmail.com>.
> Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
> ---
>  libavutil/channel_layout.c | 393 +++++++++++++++++++++++++++++------
>  libavutil/channel_layout.h | 415 ++++++++++++++++++++++++++++++++++---
>  libavutil/version.h        |   3 +
>  3 files changed, 719 insertions(+), 92 deletions(-)

Thanks for sending this here. I will make a few preliminary remarks
below.

Before that, I want to mention I think the real trial for this API will
be integration with libavfilter, especially the format negotiation and
the user interface of the filters that allow fine control of channels.
The negotiation, in particular, promises to be an interesting
algorithmic problem. By "real trial", I mean that it is very possible
that trying to implement that will make us realize some details need to
be changed. I have these questions in mind while writing my comments.

Two important comments to consider, because they affect the structure of
the API itself:

- about av_channel_layout_describe(), a general reflection on functions
  that return strings;

- about getting a channel by name.

> 
> diff --git a/libavutil/channel_layout.c b/libavutil/channel_layout.c
> index 3bd5ee29b7..b7077ed5fd 100644
> --- a/libavutil/channel_layout.c
> +++ b/libavutil/channel_layout.c
> @@ -37,75 +37,89 @@ struct channel_name {
>  };
>  
>  static const struct channel_name channel_names[] = {
> -     [0] = { "FL",        "front left"            },
> -     [1] = { "FR",        "front right"           },
> -     [2] = { "FC",        "front center"          },
> -     [3] = { "LFE",       "low frequency"         },
> -     [4] = { "BL",        "back left"             },
> -     [5] = { "BR",        "back right"            },
> -     [6] = { "FLC",       "front left-of-center"  },
> -     [7] = { "FRC",       "front right-of-center" },
> -     [8] = { "BC",        "back center"           },
> -     [9] = { "SL",        "side left"             },
> -    [10] = { "SR",        "side right"            },
> -    [11] = { "TC",        "top center"            },
> -    [12] = { "TFL",       "top front left"        },
> -    [13] = { "TFC",       "top front center"      },
> -    [14] = { "TFR",       "top front right"       },
> -    [15] = { "TBL",       "top back left"         },
> -    [16] = { "TBC",       "top back center"       },
> -    [17] = { "TBR",       "top back right"        },
> -    [29] = { "DL",        "downmix left"          },
> -    [30] = { "DR",        "downmix right"         },
> -    [31] = { "WL",        "wide left"             },
> -    [32] = { "WR",        "wide right"            },
> -    [33] = { "SDL",       "surround direct left"  },
> -    [34] = { "SDR",       "surround direct right" },
> -    [35] = { "LFE2",      "low frequency 2"       },
> +    [AV_CHAN_FRONT_LEFT           ] = { "FL",        "front left"            },
> +    [AV_CHAN_FRONT_RIGHT          ] = { "FR",        "front right"           },
> +    [AV_CHAN_FRONT_CENTER         ] = { "FC",        "front center"          },
> +    [AV_CHAN_LOW_FREQUENCY        ] = { "LFE",       "low frequency"         },
> +    [AV_CHAN_BACK_LEFT            ] = { "BL",        "back left"             },
> +    [AV_CHAN_BACK_RIGHT           ] = { "BR",        "back right"            },
> +    [AV_CHAN_FRONT_LEFT_OF_CENTER ] = { "FLC",       "front left-of-center"  },
> +    [AV_CHAN_FRONT_RIGHT_OF_CENTER] = { "FRC",       "front right-of-center" },
> +    [AV_CHAN_BACK_CENTER          ] = { "BC",        "back center"           },
> +    [AV_CHAN_SIDE_LEFT            ] = { "SL",        "side left"             },
> +    [AV_CHAN_SIDE_RIGHT           ] = { "SR",        "side right"            },
> +    [AV_CHAN_TOP_CENTER           ] = { "TC",        "top center"            },
> +    [AV_CHAN_TOP_FRONT_LEFT       ] = { "TFL",       "top front left"        },
> +    [AV_CHAN_TOP_FRONT_CENTER     ] = { "TFC",       "top front center"      },
> +    [AV_CHAN_TOP_FRONT_RIGHT      ] = { "TFR",       "top front right"       },
> +    [AV_CHAN_TOP_BACK_LEFT        ] = { "TBL",       "top back left"         },
> +    [AV_CHAN_TOP_BACK_CENTER      ] = { "TBC",       "top back center"       },
> +    [AV_CHAN_TOP_BACK_RIGHT       ] = { "TBR",       "top back right"        },
> +    [AV_CHAN_STEREO_LEFT          ] = { "DL",        "downmix left"          },
> +    [AV_CHAN_STEREO_RIGHT         ] = { "DR",        "downmix right"         },
> +    [AV_CHAN_WIDE_LEFT            ] = { "WL",        "wide left"             },
> +    [AV_CHAN_WIDE_RIGHT           ] = { "WR",        "wide right"            },
> +    [AV_CHAN_SURROUND_DIRECT_LEFT ] = { "SDL",       "surround direct left"  },
> +    [AV_CHAN_SURROUND_DIRECT_RIGHT] = { "SDR",       "surround direct right" },
> +    [AV_CHAN_LOW_FREQUENCY_2      ] = { "LFE2",      "low frequency 2"       },
> +    [AV_CHAN_SILENCE              ] = { "PAD",       "silence"               },
>  };
>  
> -static const char *get_channel_name(int channel_id)
> +const char *av_channel_name(enum AVChannel channel_id)
>  {
> -    if (channel_id < 0 || channel_id >= FF_ARRAY_ELEMS(channel_names))
> -        return NULL;
> +    if ((unsigned) channel_id >= FF_ARRAY_ELEMS(channel_names) ||
> +        !channel_names[channel_id].name)
> +        return "?";
>      return channel_names[channel_id].name;
>  }
>  
> +int av_channel_from_string(const char *str)
> +{
> +    int i;
> +    for (i = 0; i < FF_ARRAY_ELEMS(channel_names); i++) {
> +        if (channel_names[i].name && !strcmp(str, channel_names[i].name))
> +            return i;
> +    }
> +    return AVERROR(EINVAL);
> +}
> +
>  static const struct {
>      const char *name;
> -    int         nb_channels;
> -    uint64_t     layout;
> +    AVChannelLayout layout;
>  } channel_layout_map[] = {
> -    { "mono",        1,  AV_CH_LAYOUT_MONO },
> -    { "stereo",      2,  AV_CH_LAYOUT_STEREO },
> -    { "2.1",         3,  AV_CH_LAYOUT_2POINT1 },
> -    { "3.0",         3,  AV_CH_LAYOUT_SURROUND },
> -    { "3.0(back)",   3,  AV_CH_LAYOUT_2_1 },
> -    { "4.0",         4,  AV_CH_LAYOUT_4POINT0 },
> -    { "quad",        4,  AV_CH_LAYOUT_QUAD },
> -    { "quad(side)",  4,  AV_CH_LAYOUT_2_2 },
> -    { "3.1",         4,  AV_CH_LAYOUT_3POINT1 },
> -    { "5.0",         5,  AV_CH_LAYOUT_5POINT0_BACK },
> -    { "5.0(side)",   5,  AV_CH_LAYOUT_5POINT0 },
> -    { "4.1",         5,  AV_CH_LAYOUT_4POINT1 },
> -    { "5.1",         6,  AV_CH_LAYOUT_5POINT1_BACK },
> -    { "5.1(side)",   6,  AV_CH_LAYOUT_5POINT1 },
> -    { "6.0",         6,  AV_CH_LAYOUT_6POINT0 },
> -    { "6.0(front)",  6,  AV_CH_LAYOUT_6POINT0_FRONT },
> -    { "hexagonal",   6,  AV_CH_LAYOUT_HEXAGONAL },
> -    { "6.1",         7,  AV_CH_LAYOUT_6POINT1 },
> -    { "6.1(back)",   7,  AV_CH_LAYOUT_6POINT1_BACK },
> -    { "6.1(front)",  7,  AV_CH_LAYOUT_6POINT1_FRONT },
> -    { "7.0",         7,  AV_CH_LAYOUT_7POINT0 },
> -    { "7.0(front)",  7,  AV_CH_LAYOUT_7POINT0_FRONT },
> -    { "7.1",         8,  AV_CH_LAYOUT_7POINT1 },
> -    { "7.1(wide)",   8,  AV_CH_LAYOUT_7POINT1_WIDE_BACK },
> -    { "7.1(wide-side)",   8,  AV_CH_LAYOUT_7POINT1_WIDE },
> -    { "octagonal",   8,  AV_CH_LAYOUT_OCTAGONAL },
> -    { "hexadecagonal", 16, AV_CH_LAYOUT_HEXADECAGONAL },
> -    { "downmix",     2,  AV_CH_LAYOUT_STEREO_DOWNMIX, },
> +    { "mono",           AV_CHANNEL_LAYOUT_MONO                },
> +    { "stereo",         AV_CHANNEL_LAYOUT_STEREO              },
> +    { "stereo",         AV_CHANNEL_LAYOUT_STEREO_DOWNMIX      },
> +    { "2.1",            AV_CHANNEL_LAYOUT_2POINT1             },
> +    { "3.0",            AV_CHANNEL_LAYOUT_SURROUND            },
> +    { "3.0(back)",      AV_CHANNEL_LAYOUT_2_1                 },
> +    { "4.0",            AV_CHANNEL_LAYOUT_4POINT0             },
> +    { "quad",           AV_CHANNEL_LAYOUT_QUAD                },
> +    { "quad(side)",     AV_CHANNEL_LAYOUT_2_2                 },
> +    { "3.1",            AV_CHANNEL_LAYOUT_3POINT1             },
> +    { "5.0",            AV_CHANNEL_LAYOUT_5POINT0_BACK        },
> +    { "5.0(side)",      AV_CHANNEL_LAYOUT_5POINT0             },
> +    { "4.1",            AV_CHANNEL_LAYOUT_4POINT1             },
> +    { "5.1",            AV_CHANNEL_LAYOUT_5POINT1_BACK        },
> +    { "5.1(side)",      AV_CHANNEL_LAYOUT_5POINT1             },
> +    { "6.0",            AV_CHANNEL_LAYOUT_6POINT0             },
> +    { "6.0(front)",     AV_CHANNEL_LAYOUT_6POINT0_FRONT       },
> +    { "hexagonal",      AV_CHANNEL_LAYOUT_HEXAGONAL           },
> +    { "6.1",            AV_CHANNEL_LAYOUT_6POINT1             },
> +    { "6.1(back)",      AV_CHANNEL_LAYOUT_6POINT1_BACK        },
> +    { "6.1(front)",     AV_CHANNEL_LAYOUT_6POINT1_FRONT       },
> +    { "7.0",            AV_CHANNEL_LAYOUT_7POINT0             },
> +    { "7.0(front)",     AV_CHANNEL_LAYOUT_7POINT0_FRONT       },
> +    { "7.1",            AV_CHANNEL_LAYOUT_7POINT1             },
> +    { "7.1(wide)",      AV_CHANNEL_LAYOUT_7POINT1_WIDE_BACK   },
> +    { "7.1(wide-side)", AV_CHANNEL_LAYOUT_7POINT1_WIDE        },
> +    { "octagonal",      AV_CHANNEL_LAYOUT_OCTAGONAL           },
> +    { "hexadecagonal",  AV_CHANNEL_LAYOUT_HEXADECAGONAL       },
> +    { "downmix",        AV_CHANNEL_LAYOUT_STEREO_DOWNMIX,     },
>  };
>  
> +#if FF_API_OLD_CHANNEL_LAYOUT
> +FF_DISABLE_DEPRECATION_WARNINGS
>  static uint64_t get_channel_layout_single(const char *name, int name_len)
>  {
>      int i;
> @@ -115,7 +129,7 @@ static uint64_t get_channel_layout_single(const char *name, int name_len)
>      for (i = 0; i < FF_ARRAY_ELEMS(channel_layout_map); i++) {
>          if (strlen(channel_layout_map[i].name) == name_len &&
>              !memcmp(channel_layout_map[i].name, name, name_len))
> -            return channel_layout_map[i].layout;
> +            return channel_layout_map[i].layout.u.mask;
>      }
>      for (i = 0; i < FF_ARRAY_ELEMS(channel_names); i++)
>          if (channel_names[i].name &&
> @@ -183,8 +197,8 @@ void av_bprint_channel_layout(struct AVBPrint *bp,
>          nb_channels = av_get_channel_layout_nb_channels(channel_layout);
>  
>      for (i = 0; i < FF_ARRAY_ELEMS(channel_layout_map); i++)
> -        if (nb_channels    == channel_layout_map[i].nb_channels &&
> -            channel_layout == channel_layout_map[i].layout) {
> +        if (nb_channels    == channel_layout_map[i].layout.nb_channels &&
> +            channel_layout == channel_layout_map[i].layout.u.mask) {
>              av_bprintf(bp, "%s", channel_layout_map[i].name);
>              return;
>          }
> @@ -195,7 +209,7 @@ void av_bprint_channel_layout(struct AVBPrint *bp,
>          av_bprintf(bp, " (");
>          for (i = 0, ch = 0; i < 64; i++) {
>              if ((channel_layout & (UINT64_C(1) << i))) {
> -                const char *name = get_channel_name(i);
> +                const char *name = av_channel_name(i);
>                  if (name) {
>                      if (ch > 0)
>                          av_bprintf(bp, "+");
> @@ -225,8 +239,8 @@ int av_get_channel_layout_nb_channels(uint64_t channel_layout)
>  int64_t av_get_default_channel_layout(int nb_channels) {
>      int i;
>      for (i = 0; i < FF_ARRAY_ELEMS(channel_layout_map); i++)
> -        if (nb_channels == channel_layout_map[i].nb_channels)
> -            return channel_layout_map[i].layout;
> +        if (nb_channels == channel_layout_map[i].layout.nb_channels)
> +            return channel_layout_map[i].layout.u.mask;
>      return 0;
>  }
>  
> @@ -247,7 +261,7 @@ const char *av_get_channel_name(uint64_t channel)
>          return NULL;
>      for (i = 0; i < 64; i++)
>          if ((1ULL<<i) & channel)
> -            return get_channel_name(i);
> +            return av_channel_name(i);
>      return NULL;
>  }
>  
> @@ -281,7 +295,254 @@ int av_get_standard_channel_layout(unsigned index, uint64_t *layout,
>  {
>      if (index >= FF_ARRAY_ELEMS(channel_layout_map))
>          return AVERROR_EOF;
> -    if (layout) *layout = channel_layout_map[index].layout;
> +    if (layout) *layout = channel_layout_map[index].layout.u.mask;
>      if (name)   *name   = channel_layout_map[index].name;
>      return 0;
>  }
> +FF_ENABLE_DEPRECATION_WARNINGS
> +#endif
> +

> +int av_channel_layout_from_mask(AVChannelLayout *channel_layout,
> +                                uint64_t mask)

I find that "channel_layout" all over the place makes the code bulky,
harder to read. And tiring to write. We could decide on a standard
shortening for it everywhere. For example chlayout.

> +{
> +    if (!mask)
> +        return AVERROR(EINVAL);
> +
> +    channel_layout->order       = AV_CHANNEL_ORDER_NATIVE;
> +    channel_layout->nb_channels = av_popcount64(mask);
> +    channel_layout->u.mask      = mask;
> +
> +    return 0;
> +}
> +
> +int av_channel_layout_from_string(AVChannelLayout *channel_layout,
> +                                  const char *str)
> +{
> +    int i, channels;
> +    const char *dup = str;
> +    uint64_t mask = 0;
> +
> +    /* channel layout names */
> +    for (i = 0; i < FF_ARRAY_ELEMS(channel_layout_map); i++) {
> +        if (channel_layout_map[i].name && !strcmp(str, channel_layout_map[i].name)) {
> +            *channel_layout = channel_layout_map[i].layout;
> +            return 0;
> +        }
> +    }
> +
> +    /* channel names */
> +    while (*dup) {
> +        char *chname = av_get_token(&dup, "|");
> +        if (!chname)
> +            return AVERROR(ENOMEM);
> +        if (*dup)
> +            dup++; // skip separator
> +        for (i = 0; i < FF_ARRAY_ELEMS(channel_names); i++) {
> +            if (channel_names[i].name && !strcmp(chname, channel_names[i].name)) {
> +                mask |= 1ULL << i;
> +            }
> +        }
> +        av_free(chname);
> +    }
> +    if (mask) {
> +        av_channel_layout_from_mask(channel_layout, mask);
> +        return 0;
> +    }
> +
> +    /* channel layout mask */
> +    if (!strncmp(str, "0x", 2) && sscanf(str + 2, "%"SCNu64, &mask) == 1) {
> +        av_channel_layout_from_mask(channel_layout, mask);
> +        return 0;
> +    }
> +
> +    /* number of channels */
> +    if (sscanf(str, "%d", &channels) == 1) {
> +        av_channel_layout_default(channel_layout, channels);
> +        return 0;
> +    }
> +
> +    /* number of unordered channels */
> +    if (sscanf(str, "%d channels", &channel_layout->nb_channels) == 1) {
> +        channel_layout->order = AV_CHANNEL_ORDER_UNSPEC;
> +        return 0;
> +    }
> +
> +    return AVERROR_INVALIDDATA;
> +}
> +
> +void av_channel_layout_uninit(AVChannelLayout *channel_layout)
> +{
> +    if (channel_layout->order == AV_CHANNEL_ORDER_CUSTOM)
> +        av_freep(&channel_layout->u.map);
> +    memset(channel_layout, 0, sizeof(*channel_layout));
> +}
> +
> +int av_channel_layout_copy(AVChannelLayout *dst, const AVChannelLayout *src)
> +{
> +    av_channel_layout_uninit(dst);
> +    *dst = *src;
> +    if (src->order == AV_CHANNEL_ORDER_CUSTOM) {
> +        dst->u.map = av_malloc(src->nb_channels * sizeof(*dst->u.map));
> +        if (!dst->u.map)
> +            return AVERROR(ENOMEM);
> +        memcpy(dst->u.map, src->u.map, src->nb_channels * sizeof(*src->u.map));
> +    }
> +    return 0;
> +}
> +
> +char *av_channel_layout_describe(const AVChannelLayout *channel_layout)
> +{
> +    int i;
> +
> +    switch (channel_layout->order) {
> +    case AV_CHANNEL_ORDER_NATIVE:
> +        for (i = 0; channel_layout_map[i].name; i++)
> +            if (channel_layout->u.mask == channel_layout_map[i].layout.u.mask)
> +                return av_strdup(channel_layout_map[i].name);
> +        // fall-through
> +    case AV_CHANNEL_ORDER_CUSTOM: {
> +        // max 4 bytes for channel name + a separator
> +        int size = 5 * channel_layout->nb_channels + 1;
> +        char *ret;
> +
> +        ret = av_mallocz(size);
> +        if (!ret)
> +            return NULL;
> +
> +        for (i = 0; i < channel_layout->nb_channels; i++) {
> +            enum AVChannel ch = av_channel_layout_get_channel(channel_layout, i);
> +            const char *ch_name = av_channel_name(ch);
> +
> +            if (i)
> +                av_strlcat(ret, "|", size);
> +            av_strlcat(ret, ch_name, size);
> +        }
> +        return ret;
> +        }
> +    case AV_CHANNEL_ORDER_UNSPEC: {
> +        char buf[64];
> +        snprintf(buf, sizeof(buf), "%d channels", channel_layout->nb_channels);
> +        return av_strdup(buf);
> +        }
> +    default:
> +        return NULL;
> +    }
> +}
> +
> +int av_channel_layout_get_channel(const AVChannelLayout *channel_layout, int idx)
> +{
> +    int i;
> +
> +    if (idx < 0 || idx >= channel_layout->nb_channels)
> +        return AVERROR(EINVAL);
> +
> +    switch (channel_layout->order) {
> +    case AV_CHANNEL_ORDER_CUSTOM:
> +        return channel_layout->u.map[idx];
> +    case AV_CHANNEL_ORDER_NATIVE:
> +        for (i = 0; i < 64; i++) {
> +            if ((1ULL << i) & channel_layout->u.mask && !idx--)
> +                return i;
> +        }
> +    default:
> +        return AVERROR(EINVAL);
> +    }
> +}
> +
> +int av_channel_layout_channel_index(const AVChannelLayout *channel_layout,
> +                                    enum AVChannel channel)
> +{
> +    int i;
> +
> +    switch (channel_layout->order) {
> +    case AV_CHANNEL_ORDER_CUSTOM:
> +        for (i = 0; i < channel_layout->nb_channels; i++)
> +            if (channel_layout->u.map[i] == channel)
> +                return i;
> +        return AVERROR(EINVAL);
> +    case AV_CHANNEL_ORDER_NATIVE: {
> +        uint64_t mask = channel_layout->u.mask;
> +        if (!(mask & (1ULL << channel)))
> +            return AVERROR(EINVAL);
> +        mask &= (1ULL << channel) - 1;
> +        return av_popcount64(mask);
> +        }
> +    default:
> +        return AVERROR(EINVAL);
> +    }
> +}
> +
> +int av_channel_layout_check(const AVChannelLayout *channel_layout)
> +{
> +    if (!channel_layout || channel_layout->nb_channels <= 0)
> +        return 0;
> +
> +    switch (channel_layout->order) {
> +    case AV_CHANNEL_ORDER_NATIVE:
> +        return av_popcount64(channel_layout->u.mask) == channel_layout->nb_channels;
> +    case AV_CHANNEL_ORDER_CUSTOM:
> +        return !!channel_layout->u.map;
> +    case AV_CHANNEL_ORDER_UNSPEC:
> +        return 1;
> +    default:
> +        return 0;
> +    }
> +}
> +
> +int av_channel_layout_compare(const AVChannelLayout *chl, const AVChannelLayout *chl1)
> +{
> +    int i;
> +
> +    /* different channel counts -> not equal */
> +    if (chl->nb_channels != chl1->nb_channels)
> +        return 1;
> +
> +    /* if only one is unspecified -> not equal */
> +    if ((chl->order  == AV_CHANNEL_ORDER_UNSPEC) !=
> +        (chl1->order == AV_CHANNEL_ORDER_UNSPEC))
> +        return 1;
> +    /* both are unspecified -> equal */
> +    else if (chl->order == AV_CHANNEL_ORDER_UNSPEC)
> +        return 0;
> +
> +    /* can compare masks directly */
> +    if (chl->order != AV_CHANNEL_ORDER_CUSTOM &&
> +        chl->order == chl1->order)
> +        return chl->u.mask != chl1->u.mask;
> +
> +    /* compare channel by channel */
> +    for (i = 0; i < chl->nb_channels; i++)
> +        if (av_channel_layout_get_channel(chl,  i) !=
> +            av_channel_layout_get_channel(chl1, i))
> +            return 1;
> +    return 0;
> +}
> +
> +void av_channel_layout_default(AVChannelLayout *ch_layout, int nb_channels)
> +{
> +    int i;
> +    for (i = 0; i < FF_ARRAY_ELEMS(channel_layout_map); i++)
> +        if (nb_channels == channel_layout_map[i].layout.nb_channels) {
> +            *ch_layout = channel_layout_map[i].layout;
> +            return;
> +        }
> +
> +    ch_layout->order       = AV_CHANNEL_ORDER_UNSPEC;
> +    ch_layout->nb_channels = nb_channels;
> +}
> +
> +uint64_t av_channel_layout_subset(const AVChannelLayout *channel_layout,
> +                                  uint64_t mask)
> +{
> +    uint64_t ret = 0;
> +    int i;
> +
> +    if (channel_layout->order == AV_CHANNEL_ORDER_NATIVE)
> +        return channel_layout->u.mask & mask;
> +
> +    for (i = 0; i < 64; i++)
> +        if (mask & (1ULL << i) && av_channel_layout_channel_index(channel_layout, i) >= 0)
> +            ret |= (1ULL << i);
> +
> +    return ret;
> +}
> diff --git a/libavutil/channel_layout.h b/libavutil/channel_layout.h
> index 50bb8f03c5..8ba56651f6 100644
> --- a/libavutil/channel_layout.h
> +++ b/libavutil/channel_layout.h
> @@ -24,6 +24,9 @@
>  
>  #include <stdint.h>
>  
> +#include "version.h"
> +#include "attributes.h"
> +
>  /**
>   * @file
>   * audio channel layout utility functions
> @@ -34,6 +37,60 @@
>   * @{
>   */
>  
> +enum AVChannel {
> +    AV_CHAN_FRONT_LEFT,
> +    AV_CHAN_FRONT_RIGHT,
> +    AV_CHAN_FRONT_CENTER,
> +    AV_CHAN_LOW_FREQUENCY,
> +    AV_CHAN_BACK_LEFT,
> +    AV_CHAN_BACK_RIGHT,
> +    AV_CHAN_FRONT_LEFT_OF_CENTER,
> +    AV_CHAN_FRONT_RIGHT_OF_CENTER,
> +    AV_CHAN_BACK_CENTER,
> +    AV_CHAN_SIDE_LEFT,
> +    AV_CHAN_SIDE_RIGHT,
> +    AV_CHAN_TOP_CENTER,
> +    AV_CHAN_TOP_FRONT_LEFT,
> +    AV_CHAN_TOP_FRONT_CENTER,
> +    AV_CHAN_TOP_FRONT_RIGHT,
> +    AV_CHAN_TOP_BACK_LEFT,
> +    AV_CHAN_TOP_BACK_CENTER,
> +    AV_CHAN_TOP_BACK_RIGHT,
> +    /** Stereo downmix. */
> +    AV_CHAN_STEREO_LEFT = 29,
> +    /** See above. */
> +    AV_CHAN_STEREO_RIGHT,
> +    AV_CHAN_WIDE_LEFT,
> +    AV_CHAN_WIDE_RIGHT,
> +    AV_CHAN_SURROUND_DIRECT_LEFT,
> +    AV_CHAN_SURROUND_DIRECT_RIGHT,
> +    AV_CHAN_LOW_FREQUENCY_2,
> +
> +    /** Channel is empty can be safely skipped. */
> +    AV_CHAN_SILENCE = 64,
> +};
> +
> +enum AVChannelOrder {
> +    /**
> +     * The native channel order, i.e. the channels are in the same order in
> +     * which they are defined in the AVChannel enum. This supports up to 63
> +     * different channels.
> +     */
> +    AV_CHANNEL_ORDER_NATIVE,
> +    /**
> +     * The channel order does not correspond to any other predefined order and
> +     * is stored as an explicit map. For example, this could be used to support

> +     * layouts with 64 or more channels, or with channels that could be skipped.

[ Already said on 2019-10-30 ]
"or with channels that could be skipped"? What does that mean. If a
channel is skipped, it is not present, that can be expressed in the
mask. Is this a mistake for "duplicated"?

> +     */
> +    AV_CHANNEL_ORDER_CUSTOM,
> +    /**
> +     * Only the channel count is specified, without any further information
> +     * about the channel order.
> +     */
> +    AV_CHANNEL_ORDER_UNSPEC,
> +};
> +
> +
>  /**
>   * @defgroup channel_masks Audio channel masks
>   *
> @@ -46,36 +103,41 @@
>   *
>   * @{
>   */
> -#define AV_CH_FRONT_LEFT             0x00000001
> -#define AV_CH_FRONT_RIGHT            0x00000002
> -#define AV_CH_FRONT_CENTER           0x00000004
> -#define AV_CH_LOW_FREQUENCY          0x00000008
> -#define AV_CH_BACK_LEFT              0x00000010
> -#define AV_CH_BACK_RIGHT             0x00000020
> -#define AV_CH_FRONT_LEFT_OF_CENTER   0x00000040
> -#define AV_CH_FRONT_RIGHT_OF_CENTER  0x00000080
> -#define AV_CH_BACK_CENTER            0x00000100
> -#define AV_CH_SIDE_LEFT              0x00000200
> -#define AV_CH_SIDE_RIGHT             0x00000400
> -#define AV_CH_TOP_CENTER             0x00000800
> -#define AV_CH_TOP_FRONT_LEFT         0x00001000
> -#define AV_CH_TOP_FRONT_CENTER       0x00002000
> -#define AV_CH_TOP_FRONT_RIGHT        0x00004000
> -#define AV_CH_TOP_BACK_LEFT          0x00008000
> -#define AV_CH_TOP_BACK_CENTER        0x00010000
> -#define AV_CH_TOP_BACK_RIGHT         0x00020000
> -#define AV_CH_STEREO_LEFT            0x20000000  ///< Stereo downmix.
> -#define AV_CH_STEREO_RIGHT           0x40000000  ///< See AV_CH_STEREO_LEFT.
> -#define AV_CH_WIDE_LEFT              0x0000000080000000ULL
> -#define AV_CH_WIDE_RIGHT             0x0000000100000000ULL
> -#define AV_CH_SURROUND_DIRECT_LEFT   0x0000000200000000ULL
> -#define AV_CH_SURROUND_DIRECT_RIGHT  0x0000000400000000ULL
> -#define AV_CH_LOW_FREQUENCY_2        0x0000000800000000ULL
> +#define AV_CH_FRONT_LEFT             (1ULL << AV_CHAN_FRONT_LEFT           )
> +#define AV_CH_FRONT_RIGHT            (1ULL << AV_CHAN_FRONT_RIGHT          )
> +#define AV_CH_FRONT_CENTER           (1ULL << AV_CHAN_FRONT_CENTER         )
> +#define AV_CH_LOW_FREQUENCY          (1ULL << AV_CHAN_LOW_FREQUENCY        )
> +#define AV_CH_BACK_LEFT              (1ULL << AV_CHAN_BACK_LEFT            )
> +#define AV_CH_BACK_RIGHT             (1ULL << AV_CHAN_BACK_RIGHT           )
> +#define AV_CH_FRONT_LEFT_OF_CENTER   (1ULL << AV_CHAN_FRONT_LEFT_OF_CENTER )
> +#define AV_CH_FRONT_RIGHT_OF_CENTER  (1ULL << AV_CHAN_FRONT_RIGHT_OF_CENTER)
> +#define AV_CH_BACK_CENTER            (1ULL << AV_CHAN_BACK_CENTER          )
> +#define AV_CH_SIDE_LEFT              (1ULL << AV_CHAN_SIDE_LEFT            )
> +#define AV_CH_SIDE_RIGHT             (1ULL << AV_CHAN_SIDE_RIGHT           )
> +#define AV_CH_TOP_CENTER             (1ULL << AV_CHAN_TOP_CENTER           )
> +#define AV_CH_TOP_FRONT_LEFT         (1ULL << AV_CHAN_TOP_FRONT_LEFT       )
> +#define AV_CH_TOP_FRONT_CENTER       (1ULL << AV_CHAN_TOP_FRONT_CENTER     )
> +#define AV_CH_TOP_FRONT_RIGHT        (1ULL << AV_CHAN_TOP_FRONT_RIGHT      )
> +#define AV_CH_TOP_BACK_LEFT          (1ULL << AV_CHAN_TOP_BACK_LEFT        )
> +#define AV_CH_TOP_BACK_CENTER        (1ULL << AV_CHAN_TOP_BACK_CENTER      )
> +#define AV_CH_TOP_BACK_RIGHT         (1ULL << AV_CHAN_TOP_BACK_RIGHT       )
> +#define AV_CH_STEREO_LEFT            (1ULL << AV_CHAN_STEREO_LEFT          )
> +#define AV_CH_STEREO_RIGHT           (1ULL << AV_CHAN_STEREO_RIGHT         )
> +#define AV_CH_WIDE_LEFT              (1ULL << AV_CHAN_WIDE_LEFT            )
> +#define AV_CH_WIDE_RIGHT             (1ULL << AV_CHAN_WIDE_RIGHT           )
> +#define AV_CH_SURROUND_DIRECT_LEFT   (1ULL << AV_CHAN_SURROUND_DIRECT_LEFT )
> +#define AV_CH_SURROUND_DIRECT_RIGHT  (1ULL << AV_CHAN_SURROUND_DIRECT_RIGHT)
> +#define AV_CH_LOW_FREQUENCY_2        (1ULL << AV_CHAN_LOW_FREQUENCY_2      )
>  
> +#if FF_API_OLD_CHANNEL_LAYOUT
>  /** Channel mask value used for AVCodecContext.request_channel_layout
>      to indicate that the user requests the channel order of the decoder output
> -    to be the native codec channel order. */
> +    to be the native codec channel order.
> +    @deprecated channel order is now indicated in a special field in
> +                AVChannelLayout
> +    */
>  #define AV_CH_LAYOUT_NATIVE          0x8000000000000000ULL
> +#endif
>  
>  /**
>   * @}
> @@ -122,6 +184,139 @@ enum AVMatrixEncoding {
>      AV_MATRIX_ENCODING_NB
>  };
>  
> +/**
> + * @}
> + */
> +
> +/**
> + * An AVChannelLayout holds information about the channel layout of audio data.
> + *
> + * A channel layout here is defined as a set of channels ordered in a specific
> + * way (unless the channel order is AV_CHANNEL_ORDER_UNSPEC, in which case an
> + * AVChannelLayout carries only the channel count).
> + *

> + * Unlike most structures in Libav, sizeof(AVChannelLayout) is a part of the
> + * public ABI and may be used by the caller. E.g. it may be allocated on stack.
> + * In particular, this structure can be initialized as follows:
> + * - default initialization with {0} or by setting all used fields correctly
> + * - with predefined layout as initializer (AV_CHANNEL_LAYOUT_STEREO, etc.)
> + * - with a constructor function such as av_channel_layout_default()
> + * On that note, this also applies:
> + * - copy via assigning is forbidden, av_channel_layout_copy() must be used
> + *   instead (and its return value should be checked)
> + * - if order is AV_CHANNEL_ORDER_CUSTOM, then it must be uninitialized with
> + *   av_channel_layout_uninit().
> + *
> + * No new fields may be added to it without a major version bump, except for
> + * new elements of the union fitting in sizeof(uint64_t).

[ Already said on 2019-10-30 ]
This is a significant difference from the other APIs, and we need to be
really sure we can live with it.

I am rather for this version, but I am afraid we will find cases that
need extending the structure.

I suggest we explicitly declare this API unstable for a few months,
during which we consider acceptable to add fields.

> + *
> + * An AVChannelLayout can be constructed using the convenience function
> + * av_channel_layout_from_mask() / av_channel_layout_from_string(), or it can be
> + * built manually by the caller.
> + */
> +typedef struct AVChannelLayout {
> +    /**
> +     * Channel order used in this layout.
> +     * This is a mandatory field, will default to AV_CHANNEL_ORDER_NATIVE.
> +     */
> +    enum AVChannelOrder order;
> +
> +    /**
> +     * Number of channels in this layout. Mandatory field.
> +     */

> +    int nb_channels;

[ Already said on 2019-10-30 ]
unsigned

> +
> +    /**
> +     * Details about which channels are present in this layout.
> +     * For AV_CHANNEL_ORDER_UNSPEC, this field is undefined and must not be
> +     * used.
> +     */
> +    union {
> +        /**
> +         * This member must be used for AV_CHANNEL_ORDER_NATIVE.
> +         * It is a bitmask, where the position of each set bit means that the
> +         * AVChannel with the corresponding value is present.
> +         *
> +         * I.e. when (mask & (1 << AV_CHAN_FOO)) is non-zero, then AV_CHAN_FOO
> +         * is present in the layout. Otherwise it is not present.
> +         *
> +         * @note when a channel layout using a bitmask is constructed or
> +         * modified manually (i.e.  not using any of the av_channel_layout_*
> +         * functions), the code doing it must ensure that the number of set bits
> +         * is equal to nb_channels.
> +         */
> +        uint64_t mask;
> +        /**
> +         * This member must be used when the channel order is
> +         * AV_CHANNEL_ORDER_CUSTOM. It is a nb_channels-sized array, with each
> +         * element signalling the presend of the AVChannel with the
> +         * corresponding value.
> +         *
> +         * I.e. when map[i] is equal to AV_CHAN_FOO, then AV_CH_FOO is the i-th
> +         * channel in the audio data.
> +         */

> +        enum AVChannel *map;

I suggest:

    typedef struct AVChannelMapped {
        enum AVChannel id;
        const char *descr;
    } AVChannelMapped;

    AVChannelMapped *map;

That way, the application can provide a description for each channel. It
is useful if the same AVChannel value is present several times.

> +    } u;
> +} AVChannelLayout;
> +

> +#define AV_CHANNEL_LAYOUT_MONO \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 1,  .u = { .mask = AV_CH_LAYOUT_MONO }}

[ Already said on 2019-10-30 ]
These macros can only be used to init structures. This is unusual for
macros in ALL_CAPS, and therefore it needs to be documented.

> +#define AV_CHANNEL_LAYOUT_STEREO \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 2,  .u = { .mask = AV_CH_LAYOUT_STEREO }}
> +#define AV_CHANNEL_LAYOUT_2POINT1 \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 3,  .u = { .mask = AV_CH_LAYOUT_2POINT1 }}
> +#define AV_CHANNEL_LAYOUT_2_1 \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 3,  .u = { .mask = AV_CH_LAYOUT_2_1 }}
> +#define AV_CHANNEL_LAYOUT_SURROUND \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 3,  .u = { .mask = AV_CH_LAYOUT_SURROUND }}
> +#define AV_CHANNEL_LAYOUT_3POINT1 \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 4,  .u = { .mask = AV_CH_LAYOUT_3POINT1 }}
> +#define AV_CHANNEL_LAYOUT_4POINT0 \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 4,  .u = { .mask = AV_CH_LAYOUT_4POINT0 }}
> +#define AV_CHANNEL_LAYOUT_4POINT1 \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 5,  .u = { .mask = AV_CH_LAYOUT_4POINT1 }}
> +#define AV_CHANNEL_LAYOUT_2_2 \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 4,  .u = { .mask = AV_CH_LAYOUT_2_2 }}
> +#define AV_CHANNEL_LAYOUT_QUAD \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 4,  .u = { .mask = AV_CH_LAYOUT_QUAD }}
> +#define AV_CHANNEL_LAYOUT_5POINT0 \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 5,  .u = { .mask = AV_CH_LAYOUT_5POINT0 }}
> +#define AV_CHANNEL_LAYOUT_5POINT1 \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 6,  .u = { .mask = AV_CH_LAYOUT_5POINT1 }}
> +#define AV_CHANNEL_LAYOUT_5POINT0_BACK \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 5,  .u = { .mask = AV_CH_LAYOUT_5POINT0_BACK }}
> +#define AV_CHANNEL_LAYOUT_5POINT1_BACK \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 6,  .u = { .mask = AV_CH_LAYOUT_5POINT1_BACK }}
> +#define AV_CHANNEL_LAYOUT_6POINT0 \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 6,  .u = { .mask = AV_CH_LAYOUT_6POINT0 }}
> +#define AV_CHANNEL_LAYOUT_6POINT0_FRONT \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 6,  .u = { .mask = AV_CH_LAYOUT_6POINT0_FRONT }}
> +#define AV_CHANNEL_LAYOUT_HEXAGONAL \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 6,  .u = { .mask = AV_CH_LAYOUT_HEXAGONAL }}
> +#define AV_CHANNEL_LAYOUT_6POINT1 \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 7,  .u = { .mask = AV_CH_LAYOUT_6POINT1 }}
> +#define AV_CHANNEL_LAYOUT_6POINT1_BACK \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 7,  .u = { .mask = AV_CH_LAYOUT_6POINT1_BACK }}
> +#define AV_CHANNEL_LAYOUT_6POINT1_FRONT \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 7,  .u = { .mask = AV_CH_LAYOUT_6POINT1_FRONT }}
> +#define AV_CHANNEL_LAYOUT_7POINT0 \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 7,  .u = { .mask = AV_CH_LAYOUT_7POINT0 }}
> +#define AV_CHANNEL_LAYOUT_7POINT0_FRONT \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 7,  .u = { .mask = AV_CH_LAYOUT_7POINT0_FRONT }}
> +#define AV_CHANNEL_LAYOUT_7POINT1 \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 8,  .u = { .mask = AV_CH_LAYOUT_7POINT1 }}
> +#define AV_CHANNEL_LAYOUT_7POINT1_WIDE \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 8,  .u = { .mask = AV_CH_LAYOUT_7POINT1_WIDE }}
> +#define AV_CHANNEL_LAYOUT_7POINT1_WIDE_BACK \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 8,  .u = { .mask = AV_CH_LAYOUT_7POINT1_WIDE_BACK }}
> +#define AV_CHANNEL_LAYOUT_OCTAGONAL \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 8,  .u = { .mask = AV_CH_LAYOUT_OCTAGONAL }}
> +#define AV_CHANNEL_LAYOUT_HEXADECAGONAL \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 16, .u = { .mask = AV_CH_LAYOUT_HEXAGONAL }}
> +#define AV_CHANNEL_LAYOUT_STEREO_DOWNMIX \
> +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 2,  .u = { .mask = AV_CH_LAYOUT_STEREO_DOWNMIX }}
> +
> +#if FF_API_OLD_CHANNEL_LAYOUT
>  /**
>   * Return a channel layout id that matches name, or 0 if no match is found.
>   *
> @@ -138,7 +333,10 @@ enum AVMatrixEncoding {
>   *   AV_CH_* macros).
>   *
>   * Example: "stereo+FC" = "2c+FC" = "2c+1c" = "0x7"
> + *
> + * @deprecated use av_channel_layout_from_string()
>   */
> +attribute_deprecated
>  uint64_t av_get_channel_layout(const char *name);
>  
>  /**
> @@ -161,7 +359,9 @@ int av_get_extended_channel_layout(const char *name, uint64_t* channel_layout, i
>   *
>   * @param buf put here the string containing the channel layout
>   * @param buf_size size in bytes of the buffer
> + * @deprecated use av_channel_layout_describe()
>   */
> +attribute_deprecated
>  void av_get_channel_layout_string(char *buf, int buf_size, int nb_channels, uint64_t channel_layout);
>  
>  struct AVBPrint;
> @@ -172,12 +372,17 @@ void av_bprint_channel_layout(struct AVBPrint *bp, int nb_channels, uint64_t cha
>  
>  /**
>   * Return the number of channels in the channel layout.
> + * @deprecated use AVChannelLayout.nb_channels
>   */
> +attribute_deprecated
>  int av_get_channel_layout_nb_channels(uint64_t channel_layout);
>  
>  /**
>   * Return default channel layout for a given number of channels.
> + *
> + * @deprecated use av_channel_layout_default()
>   */
> +attribute_deprecated
>  int64_t av_get_default_channel_layout(int nb_channels);
>  
>  /**
> @@ -188,21 +393,179 @@ int64_t av_get_default_channel_layout(int nb_channels);
>   *
>   * @return index of channel in channel_layout on success, a negative AVERROR
>   *         on error.
> + *
> + * @deprecated use av_channel_layout_channel_index()
>   */
> +attribute_deprecated
>  int av_get_channel_layout_channel_index(uint64_t channel_layout,
>                                          uint64_t channel);
>  
>  /**
>   * Get the channel with the given index in channel_layout.
> + * @deprecated use av_channel_layout_get_channel()
>   */
> +attribute_deprecated
>  uint64_t av_channel_layout_extract_channel(uint64_t channel_layout, int index);
>  
>  /**
>   * Get the name of a given channel.
>   *
>   * @return channel name on success, NULL on error.
> + *
> + * @deprecated use av_channel_name()
>   */
> +attribute_deprecated
>  const char *av_get_channel_name(uint64_t channel);
> +#endif
> +
> +/**
> + * This is the inverse function of @ref av_channel_from_string().
> + *
> + * @return a string describing a given channel, "?" if not found.
> + */
> +const char *av_channel_name(enum AVChannel channel);
> +
> +/**
> + * This is the inverse function of @ref av_channel_name().
> + *
> + * @return a channel described by the given string, or a negative AVERROR value.
> + */
> +int av_channel_from_string(const char *name);
> +
> +/**
> + * Initialize a native channel layout from a bitmask indicating which channels
> + * are present.
> + *
> + * @note channel_layout should be properly allocated as described above.
> + *
> + * @param channel_layout the layout structure to be initialized
> + * @param mask bitmask describing the channel layout
> + *
> + * @return 0 on success
> + *         AVERROR(EINVAL) for invalid mask values
> + */
> +int av_channel_layout_from_mask(AVChannelLayout *channel_layout, uint64_t mask);
> +
> +/**
> + * Initialize a channel layout from a given string description.
> + * The input string can be represented by:
> + *  - the formal channel layout name (returned by av_channel_layout_describe())
> + *  - single or multiple channel names (returned by av_channel_name()
> + *    or concatenated with "|")
> + *  - a hexadecimal value of a channel layout (eg. "0x4")
> + *  - the number of channels with default layout (eg. "5")
> + *  - the number of unordered channels (eg. "4 channels")
> + *
> + * @note channel_layout should be properly allocated as described above.
> + *
> + * @param channel_layout input channel layout
> + * @param str string describing the channel layout
> + * @return 0 channel layout was detected, AVERROR_INVALIDATATA otherwise
> + */
> +int av_channel_layout_from_string(AVChannelLayout *channel_layout,
> +                                  const char *str);
> +
> +/**
> + * Get the default channel layout for a given number of channels.
> + *
> + * @note channel_layout should be properly allocated as described above.
> + *
> + * @param channel_layout the layout structure to be initialized
> + * @param nb_channels number of channels
> + */

> +void av_channel_layout_default(AVChannelLayout *ch_layout, int nb_channels);

[ Already said on 2019-10-30 ]
unsigned nb_channels

> +
> +/**
> + * Free any allocated data in the channel layout and reset the channel
> + * count to 0.
> + *
> + * @note this only used for structure initialization and for freeing the
> + *       allocated memory for AV_CHANNEL_ORDER_CUSTOM order.
> + *
> + * @param channel_layout the layout structure to be uninitialized
> + */
> +void av_channel_layout_uninit(AVChannelLayout *channel_layout);
> +
> +/**
> + * Make a copy of a channel layout. This differs from just assigning src to dst
> + * in that it allocates and copies the map for AV_CHANNEL_ORDER_CUSTOM.
> + *
> + * @note the destination channel_layout will be always uninitialized before copy.
> + *
> + * @param dst destination channel layout
> + * @param src source channel layout
> + * @return 0 on success, a negative AVERROR on error.
> + */
> +int av_channel_layout_copy(AVChannelLayout *dst, const AVChannelLayout *src);
> +

> +/**
> + * Get a human-readable string describing the channel layout properties.
> + *
> + * @note The returned string is allocated with av_malloc(),
> + *       and must be freed by the caller with av_free().
> + *
> + * @param channel_layout channel layout to be described
> + * @return a string describing the structure or NULL on failure in the same
> + *         format that is accepted by @ref av_channel_layout_from_string().
> + */

> +char *av_channel_layout_describe(const AVChannelLayout *channel_layout);

The usual function that returns a string. I have always found these
functions very clumsy. They either take a buffer and size, which is
simple but very inconvenient when the resulting string can be long, or
they do memory allocations and require annoying error checks. Both are
inefficient because the returned string will probably be immediately
copied into another buffer or written somewhere.

I have been thinking about a better interface for a long time, and I
recently found the right way to do it. I implemented a first draft, it
works. And it also makes the implementation of this function simpler.

I propose you this:

void av_channel_layout_write(AVWriter wr, const AVChannelLayout *channel_layout);

It can be used several ways:

    char buf[arbitrary];
    av_channel_layout_write(av_buf_writer_array(buf), layout);

Or:

    AVWriter wr = av_dynbuf_writer();
    av_channel_layout_write(wr, layout);
    ret = av_dynbuf_writer_finalize(wr, &data, &size);

Or several other possibilities to write directly to av_log() or to
user-supplied callbacks.

> +
> +/**
> + * Get the channel with the given index in a channel layout.
> + *
> + * @param channel_layout input channel layout
> + * @return channel with the index idx in channel_layout on success or a negative
> + *                 AVERROR on failure (if idx is not valid or the channel order
> + *                 is unspecified)
> + */

> +int av_channel_layout_get_channel(const AVChannelLayout *channel_layout, int idx);

[ Already said on 2019-10-30 ]
unsigned idx

But do we really need this as part of the API?

[ New ]

On the other hand, I am pretty sure we will need a function to let the
user specify one channel within the layout. This is important with this
new API, since the channel name is not enough: since channels can be
duplicated, the name of a channel is not enough to identify it. We need
something to express "the third front-left channel" for example.

I suggest

int av_channel_layout_find_channel(const AVChannelLayout *channel_layout, 
                                   const char *name);

where name could be "FL" if there is only one, "c7" for the channel
number 7 whatever the layout, "FL3" for the third FL, and possibly a
name that appears in AVChannelMapped.descr.

> +
> +/**
> + * Get the index of a given channel in a channel layout. In case multiple
> + * channels are found, only the first match will be returned.
> + *
> + * @param channel_layout input channel layout
> + * @return index of channel in channel_layout on success or a negative number if
> + *         channel is not present in channel_layout.
> + */

> +int av_channel_layout_channel_index(const AVChannelLayout *channel_layout,
> +                                    enum AVChannel channel);

[ Already said on 2019-10-30 ]
Same: do we really need this?

> +
> +/**
> + * Find out what channels from a given set are present in a channel layout,
> + * without regard for their positions.
> + *
> + * @param channel_layout input channel layout
> + * @param mask a combination of AV_CH_* representing a set of channels
> + * @return a bitfield representing all the channels from mask that are present
> + *         in channel_layout
> + */

> +uint64_t av_channel_layout_subset(const AVChannelLayout *channel_layout,
> +                                  uint64_t mask);

[ Already said on 2019-10-30 ]
Same.

> +
> +/**
> + * Check whether a channel layout is valid, i.e. can possibly describe audio
> + * data.
> + *
> + * @param channel_layout input channel layout
> + * @return 1 if channel_layout is valid, 0 otherwise.
> + */
> +int av_channel_layout_check(const AVChannelLayout *channel_layout);
> +
> +/**
> + * Check whether two channel layouts are semantically the same, i.e. the same
> + * channels are present on the same positions in both.
> + *
> + * If one of the channel layouts is AV_CHANNEL_ORDER_UNSPEC, while the other is
> + * not, they are considered to be unequal. If both are AV_CHANNEL_ORDER_UNSPEC,
> + * they are considered equal iff the channel counts are the same in both.
> + *
> + * @param chl input channel layout
> + * @param chl1 input channel layout
> + * @return 0 if chl and chl1 are equal, 1 if they are not equal. A negative
> + *         AVERROR code if one or both are invalid.
> + */
> +int av_channel_layout_compare(const AVChannelLayout *chl, const AVChannelLayout *chl1);
>  
>  /**
>   * Get the description of a given channel.
> diff --git a/libavutil/version.h b/libavutil/version.h
> index e18163388d..0aa669fd16 100644
> --- a/libavutil/version.h
> +++ b/libavutil/version.h
> @@ -129,6 +129,9 @@
>  #ifndef FF_API_PSEUDOPAL
>  #define FF_API_PSEUDOPAL                (LIBAVUTIL_VERSION_MAJOR < 57)
>  #endif
> +#ifndef FF_API_OLD_CHANNEL_LAYOUT
> +#define FF_API_OLD_CHANNEL_LAYOUT       (LIBAVUTIL_VERSION_MAJOR < 57)
> +#endif
>  
>  
>  /**

Regards,

Michael Niedermayer Dec. 8, 2019, 11:21 p.m. UTC | #2

On Sat, Dec 07, 2019 at 09:25:53PM +0100, Nicolas George wrote:
> Anton Khirnov (12019-12-06):
> > The new API is more extensible and allows for custom layouts.
> > More accurate information is exported, eg for decoders that do not
> > set a channel layout, lavc will not make one up for them.
> > 
> > Deprecate the old API working with just uint64_t bitmasks.
> > 
> > Expanded and completed by Vittorio Giovara <vittorio.giovara@gmail.com>.
> > Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
> > ---
> >  libavutil/channel_layout.c | 393 +++++++++++++++++++++++++++++------
> >  libavutil/channel_layout.h | 415 ++++++++++++++++++++++++++++++++++---
> >  libavutil/version.h        |   3 +
> >  3 files changed, 719 insertions(+), 92 deletions(-)
> 
> Thanks for sending this here. I will make a few preliminary remarks
> below.
[...]
> > + * Unlike most structures in Libav, sizeof(AVChannelLayout) is a part of the
> > + * public ABI and may be used by the caller. E.g. it may be allocated on stack.
> > + * In particular, this structure can be initialized as follows:
> > + * - default initialization with {0} or by setting all used fields correctly
> > + * - with predefined layout as initializer (AV_CHANNEL_LAYOUT_STEREO, etc.)
> > + * - with a constructor function such as av_channel_layout_default()
> > + * On that note, this also applies:
> > + * - copy via assigning is forbidden, av_channel_layout_copy() must be used
> > + *   instead (and its return value should be checked)
> > + * - if order is AV_CHANNEL_ORDER_CUSTOM, then it must be uninitialized with
> > + *   av_channel_layout_uninit().
> > + *
> > + * No new fields may be added to it without a major version bump, except for
> > + * new elements of the union fitting in sizeof(uint64_t).
> 
> [ Already said on 2019-10-30 ]
> This is a significant difference from the other APIs, and we need to be
> really sure we can live with it.
> 
> I am rather for this version, but I am afraid we will find cases that
> need extending the structure.
> 
> I suggest we explicitly declare this API unstable for a few months,
> during which we consider acceptable to add fields.
> 

> > + *
> > + * An AVChannelLayout can be constructed using the convenience function
> > + * av_channel_layout_from_mask() / av_channel_layout_from_string(), or it can be
> > + * built manually by the caller.
> > + */
> > +typedef struct AVChannelLayout {
> > +    /**
> > +     * Channel order used in this layout.
> > +     * This is a mandatory field, will default to AV_CHANNEL_ORDER_NATIVE.
> > +     */
> > +    enum AVChannelOrder order;
> > +
> > +    /**
> > +     * Number of channels in this layout. Mandatory field.
> > +     */
> 
> > +    int nb_channels;
> 
> [ Already said on 2019-10-30 ]
> unsigned

One problem with unsigned is that in expressions you cannot have negative
values, nor can you compare to negative values without casting to signed.
That has some risk for producing unexpected behavior bugs

for example 
if (ch >= s->nb_channels) {
    ...
} else if (ch < 0)
    ...

would not work as expected

[...]

Nicolas George Dec. 8, 2019, 11:36 p.m. UTC | #3

Michael Niedermayer (12019-12-09):
> One problem with unsigned is that in expressions you cannot have negative
> values, nor can you compare to negative values without casting to signed.
> That has some risk for producing unexpected behavior bugs
> 
> for example 
> if (ch >= s->nb_channels) {
>     ...
> } else if (ch < 0)
>     ...
> 
> would not work as expected

I do not see it as a problem, I see it as exactly what we want. If a
value cannot meaningfully be negative, there is no sense in wasting time
and code allowing it to be negative and then testing it.

In the above code, ch should be unsigned too. Or, if it has a good
reason to be signed (negative values meaning something else?), test them
first.

Regards,

Michael Niedermayer Dec. 9, 2019, 6:23 p.m. UTC | #4

On Mon, Dec 09, 2019 at 12:36:11AM +0100, Nicolas George wrote:
> Michael Niedermayer (12019-12-09):
> > One problem with unsigned is that in expressions you cannot have negative
> > values, nor can you compare to negative values without casting to signed.
> > That has some risk for producing unexpected behavior bugs
> > 
> > for example 
> > if (ch >= s->nb_channels) {
> >     ...
> > } else if (ch < 0)
> >     ...
> > 
> > would not work as expected
> 
> I do not see it as a problem, I see it as exactly what we want. If a
> value cannot meaningfully be negative, there is no sense in wasting time
> and code allowing it to be negative and then testing it.
> 
> In the above code, ch should be unsigned too. Or, if it has a good
> reason to be signed (negative values meaning something else?), test them
> first.

mixing unsigned and signed int of course works if one is aware of
* what is signed, what is unsigned
* exact semantics of expressions mixing them
* does not miss any corner cases

OTOH if everything is signed, then the developer does not need to worry
about these things, and its easier to remember "all is signed int" vs.
"these specific fields are unsigned"

What i meant really is
"Its easier to maintain code that is all int, instead of code mixing
 signed and unsigned int"
Sometimes we need unsigned and thats fine, i think though when we dont
"need" it, its better to use plain int. 

Just my oppinion, not objecting to anything

Thanks

[...]

Nicolas George Dec. 9, 2019, 7:34 p.m. UTC | #5

Michael Niedermayer (12019-12-09):
> mixing unsigned and signed int of course works if one is aware of
> * what is signed, what is unsigned
> * exact semantics of expressions mixing them
> * does not miss any corner cases

I think it is reasonable to demand from FFmpeg developers that they know
enough C to be at ease with this.

And for remembering exactly and not missing corner cases, there are
compiler warnings. IIRC they are not enabled in FFmpeg, but they do lead
to a much better code hygiene.

> OTOH if everything is signed, then the developer does not need to worry
> about these things, and its easier to remember "all is signed int" vs.
> "these specific fields are unsigned"

Indeed. Instead, if everything is signed, they have to worry about the
many undefined behaviors attached with signed. Plus the risks that are
not related to UB, like checking that an array index is small enough but
forgetting to check non-negative.

Considering the time you spent recently "fixing" signed integer
overflows, I suspect you may be receptive to the fact that unsigned
arithmetic is entirely specified.

Regards,

Anton Khirnov Dec. 10, 2019, 11:57 a.m. UTC | #6

Quoting Nicolas George (2019-12-09 20:34:45)
> Michael Niedermayer (12019-12-09):
> > mixing unsigned and signed int of course works if one is aware of
> > * what is signed, what is unsigned
> > * exact semantics of expressions mixing them
> > * does not miss any corner cases
> 
> I think it is reasonable to demand from FFmpeg developers that they know
> enough C to be at ease with this.

I disagree. I think I have a fair amount of skill and experience as a C
developer, but I still get hit by those issues frequently. It's extra
trouble for only theoretical gain.

Will reply to the rest of your remarks later.

Nicolas George Dec. 17, 2019, 7:20 p.m. UTC | #7

Hi.

Anton Khirnov (12019-12-10):
> I disagree. I think I have a fair amount of skill and experience as a C
> developer, but I still get hit by those issues frequently. It's extra
> trouble for only theoretical gain.

I concede this to you, except for the last sentence: I do not agree that
the gain is only theoretical. We only need to look at the number of
recent commits fixing possible integer overflows. Signed arithmetic is
tricky, possibly as tricky as the traps of mixing signed and unsigned.
The reason we are seeing this only now is because compilers have only
recently started to optimize these undefined behaviors aggressively. And
we have only recently started to care. But these are real traps that do
not happen with unsigned.

I would advise everybody to adopt the policy of always using unsigned
unless there is a good reason to use signed, instead of the default now
which is the opposite.

Of course, for your own code and everybody else, I would not block the
patch if my advice is not followed. But this is public API, it should be
clean, so I will insist a little more.

Using the logical type has practical benefits. One is that it guarantees
to the API users that negative values cannot happen, need not to be
tested. With an unsigned, the value can be injected in size and pointer
arithmetic directly. With a signed, the API user need either to add an
useless test or to blindly trust the library. Blindly trusting a piece
of code should always put people ill at ease.

Regards,

Anton Khirnov Dec. 29, 2019, 3:05 p.m. UTC | #8

Quoting Nicolas George (2019-12-17 20:20:00)
> Hi.
> 
> Anton Khirnov (12019-12-10):
> > I disagree. I think I have a fair amount of skill and experience as a C
> > developer, but I still get hit by those issues frequently. It's extra
> > trouble for only theoretical gain.
> 
> I concede this to you, except for the last sentence: I do not agree that
> the gain is only theoretical. We only need to look at the number of
> recent commits fixing possible integer overflows. Signed arithmetic is
> tricky, possibly as tricky as the traps of mixing signed and unsigned.
> The reason we are seeing this only now is because compilers have only
> recently started to optimize these undefined behaviors aggressively. And
> we have only recently started to care. But these are real traps that do
> not happen with unsigned.

Maybe I missed something, but I am not aware of the UB-ness of signed
overflow being a practical problem. Typically, your computation will
return a meaningless result. You would get a similarly meaningless
result from the analogous perfectly well-defined unsigned computation.

to be clear: I am not objecting against fixing UB, but clarifying my
'theoretical gain' comment above.

> 
> I would advise everybody to adopt the policy of always using unsigned
> unless there is a good reason to use signed, instead of the default now
> which is the opposite.
> 
> Of course, for your own code and everybody else, I would not block the
> patch if my advice is not followed. But this is public API, it should be
> clean, so I will insist a little more.

I would agree to that only if it was done consistently across the entire
API. Since the channel count tends to be used in expressions with other
values which are signed (e.g. sample rate). Which means that the users
will have to mix signed and unsigned, making things harder for them.

> 
> Using the logical type has practical benefits. One is that it guarantees
> to the API users that negative values cannot happen, need not to be
> tested. With an unsigned, the value can be injected in size and pointer
> arithmetic directly. With a signed, the API user need either to add an
> useless test or to blindly trust the library. Blindly trusting a piece
> of code should always put people ill at ease.

The guarantee that it cannot be negative in valid cases is already
present, there is no need for users to test that explicitly. The only
conceivable case where it will be negative is a bug in the code
populating the channel layout. Making the value unsigned would not make
such bugs go away, it would only make the value absurdly large instead
of negative.

Anton Khirnov Dec. 29, 2019, 4:39 p.m. UTC | #9

Quoting Nicolas George (2019-12-07 21:25:53)
> Anton Khirnov (12019-12-06):
> > The new API is more extensible and allows for custom layouts.
> > More accurate information is exported, eg for decoders that do not
> > set a channel layout, lavc will not make one up for them.
> > 
> > Deprecate the old API working with just uint64_t bitmasks.
> > 
> > Expanded and completed by Vittorio Giovara <vittorio.giovara@gmail.com>.
> > Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
> > ---
> >  libavutil/channel_layout.c | 393 +++++++++++++++++++++++++++++------
> >  libavutil/channel_layout.h | 415 ++++++++++++++++++++++++++++++++++---
> >  libavutil/version.h        |   3 +
> >  3 files changed, 719 insertions(+), 92 deletions(-)
> 
> Thanks for sending this here. I will make a few preliminary remarks
> below.
> 
> Before that, I want to mention I think the real trial for this API will
> be integration with libavfilter, especially the format negotiation and
> the user interface of the filters that allow fine control of channels.
> The negotiation, in particular, promises to be an interesting
> algorithmic problem. By "real trial", I mean that it is very possible
> that trying to implement that will make us realize some details need to
> be changed. I have these questions in mind while writing my comments.
> 
> Two important comments to consider, because they affect the structure of
> the API itself:
> 
> - about av_channel_layout_describe(), a general reflection on functions
>   that return strings;
> 
> - about getting a channel by name.
> 

> 
> > +int av_channel_layout_from_mask(AVChannelLayout *channel_layout,
> > +                                uint64_t mask)
> 
> I find that "channel_layout" all over the place makes the code bulky,
> harder to read. And tiring to write. We could decide on a standard
> shortening for it everywhere. For example chlayout.

In the API namespace (function names) or the parameter names? For the
latter, it can be changed at any time without problem and I don't really
care much. For the former, the header is called channel_layout and I'd
lean towards keeping that aligned with the namespace. Not a very strong
opinion though.

> > +enum AVChannel {
> > +    AV_CHAN_FRONT_LEFT,
> > +    AV_CHAN_FRONT_RIGHT,
> > +    AV_CHAN_FRONT_CENTER,
> > +    AV_CHAN_LOW_FREQUENCY,
> > +    AV_CHAN_BACK_LEFT,
> > +    AV_CHAN_BACK_RIGHT,
> > +    AV_CHAN_FRONT_LEFT_OF_CENTER,
> > +    AV_CHAN_FRONT_RIGHT_OF_CENTER,
> > +    AV_CHAN_BACK_CENTER,
> > +    AV_CHAN_SIDE_LEFT,
> > +    AV_CHAN_SIDE_RIGHT,
> > +    AV_CHAN_TOP_CENTER,
> > +    AV_CHAN_TOP_FRONT_LEFT,
> > +    AV_CHAN_TOP_FRONT_CENTER,
> > +    AV_CHAN_TOP_FRONT_RIGHT,
> > +    AV_CHAN_TOP_BACK_LEFT,
> > +    AV_CHAN_TOP_BACK_CENTER,
> > +    AV_CHAN_TOP_BACK_RIGHT,
> > +    /** Stereo downmix. */
> > +    AV_CHAN_STEREO_LEFT = 29,
> > +    /** See above. */
> > +    AV_CHAN_STEREO_RIGHT,
> > +    AV_CHAN_WIDE_LEFT,
> > +    AV_CHAN_WIDE_RIGHT,
> > +    AV_CHAN_SURROUND_DIRECT_LEFT,
> > +    AV_CHAN_SURROUND_DIRECT_RIGHT,
> > +    AV_CHAN_LOW_FREQUENCY_2,
> > +
> > +    /** Channel is empty can be safely skipped. */
> > +    AV_CHAN_SILENCE = 64,
> > +};
> > +
> > +enum AVChannelOrder {
> > +    /**
> > +     * The native channel order, i.e. the channels are in the same order in
> > +     * which they are defined in the AVChannel enum. This supports up to 63
> > +     * different channels.
> > +     */
> > +    AV_CHANNEL_ORDER_NATIVE,
> > +    /**
> > +     * The channel order does not correspond to any other predefined order and
> > +     * is stored as an explicit map. For example, this could be used to support
> 
> > +     * layouts with 64 or more channels, or with channels that could be skipped.
> 
> [ Already said on 2019-10-30 ]
> "or with channels that could be skipped"? What does that mean. If a
> channel is skipped, it is not present, that can be expressed in the
> mask. Is this a mistake for "duplicated"?

Hmm, this was apparently added by Vittorio so I'm not sure, but would
assume it refers to AV_CHAN_SILENCE.

> 
> > +     */
> > +    AV_CHANNEL_ORDER_CUSTOM,
> > +    /**
> > +     * Only the channel count is specified, without any further information
> > +     * about the channel order.
> > +     */
> > +    AV_CHANNEL_ORDER_UNSPEC,
> > +};
> > +
> > +
> >  /**
> >   * @defgroup channel_masks Audio channel masks
> >   *
> > @@ -46,36 +103,41 @@
> >   *
> >   * @{
> >   */
> > -#define AV_CH_FRONT_LEFT             0x00000001
> > -#define AV_CH_FRONT_RIGHT            0x00000002
> > -#define AV_CH_FRONT_CENTER           0x00000004
> > -#define AV_CH_LOW_FREQUENCY          0x00000008
> > -#define AV_CH_BACK_LEFT              0x00000010
> > -#define AV_CH_BACK_RIGHT             0x00000020
> > -#define AV_CH_FRONT_LEFT_OF_CENTER   0x00000040
> > -#define AV_CH_FRONT_RIGHT_OF_CENTER  0x00000080
> > -#define AV_CH_BACK_CENTER            0x00000100
> > -#define AV_CH_SIDE_LEFT              0x00000200
> > -#define AV_CH_SIDE_RIGHT             0x00000400
> > -#define AV_CH_TOP_CENTER             0x00000800
> > -#define AV_CH_TOP_FRONT_LEFT         0x00001000
> > -#define AV_CH_TOP_FRONT_CENTER       0x00002000
> > -#define AV_CH_TOP_FRONT_RIGHT        0x00004000
> > -#define AV_CH_TOP_BACK_LEFT          0x00008000
> > -#define AV_CH_TOP_BACK_CENTER        0x00010000
> > -#define AV_CH_TOP_BACK_RIGHT         0x00020000
> > -#define AV_CH_STEREO_LEFT            0x20000000  ///< Stereo downmix.
> > -#define AV_CH_STEREO_RIGHT           0x40000000  ///< See AV_CH_STEREO_LEFT.
> > -#define AV_CH_WIDE_LEFT              0x0000000080000000ULL
> > -#define AV_CH_WIDE_RIGHT             0x0000000100000000ULL
> > -#define AV_CH_SURROUND_DIRECT_LEFT   0x0000000200000000ULL
> > -#define AV_CH_SURROUND_DIRECT_RIGHT  0x0000000400000000ULL
> > -#define AV_CH_LOW_FREQUENCY_2        0x0000000800000000ULL
> > +#define AV_CH_FRONT_LEFT             (1ULL << AV_CHAN_FRONT_LEFT           )
> > +#define AV_CH_FRONT_RIGHT            (1ULL << AV_CHAN_FRONT_RIGHT          )
> > +#define AV_CH_FRONT_CENTER           (1ULL << AV_CHAN_FRONT_CENTER         )
> > +#define AV_CH_LOW_FREQUENCY          (1ULL << AV_CHAN_LOW_FREQUENCY        )
> > +#define AV_CH_BACK_LEFT              (1ULL << AV_CHAN_BACK_LEFT            )
> > +#define AV_CH_BACK_RIGHT             (1ULL << AV_CHAN_BACK_RIGHT           )
> > +#define AV_CH_FRONT_LEFT_OF_CENTER   (1ULL << AV_CHAN_FRONT_LEFT_OF_CENTER )
> > +#define AV_CH_FRONT_RIGHT_OF_CENTER  (1ULL << AV_CHAN_FRONT_RIGHT_OF_CENTER)
> > +#define AV_CH_BACK_CENTER            (1ULL << AV_CHAN_BACK_CENTER          )
> > +#define AV_CH_SIDE_LEFT              (1ULL << AV_CHAN_SIDE_LEFT            )
> > +#define AV_CH_SIDE_RIGHT             (1ULL << AV_CHAN_SIDE_RIGHT           )
> > +#define AV_CH_TOP_CENTER             (1ULL << AV_CHAN_TOP_CENTER           )
> > +#define AV_CH_TOP_FRONT_LEFT         (1ULL << AV_CHAN_TOP_FRONT_LEFT       )
> > +#define AV_CH_TOP_FRONT_CENTER       (1ULL << AV_CHAN_TOP_FRONT_CENTER     )
> > +#define AV_CH_TOP_FRONT_RIGHT        (1ULL << AV_CHAN_TOP_FRONT_RIGHT      )
> > +#define AV_CH_TOP_BACK_LEFT          (1ULL << AV_CHAN_TOP_BACK_LEFT        )
> > +#define AV_CH_TOP_BACK_CENTER        (1ULL << AV_CHAN_TOP_BACK_CENTER      )
> > +#define AV_CH_TOP_BACK_RIGHT         (1ULL << AV_CHAN_TOP_BACK_RIGHT       )
> > +#define AV_CH_STEREO_LEFT            (1ULL << AV_CHAN_STEREO_LEFT          )
> > +#define AV_CH_STEREO_RIGHT           (1ULL << AV_CHAN_STEREO_RIGHT         )
> > +#define AV_CH_WIDE_LEFT              (1ULL << AV_CHAN_WIDE_LEFT            )
> > +#define AV_CH_WIDE_RIGHT             (1ULL << AV_CHAN_WIDE_RIGHT           )
> > +#define AV_CH_SURROUND_DIRECT_LEFT   (1ULL << AV_CHAN_SURROUND_DIRECT_LEFT )
> > +#define AV_CH_SURROUND_DIRECT_RIGHT  (1ULL << AV_CHAN_SURROUND_DIRECT_RIGHT)
> > +#define AV_CH_LOW_FREQUENCY_2        (1ULL << AV_CHAN_LOW_FREQUENCY_2      )
> >  
> > +#if FF_API_OLD_CHANNEL_LAYOUT
> >  /** Channel mask value used for AVCodecContext.request_channel_layout
> >      to indicate that the user requests the channel order of the decoder output
> > -    to be the native codec channel order. */
> > +    to be the native codec channel order.
> > +    @deprecated channel order is now indicated in a special field in
> > +                AVChannelLayout
> > +    */
> >  #define AV_CH_LAYOUT_NATIVE          0x8000000000000000ULL
> > +#endif
> >  
> >  /**
> >   * @}
> > @@ -122,6 +184,139 @@ enum AVMatrixEncoding {
> >      AV_MATRIX_ENCODING_NB
> >  };
> >  
> > +/**
> > + * @}
> > + */
> > +
> > +/**
> > + * An AVChannelLayout holds information about the channel layout of audio data.
> > + *
> > + * A channel layout here is defined as a set of channels ordered in a specific
> > + * way (unless the channel order is AV_CHANNEL_ORDER_UNSPEC, in which case an
> > + * AVChannelLayout carries only the channel count).
> > + *
> 
> > + * Unlike most structures in Libav, sizeof(AVChannelLayout) is a part of the
> > + * public ABI and may be used by the caller. E.g. it may be allocated on stack.
> > + * In particular, this structure can be initialized as follows:
> > + * - default initialization with {0} or by setting all used fields correctly
> > + * - with predefined layout as initializer (AV_CHANNEL_LAYOUT_STEREO, etc.)
> > + * - with a constructor function such as av_channel_layout_default()
> > + * On that note, this also applies:
> > + * - copy via assigning is forbidden, av_channel_layout_copy() must be used
> > + *   instead (and its return value should be checked)
> > + * - if order is AV_CHANNEL_ORDER_CUSTOM, then it must be uninitialized with
> > + *   av_channel_layout_uninit().
> > + *
> > + * No new fields may be added to it without a major version bump, except for
> > + * new elements of the union fitting in sizeof(uint64_t).
> 
> [ Already said on 2019-10-30 ]
> This is a significant difference from the other APIs, and we need to be
> really sure we can live with it.
> 
> I am rather for this version, but I am afraid we will find cases that
> need extending the structure.
> 
> I suggest we explicitly declare this API unstable for a few months,
> during which we consider acceptable to add fields.

I am aware that it is different, but making the struct dynamic would
make using it significantly more cumbersome. Given how long we have
lived with the bitmask, I do not expect there to be a significant need
to add more fields to it.

Plus, it can still be extended by adding a new AV_CHANNEL_ORDER types
and a corresponding new union member.

> 
> > +
> > +    /**
> > +     * Details about which channels are present in this layout.
> > +     * For AV_CHANNEL_ORDER_UNSPEC, this field is undefined and must not be
> > +     * used.
> > +     */
> > +    union {
> > +        /**
> > +         * This member must be used for AV_CHANNEL_ORDER_NATIVE.
> > +         * It is a bitmask, where the position of each set bit means that the
> > +         * AVChannel with the corresponding value is present.
> > +         *
> > +         * I.e. when (mask & (1 << AV_CHAN_FOO)) is non-zero, then AV_CHAN_FOO
> > +         * is present in the layout. Otherwise it is not present.
> > +         *
> > +         * @note when a channel layout using a bitmask is constructed or
> > +         * modified manually (i.e.  not using any of the av_channel_layout_*
> > +         * functions), the code doing it must ensure that the number of set bits
> > +         * is equal to nb_channels.
> > +         */
> > +        uint64_t mask;
> > +        /**
> > +         * This member must be used when the channel order is
> > +         * AV_CHANNEL_ORDER_CUSTOM. It is a nb_channels-sized array, with each
> > +         * element signalling the presend of the AVChannel with the
> > +         * corresponding value.
> > +         *
> > +         * I.e. when map[i] is equal to AV_CHAN_FOO, then AV_CH_FOO is the i-th
> > +         * channel in the audio data.
> > +         */
> 
> > +        enum AVChannel *map;
> 
> I suggest:
> 
>     typedef struct AVChannelMapped {
>         enum AVChannel id;
>         const char *descr;
>     } AVChannelMapped;
> 
>     AVChannelMapped *map;
> 
> That way, the application can provide a description for each channel. It
> is useful if the same AVChannel value is present several times.

Given that a copy of this struct is embedded in every single frame, I'd
rather not add extensive dynamically allocated metadata to it. Those
should rather appear in some higher level API.

> 
> > +    } u;
> > +} AVChannelLayout;
> > +
> 
> > +#define AV_CHANNEL_LAYOUT_MONO \
> > +    { .order = AV_CHANNEL_ORDER_NATIVE, .nb_channels = 1,  .u = { .mask = AV_CH_LAYOUT_MONO }}
> 
> [ Already said on 2019-10-30 ]
> These macros can only be used to init structures. This is unusual for
> macros in ALL_CAPS, and therefore it needs to be documented.

They also can (and are, in the patchset) used as
(AVChannelLayout)AV_CHANNEL_LAYOUT_FOO

We might want to add a macro for that.

> > +
> > +/**
> > + * Free any allocated data in the channel layout and reset the channel
> > + * count to 0.
> > + *
> > + * @note this only used for structure initialization and for freeing the
> > + *       allocated memory for AV_CHANNEL_ORDER_CUSTOM order.
> > + *
> > + * @param channel_layout the layout structure to be uninitialized
> > + */
> > +void av_channel_layout_uninit(AVChannelLayout *channel_layout);
> > +
> > +/**
> > + * Make a copy of a channel layout. This differs from just assigning src to dst
> > + * in that it allocates and copies the map for AV_CHANNEL_ORDER_CUSTOM.
> > + *
> > + * @note the destination channel_layout will be always uninitialized before copy.
> > + *
> > + * @param dst destination channel layout
> > + * @param src source channel layout
> > + * @return 0 on success, a negative AVERROR on error.
> > + */
> > +int av_channel_layout_copy(AVChannelLayout *dst, const AVChannelLayout *src);
> > +
> 
> > +/**
> > + * Get a human-readable string describing the channel layout properties.
> > + *
> > + * @note The returned string is allocated with av_malloc(),
> > + *       and must be freed by the caller with av_free().
> > + *
> > + * @param channel_layout channel layout to be described
> > + * @return a string describing the structure or NULL on failure in the same
> > + *         format that is accepted by @ref av_channel_layout_from_string().
> > + */
> 
> > +char *av_channel_layout_describe(const AVChannelLayout *channel_layout);
> 
> The usual function that returns a string. I have always found these
> functions very clumsy. They either take a buffer and size, which is
> simple but very inconvenient when the resulting string can be long, or
> they do memory allocations and require annoying error checks. Both are
> inefficient because the returned string will probably be immediately
> copied into another buffer or written somewhere.
> 
> I have been thinking about a better interface for a long time, and I
> recently found the right way to do it. I implemented a first draft, it
> works. And it also makes the implementation of this function simpler.
> 
> I propose you this:
> 
> void av_channel_layout_write(AVWriter wr, const AVChannelLayout *channel_layout);
> 
> It can be used several ways:
> 
>     char buf[arbitrary];
>     av_channel_layout_write(av_buf_writer_array(buf), layout);
> 
> Or:
> 
>     AVWriter wr = av_dynbuf_writer();
>     av_channel_layout_write(wr, layout);
>     ret = av_dynbuf_writer_finalize(wr, &data, &size);
> 
> Or several other possibilities to write directly to av_log() or to
> user-supplied callbacks.

That still requires error checks and adds considerable complexity.

There is no hiding from the fact that the string size is not bounded, so
either you make up an arbitrary limit and hope it's enough (and have it
fail for special cases), or you do dynamic allocation with error checks.

> 
> > +
> > +/**
> > + * Get the channel with the given index in a channel layout.
> > + *
> > + * @param channel_layout input channel layout
> > + * @return channel with the index idx in channel_layout on success or a negative
> > + *                 AVERROR on failure (if idx is not valid or the channel order
> > + *                 is unspecified)
> > + */
> 
> > +int av_channel_layout_get_channel(const AVChannelLayout *channel_layout, int idx);
> 
> [ Already said on 2019-10-30 ]
> unsigned idx

ok

> 
> But do we really need this as part of the API?

Yes. The point of the API is to provide a logical mapping between
audio channels and their indices in a stream. That way, the callers that
only read the channel layout do not (typically) need to know how exactly
it is stored.

This function is the canonical way of answering the question 'what
semantics does n-th channel in this stream have?', so the callers don't
need to handle ORDER_NATIVE vs ORDER_CUSTOM, or do any messing with
bitmasks.

> 
> [ New ]
> 
> On the other hand, I am pretty sure we will need a function to let the
> user specify one channel within the layout. This is important with this
> new API, since the channel name is not enough: since channels can be
> duplicated, the name of a channel is not enough to identify it. We need
> something to express "the third front-left channel" for example.
> 
> I suggest
> 
> int av_channel_layout_find_channel(const AVChannelLayout *channel_layout, 
>                                    const char *name);
> 
> where name could be "FL" if there is only one, "c7" for the channel
> number 7 whatever the layout, "FL3" for the third FL, and possibly a
> name that appears in AVChannelMapped.descr.

I do not agree. Duplicated channels in a layout are expected to be a
fringe thing and how you handle them highly depends on the specific use
case. I expect a typical caller will want to disregard that possibility
and just take the first channel of each semantics.
So I do not believe a dedicated function for this makes sense. We could
always add something later though, if it turns out to be necessary.

> 
> > +
> > +/**
> > + * Get the index of a given channel in a channel layout. In case multiple
> > + * channels are found, only the first match will be returned.
> > + *
> > + * @param channel_layout input channel layout
> > + * @return index of channel in channel_layout on success or a negative number if
> > + *         channel is not present in channel_layout.
> > + */
> 
> > +int av_channel_layout_channel_index(const AVChannelLayout *channel_layout,
> > +                                    enum AVChannel channel);
> 
> [ Already said on 2019-10-30 ]
> Same: do we really need this?

Again, yes. E.g. you are decoding an audio stream with multiple channels
and want to know which is the 'front left' one. This function gives you
the canonical answer to that.

> 
> > +
> > +/**
> > + * Find out what channels from a given set are present in a channel layout,
> > + * without regard for their positions.
> > + *
> > + * @param channel_layout input channel layout
> > + * @param mask a combination of AV_CH_* representing a set of channels
> > + * @return a bitfield representing all the channels from mask that are present
> > + *         in channel_layout
> > + */
> 
> > +uint64_t av_channel_layout_subset(const AVChannelLayout *channel_layout,
> > +                                  uint64_t mask);
> 
> [ Already said on 2019-10-30 ]
> Same.

It turns out to be useful in flacenc, matroskaenc and mlpdec
conversions. And of course can be potentially useful to the callers.
Since we are not deprecating the AV_CH_FOO macros, I do not see a
problem with having it public.

Anton Khirnov Jan. 7, 2020, 6:34 p.m. UTC | #10

Quoting Nicolas George (2019-12-31 16:17:49)
> Anton Khirnov (12019-12-29):
> > In the API namespace (function names) or the parameter names? For the
> > latter, it can be changed at any time without problem and I don't really
> > care much. For the former, the header is called channel_layout and I'd
> > lean towards keeping that aligned with the namespace. Not a very strong
> > opinion though.
> 
> I meant both, but of course the names of the public symbols (and the name
> of the header, which is not set in marble) are what we are stuck with
> and requires careful planning. I do not insist on it, but it is
> something to consider.
> 
> In fact, I just noticed that you used chlayout at some places in the
> public API instead of channel_layout.

Yes, fields in structs where channel_layout already exists. I don't see
what else can be done there.

> 
> > Hmm, this was apparently added by Vittorio so I'm not sure, but would
> > assume it refers to AV_CHAN_SILENCE.
> 
> It will need to be clarified before it's done.

Right, explanation added.

> 
> > I am aware that it is different, but making the struct dynamic would
> > make using it significantly more cumbersome. Given how long we have
> > lived with the bitmask, I do not expect there to be a significant need
> > to add more fields to it.
> 
> The bitmask was constraining, but it was very simple: we can accept to
> be constrained as the cost of simplicity. This new API is not very
> simple: if it is constraining, the it is not worth it.

This API is the simplest way I could think of that achieves the desired
goals (bundling the channel count+layout together, allowing arbitrary
channel counts, ambisonic,...). Most things doable with the current API
are just as simple in the new one.

> 
> > Plus, it can still be extended by adding a new AV_CHANNEL_ORDER types
> > and a corresponding new union member.
> 
> Not all extensions can be done like that.
> 
> > Given that a copy of this struct is embedded in every single frame, I'd
> > rather not add extensive dynamically allocated metadata to it. Those
> > should rather appear in some higher level API.
> 
> What higher level API? Maybe we need to reconsider embedding the
> structure in every single frame, and possibly using reference counting
> instead.

Reference counting means dynamic allocation and a whole bunch of extra
complixity. I believe it is not worth it.

> 
> > They also can (and are, in the patchset) used as
> > (AVChannelLayout)AV_CHANNEL_LAYOUT_FOO
> > 
> > We might want to add a macro for that.
> 
> Ok. Or just mention it in the doc.
> 
> > That still requires error checks and adds considerable complexity.
> 
> This is not true, please have a look at the API I have proposed, since I
> have posted it on the mailing-list:
> 
> https://ffmpeg.org/pipermail/ffmpeg-devel/2019-December/254901.html
> 
> As you can see, all the complexity is inside the implementation. For the
> code that uses the API, it actually makes things simpler.
> 
> In particular, with your version, error checking must be done twice:
> once in av_channel_layout_describe() and once when calling it. With
> AVWriter, the first one is unnecessary.

And the caller gets an extra opportunity to receive an invalid truncated
result if he doesn't go out of his way to check the result.

> 
> > There is no hiding from the fact that the string size is not bounded, so
> > either you make up an arbitrary limit and hope it's enough (and have it
> > fail for special cases), or you do dynamic allocation with error checks.
> 
> All that is taken care of by the API and hidden from the user.
> 
> > Yes. The point of the API is to provide a logical mapping between
> > audio channels and their indices in a stream. That way, the callers that
> > only read the channel layout do not (typically) need to know how exactly
> > it is stored.
> > 
> > This function is the canonical way of answering the question 'what
> > semantics does n-th channel in this stream have?', so the callers don't
> > need to handle ORDER_NATIVE vs ORDER_CUSTOM, or do any messing with
> > bitmasks.
> 
> Ok. But my point is: is the question "what semantics does n-th channel
> in this stream have?" really relevant?
> 
> (Similar with strings: the question "what is the n-th character" is
> hardly ever relevant, but people keep asking it and implementing APIs to
> answer it.)

What questions about a channel layout would you consider relevant then?

A channel layout IS - by definition - a mapping from stream indices to
semantics. 'What semantics does the n-th channel have' is one of just
two fundamental questions you can ask about a layout (the other one is
'how many channels are there'), everything else is fluff.

> 
> > I do not agree. Duplicated channels in a layout are expected to be a
> > fringe thing and how you handle them highly depends on the specific use
> > case. I expect a typical caller will want to disregard that possibility
> > and just take the first channel of each semantics.
> > So I do not believe a dedicated function for this makes sense. We could
> > always add something later though, if it turns out to be necessary.
> 
> I think you are making a mistake. I think that as soon as it will be
> technically possible, we will see cases with duplicated channels. And I
> know that some filters will do exactly that as soon as they are ported
> to this new API.
> 
> Therefore, I insist, we need a clean user interface to handle duplicated
> channels.

Why would such filters exist and why would we accept them? I do not see
how can there be a clean user interface for a broken and undefined use
case.

> 
> > Again, yes. E.g. you are decoding an audio stream with multiple channels
> > and want to know which is the 'front left' one. This function gives you
> > the canonical answer to that.
> 
> Same as above: I do not insist, but I doubt that "which one is the front
> left" is a very interesting question.

See above.

Anton Khirnov Jan. 7, 2020, 6:38 p.m. UTC | #11

Quoting Nicolas George (2019-12-31 15:01:03)
> Anton Khirnov (12019-12-29):
> > Maybe I missed something, but I am not aware of the UB-ness of signed
> > overflow being a practical problem. Typically, your computation will
> > return a meaningless result. You would get a similarly meaningless
> > result from the analogous perfectly well-defined unsigned computation.
> > 
> > to be clear: I am not objecting against fixing UB, but clarifying my
> > 'theoretical gain' comment above.
> 
> It seems you have missed some of the drama that happened on this list
> recently. Michael and others have been intent on fixing the UB caused by
> integer overflows, including cases that I personally find futile.
> 
> But I do not consider this case futile: a channel number will typically
> be used as an array index, and an invalid value will cause an invalid
> memory access and a segmentation fault, or an exploitable memory
> corruption.
> 
> What makes signed overflows, which are UB, worse than unsigned
> overflows, which are completely specified, is that you cannot guard
> against it. For example, if your write:
> 
>         if (ch < 0 || ch >= nb_channels)
>             return AVERROR_BUG;
> 
> you think you are safe, but if the compiler detects that ch cannot be
> negative without overflow, it will silently discard the ch<0 test. Then,
> if the overflows does happen, there is no protection against invalid
> memory access.
> 
> And this is not theoretical: I have seen entire blogs and twitter
> accounts dedicated to posting examples of actual cases where an
> optimizing compiler produces very unintuitive code because there is a
> tiny UB in the middle.

How is it any better in the unsigned case? You do a well-defined
unsigned overflow and end up with an invalid channel count (which might
even look sane).

Nicolas George Jan. 12, 2020, 1:28 p.m. UTC | #12

Anton Khirnov (12020-01-07):
> How is it any better in the unsigned case? You do a well-defined
> unsigned overflow and end up with an invalid channel count (which might
> even look sane).

I explained this: you can guard against defined behaviors, not against
undefined behaviors, because the compilers are allowed to shunt the
checks, and they do.

Regards,

Nicolas George Jan. 12, 2020, 1:33 p.m. UTC | #13

Anton Khirnov (12020-01-07):
> This API is the simplest way I could think of that achieves the desired
> goals (bundling the channel count+layout together, allowing arbitrary
> channel counts, ambisonic,...). Most things doable with the current API
> are just as simple in the new one.

What I am saying is that the goals were not far enough.

> Reference counting means dynamic allocation and a whole bunch of extra
> complixity. I believe it is not worth it.

This new API is not worth it if it does not allow a proper user
interface for naming channels including when they are duplicated.

> And the caller gets an extra opportunity to receive an invalid truncated
> result if he doesn't go out of his way to check the result.

If they chose so, it is their problem. But let us discuss that part in
the other thread.

> What questions about a channel layout would you consider relevant then?
> 
> A channel layout IS - by definition - a mapping from stream indices to
> semantics. 'What semantics does the n-th channel have' is one of just
> two fundamental questions you can ask about a layout (the other one is
> 'how many channels are there'), everything else is fluff.

Ok.

> Why would such filters exist and why would we accept them? I do not see
> how can there be a clean user interface for a broken and undefined use
> case.

They are already there, they work very well, and people use them. Their
behavior is perfectly well defined, but to give them a good user
interface requires channel names and duplication. It's too bad you
invested a lot of work while forgetting about them.

Regards,

Anton Khirnov Jan. 12, 2020, 7:55 p.m. UTC | #14

Quoting Nicolas George (2020-01-12 14:28:06)
> Anton Khirnov (12020-01-07):
> > How is it any better in the unsigned case? You do a well-defined
> > unsigned overflow and end up with an invalid channel count (which might
> > even look sane).
> 
> I explained this: you can guard against defined behaviors, not against
> undefined behaviors, because the compilers are allowed to shunt the
> checks, and they do.

Your explanation does not make sense to me. Checking for negative values
is not guarding against overflow, it's "checking after the fact whether
overflow occurred". Any such checks, whether signed or unsigned, are
necessarily invalid and broken (hence the quotes). Guarding against
overflow must always be done by checking BEFORE the operation that might
overflow - again both for signed and unsigned.

From this angle, there is no difference between using signed and
unsigned values. The fact that in one case the overflow would have been
UB and the other wouldn't changes nothing here.

Anton Khirnov Jan. 13, 2020, 9:37 a.m. UTC | #15

Quoting Nicolas George (2020-01-12 14:33:19)
> Anton Khirnov (12020-01-07):
> > Why would such filters exist and why would we accept them? I do not see
> > how can there be a clean user interface for a broken and undefined use
> > case.
> 
> They are already there, they work very well, and people use them. Their
> behavior is perfectly well defined, but to give them a good user
> interface requires channel names and duplication. It's too bad you
> invested a lot of work while forgetting about them.

You still did not say which filters those are, why do they need to
create streams with duplicate channels, or for that matter how they can
even do so when the current API does not support it.

Nicolas George Jan. 14, 2020, 2:46 p.m. UTC | #16

Anton Khirnov (12020-01-12):
> Your explanation does not make sense to me. Checking for negative values
> is not guarding against overflow, it's "checking after the fact whether
> overflow occurred". Any such checks, whether signed or unsigned, are
> necessarily invalid and broken (hence the quotes). Guarding against
> overflow must always be done by checking BEFORE the operation that might
> overflow - again both for signed and unsigned.
> 
> From this angle, there is no difference between using signed and
> unsigned values. The fact that in one case the overflow would have been
> UB and the other wouldn't changes nothing here.

You are right: if the code is known to be 100% bug-free, then it makes
no difference. But even TeX's code is not known to be 100% bug-free. And
when there may be bugs, I think I have given ample proof that signed
with UB are more dangerous than unsigned with modular arithmetic.

Regards,

Nicolas George Jan. 14, 2020, 2:53 p.m. UTC | #17

Anton Khirnov (12020-01-13):
> You still did not say which filters those are, why do they need to
> create streams with duplicate channels, or for that matter how they can
> even do so when the current API does not support it.

The one that comes to mind immediately is amerge, which is meant to be
used in conjunction with pan to perform arbitrary mix between different
audio streams. It was here significantly before the fork introduced
amix, and had features that are not available there.

With the current code, when the layouts overlap, it resorts to declaring
an unknown layout and relying on channel order after printing a warning.
But the natural way of doing it is to have repeated channels:

	[sfx][music][cmt]amerge=3 ;
	pan=5.1 | FC<FC.sfx+FC.cmt |
	          FL<FL.sfx+FL.music |
	          FR<FR.sfx+FR.music |
	          BL<BL.sfx |
	          BR<BR.sfx |
	          LF<LF.sfx

Any new channel layout API must contain provisions to handle this
pattern.

Regards,

Anton Khirnov Jan. 14, 2020, 3:51 p.m. UTC | #18

Quoting Nicolas George (2020-01-14 15:46:17)
> Anton Khirnov (12020-01-12):
> > Your explanation does not make sense to me. Checking for negative values
> > is not guarding against overflow, it's "checking after the fact whether
> > overflow occurred". Any such checks, whether signed or unsigned, are
> > necessarily invalid and broken (hence the quotes). Guarding against
> > overflow must always be done by checking BEFORE the operation that might
> > overflow - again both for signed and unsigned.
> > 
> > From this angle, there is no difference between using signed and
> > unsigned values. The fact that in one case the overflow would have been
> > UB and the other wouldn't changes nothing here.
> 
> You are right: if the code is known to be 100% bug-free, then it makes
> no difference. But even TeX's code is not known to be 100% bug-free. And
> when there may be bugs, I think I have given ample proof that signed
> with UB are more dangerous than unsigned with modular arithmetic.

No you certainly have not. If buggy code produces an invalid channel
count through overflow, then you are screwed no matter what. The fact
that one of the overflows is UB and the other is not has zero impact on
the fact that the channel count is unusable garbage (as I already
said).

Anton Khirnov Jan. 14, 2020, 4:04 p.m. UTC | #19

Quoting Nicolas George (2020-01-14 15:53:26)
> Anton Khirnov (12020-01-13):
> > You still did not say which filters those are, why do they need to
> > create streams with duplicate channels, or for that matter how they can
> > even do so when the current API does not support it.
> 
> The one that comes to mind immediately is amerge, which is meant to be
> used in conjunction with pan to perform arbitrary mix between different
> audio streams. It was here significantly before the fork introduced
> amix, and had features that are not available there.
> 
> With the current code, when the layouts overlap, it resorts to declaring
> an unknown layout and relying on channel order after printing a warning.
> But the natural way of doing it is to have repeated channels:
> 
>         [sfx][music][cmt]amerge=3 ;
>         pan=5.1 | FC<FC.sfx+FC.cmt |
>                   FL<FL.sfx+FL.music |
>                   FR<FR.sfx+FR.music |
>                   BL<BL.sfx |
>                   BR<BR.sfx |
>                   LF<LF.sfx
> 
> Any new channel layout API must contain provisions to handle this
> pattern.

No. If you want to mix multiple streams, then your mixing filter should
support multiple streams. It is certainly in no way "natural" or correct
to invent a scheme for stream multiplexing through channel layouts.

Nicolas George Jan. 14, 2020, 4:07 p.m. UTC | #20

Anton Khirnov (12020-01-14):
> No. If you want to mix multiple streams, then your mixing filter should
> support multiple streams. It is certainly in no way "natural" or correct
> to invent a scheme for stream multiplexing through channel layouts.

You could argue that if the channel layout API was done and the filter
was in the process of being written. But it is the other way around: the
filter exists, it has needs, the new API must adapt.

Regards,

Paul B Mahol Jan. 14, 2020, 4:28 p.m. UTC | #21

On 1/14/20, Nicolas George <george@nsup.org> wrote:
> Anton Khirnov (12020-01-14):
>> No. If you want to mix multiple streams, then your mixing filter should
>> support multiple streams. It is certainly in no way "natural" or correct
>> to invent a scheme for stream multiplexing through channel layouts.
>
> You could argue that if the channel layout API was done and the filter
> was in the process of being written. But it is the other way around: the
> filter exists, it has needs, the new API must adapt.
>

This is pure nonsense, nothing must adapt.

> Regards,
>
> --
>   Nicolas George
>

Anton Khirnov Jan. 28, 2020, 6:07 p.m. UTC | #22

Quoting Nicolas George (2020-01-14 17:07:56)
> Anton Khirnov (12020-01-14):
> > No. If you want to mix multiple streams, then your mixing filter should
> > support multiple streams. It is certainly in no way "natural" or correct
> > to invent a scheme for stream multiplexing through channel layouts.
> 
> You could argue that if the channel layout API was done and the filter
> was in the process of being written. But it is the other way around: the
> filter exists, it has needs, the new API must adapt.

That makes no sense. The filter cannot "have needs" when the current API
does not support the use case you have in mind (which is good). The
filter can either be modified to allow multiple inputs or a new filter
can be added.

Nicolas George Feb. 1, 2020, 1:35 p.m. UTC | #23

Anton Khirnov (12020-01-28):
> That makes no sense. The filter cannot "have needs" when the current API
> does not support the use case you have in mind (which is good). The
> filter can either be modified to allow multiple inputs or a new filter
> can be added.

It makes perfect sense: I know what I planned when I wrote them, and the
plan was always that when we would replace the channel layout API, it
would be able to express duplicated channels. And we will not demand
users to learn new filters for that.

You can continue to disregard my arguments, and getting close to being
insulting while you are at it, but you will not get me to change my mind
like that. The API, as it is, is rejected. Either we design a good API,
one that can satisfy the needs you see and the needs I see, or we keep
the current, very simple, API. No half-measure that brings all the
drawbacks and almost no benefit.

Regards,

Paul B Mahol Feb. 1, 2020, 2:06 p.m. UTC | #24

On 2/1/20, Nicolas George <george@nsup.org> wrote:
> Anton Khirnov (12020-01-28):
>> That makes no sense. The filter cannot "have needs" when the current API
>> does not support the use case you have in mind (which is good). The
>> filter can either be modified to allow multiple inputs or a new filter
>> can be added.
>
> It makes perfect sense: I know what I planned when I wrote them, and the
> plan was always that when we would replace the channel layout API, it
> would be able to express duplicated channels. And we will not demand
> users to learn new filters for that.
>
> You can continue to disregard my arguments, and getting close to being
> insulting while you are at it, but you will not get me to change my mind
> like that. The API, as it is, is rejected. Either we design a good API,
> one that can satisfy the needs you see and the needs I see, or we keep
> the current, very simple, API. No half-measure that brings all the
> drawbacks and almost no benefit.

Why you think you are not insulting other developers by your way of
expressing your side of view.

Very short sighted reasoning.

Marton Balint Feb. 5, 2020, 6:55 p.m. UTC | #25

On Tue, 7 Jan 2020, Anton Khirnov wrote:

> Quoting Nicolas George (2019-12-31 16:17:49)
>> Anton Khirnov (12019-12-29):

>> > I do not agree. Duplicated channels in a layout are expected to be a
>> > fringe thing and how you handle them highly depends on the specific use
>> > case. I expect a typical caller will want to disregard that possibility
>> > and just take the first channel of each semantics.
>> > So I do not believe a dedicated function for this makes sense. We could
>> > always add something later though, if it turns out to be necessary.
>> 
>> I think you are making a mistake. I think that as soon as it will be
>> technically possible, we will see cases with duplicated channels. And I
>> know that some filters will do exactly that as soon as they are ported
>> to this new API.

Quicktime also allows duplicated channels in a single audio track, this is 
unfortunately a commonly used feature. So if a new API is introduced to 
overcome the limitations of the existing one, supporting this should be 
seriously considered.

Regards,
Marton

Anton Khirnov Feb. 18, 2020, 8:59 p.m. UTC | #26

Quoting Marton Balint (2020-02-05 19:55:24)
> 
> 
> On Tue, 7 Jan 2020, Anton Khirnov wrote:
> 
> > Quoting Nicolas George (2019-12-31 16:17:49)
> >> Anton Khirnov (12019-12-29):
> 
> >> > I do not agree. Duplicated channels in a layout are expected to be a
> >> > fringe thing and how you handle them highly depends on the specific use
> >> > case. I expect a typical caller will want to disregard that possibility
> >> > and just take the first channel of each semantics.
> >> > So I do not believe a dedicated function for this makes sense. We could
> >> > always add something later though, if it turns out to be necessary.
> >> 
> >> I think you are making a mistake. I think that as soon as it will be
> >> technically possible, we will see cases with duplicated channels. And I
> >> know that some filters will do exactly that as soon as they are ported
> >> to this new API.
> 
> Quicktime also allows duplicated channels in a single audio track, this is 
> unfortunately a commonly used feature. So if a new API is introduced to 
> overcome the limitations of the existing one, supporting this should be 
> seriously considered.

Can you provide a link to more information about this? I'd like to know
the specifics about how it's handled by quicktime. And do note that this
API does have some ammount of support for duplicated channels (though I
still consider streams containing them to be broken).

Marton Balint Feb. 18, 2020, 10:54 p.m. UTC | #27

On Tue, 18 Feb 2020, Anton Khirnov wrote:

> Quoting Marton Balint (2020-02-05 19:55:24)
>> 
>> 
>> On Tue, 7 Jan 2020, Anton Khirnov wrote:
>> 
>> > Quoting Nicolas George (2019-12-31 16:17:49)
>> >> Anton Khirnov (12019-12-29):
>> 
>> >> > I do not agree. Duplicated channels in a layout are expected to be a
>> >> > fringe thing and how you handle them highly depends on the specific use
>> >> > case. I expect a typical caller will want to disregard that possibility
>> >> > and just take the first channel of each semantics.
>> >> > So I do not believe a dedicated function for this makes sense. We could
>> >> > always add something later though, if it turns out to be necessary.
>> >> 
>> >> I think you are making a mistake. I think that as soon as it will be
>> >> technically possible, we will see cases with duplicated channels. And I
>> >> know that some filters will do exactly that as soon as they are ported
>> >> to this new API.
>> 
>> Quicktime also allows duplicated channels in a single audio track, this is 
>> unfortunately a commonly used feature. So if a new API is introduced to 
>> overcome the limitations of the existing one, supporting this should be 
>> seriously considered.
>
> Can you provide a link to more information about this? I'd like to know
> the specifics about how it's handled by quicktime. And do note that this
> API does have some ammount of support for duplicated channels (though I
> still consider streams containing them to be broken).

The 'chan' atom can hold this information:

https://developer.apple.com/library/archive/documentation/QuickTime/QTFF/QTFFChap3/qtff3.html#//apple_ref/doc/uid/TP40000939-CH205-SW53

It holds an AudioChannelLayout structure as defined in CoreAudioTypes.h:

https://developer.apple.com/documentation/coreaudiotypes/audiochannellayout
https://github.com/phracker/MacOSX-SDKs/blob/master/MacOSX10.15.sdk/System/Library/Frameworks/CoreAudioTypes.framework/Versions/A/Headers/CoreAudioBaseTypes.h#L1353

Which can hold an AudioChannelDescription array of channel descriptions 
for each channel:

https://developer.apple.com/documentation/coreaudiotypes/audiochannellayout
https://github.com/phracker/MacOSX-SDKs/blob/master/MacOSX10.15.sdk/System/Library/Frameworks/CoreAudioTypes.framework/Versions/A/Headers/CoreAudioBaseTypes.h#L1335

Which can signal the same AudioChannelLabel for every channel in a track 
if needed.

There is some support for it in libavformat/mov_chan.c.

Regards,
Marton

Anton Khirnov Feb. 19, 2020, 12:56 p.m. UTC | #28

Quoting Marton Balint (2020-02-18 23:54:56)
> 
> 
> On Tue, 18 Feb 2020, Anton Khirnov wrote:
> 
> > Quoting Marton Balint (2020-02-05 19:55:24)
> >> 
> >> 
> >> On Tue, 7 Jan 2020, Anton Khirnov wrote:
> >> 
> >> > Quoting Nicolas George (2019-12-31 16:17:49)
> >> >> Anton Khirnov (12019-12-29):
> >> 
> >> >> > I do not agree. Duplicated channels in a layout are expected to be a
> >> >> > fringe thing and how you handle them highly depends on the specific use
> >> >> > case. I expect a typical caller will want to disregard that possibility
> >> >> > and just take the first channel of each semantics.
> >> >> > So I do not believe a dedicated function for this makes sense. We could
> >> >> > always add something later though, if it turns out to be necessary.
> >> >> 
> >> >> I think you are making a mistake. I think that as soon as it will be
> >> >> technically possible, we will see cases with duplicated channels. And I
> >> >> know that some filters will do exactly that as soon as they are ported
> >> >> to this new API.
> >> 
> >> Quicktime also allows duplicated channels in a single audio track, this is 
> >> unfortunately a commonly used feature. So if a new API is introduced to 
> >> overcome the limitations of the existing one, supporting this should be 
> >> seriously considered.
> >
> > Can you provide a link to more information about this? I'd like to know
> > the specifics about how it's handled by quicktime. And do note that this
> > API does have some ammount of support for duplicated channels (though I
> > still consider streams containing them to be broken).
> 
> The 'chan' atom can hold this information:
> 
> https://developer.apple.com/library/archive/documentation/QuickTime/QTFF/QTFFChap3/qtff3.html#//apple_ref/doc/uid/TP40000939-CH205-SW53
> 
> It holds an AudioChannelLayout structure as defined in CoreAudioTypes.h:
> 
> https://developer.apple.com/documentation/coreaudiotypes/audiochannellayout
> https://github.com/phracker/MacOSX-SDKs/blob/master/MacOSX10.15.sdk/System/Library/Frameworks/CoreAudioTypes.framework/Versions/A/Headers/CoreAudioBaseTypes.h#L1353
> 
> Which can hold an AudioChannelDescription array of channel descriptions 
> for each channel:
> 
> https://developer.apple.com/documentation/coreaudiotypes/audiochannellayout
> https://github.com/phracker/MacOSX-SDKs/blob/master/MacOSX10.15.sdk/System/Library/Frameworks/CoreAudioTypes.framework/Versions/A/Headers/CoreAudioBaseTypes.h#L1335
> 
> Which can signal the same AudioChannelLabel for every channel in a track 
> if needed.

Thanks for the links. As far as I can tell this can be mapped to the
proposed API just fine (except for signalling precise speaker
coordinates, which I don't think anything uses).

Nicolas George Feb. 25, 2020, 11:47 a.m. UTC | #29

Anton Khirnov (12020-02-19):
> Thanks for the links. As far as I can tell this can be mapped to the
> proposed API just fine

Except for the user interface part, as I already pointed: if there are
several times the same channel, the API needs to provide a standard way
for the user to specify one.

>			 (except for signalling precise speaker
> coordinates, which I don't think anything uses).

I think somebody uses it, because somebody felt the need to include it
in the standard. Therefore, we need to support it.

Regards,

Hendrik Leppkes Feb. 25, 2020, 12:28 p.m. UTC | #30

On Tue, Feb 25, 2020 at 12:47 PM Nicolas George <george@nsup.org> wrote:
> >                        (except for signalling precise speaker
> > coordinates, which I don't think anything uses).
>
> I think somebody uses it, because somebody felt the need to include it
> in the standard. Therefore, we need to support it.
>

Standards designed by committee like all this MPEG stuff are full of
features that noone uses. Its usually indicative of the following
replacement standard dumping them again. Instead, they threw in a
bunch of new things of questionable use that will disappear in the
next standard once again.

I don't think we should be blindly following what some other group
thinks, but critically judge everything we're implementing here.

- Hendrik

Anton Khirnov Feb. 25, 2020, 3:36 p.m. UTC | #31

Quoting Nicolas George (2020-02-25 12:47:03)
> Anton Khirnov (12020-02-19):
> > Thanks for the links. As far as I can tell this can be mapped to the
> > proposed API just fine
> 
> Except for the user interface part, as I already pointed: if there are
> several times the same channel, the API needs to provide a standard way
> for the user to specify one.

As far as I can tell, the Apple API linked above does not support that
either. The way of describing the channel layout is given by
mChannelLayoutTag, which can be either
- kAudioChannelLayoutTag_UseChannelBitmap, which is effectively
  equivalent to our current API, or the new API's LAYOUT_NATIVE
- one of several predefined layouts, which can be mapped either to
  LAYOUT_NATIVE or LAYOUT_CUSTOM
- kAudioChannelLayoutTag_UseChannelDescriptions, which cannot be
  represented in the current API, but is effectively equivalent to the
  new API's LAYOUT_CUSTOM
  The AudioChannelDescription struct contains:
    * AudioChannelFlags, which apply to coordinates
    * three floats, which are the coordinates
    * AudioChannelLabel, which is uint32 similar to our AVChannel

I see no support for any custom free-form text labels of the kind you
want me to add.

> 
> >                        (except for signalling precise speaker
> > coordinates, which I don't think anything uses).
> 
> I think somebody uses it, because somebody felt the need to include it
> in the standard. Therefore, we need to support it.

In addition to Hendrik's reply (which I agree with), support for this
can be later added through a new layout type if someone really needs it.
I see no reason to spend effort implementing functionality that is not
actually used for anything.

Nicolas George Feb. 26, 2020, 11:10 a.m. UTC | #32

Hendrik Leppkes (12020-02-25):
> Standards designed by committee like all this MPEG stuff are full of
> features that noone uses. Its usually indicative of the following
> replacement standard dumping them again. Instead, they threw in a
> bunch of new things of questionable use that will disappear in the
> next standard once again.

I agree, it's a very weak argument. But it's still infinitely superior
to the absence of arguments in "I don't think anything uses".

> I don't think we should be blindly following what some other group
> thinks, but critically judge everything we're implementing here.

I am not suggesting following anything blindly. But files using this
feature can exist out there, so the question is whether to accept an API
that cannot handle them without loss of information.

Regards,

Nicolas George Feb. 26, 2020, 11:26 a.m. UTC | #33

Anton Khirnov (12020-02-25):
> As far as I can tell, the Apple API linked above does not support that
> either. The way of describing the channel layout is given by
> mChannelLayoutTag, which can be either
> - kAudioChannelLayoutTag_UseChannelBitmap, which is effectively
>   equivalent to our current API, or the new API's LAYOUT_NATIVE
> - one of several predefined layouts, which can be mapped either to
>   LAYOUT_NATIVE or LAYOUT_CUSTOM
> - kAudioChannelLayoutTag_UseChannelDescriptions, which cannot be
>   represented in the current API, but is effectively equivalent to the
>   new API's LAYOUT_CUSTOM
>   The AudioChannelDescription struct contains:
>     * AudioChannelFlags, which apply to coordinates
>     * three floats, which are the coordinates
>     * AudioChannelLabel, which is uint32 similar to our AVChannel
> 
> I see no support for any custom free-form text labels of the kind you
> want me to add.

The link above does not describe an API, it describe a format. A format
that can contain several channels with the FL (example) disposition.
Even if there are no labels attached, users need an interface to specify
one: "the first FL channel", "the second FL channel", for example. Your
proposal does not have it, it needs it.

> In addition to Hendrik's reply (which I agree with), support for this
> can be later added through a new layout type if someone really needs it.
> I see no reason to spend effort implementing functionality that is not
> actually used for anything.

It's the second time you propose to extend the API later like that. I am
glad you realize the API you proposed is insufficient and needs
extending, but I am worried that you don't realize that extending later
for needs that we see now is a terrible idea.

It's a terrible idea because it requires more work, and because the
extension will always stay second class compared to the initial code,
and unsupported in all the places in the code that will not be updated
for it.

Since we were lucky to see we need a way of attaching data to channels
before including the API, we'll include it in the API from the start.

It's too bad it will require dynamic allocation and make the handling of
the structure as a whole more clumsy, but it seems it was unavoidable.

We could use simply a per-channel metadata dictionary, it would allow
both the speaker coordinate in this spec and the free-form label that I
want. Alternatively, we can use a more typed approach, like side data.

Regards,

[FFmpeg-devel,001/244] Add a new channel layout API

Commit Message

Comments

Patch