diff mbox series

[FFmpeg-devel,1/2] avcodec/s302m: enable non-PCM decoding

Message ID 20240123064948.455-1-ffmpeg@gyani.pro
State New
Headers show
Series [FFmpeg-devel,1/2] avcodec/s302m: enable non-PCM decoding | expand

Checks

Context Check Description
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Gyan Doshi Jan. 23, 2024, 6:49 a.m. UTC
Set up framework for non-PCM decoding in-place and
add support for Dolby-E decoding.

Useful for direct transcoding of non-PCM audio in live inputs.
---
 configure          |   1 +
 doc/decoders.texi  |  40 +++
 libavcodec/s302m.c | 609 +++++++++++++++++++++++++++++++++++++--------
 3 files changed, 543 insertions(+), 107 deletions(-)

Comments

Kieran Kunhya Jan. 23, 2024, 7:56 a.m. UTC | #1
On Tue, 23 Jan 2024 at 06:50, Gyan Doshi <ffmpeg@gyani.pro> wrote:

> Set up framework for non-PCM decoding in-place and
> add support for Dolby-E decoding.
>
> Useful for direct transcoding of non-PCM audio in live inputs.
>

Does this handle a Dolby E packet spanning multiple S302M packets? I'm not
saying you should support this as it's rare and very annoying to implement
but it does happen.

Kieran
Gyan Doshi Jan. 23, 2024, 8:32 a.m. UTC | #2
On 2024-01-23 01:26 pm, Kieran Kunhya wrote:
> On Tue, 23 Jan 2024 at 06:50, Gyan Doshi <ffmpeg@gyani.pro> wrote:
>
>> Set up framework for non-PCM decoding in-place and
>> add support for Dolby-E decoding.
>>
>> Useful for direct transcoding of non-PCM audio in live inputs.
>>
> Does this handle a Dolby E packet spanning multiple S302M packets? I'm not
> saying you should support this as it's rare and very annoying to implement
> but it does happen.

No, this does not handle spanned payload.
I have access to a sample where it looks like the packetizer got 
mis-aligned and I think
cases like those are better handled by a parser.

Adding support for multiple payload packets in one 302m packet is on my 
to-do although
the parser is better suited for that as well as the last payload packet 
might be fragmented.

Regards,
Gyan
Kieran Kunhya Jan. 23, 2024, 9:05 a.m. UTC | #3
On Tue, 23 Jan 2024 at 08:33, Gyan Doshi <ffmpeg@gyani.pro> wrote:

>
>
> On 2024-01-23 01:26 pm, Kieran Kunhya wrote:
> > On Tue, 23 Jan 2024 at 06:50, Gyan Doshi <ffmpeg@gyani.pro> wrote:
> >
> >> Set up framework for non-PCM decoding in-place and
> >> add support for Dolby-E decoding.
> >>
> >> Useful for direct transcoding of non-PCM audio in live inputs.
> >>
> > Does this handle a Dolby E packet spanning multiple S302M packets? I'm
> not
> > saying you should support this as it's rare and very annoying to
> implement
> > but it does happen.
>
> No, this does not handle spanned payload.
> I have access to a sample where it looks like the packetizer got
> mis-aligned and I think
> cases like those are better handled by a parser.
>
> Adding support for multiple payload packets in one 302m packet is on my
> to-do although
> the parser is better suited for that as well as the last payload packet
> might be fragmented.
>
> Regards,
> Gyan
>

Yes, and the other uncertainty is that the PTS must be cotimed with the
video so you have to guess whether it's the previous PTS or the next one.

Kieran
Nicolas Gaullier Jan. 23, 2024, 10:28 a.m. UTC | #4
>+#define IS_NONPCMSYNC_16(state)   ((state & 0xFFFF0FFFF0)     == NONPCMSYNC_16MARKER)

Is this single 32 bits marker enough to avoid a fake detection ?

Nicolas
Gyan Doshi Jan. 23, 2024, 11:18 a.m. UTC | #5
On 2024-01-23 03:58 pm, Nicolas Gaullier wrote:
>> +#define IS_NONPCMSYNC_16(state)   ((state & 0xFFFF0FFFF0)     == NONPCMSYNC_16MARKER)
> Is this single 32 bits marker enough to avoid a fake detection ?

It will have to do. The modal number of payload packets expected in a 
single s302m packet is one, so there's no redundancy.
And since the AES3 sync (time slots 0-3) is stripped off (in all samples 
I have), we can't identify byte 0 of the channel status word for a 
double-check.

However, unlike s337m probe, which is dealing with data of unknown 
provenance, the context here is a PES packet emitted of a stream 
identified as s302m by the demuxer so the odds are low of a false positive.

Regards,
Gyan
Devin Heitmueller Jan. 23, 2024, 2:50 p.m. UTC | #6
On Tue, Jan 23, 2024 at 4:05 AM Kieran Kunhya <kierank@obe.tv> wrote:
> Yes, and the other uncertainty is that the PTS must be cotimed with the
> video so you have to guess whether it's the previous PTS or the next one.

Isn't the correct behavior to determine the offset within the raw PCM
payload, and use that to determine the actual PTS of the embedded
non-PCM packet?  Unless I"m misreading the code, it seems like it's
simply setting the PTS of the non-pcm packet to be the PTS of the
original S302 packet, which will result in incorrect A/V sync.

Devin
Kieran Kunhya Jan. 23, 2024, 2:53 p.m. UTC | #7
On Tue, 23 Jan 2024 at 14:51, Devin Heitmueller <
devin.heitmueller@ltnglobal.com> wrote:

> On Tue, Jan 23, 2024 at 4:05 AM Kieran Kunhya <kierank@obe.tv> wrote:
> > Yes, and the other uncertainty is that the PTS must be cotimed with the
> > video so you have to guess whether it's the previous PTS or the next one.
>
> Isn't the correct behavior to determine the offset within the raw PCM
> payload, and use that to determine the actual PTS of the embedded
> non-PCM packet?  Unless I"m misreading the code, it seems like it's
> simply setting the PTS of the non-pcm packet to be the PTS of the
> original S302 packet, which will result in incorrect A/V sync.
>

Quite the opposite, Dolby E by design is cotimed with the video frame and
S302M PTS is also cotimed with the video frame.
The ambiguity is when Dolby E is misaligned, is it misaligned to the next
video frame, or the previous one.

For AC-3 yes, the offset matters but that's a different story.

Kieran
Devin Heitmueller Jan. 23, 2024, 3:04 p.m. UTC | #8
On Tue, Jan 23, 2024 at 9:54 AM Kieran Kunhya <kierank@obe.tv> wrote:
> Quite the opposite, Dolby E by design is cotimed with the video frame and
> S302M PTS is also cotimed with the video frame.
> The ambiguity is when Dolby E is misaligned, is it misaligned to the next
> video frame, or the previous one.

I knew the Dolby-E packet had to be cotimed with the video frame, but
was unaware that the S302M packets also had that requirement.  But
yeah, now that I look at the spec I see that to be the case
(S302M-2007 Sec 6.10).

Devin
Andreas Rheinhardt Jan. 25, 2024, 4:59 a.m. UTC | #9
Gyan Doshi:
> Set up framework for non-PCM decoding in-place and
> add support for Dolby-E decoding.
> 
> Useful for direct transcoding of non-PCM audio in live inputs.
> ---
>  configure          |   1 +
>  doc/decoders.texi  |  40 +++
>  libavcodec/s302m.c | 609 +++++++++++++++++++++++++++++++++++++--------
>  3 files changed, 543 insertions(+), 107 deletions(-)
> 
> diff --git a/configure b/configure
> index c8ae0a061d..8db3fa3f4b 100755
> --- a/configure
> +++ b/configure
> @@ -2979,6 +2979,7 @@ rv20_decoder_select="h263_decoder"
>  rv20_encoder_select="h263_encoder"
>  rv30_decoder_select="golomb h264pred h264qpel mpegvideodec rv34dsp"
>  rv40_decoder_select="golomb h264pred h264qpel mpegvideodec rv34dsp"
> +s302m_decoder_select="dolby_e_decoder"
>  screenpresso_decoder_deps="zlib"
>  shorten_decoder_select="bswapdsp"
>  sipr_decoder_select="lsp"
> diff --git a/doc/decoders.texi b/doc/decoders.texi
> index 293c82c2ba..9f85c876bf 100644
> --- a/doc/decoders.texi
> +++ b/doc/decoders.texi
> @@ -347,6 +347,46 @@ configuration. You need to explicitly configure the build with
>  An FFmpeg native decoder for Opus exists, so users can decode Opus
>  without this library.
>  
> +@section s302m
> +
> +SMPTE ST 302 decoder.
> +
> +SMPTE ST 302 is a method for storing AES3 data format within an MPEG Transport
> +Stream. AES3 streams can contain LPCM streams of 2, 4, 6 or 8 channels with a
> +bit depth of 16, 20 or 24-bits at a sample rate of 48 kHz.
> +They can also contain non-PCM codec streams such as AC-3 or Dolby-E.
> +

This sounds like we should add bitstream filters to extract the proper
underlying streams instead.
(I see only two problems with this approach: The BSF API needs to set
the CodecID of the output during init, but at this point no packet has
reached the BSF to determine it. And changing codec IDs mid-stream is
also not supported.)

> +Decoding non-PCM streams directly requires that the necessary stream decoder be
> +present in the build. At present, only Dolby-E decoding is supported.
> +
> +@subsection Options
> +
> +The following options are supported by the s302m decoder.
> +
> +@table @option
> +@item non_pcm_mode @var{mode}
> +Specify how to process non-PCM streams
> +
> +@table @samp
> +@item copy
> +Treat data as if it were LPCM.
> +@item drop
> +Discard the stream.
> +@item decode_copy
> +Decode if possible eise treat the same as @code{copy}.
> +@item decode_drop
> +Decode if possible eise treat the same as @code{drop}.
> +@end table
> +
> +The default value is @code{decode_drop}. This option does not affect the processing of
> +LPCM streams.
> +
> +@item non_pcm_options @var{options}
> +Set options for non-PCM decoder using a list of key=value pairs separated by ":".
> +Consult the docs for the non-PCM decoder for its options.
> +
> +@end table
> +
>  @c man end AUDIO DECODERS
>  
>  @chapter Subtitles Decoders
> diff --git a/libavcodec/s302m.c b/libavcodec/s302m.c
> index f1b41608f3..d6a75cfa73 100644
> --- a/libavcodec/s302m.c
> +++ b/libavcodec/s302m.c
> @@ -24,21 +24,264 @@
>  #include "libavutil/intreadwrite.h"
>  #include "libavutil/opt.h"
>  #include "libavutil/log.h"
> +#include "libavutil/dict.h"
>  #include "libavutil/reverse.h"
>  #include "avcodec.h"
>  #include "codec_internal.h"
> +#include "get_bits.h"
>  #include "decode.h"
>  
>  #define AES3_HEADER_LEN 4
>  
> +#define NONPCMSYNC_16MARKER      0x4E1F0F8720
> +#define NONPCMSYNC_20MARKER      0x4E1F60F872A0
> +#define NONPCMSYNC_24MARKER      0x7E1F690F872A50
> +
> +#define NONPCMSYNC_16_IN_20MARKER      0x04E1F00F8720
> +#define NONPCMSYNC_20_IN_24MARKER      0x04E1F600F872A0
> +
> +#define IS_NONPCMSYNC_16(state)   ((state & 0xFFFF0FFFF0)     == NONPCMSYNC_16MARKER)
> +#define IS_NONPCMSYNC_20(state)   ((state & 0xFFFFF0FFFFF0)   == NONPCMSYNC_20MARKER)
> +#define IS_NONPCMSYNC_24(state)   ((state & 0xFFFFFF0FFFFFF0) == NONPCMSYNC_24MARKER)
> +
> +#define IS_NONPCMSYNC_16_IN_20(state)   ((state & 0x0FFFF00FFFF0)   == NONPCMSYNC_16_IN_20MARKER)
> +#define IS_NONPCMSYNC_20_IN_24(state)   ((state & 0x0FFFFF00FFFFF0) == NONPCMSYNC_20_IN_24MARKER)
> +
> +#define IS_NONPCMSYNC(bit,state)  ( ((bit == 16) &&  IS_NONPCMSYNC_16(state)) || \
> +                                    ((bit == 20) && (IS_NONPCMSYNC_20(state) || IS_NONPCMSYNC_16_IN_20(state))) || \
> +                                    ((bit == 24) && (IS_NONPCMSYNC_24(state) || IS_NONPCMSYNC_20_IN_24(state))) \
> +                                  )
> +
> +enum non_pcm_modes {
> +    NON_PCM_COPY,
> +    NON_PCM_DROP,
> +    NON_PCM_DEC_ELSE_COPY,
> +    NON_PCM_DEC_ELSE_DROP,
> +};
> +
>  typedef struct S302Context {
>      AVClass *class;
> +
> +    int avctx_props_set;
> +
> +    int channels;
> +    int bits;
> +
>      int non_pcm_mode;
> +    int non_pcm_data_type;
> +    int non_pcm_bits;
> +    int non_pcm_dec;
> +
> +    AVCodecContext *non_pcm_ctx;
> +    AVDictionary   *non_pcm_opts;
> +    AVPacket *packet;
> +    AVFrame  *frame;
>  } S302Context;
>  
> +static av_cold int s302m_init(AVCodecContext *avctx)
> +{
> +    S302Context *s = avctx->priv_data;
> +
> +    s->non_pcm_data_type = -1;
> +
> +    return 0;
> +}
> +
> +static int s302m_non_pcm_inspect(AVCodecContext *avctx, const uint8_t *buf, int buf_size,
> +                                  int *offset, int *length)
> +{
> +    S302Context *s = avctx->priv_data;
> +    GetBitContext gb;
> +    int ret, aes_frm_size, data_type, length_code = 0;
> +    uint64_t state = 0;
> +    uint32_t size;
> +
> +    if (s->channels != 2) {
> +        goto end;
> +    }
> +
> +    ret = init_get_bits8(&gb, buf, buf_size);
> +    if (ret < 0)
> +        return ret;
> +
> +    aes_frm_size = (s->bits + 4) * 2 / 8;
> +    if (buf_size < aes_frm_size * 2)  // not enough to contain data_type & length_code
> +        return AVERROR_INVALIDDATA;
> +
> +    state = get_bits64(&gb, aes_frm_size * 8);
> +
> +    while (!IS_NONPCMSYNC(s->bits,state) && (get_bits_left(&gb) >= 8))
> +        state = (state << 8) | get_bits(&gb, 8);

Reading byte-aligned data with a GetBit context is very suboptimal.

> +
> +    if (IS_NONPCMSYNC(s->bits,state)) {
> +        if (get_bits_left(&gb) < aes_frm_size * 8) {
> +            av_log(avctx, AV_LOG_ERROR, "truncated non-pcm frame detected\n");
> +            return AVERROR_INVALIDDATA;
> +        }
> +
> +        if (s->bits == 16) {
> +            s->non_pcm_bits = 16;
> +        } else if (s->bits == 20) {
> +            if (IS_NONPCMSYNC_16_IN_20(state))
> +                s->non_pcm_bits = 16;
> +            else
> +                s->non_pcm_bits = 20;
> +        } else if (s->bits == 24) {
> +            if (IS_NONPCMSYNC_20_IN_24(state))
> +                s->non_pcm_bits = 20;
> +            else
> +                s->non_pcm_bits = 24;
> +        }
> +
> +        skip_bits(&gb, s->bits - 16);
> +
> +        data_type = ff_reverse[(uint8_t)get_bits(&gb, 5)] >> 3;
> +
> +        if (s->non_pcm_data_type == -1) {
> +            s->non_pcm_data_type = data_type;
> +            av_log(avctx, AV_LOG_INFO, "stream has non-pcm data of type %d with %d-bit words in aes3 payload of size %d bits\n", data_type, s->non_pcm_bits, s->bits);
> +        } else if (s->non_pcm_data_type != data_type) {
> +            av_log(avctx, AV_LOG_DEBUG, "type mismatch of non-pcm type in packet %d vs stream %d. Dropping.\n", data_type, s->non_pcm_data_type);
> +            return AVERROR_INVALIDDATA;
> +        }
> +
> +        if (s->non_pcm_mode == NON_PCM_COPY)
> +            return 0;
> +        else if (s->non_pcm_mode == NON_PCM_DROP)
> +            return AVERROR_INVALIDDATA;
> +
> +        skip_bits(&gb, 15);
> +
> +        size = get_bits(&gb, s->bits);
> +
> +        length_code = ((ff_reverse[(uint8_t)((size & 0xFF)          )] << 16) |
> +                       (ff_reverse[(uint8_t)((size & 0xFF00)   >> 8 )] << 8 ) |
> +                       (ff_reverse[(uint8_t)((size & 0xFF0000) >> 16)]      ) ) >> (24 - s->non_pcm_bits);
> +
> +        skip_bits(&gb, 4);
> +
> +        *offset = get_bits_count(&gb)/8;
> +        *length = length_code;
> +
> +        av_log(avctx, AV_LOG_TRACE, "located non-pcm packet at offset %d length code %d.\n", AES3_HEADER_LEN + *offset, length_code);
> +
> +        return data_type;
> +    }
> +
> +end:
> +    if (s->non_pcm_data_type == -1) {
> +        s->non_pcm_data_type = 32;  // indicate stream should be treated as LPCM
> +        return 0;
> +    } else
> +        return AVERROR_INVALIDDATA;
> +}
> +
> +static int s302m_setup_non_pcm_handling(AVCodecContext *avctx, const uint8_t *buf, int buf_size)
> +{
> +    S302Context *s = avctx->priv_data;
> +    const AVCodec *codec;
> +    enum AVCodecID codec_id;
> +    AVDictionary *dec_opts = NULL;
> +    int ret;
> +
> +    if (s->non_pcm_mode > NON_PCM_DROP) {
> +        switch (s->non_pcm_data_type) {
> +        case 0x1C:
> +            codec_id = AV_CODEC_ID_DOLBY_E;
> +            break;
> +        default:
> +            avpriv_report_missing_feature(avctx, "decode of non-pcm data type %d", s->non_pcm_data_type);
> +            ret = AVERROR_PATCHWELCOME;
> +            goto fail;
> +        }
> +
> +        codec = avcodec_find_decoder(codec_id);
> +        if (!codec) {
> +            ret = AVERROR_DECODER_NOT_FOUND;
> +            goto fail;
> +        }
> +
> +        s->non_pcm_ctx = avcodec_alloc_context3(codec);
> +        if (!s->non_pcm_ctx) {
> +            ret = AVERROR(ENOMEM);
> +            goto fail;
> +        }
> +
> +        av_dict_copy(&dec_opts, s->non_pcm_opts, 0);
> +
> +        ret = avcodec_open2(s->non_pcm_ctx, codec, &dec_opts);
> +        av_dict_free(&dec_opts);
> +        if (ret < 0)
> +            goto fail;
> +
> +        s->packet = av_packet_alloc();
> +        if (!s->packet) {
> +            ret = AVERROR(ENOMEM);
> +            goto fail;
> +        }
> +
> +        s->frame = av_frame_alloc();
> +        if (!s->frame) {
> +            ret = AVERROR(ENOMEM);
> +            goto fail;
> +        }
> +    } else
> +        return 0;
> +
> +    s->non_pcm_dec = 1;
> +    return 0;
> +
> +fail:
> +    avcodec_free_context(&s->non_pcm_ctx);
> +    av_packet_free(&s->packet);
> +    av_frame_free(&s->frame);
> +
> +    if (s->non_pcm_mode == NON_PCM_DEC_ELSE_COPY)
> +        s->non_pcm_mode = NON_PCM_COPY;
> +    else if (s->non_pcm_mode == NON_PCM_DEC_ELSE_DROP)
> +        s->non_pcm_mode = NON_PCM_DROP;
> +
> +    return ret;
> +}
> +
> +static int s302m_get_non_pcm_pkt_size(AVCodecContext *avctx, int buf_size, int offset,
> +                                      int length_code, int *dec_pkt_size)
> +{
> +    S302Context *s = avctx->priv_data;
> +    int nb_words, word_size, aesframe_size, s302m_read_size;
> +
> +    if (offset < 0 || offset >= buf_size)
> +        return AVERROR_INVALIDDATA;
> +
> +    switch (s->non_pcm_data_type) {
> +    case 0x1C:
> +        goto dolby_e;
> +    default:
> +        return AVERROR_INVALIDDATA;
> +    }
> +
> +dolby_e:
> +    {
> +    nb_words = length_code / s->non_pcm_bits;
> +    nb_words += nb_words & 1;
> +
> +    word_size = s->non_pcm_bits + 7 >> 3;
> +    aesframe_size = (s->bits + 4) * 2 / 8;  // 2 subframes, each with payload + VUCF bits
> +
> +    *dec_pkt_size = nb_words * word_size;
> +    s302m_read_size = aesframe_size * nb_words/2;
> +
> +    if (offset + s302m_read_size > buf_size)
> +        return AVERROR_INVALIDDATA;
> +
> +    return s302m_read_size;
> +    }
> +}
> +
>  static int s302m_parse_frame_header(AVCodecContext *avctx, const uint8_t *buf,
>                                      int buf_size)
>  {
> +    S302Context *s = avctx->priv_data;
>      uint32_t h;
>      int frame_size, channels, bits;
>  
> @@ -66,33 +309,15 @@ static int s302m_parse_frame_header(AVCodecContext *avctx, const uint8_t *buf,
>          return AVERROR_INVALIDDATA;
>      }
>  
> -    /* Set output properties */
> -    avctx->bits_per_raw_sample = bits;
> -    if (bits > 16)
> -        avctx->sample_fmt = AV_SAMPLE_FMT_S32;
> -    else
> -        avctx->sample_fmt = AV_SAMPLE_FMT_S16;
> -
> -    av_channel_layout_uninit(&avctx->ch_layout);
> -    switch(channels) {
> -        case 2:
> -            avctx->ch_layout = (AVChannelLayout)AV_CHANNEL_LAYOUT_STEREO;
> -            break;
> -        case 4:
> -            avctx->ch_layout = (AVChannelLayout)AV_CHANNEL_LAYOUT_QUAD;
> -            break;
> -        case 6:
> -            avctx->ch_layout = (AVChannelLayout)AV_CHANNEL_LAYOUT_5POINT1_BACK;
> -            break;
> -        case 8:
> -            av_channel_layout_from_mask(&avctx->ch_layout,
> -                                        AV_CH_LAYOUT_5POINT1_BACK | AV_CH_LAYOUT_STEREO_DOWNMIX);
> -            break;
> -        default:
> -            avctx->ch_layout.order       = AV_CHANNEL_ORDER_UNSPEC;
> -            avctx->ch_layout.nb_channels = channels;
> -            break;
> -    }
> +    if (!s->channels)
> +        s->channels = channels;
> +    else if (s->channels != channels)
> +        return AVERROR_INVALIDDATA;
> +
> +    if (!s->bits)
> +        s->bits = bits;
> +    else if (s->bits != bits)
> +        return AVERROR_INVALIDDATA;
>  
>      return frame_size;
>  }
> @@ -103,119 +328,286 @@ static int s302m_decode_frame(AVCodecContext *avctx, AVFrame *frame,
>      S302Context *s = avctx->priv_data;
>      const uint8_t *buf = avpkt->data;
>      int buf_size       = avpkt->size;
> -    int block_size, ret, channels;
> -    int i;
> -    int non_pcm_data_type = -1;
> +    int block_size, ret, channels, frame_size;
> +    int non_pcm_offset = -1, non_pcm_length = 0;
> +    int dec_pkt_size = 0;
> +
> +    if (s->non_pcm_mode == NON_PCM_DROP && s->non_pcm_data_type != -1 && s->non_pcm_data_type != 32)
> +        return avpkt->size;
>  
> -    int frame_size = s302m_parse_frame_header(avctx, buf, buf_size);
> +    frame_size = s302m_parse_frame_header(avctx, buf, buf_size);
>      if (frame_size < 0)
>          return frame_size;
>  
>      buf_size -= AES3_HEADER_LEN;
>      buf      += AES3_HEADER_LEN;
>  
> -    /* get output buffer */
> -    block_size = (avctx->bits_per_raw_sample + 4) / 4;
> -    channels = avctx->ch_layout.nb_channels;
> -    frame->nb_samples = 2 * (buf_size / block_size) / channels;
> -    if ((ret = ff_get_buffer(avctx, frame, 0)) < 0)
> -        return ret;
> +    // set non-pcm status if not determined
> +    // else extract offset and length if non-pcm can be decoded
> +    if (s->non_pcm_data_type == -1 || s->non_pcm_dec) {
> +        ret = s302m_non_pcm_inspect(avctx, buf, buf_size, &non_pcm_offset, &non_pcm_length);
> +        if (ret >= 0 && s->non_pcm_data_type != 32 && !s->non_pcm_dec)
> +            ret = s302m_setup_non_pcm_handling(avctx, buf, buf_size);
> +        else if (ret < 0)
> +            return avpkt->size;
> +    }
> +
> +    if (s->non_pcm_data_type == -1)
> +        return AVERROR_INVALIDDATA;  // we should know data type in order to proceed with output
> +
> +    if (!s->non_pcm_dec && !s->avctx_props_set) {
> +        /* Set output properties */
> +        avctx->bits_per_raw_sample = s->non_pcm_bits ? s->non_pcm_bits : s->bits;
> +        if (avctx->bits_per_raw_sample > 16)
> +            avctx->sample_fmt = AV_SAMPLE_FMT_S32;
> +        else
> +            avctx->sample_fmt = AV_SAMPLE_FMT_S16;
>  
> -    avctx->bit_rate = 48000 * channels * (avctx->bits_per_raw_sample + 4) +
> -                      32 * 48000 / frame->nb_samples;
> -    buf_size = (frame->nb_samples * channels / 2) * block_size;
> +        av_channel_layout_uninit(&avctx->ch_layout);
> +        switch(s->channels) {
> +            case 2:
> +                avctx->ch_layout = (AVChannelLayout)AV_CHANNEL_LAYOUT_STEREO;
> +                break;
> +            case 4:
> +                avctx->ch_layout = (AVChannelLayout)AV_CHANNEL_LAYOUT_QUAD;
> +                break;
> +            case 6:
> +                avctx->ch_layout = (AVChannelLayout)AV_CHANNEL_LAYOUT_5POINT1_BACK;
> +                break;
> +            case 8:
> +                av_channel_layout_from_mask(&avctx->ch_layout,
> +                                            AV_CH_LAYOUT_5POINT1_BACK | AV_CH_LAYOUT_STEREO_DOWNMIX);
> +                break;
> +            default:
> +                avctx->ch_layout.order       = AV_CHANNEL_ORDER_UNSPEC;
> +                avctx->ch_layout.nb_channels = channels;
> +                break;
> +        }
> +
> +        avctx->sample_rate = 48000;
> +        s->avctx_props_set = 1;
> +    }
> +
> +    if (s->non_pcm_dec) {
> +        buf_size = s302m_get_non_pcm_pkt_size(avctx, buf_size, non_pcm_offset, non_pcm_length, &dec_pkt_size);
> +        if (buf_size < 0)
> +            return AVERROR_INVALIDDATA;
> +        buf += non_pcm_offset;
> +
> +        if (dec_pkt_size > s->packet->size) {
> +            ret = av_grow_packet(s->packet, dec_pkt_size - s->packet->size);
> +            if (ret < 0)
> +                return ret;
> +        }
> +        ret = av_packet_make_writable(s->packet);
> +        if (ret < 0)
> +            return ret;
> +        memset(s->packet->data, 0, s->packet->size);
> +
> +        ret = av_packet_copy_props(s->packet, avpkt);
> +        if (ret < 0)
> +            return ret;
> +    } else {
> +        /* get output buffer */
> +        block_size = (s->bits + 4) / 4;
> +        channels = s->channels;
> +        frame->nb_samples = 2 * (buf_size / block_size) / channels;
> +        if ((ret = ff_get_buffer(avctx, frame, 0)) < 0)
> +            return ret;
> +
> +        buf_size = (frame->nb_samples * channels / 2) * block_size;
> +
> +        avctx->bit_rate = 48000 * s->channels * ((s->non_pcm_bits ? s->non_pcm_bits : s->bits) + 4) +
> +                          32 * 48000 / frame->nb_samples;
> +    }
> +
> +    if (s->bits == 24) {
> +        uint8_t  *p   = NULL;
> +        uint32_t *f32 = NULL;
> +
> +        if (s->non_pcm_dec)
> +            p = (uint8_t *)s->packet->data;
> +        else
> +            f32 = (uint32_t *)frame->data[0];
>  
> -    if (avctx->bits_per_raw_sample == 24) {
> -        uint32_t *o = (uint32_t *)frame->data[0];
>          for (; buf_size > 6; buf_size -= 7) {
> -            *o++ = ((unsigned)ff_reverse[buf[2]]        << 24) |
> -                   (ff_reverse[buf[1]]        << 16) |
> -                   (ff_reverse[buf[0]]        <<  8);
> -            *o++ = ((unsigned)ff_reverse[buf[6] & 0xf0] << 28) |
> -                   (ff_reverse[buf[5]]        << 20) |
> -                   (ff_reverse[buf[4]]        << 12) |
> -                   (ff_reverse[buf[3] & 0x0f] <<  4);
> +            uint8_t b[6];
> +
> +            b[0] = (ff_reverse[buf[2]       ]      ) ;
> +            b[1] = (ff_reverse[buf[1]       ]      ) ;
> +            b[2] = (ff_reverse[buf[0]       ]      ) ;
> +            b[3] = (ff_reverse[buf[6] & 0xf0] <<  4) |
> +                   (ff_reverse[buf[5] & 0x0f] >>  4) ;
> +            b[4] = (ff_reverse[buf[5] & 0xf0] <<  4) |
> +                   (ff_reverse[buf[4] & 0x0f] >>  4) ;
> +            b[5] = (ff_reverse[buf[4] & 0xf0] <<  4) |
> +                   (ff_reverse[buf[3] & 0x0f] >>  4) ;
> +
> +            if (s->non_pcm_bits == 20) {
> +                b[2] &= 0xf0;
> +                b[5] &= 0xf0;
> +            }
> +
> +            if (s->non_pcm_dec)
> +                for (int i = 0; i < 6; i++)
> +                    *p++ = b[i];
> +            else {
> +                *f32++ = (b[0] << 24) |
> +                         (b[1] << 16) |
> +                         (b[2] <<  8) ;
> +                *f32++ = (b[3] << 24) |
> +                         (b[4] << 16) |
> +                         (b[5] <<  8) ;
> +            }
>              buf += 7;
>          }
> -        o = (uint32_t *)frame->data[0];
> -        if (channels == 2)
> -            for (i=0; i<frame->nb_samples * 2 - 6; i+=2) {
> -                if (o[i] || o[i+1] || o[i+2] || o[i+3])
> -                    break;
> -                if (o[i+4] == 0x96F87200U && o[i+5] == 0xA54E1F00) {
> -                    non_pcm_data_type = (o[i+6] >> 16) & 0x1F;
> -                    break;
> +    } else if (s->bits == 20) {
> +        uint8_t  *p   = NULL;
> +        uint16_t *f16 = NULL;
> +        uint32_t *f32 = NULL;
> +
> +        if (s->non_pcm_dec)
> +            p = (uint8_t *)s->packet->data;
> +        else if (s->non_pcm_bits == 16)
> +            f16 = (uint16_t *)frame->data[0];
> +        else
> +            f32 = (uint32_t *)frame->data[0];
> +
> +        for (; buf_size > 5; buf_size -= 6) {
> +            uint8_t b[6];
> +
> +            b[0] = (ff_reverse[buf[2] & 0xf0] <<  4) |
> +                   (ff_reverse[buf[1] & 0x0f] >>  4) ;
> +            b[1] = (ff_reverse[buf[1] & 0xf0] <<  4) |
> +                   (ff_reverse[buf[0] & 0x0f] >>  4) ;
> +            b[2] = (ff_reverse[buf[0] & 0xf0] <<  4) ;
> +            b[3] = (ff_reverse[buf[5] & 0xf0] <<  4) |
> +                   (ff_reverse[buf[4] & 0x0f] >>  4) ;
> +            b[4] = (ff_reverse[buf[4] & 0xf0] <<  4) |
> +                   (ff_reverse[buf[3] & 0x0f] >>  4) ;
> +            b[5] = (ff_reverse[buf[3] & 0xf0] <<  4) ;
> +
> +            if (s->non_pcm_dec) {
> +                for (int i = 0; i < 6; i++) {
> +                    if (s->non_pcm_bits == 16 && (i % 3 == 2))
> +                        continue;
> +                    *p++ = b[i];
>                  }
> +            } else if (s->non_pcm_bits == 16) {
> +                *f16++ = (b[0] << 8) |
> +                         (b[1]     ) ;
> +                *f16++ = (b[3] << 8) |
> +                         (b[4]     ) ;
> +            } else {
> +                *f32++ = (b[0] << 24) |
> +                         (b[1] << 16) |
> +                         (b[2] <<  8) ;
> +                *f32++ = (b[3] << 24) |
> +                         (b[4] << 16) |
> +                         (b[5] <<  8) ;
>              }
> -    } else if (avctx->bits_per_raw_sample == 20) {
> -        uint32_t *o = (uint32_t *)frame->data[0];
> -        for (; buf_size > 5; buf_size -= 6) {
> -            *o++ = ((unsigned)ff_reverse[buf[2] & 0xf0] << 28) |
> -                   (ff_reverse[buf[1]]        << 20) |
> -                   (ff_reverse[buf[0]]        << 12);
> -            *o++ = ((unsigned)ff_reverse[buf[5] & 0xf0] << 28) |
> -                   (ff_reverse[buf[4]]        << 20) |
> -                   (ff_reverse[buf[3]]        << 12);
>              buf += 6;
>          }
> -        o = (uint32_t *)frame->data[0];
> -        if (channels == 2)
> -            for (i=0; i<frame->nb_samples * 2 - 6; i+=2) {
> -                if (o[i] || o[i+1] || o[i+2] || o[i+3])
> -                    break;
> -                if (o[i+4] == 0x6F872000U && o[i+5] == 0x54E1F000) {
> -                    non_pcm_data_type = (o[i+6] >> 16) & 0x1F;
> -                    break;
> -                }
> -            }
>      } else {
> -        uint16_t *o = (uint16_t *)frame->data[0];
> +        uint8_t  *p   = NULL;
> +        uint16_t *f16 = NULL;
> +
> +        if (s->non_pcm_dec)
> +            p = (uint8_t *)s->packet->data;
> +        else
> +            f16 = (uint16_t *)frame->data[0];
> +
>          for (; buf_size > 4; buf_size -= 5) {
> -            *o++ = (ff_reverse[buf[1]]        <<  8) |
> -                    ff_reverse[buf[0]];
> -            *o++ = (ff_reverse[buf[4] & 0xf0] << 12) |
> -                   (ff_reverse[buf[3]]        <<  4) |
> -                   (ff_reverse[buf[2]]        >>  4);
> +            uint8_t b[4];
> +
> +            b[0] = (ff_reverse[buf[1]       ]      ) ;
> +            b[1] = (ff_reverse[buf[0]       ]      ) ;
> +            b[2] = (ff_reverse[buf[4] & 0xf0] <<  4) |
> +                   (ff_reverse[buf[3] & 0x0f] >>  4) ;
> +            b[3] = (ff_reverse[buf[3] & 0xf0] <<  4) |
> +                   (ff_reverse[buf[2] & 0x0f] >>  4) ;
> +
> +            if (s->non_pcm_dec)
> +                for (int i = 0; i < 4; i++)
> +                    *p++ = b[i];
> +            else {
> +                *f16++ = (b[0] << 8) |
> +                         (b[1]     ) ;

AV_RB16(b)

> +                *f16++ = (b[2] << 8) |
> +                         (b[3]     ) ;
> +            }
>              buf += 5;
>          }
> -        o = (uint16_t *)frame->data[0];
> -        if (channels == 2)
> -            for (i=0; i<frame->nb_samples * 2 - 6; i+=2) {
> -                if (o[i] || o[i+1] || o[i+2] || o[i+3])
> -                    break;
> -                if (o[i+4] == 0xF872U && o[i+5] == 0x4E1F) {
> -                    non_pcm_data_type = (o[i+6] & 0x1F);
> -                    break;
> -                }
> -            }
>      }
>  
> -    if (non_pcm_data_type != -1) {
> -        if (s->non_pcm_mode == 3) {
> -            av_log(avctx, AV_LOG_ERROR,
> -                   "S302 non PCM mode with data type %d not supported\n",
> -                   non_pcm_data_type);
> -            return AVERROR_PATCHWELCOME;
> +    if (s->non_pcm_dec) {
> +        ret = avcodec_send_packet(s->non_pcm_ctx, s->packet);
> +        if (ret < 0) {
> +            av_log(avctx, AV_LOG_ERROR, "error %d submitting non-pcm packet with pts %"PRId64" for decoding\n", ret, s->packet->pts);
> +            return ret;
>          }
> -        if (s->non_pcm_mode & 1) {
> -            return avpkt->size;
> +        ret = avcodec_receive_frame(s->non_pcm_ctx, s->frame);
> +        if (ret < 0) {
> +            av_log(avctx, AV_LOG_ERROR, "error %d receiving non-pcm decoded frame for packet with pts %"PRId64"\n", ret, s->packet->pts);
> +            return ret;
>          }
> -    }
>  
> -    avctx->sample_rate = 48000;
> +        if (!s->avctx_props_set) {
> +            avctx->sample_fmt  = s->non_pcm_ctx->sample_fmt;
> +            avctx->sample_rate = s->non_pcm_ctx->sample_rate;
> +
> +            av_channel_layout_uninit(&avctx->ch_layout);
> +            ret = av_channel_layout_copy(&avctx->ch_layout, &s->non_pcm_ctx->ch_layout);
> +            if (ret < 0) {
> +                av_log(avctx, AV_LOG_ERROR, "error %d when copying channel layout from non-pcm decoder context to parent context.\n", ret);
> +                return ret;
> +            }
> +            s->avctx_props_set = 1;
> +        }
> +
> +        frame->nb_samples = s->frame->nb_samples;
> +        ret = ff_get_buffer(avctx, frame, 0);
> +        if (ret < 0)
> +            return ret;
> +
> +        for (int ch = 0; ch < s->frame->ch_layout.nb_channels; ch++)
> +            memcpy(frame->extended_data[ch], s->frame->extended_data[ch],
> +                   av_get_bytes_per_sample(s->non_pcm_ctx->sample_fmt) * s->frame->nb_samples);

Would you please explain to me why this extra frame s->frame exists at
all? (Is it just the assert due to the missing FrameDecodeData? If so,
then this should be changed instead.)

> +    }
>  
>      *got_frame_ptr = 1;
>  
>      return avpkt->size;
>  }
>  
> +static void s302m_flush(AVCodecContext *avctx)
> +{
> +    S302Context *s = avctx->priv_data;
> +
> +    if (s->non_pcm_dec && s->non_pcm_ctx)
> +        avcodec_flush_buffers(s->non_pcm_ctx);
> +}
> +
> +static av_cold int s302m_close(AVCodecContext *avctx)
> +{
> +    S302Context *s = avctx->priv_data;
> +
> +    avcodec_free_context(&s->non_pcm_ctx);
> +    av_packet_free(&s->packet);
> +    av_frame_free(&s->frame);
> +    av_dict_free(&s->non_pcm_opts);

non_pcm_opts is an av_opt-enabled field and is therefore freed generically.

> +
> +    return 0;
> +}
> +
>  #define FLAGS AV_OPT_FLAG_AUDIO_PARAM|AV_OPT_FLAG_DECODING_PARAM
>  static const AVOption s302m_options[] = {
> -    {"non_pcm_mode", "Chooses what to do with NON-PCM", offsetof(S302Context, non_pcm_mode), AV_OPT_TYPE_INT, {.i64 = 3}, 0, 3, FLAGS, "non_pcm_mode"},
> -    {"copy"        , "Pass NON-PCM through unchanged"     , 0, AV_OPT_TYPE_CONST, {.i64 = 0}, 0, 3, FLAGS, "non_pcm_mode"},
> -    {"drop"        , "Drop NON-PCM"                       , 0, AV_OPT_TYPE_CONST, {.i64 = 1}, 0, 3, FLAGS, "non_pcm_mode"},
> -    {"decode_copy" , "Decode if possible else passthrough", 0, AV_OPT_TYPE_CONST, {.i64 = 2}, 0, 3, FLAGS, "non_pcm_mode"},
> -    {"decode_drop" , "Decode if possible else drop"       , 0, AV_OPT_TYPE_CONST, {.i64 = 3}, 0, 3, FLAGS, "non_pcm_mode"},
> +    {"non_pcm_mode", "Chooses what to do with NON-PCM", offsetof(S302Context, non_pcm_mode), AV_OPT_TYPE_INT, {.i64 = NON_PCM_DEC_ELSE_DROP}, NON_PCM_COPY, NON_PCM_DEC_ELSE_DROP, FLAGS, "non_pcm_mode"},
> +    {"copy"        , "Pass NON-PCM through unchanged"     , 0, AV_OPT_TYPE_CONST, {.i64 = NON_PCM_COPY},          0, 3, FLAGS, "non_pcm_mode"},
> +    {"drop"        , "Drop NON-PCM"                       , 0, AV_OPT_TYPE_CONST, {.i64 = NON_PCM_DROP},          0, 3, FLAGS, "non_pcm_mode"},
> +    {"decode_copy" , "Decode if possible else passthrough", 0, AV_OPT_TYPE_CONST, {.i64 = NON_PCM_DEC_ELSE_COPY}, 0, 3, FLAGS, "non_pcm_mode"},
> +    {"decode_drop" , "Decode if possible else drop"       , 0, AV_OPT_TYPE_CONST, {.i64 = NON_PCM_DEC_ELSE_DROP}, 0, 3, FLAGS, "non_pcm_mode"},
> +    {"non_pcm_options", "Set options for non-pcm decoder",  offsetof(S302Context, non_pcm_opts), AV_OPT_TYPE_DICT, {.str = NULL}, 0, 0, FLAGS},
>      {NULL}
>  };
>  
> @@ -231,6 +623,9 @@ const FFCodec ff_s302m_decoder = {
>      CODEC_LONG_NAME("SMPTE 302M"),
>      .p.type         = AVMEDIA_TYPE_AUDIO,
>      .p.id           = AV_CODEC_ID_S302M,
> +    .init           = s302m_init,
> +    .close          = s302m_close,
> +    .flush          = s302m_flush,
>      .p.priv_class   = &s302m_class,
>      .priv_data_size = sizeof(S302Context),
>      FF_CODEC_DECODE_CB(s302m_decode_frame),
Gyan Doshi Jan. 25, 2024, 7:11 a.m. UTC | #10
On 2024-01-25 10:29 am, Andreas Rheinhardt wrote:
> Gyan Doshi:
>> Set up framework for non-PCM decoding in-place and
>> add support for Dolby-E decoding.
>>
>> Useful for direct transcoding of non-PCM audio in live inputs.
>> ---
>>   configure          |   1 +
>>   doc/decoders.texi  |  40 +++
>>   libavcodec/s302m.c | 609 +++++++++++++++++++++++++++++++++++++--------
>>   3 files changed, 543 insertions(+), 107 deletions(-)
>>
>> diff --git a/configure b/configure
>> index c8ae0a061d..8db3fa3f4b 100755
>> --- a/configure
>> +++ b/configure
>> @@ -2979,6 +2979,7 @@ rv20_decoder_select="h263_decoder"
>>   rv20_encoder_select="h263_encoder"
>>   rv30_decoder_select="golomb h264pred h264qpel mpegvideodec rv34dsp"
>>   rv40_decoder_select="golomb h264pred h264qpel mpegvideodec rv34dsp"
>> +s302m_decoder_select="dolby_e_decoder"
>>   screenpresso_decoder_deps="zlib"
>>   shorten_decoder_select="bswapdsp"
>>   sipr_decoder_select="lsp"
>> diff --git a/doc/decoders.texi b/doc/decoders.texi
>> index 293c82c2ba..9f85c876bf 100644
>> --- a/doc/decoders.texi
>> +++ b/doc/decoders.texi
>> @@ -347,6 +347,46 @@ configuration. You need to explicitly configure the build with
>>   An FFmpeg native decoder for Opus exists, so users can decode Opus
>>   without this library.
>>   
>> +@section s302m
>> +
>> +SMPTE ST 302 decoder.
>> +
>> +SMPTE ST 302 is a method for storing AES3 data format within an MPEG Transport
>> +Stream. AES3 streams can contain LPCM streams of 2, 4, 6 or 8 channels with a
>> +bit depth of 16, 20 or 24-bits at a sample rate of 48 kHz.
>> +They can also contain non-PCM codec streams such as AC-3 or Dolby-E.
>> +
> This sounds like we should add bitstream filters to extract the proper
> underlying streams instead.
> (I see only two problems with this approach: The BSF API needs to set
> the CodecID of the output during init, but at this point no packet has
> reached the BSF to determine it. And changing codec IDs mid-stream is
> also not supported.)

In theory, this decoder shouldn't exist, as it is just a carrier, 
whether of LPCM or non-PCM.
FFmpeg architecture also imposes a fundamental limitation in that one 
s302m stream may
carry multiple payload streams and we support only one decoding context 
per input stream
neither can a bsf spawn streams (not sure). So proper, full support 
seems not possible.

[...]

>> +
>> +    ret = init_get_bits8(&gb, buf, buf_size);
>> +    if (ret < 0)
>> +        return ret;
>> +
>> +    aes_frm_size = (s->bits + 4) * 2 / 8;
>> +    if (buf_size < aes_frm_size * 2)  // not enough to contain data_type & length_code
>> +        return AVERROR_INVALIDDATA;
>> +
>> +    state = get_bits64(&gb, aes_frm_size * 8);
>> +
>> +    while (!IS_NONPCMSYNC(s->bits,state) && (get_bits_left(&gb) >= 8))
>> +        state = (state << 8) | get_bits(&gb, 8);
> Reading byte-aligned data with a GetBit context is very suboptimal.

What is the performance difference vs. uint8 pointers?
Note that if stream is LPCM or non-decodable non-PCM, this isn't called 
again. If it is Dolby-E, the data traversed can typically be measured in 
the dozens of bytes. And further on, I do read and skip some 
non-byte-aligned lengths.

[...]

>> +
>> +            if (s->non_pcm_dec)
>> +                for (int i = 0; i < 4; i++)
>> +                    *p++ = b[i];
>> +            else {
>> +                *f16++ = (b[0] << 8) |
>> +                         (b[1]     ) ;
> AV_RB16(b)

Ok.

[...]
>> +
>> +        for (int ch = 0; ch < s->frame->ch_layout.nb_channels; ch++)
>> +            memcpy(frame->extended_data[ch], s->frame->extended_data[ch],
>> +                   av_get_bytes_per_sample(s->non_pcm_ctx->sample_fmt) * s->frame->nb_samples);
> Would you please explain to me why this extra frame s->frame exists at
> all? (Is it just the assert due to the missing FrameDecodeData? If so,
> then this should be changed instead.)

Yes, that assert was triggered. I haven't looked into the ramifications 
of altering decode_receive_frame_internal and it's out of scope for this 
patch.
If you feel strongly about it, I invite you to change that code and I'll 
update this patch accordingly.

[...]

>> +static av_cold int s302m_close(AVCodecContext *avctx)
>> +{
>> +    S302Context *s = avctx->priv_data;
>> +
>> +    avcodec_free_context(&s->non_pcm_ctx);
>> +    av_packet_free(&s->packet);
>> +    av_frame_free(&s->frame);
>> +    av_dict_free(&s->non_pcm_opts);
> non_pcm_opts is an av_opt-enabled field and is therefore freed generically.

Will remove.

Regards,
Gyan
Andreas Rheinhardt Jan. 25, 2024, 1:17 p.m. UTC | #11
Gyan Doshi:
> 
> 
> On 2024-01-25 10:29 am, Andreas Rheinhardt wrote:
>> Gyan Doshi:
>>> Set up framework for non-PCM decoding in-place and
>>> add support for Dolby-E decoding.
>>>
>>> Useful for direct transcoding of non-PCM audio in live inputs.
>>> ---
>>>   configure          |   1 +
>>>   doc/decoders.texi  |  40 +++
>>>   libavcodec/s302m.c | 609 +++++++++++++++++++++++++++++++++++++--------
>>>   3 files changed, 543 insertions(+), 107 deletions(-)
>>>
>>> diff --git a/configure b/configure
>>> index c8ae0a061d..8db3fa3f4b 100755
>>> --- a/configure
>>> +++ b/configure
>>> @@ -2979,6 +2979,7 @@ rv20_decoder_select="h263_decoder"
>>>   rv20_encoder_select="h263_encoder"
>>>   rv30_decoder_select="golomb h264pred h264qpel mpegvideodec rv34dsp"
>>>   rv40_decoder_select="golomb h264pred h264qpel mpegvideodec rv34dsp"
>>> +s302m_decoder_select="dolby_e_decoder"
>>>   screenpresso_decoder_deps="zlib"
>>>   shorten_decoder_select="bswapdsp"
>>>   sipr_decoder_select="lsp"
>>> diff --git a/doc/decoders.texi b/doc/decoders.texi
>>> index 293c82c2ba..9f85c876bf 100644
>>> --- a/doc/decoders.texi
>>> +++ b/doc/decoders.texi
>>> @@ -347,6 +347,46 @@ configuration. You need to explicitly configure
>>> the build with
>>>   An FFmpeg native decoder for Opus exists, so users can decode Opus
>>>   without this library.
>>>   +@section s302m
>>> +
>>> +SMPTE ST 302 decoder.
>>> +
>>> +SMPTE ST 302 is a method for storing AES3 data format within an MPEG
>>> Transport
>>> +Stream. AES3 streams can contain LPCM streams of 2, 4, 6 or 8
>>> channels with a
>>> +bit depth of 16, 20 or 24-bits at a sample rate of 48 kHz.
>>> +They can also contain non-PCM codec streams such as AC-3 or Dolby-E.
>>> +
>> This sounds like we should add bitstream filters to extract the proper
>> underlying streams instead.
>> (I see only two problems with this approach: The BSF API needs to set
>> the CodecID of the output during init, but at this point no packet has
>> reached the BSF to determine it. And changing codec IDs mid-stream is
>> also not supported.)
> 
> In theory, this decoder shouldn't exist, as it is just a carrier,
> whether of LPCM or non-PCM.
> FFmpeg architecture also imposes a fundamental limitation in that one
> s302m stream may
> carry multiple payload streams and we support only one decoding context
> per input stream

Then why does the demuxer not separate the data into multiple streams?

> neither can a bsf spawn streams (not sure). So proper, full support
> seems not possible.

Hypothetically a BSF can output packets with different stream_ids. But
no BSF does this intentionally (all BSFs actually ignore stream_id; they
just copy the id from the input.

> 
> [...]
> 
>>> +
>>> +    ret = init_get_bits8(&gb, buf, buf_size);
>>> +    if (ret < 0)
>>> +        return ret;
>>> +
>>> +    aes_frm_size = (s->bits + 4) * 2 / 8;
>>> +    if (buf_size < aes_frm_size * 2)  // not enough to contain
>>> data_type & length_code
>>> +        return AVERROR_INVALIDDATA;
>>> +
>>> +    state = get_bits64(&gb, aes_frm_size * 8);
>>> +
>>> +    while (!IS_NONPCMSYNC(s->bits,state) && (get_bits_left(&gb) >= 8))
>>> +        state = (state << 8) | get_bits(&gb, 8);
>> Reading byte-aligned data with a GetBit context is very suboptimal.
> 
> What is the performance difference vs. uint8 pointers?

Typically worse by a constant factor.

> Note that if stream is LPCM or non-decodable non-PCM, this isn't called
> again. If it is Dolby-E, the data traversed can typically be measured in
> the dozens of bytes. And further on, I do read and skip some
> non-byte-aligned lengths.
> 
> [...]
> 
>>> +
>>> +            if (s->non_pcm_dec)
>>> +                for (int i = 0; i < 4; i++)
>>> +                    *p++ = b[i];
>>> +            else {
>>> +                *f16++ = (b[0] << 8) |
>>> +                         (b[1]     ) ;
>> AV_RB16(b)
> 
> Ok.
> 
> [...]
>>> +
>>> +        for (int ch = 0; ch < s->frame->ch_layout.nb_channels; ch++)
>>> +            memcpy(frame->extended_data[ch],
>>> s->frame->extended_data[ch],
>>> +                  
>>> av_get_bytes_per_sample(s->non_pcm_ctx->sample_fmt) *
>>> s->frame->nb_samples);
>> Would you please explain to me why this extra frame s->frame exists at
>> all? (Is it just the assert due to the missing FrameDecodeData? If so,
>> then this should be changed instead.)
> 
> Yes, that assert was triggered. I haven't looked into the ramifications
> of altering decode_receive_frame_internal and it's out of scope for this
> patch.
> If you feel strongly about it, I invite you to change that code and I'll
> update this patch accordingly.
> 
> [...]
> 
>>> +static av_cold int s302m_close(AVCodecContext *avctx)
>>> +{
>>> +    S302Context *s = avctx->priv_data;
>>> +
>>> +    avcodec_free_context(&s->non_pcm_ctx);
>>> +    av_packet_free(&s->packet);
>>> +    av_frame_free(&s->frame);
>>> +    av_dict_free(&s->non_pcm_opts);
>> non_pcm_opts is an av_opt-enabled field and is therefore freed
>> generically.
> 
> Will remove.
> 
> Regards,
> Gyan
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Gyan Doshi Jan. 26, 2024, 4:23 a.m. UTC | #12
On 2024-01-25 06:47 pm, Andreas Rheinhardt wrote:
> Gyan Doshi:
>>
>> On 2024-01-25 10:29 am, Andreas Rheinhardt wrote:
>>> Gyan Doshi:
>>>> Set up framework for non-PCM decoding in-place and
>>>> add support for Dolby-E decoding.
>>>>
>>>> Useful for direct transcoding of non-PCM audio in live inputs.
>>>> ---
>>>>    configure          |   1 +
>>>>    doc/decoders.texi  |  40 +++
>>>>    libavcodec/s302m.c | 609 +++++++++++++++++++++++++++++++++++++--------
>>>>    3 files changed, 543 insertions(+), 107 deletions(-)
>>>>
>>>> diff --git a/configure b/configure
>>>> index c8ae0a061d..8db3fa3f4b 100755
>>>> --- a/configure
>>>> +++ b/configure
>>>> @@ -2979,6 +2979,7 @@ rv20_decoder_select="h263_decoder"
>>>>    rv20_encoder_select="h263_encoder"
>>>>    rv30_decoder_select="golomb h264pred h264qpel mpegvideodec rv34dsp"
>>>>    rv40_decoder_select="golomb h264pred h264qpel mpegvideodec rv34dsp"
>>>> +s302m_decoder_select="dolby_e_decoder"
>>>>    screenpresso_decoder_deps="zlib"
>>>>    shorten_decoder_select="bswapdsp"
>>>>    sipr_decoder_select="lsp"
>>>> diff --git a/doc/decoders.texi b/doc/decoders.texi
>>>> index 293c82c2ba..9f85c876bf 100644
>>>> --- a/doc/decoders.texi
>>>> +++ b/doc/decoders.texi
>>>> @@ -347,6 +347,46 @@ configuration. You need to explicitly configure
>>>> the build with
>>>>    An FFmpeg native decoder for Opus exists, so users can decode Opus
>>>>    without this library.
>>>>    +@section s302m
>>>> +
>>>> +SMPTE ST 302 decoder.
>>>> +
>>>> +SMPTE ST 302 is a method for storing AES3 data format within an MPEG
>>>> Transport
>>>> +Stream. AES3 streams can contain LPCM streams of 2, 4, 6 or 8
>>>> channels with a
>>>> +bit depth of 16, 20 or 24-bits at a sample rate of 48 kHz.
>>>> +They can also contain non-PCM codec streams such as AC-3 or Dolby-E.
>>>> +
>>> This sounds like we should add bitstream filters to extract the proper
>>> underlying streams instead.
>>> (I see only two problems with this approach: The BSF API needs to set
>>> the CodecID of the output during init, but at this point no packet has
>>> reached the BSF to determine it. And changing codec IDs mid-stream is
>>> also not supported.)
>> In theory, this decoder shouldn't exist, as it is just a carrier,
>> whether of LPCM or non-PCM.
>> FFmpeg architecture also imposes a fundamental limitation in that one
>> s302m stream may
>> carry multiple payload streams and we support only one decoding context
>> per input stream
> Then why does the demuxer not separate the data into multiple streams?

I didn't add demuxing support for this codec in MPEGTS, but I can venture

a) it would mean essentially inlining this decoder in the demuxer.
b) it would lead to a mapping of 1 actual to N virtual streams. How 
would stream_ids be assigned?
c) this codec specifies for multiple payloads but I haven't seen an 
actual sample


>>>> +
>>>> +    ret = init_get_bits8(&gb, buf, buf_size);
>>>> +    if (ret < 0)
>>>> +        return ret;
>>>> +
>>>> +    aes_frm_size = (s->bits + 4) * 2 / 8;
>>>> +    if (buf_size < aes_frm_size * 2)  // not enough to contain
>>>> data_type & length_code
>>>> +        return AVERROR_INVALIDDATA;
>>>> +
>>>> +    state = get_bits64(&gb, aes_frm_size * 8);
>>>> +
>>>> +    while (!IS_NONPCMSYNC(s->bits,state) && (get_bits_left(&gb) >= 8))
>>>> +        state = (state << 8) | get_bits(&gb, 8);
>>> Reading byte-aligned data with a GetBit context is very suboptimal.
>> What is the performance difference vs. uint8 pointers?
> Typically worse by a constant factor.

But in this scenario (of a few dozen bytes per packet if dolby_e), how 
significant is it?

Regards,
Gyan
Andreas Rheinhardt Jan. 26, 2024, 6:42 a.m. UTC | #13
Gyan Doshi:
> 
> 
> On 2024-01-25 06:47 pm, Andreas Rheinhardt wrote:
>> Gyan Doshi:
>>>
>>> On 2024-01-25 10:29 am, Andreas Rheinhardt wrote:
>>>> Gyan Doshi:
>>>>> Set up framework for non-PCM decoding in-place and
>>>>> add support for Dolby-E decoding.
>>>>>
>>>>> Useful for direct transcoding of non-PCM audio in live inputs.
>>>>> ---
>>>>>    configure          |   1 +
>>>>>    doc/decoders.texi  |  40 +++
>>>>>    libavcodec/s302m.c | 609
>>>>> +++++++++++++++++++++++++++++++++++++--------
>>>>>    3 files changed, 543 insertions(+), 107 deletions(-)
>>>>>
>>>>> diff --git a/configure b/configure
>>>>> index c8ae0a061d..8db3fa3f4b 100755
>>>>> --- a/configure
>>>>> +++ b/configure
>>>>> @@ -2979,6 +2979,7 @@ rv20_decoder_select="h263_decoder"
>>>>>    rv20_encoder_select="h263_encoder"
>>>>>    rv30_decoder_select="golomb h264pred h264qpel mpegvideodec rv34dsp"
>>>>>    rv40_decoder_select="golomb h264pred h264qpel mpegvideodec rv34dsp"
>>>>> +s302m_decoder_select="dolby_e_decoder"
>>>>>    screenpresso_decoder_deps="zlib"
>>>>>    shorten_decoder_select="bswapdsp"
>>>>>    sipr_decoder_select="lsp"
>>>>> diff --git a/doc/decoders.texi b/doc/decoders.texi
>>>>> index 293c82c2ba..9f85c876bf 100644
>>>>> --- a/doc/decoders.texi
>>>>> +++ b/doc/decoders.texi
>>>>> @@ -347,6 +347,46 @@ configuration. You need to explicitly configure
>>>>> the build with
>>>>>    An FFmpeg native decoder for Opus exists, so users can decode Opus
>>>>>    without this library.
>>>>>    +@section s302m
>>>>> +
>>>>> +SMPTE ST 302 decoder.
>>>>> +
>>>>> +SMPTE ST 302 is a method for storing AES3 data format within an MPEG
>>>>> Transport
>>>>> +Stream. AES3 streams can contain LPCM streams of 2, 4, 6 or 8
>>>>> channels with a
>>>>> +bit depth of 16, 20 or 24-bits at a sample rate of 48 kHz.
>>>>> +They can also contain non-PCM codec streams such as AC-3 or Dolby-E.
>>>>> +
>>>> This sounds like we should add bitstream filters to extract the proper
>>>> underlying streams instead.
>>>> (I see only two problems with this approach: The BSF API needs to set
>>>> the CodecID of the output during init, but at this point no packet has
>>>> reached the BSF to determine it. And changing codec IDs mid-stream is
>>>> also not supported.)
>>> In theory, this decoder shouldn't exist, as it is just a carrier,
>>> whether of LPCM or non-PCM.
>>> FFmpeg architecture also imposes a fundamental limitation in that one
>>> s302m stream may
>>> carry multiple payload streams and we support only one decoding context
>>> per input stream
>> Then why does the demuxer not separate the data into multiple streams?
> 
> I didn't add demuxing support for this codec in MPEGTS, but I can venture
> 
> a) it would mean essentially inlining this decoder in the demuxer.
> b) it would lead to a mapping of 1 actual to N virtual streams. How
> would stream_ids be assigned?
> c) this codec specifies for multiple payloads but I haven't seen an
> actual sample
> 
> 
>>>>> +
>>>>> +    ret = init_get_bits8(&gb, buf, buf_size);
>>>>> +    if (ret < 0)
>>>>> +        return ret;
>>>>> +
>>>>> +    aes_frm_size = (s->bits + 4) * 2 / 8;
>>>>> +    if (buf_size < aes_frm_size * 2)  // not enough to contain
>>>>> data_type & length_code
>>>>> +        return AVERROR_INVALIDDATA;
>>>>> +
>>>>> +    state = get_bits64(&gb, aes_frm_size * 8);
>>>>> +
>>>>> +    while (!IS_NONPCMSYNC(s->bits,state) && (get_bits_left(&gb) >=
>>>>> 8))
>>>>> +        state = (state << 8) | get_bits(&gb, 8);
>>>> Reading byte-aligned data with a GetBit context is very suboptimal.
>>> What is the performance difference vs. uint8 pointers?
>> Typically worse by a constant factor.
> 
> But in this scenario (of a few dozen bytes per packet if dolby_e), how
> significant is it?
> 

You are only looking at the ordinary case and not fuzzing samples where
it is more than a few dozen bytes.

- Andreas
Anton Khirnov Jan. 28, 2024, 10:54 a.m. UTC | #14
Quoting Gyan Doshi (2024-01-26 05:23:50)
> 
> 
> On 2024-01-25 06:47 pm, Andreas Rheinhardt wrote:
> > Gyan Doshi:
> >>
> >> On 2024-01-25 10:29 am, Andreas Rheinhardt wrote:
> >>> Gyan Doshi:
> >>>> Set up framework for non-PCM decoding in-place and
> >>>> add support for Dolby-E decoding.
> >>>>
> >>>> Useful for direct transcoding of non-PCM audio in live inputs.
> >>>> ---
> >>>>    configure          |   1 +
> >>>>    doc/decoders.texi  |  40 +++
> >>>>    libavcodec/s302m.c | 609 +++++++++++++++++++++++++++++++++++++--------
> >>>>    3 files changed, 543 insertions(+), 107 deletions(-)
> >>>>
> >>>> diff --git a/configure b/configure
> >>>> index c8ae0a061d..8db3fa3f4b 100755
> >>>> --- a/configure
> >>>> +++ b/configure
> >>>> @@ -2979,6 +2979,7 @@ rv20_decoder_select="h263_decoder"
> >>>>    rv20_encoder_select="h263_encoder"
> >>>>    rv30_decoder_select="golomb h264pred h264qpel mpegvideodec rv34dsp"
> >>>>    rv40_decoder_select="golomb h264pred h264qpel mpegvideodec rv34dsp"
> >>>> +s302m_decoder_select="dolby_e_decoder"
> >>>>    screenpresso_decoder_deps="zlib"
> >>>>    shorten_decoder_select="bswapdsp"
> >>>>    sipr_decoder_select="lsp"
> >>>> diff --git a/doc/decoders.texi b/doc/decoders.texi
> >>>> index 293c82c2ba..9f85c876bf 100644
> >>>> --- a/doc/decoders.texi
> >>>> +++ b/doc/decoders.texi
> >>>> @@ -347,6 +347,46 @@ configuration. You need to explicitly configure
> >>>> the build with
> >>>>    An FFmpeg native decoder for Opus exists, so users can decode Opus
> >>>>    without this library.
> >>>>    +@section s302m
> >>>> +
> >>>> +SMPTE ST 302 decoder.
> >>>> +
> >>>> +SMPTE ST 302 is a method for storing AES3 data format within an MPEG
> >>>> Transport
> >>>> +Stream. AES3 streams can contain LPCM streams of 2, 4, 6 or 8
> >>>> channels with a
> >>>> +bit depth of 16, 20 or 24-bits at a sample rate of 48 kHz.
> >>>> +They can also contain non-PCM codec streams such as AC-3 or Dolby-E.
> >>>> +
> >>> This sounds like we should add bitstream filters to extract the proper
> >>> underlying streams instead.
> >>> (I see only two problems with this approach: The BSF API needs to set
> >>> the CodecID of the output during init, but at this point no packet has
> >>> reached the BSF to determine it. And changing codec IDs mid-stream is
> >>> also not supported.)
> >> In theory, this decoder shouldn't exist, as it is just a carrier,
> >> whether of LPCM or non-PCM.
> >> FFmpeg architecture also imposes a fundamental limitation in that one
> >> s302m stream may
> >> carry multiple payload streams and we support only one decoding context
> >> per input stream
> > Then why does the demuxer not separate the data into multiple streams?
> 
> I didn't add demuxing support for this codec in MPEGTS, but I can venture
> 
> a) it would mean essentially inlining this decoder in the demuxer.

Why is that a problem? This decoder seems like it shouldn't be a
decoder.

I agree with Andreas that this seems like it's a demuxer pretending to
be a decoder.
Kieran Kunhya Jan. 28, 2024, 9:29 p.m. UTC | #15
>
> Why is that a problem? This decoder seems like it shouldn't be a
> decoder.
>
> I agree with Andreas that this seems like it's a demuxer pretending to
> be a decoder.
>

The framing is in the PCM layer itself, you have the same issue repeated in
every container that accepts PCM (and Dolby E does exist in those).
FFmpeg only allows one layer of demux so it has to go somewhere. This is a
second layer of demux.

Kieran
Gyan Doshi Jan. 29, 2024, 4 a.m. UTC | #16
On 2024-01-28 04:24 pm, Anton Khirnov wrote:
> Quoting Gyan Doshi (2024-01-26 05:23:50)
>>
>> On 2024-01-25 06:47 pm, Andreas Rheinhardt wrote:
>>> Gyan Doshi:
>>>> On 2024-01-25 10:29 am, Andreas Rheinhardt wrote:
>>>>> Gyan Doshi:
>>>>>> Set up framework for non-PCM decoding in-place and
>>>>>> add support for Dolby-E decoding.
>>>>>>
>>>>>> Useful for direct transcoding of non-PCM audio in live inputs.
>>>>>> ---
>>>>>>     configure          |   1 +
>>>>>>     doc/decoders.texi  |  40 +++
>>>>>>     libavcodec/s302m.c | 609 +++++++++++++++++++++++++++++++++++++--------
>>>>>>     3 files changed, 543 insertions(+), 107 deletions(-)
>>>>>>
>>>>>> diff --git a/configure b/configure
>>>>>> index c8ae0a061d..8db3fa3f4b 100755
>>>>>> --- a/configure
>>>>>> +++ b/configure
>>>>>> @@ -2979,6 +2979,7 @@ rv20_decoder_select="h263_decoder"
>>>>>>     rv20_encoder_select="h263_encoder"
>>>>>>     rv30_decoder_select="golomb h264pred h264qpel mpegvideodec rv34dsp"
>>>>>>     rv40_decoder_select="golomb h264pred h264qpel mpegvideodec rv34dsp"
>>>>>> +s302m_decoder_select="dolby_e_decoder"
>>>>>>     screenpresso_decoder_deps="zlib"
>>>>>>     shorten_decoder_select="bswapdsp"
>>>>>>     sipr_decoder_select="lsp"
>>>>>> diff --git a/doc/decoders.texi b/doc/decoders.texi
>>>>>> index 293c82c2ba..9f85c876bf 100644
>>>>>> --- a/doc/decoders.texi
>>>>>> +++ b/doc/decoders.texi
>>>>>> @@ -347,6 +347,46 @@ configuration. You need to explicitly configure
>>>>>> the build with
>>>>>>     An FFmpeg native decoder for Opus exists, so users can decode Opus
>>>>>>     without this library.
>>>>>>     +@section s302m
>>>>>> +
>>>>>> +SMPTE ST 302 decoder.
>>>>>> +
>>>>>> +SMPTE ST 302 is a method for storing AES3 data format within an MPEG
>>>>>> Transport
>>>>>> +Stream. AES3 streams can contain LPCM streams of 2, 4, 6 or 8
>>>>>> channels with a
>>>>>> +bit depth of 16, 20 or 24-bits at a sample rate of 48 kHz.
>>>>>> +They can also contain non-PCM codec streams such as AC-3 or Dolby-E.
>>>>>> +
>>>>> This sounds like we should add bitstream filters to extract the proper
>>>>> underlying streams instead.
>>>>> (I see only two problems with this approach: The BSF API needs to set
>>>>> the CodecID of the output during init, but at this point no packet has
>>>>> reached the BSF to determine it. And changing codec IDs mid-stream is
>>>>> also not supported.)
>>>> In theory, this decoder shouldn't exist, as it is just a carrier,
>>>> whether of LPCM or non-PCM.
>>>> FFmpeg architecture also imposes a fundamental limitation in that one
>>>> s302m stream may
>>>> carry multiple payload streams and we support only one decoding context
>>>> per input stream
>>> Then why does the demuxer not separate the data into multiple streams?
>> I didn't add demuxing support for this codec in MPEGTS, but I can venture
>>
>> a) it would mean essentially inlining this decoder in the demuxer.
> Why is that a problem? This decoder seems like it shouldn't be a
> decoder.
>
> I agree with Andreas that this seems like it's a demuxer pretending to
> be a decoder.

This module transforms the entire raw payload data to generate its 
output, even if the syntax is simple which
essentially makes it a de-coder. The de-multiplexer aspect of multiple 
streams is an academic possibility allowed
by the standard but not seen in any sample which makes me suspect it's 
used for carriage between broadcast
facilities rather than something ever sent to an OTT provider, let alone 
an end user.

Regards,
Gyan
Nicolas Gaullier Jan. 29, 2024, 9:27 a.m. UTC | #17
>On 2024-01-28 04:24 pm, Anton Khirnov wrote:
>> Quoting Gyan Doshi (2024-01-26 05:23:50)
>>>
>>> On 2024-01-25 06:47 pm, Andreas Rheinhardt wrote:
>>>> Gyan Doshi:
>>>>>> On 2024-01-25 10:29 am, Andreas Rheinhardt wrote:
>>>>>> Gyan Doshi:
>>>>>>> Set up framework for non-PCM decoding in-place and add support 
>>>>>>> for Dolby-E decoding.
>>>>>>>
>>>>>>> Useful for direct transcoding of non-PCM audio in live inputs.
>>>>>>> ---
>>>>>>>     configure          |   1 +
>>>>>>>     doc/decoders.texi  |  40 +++
>>>>>>>     libavcodec/s302m.c | 609 +++++++++++++++++++++++++++++++++++++--------
>>>>>>>     3 files changed, 543 insertions(+), 107 deletions(-)
>>>>>>>
>>>>>>> diff --git a/configure b/configure index c8ae0a061d..8db3fa3f4b 
>>>>>>> 100755
>>>>>>> --- a/configure
>>>>>>> +++ b/configure
>>>>>>> @@ -2979,6 +2979,7 @@ rv20_decoder_select="h263_decoder"
>>>>>>>     rv20_encoder_select="h263_encoder"
>>>>>>>     rv30_decoder_select="golomb h264pred h264qpel mpegvideodec rv34dsp"
>>>>>>>     rv40_decoder_select="golomb h264pred h264qpel mpegvideodec rv34dsp"
>>>>>>> +s302m_decoder_select="dolby_e_decoder"
>>>>>>>     screenpresso_decoder_deps="zlib"
>>>>>>>     shorten_decoder_select="bswapdsp"
>>>>>>>     sipr_decoder_select="lsp"
>>>>>>> diff --git a/doc/decoders.texi b/doc/decoders.texi index 
>>>>>>> 293c82c2ba..9f85c876bf 100644
>>>>>>> --- a/doc/decoders.texi
>>>>>>> +++ b/doc/decoders.texi
>>>>>>> @@ -347,6 +347,46 @@ configuration. You need to explicitly 
>>>>>>> configure the build with
>>>>>>>     An FFmpeg native decoder for Opus exists, so users can decode Opus
>>>>>>>     without this library.
>>>>>>>     +@section s302m
>>>>>>> +
>>>>>>> +SMPTE ST 302 decoder.
>>>>>>> +
>>>>>>> +SMPTE ST 302 is a method for storing AES3 data format within an 
>>>>>>> +MPEG
>>>>>>> Transport
>>>>>>> +Stream. AES3 streams can contain LPCM streams of 2, 4, 6 or 8
>>>>>>> channels with a
>>>>>>> +bit depth of 16, 20 or 24-bits at a sample rate of 48 kHz.
>>>>>>> +They can also contain non-PCM codec streams such as AC-3 or Dolby-E.
>>>>>>> +
>>>>>> This sounds like we should add bitstream filters to extract the 
>>>>>> proper underlying streams instead.
>>>>>> (I see only two problems with this approach: The BSF API needs to 
>>>>>> set the CodecID of the output during init, but at this point no 
>>>>>> packet has reached the BSF to determine it. And changing codec IDs 
>>>>>> mid-stream is also not supported.)
>>>>> In theory, this decoder shouldn't exist, as it is just a carrier, 
>>>>> whether of LPCM or non-PCM.
>>>>> FFmpeg architecture also imposes a fundamental limitation in that 
>>>>> one s302m stream may carry multiple payload streams and we support 
>>>>> only one decoding context per input stream
>>>> Then why does the demuxer not separate the data into multiple streams?
>>> I didn't add demuxing support for this codec in MPEGTS, but I can 
>>> venture
>>>
>>> a) it would mean essentially inlining this decoder in the demuxer.
>> Why is that a problem? This decoder seems like it shouldn't be a 
>> decoder.
>>
>> I agree with Andreas that this seems like it's a demuxer pretending to 
>> be a decoder.
>
>This module transforms the entire raw payload data to generate its output, even if the syntax is simple which essentially makes it a de-coder. The de-multiplexer aspect of multiple streams is an academic possibility allowed by the >standard but not seen in any sample which makes me suspect it's used for carriage between broadcast facilities rather than something ever sent to an OTT provider, let alone an end user.
>
>Regards,
>Gyan

AFAIK, DolbyE may be found on satellite feeds, for carriage between broadcast facilities, and thus it makes them accessible so they may be grabbed by "end users". Otherwise, it is "broadcast professional end users", which are users too. AFAIK, its most common form is 20Bits and you simply "cannot" have a single stream in a 20Bit carrier; but indeed, most of the time only the first stream ("program") is used and the second is a downmix; but not always. For example, you can have a first program which is standard 5.1 and a second program with Audio Description.
Anyway, I like this s302m patch support as it makes the existing DolbyE decoder really available (not requiring two pass).
If this design is not acceptable and you want to get all the picture, then I don't think there is any other option as introducing a new filter, 1 pin in 48KHz, N pins out (XXKhz) with stream metadata inserted in each (downmix meta, dialnorm..). That would be nice, indeed from a user point of view, and this was already discussed in some forum discussions. But are you ready to go for it ? A filter, dedicated to DolbyE, and that would link to lavc ?
Nicolas
Gyan Doshi Jan. 29, 2024, 10:17 a.m. UTC | #18
On 2024-01-29 02:57 pm, Nicolas Gaullier wrote:
>> On 2024-01-28 04:24 pm, Anton Khirnov wrote:
>>> Quoting Gyan Doshi (2024-01-26 05:23:50)
>>>> On 2024-01-25 06:47 pm, Andreas Rheinhardt wrote:
>>>>> Gyan Doshi:
>>>>>>> On 2024-01-25 10:29 am, Andreas Rheinhardt wrote:
>>>>>>> Gyan Doshi:
>>>>>>>> Set up framework for non-PCM decoding in-place and add support
>>>>>>>> for Dolby-E decoding.
>>>>>>>>
>>>>>>>> Useful for direct transcoding of non-PCM audio in live inputs.
>>>>>>>> ---
>>>>>>>>      configure          |   1 +
>>>>>>>>      doc/decoders.texi  |  40 +++
>>>>>>>>      libavcodec/s302m.c | 609 +++++++++++++++++++++++++++++++++++++--------
>>>>>>>>      3 files changed, 543 insertions(+), 107 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/configure b/configure index c8ae0a061d..8db3fa3f4b
>>>>>>>> 100755
>>>>>>>> --- a/configure
>>>>>>>> +++ b/configure
>>>>>>>> @@ -2979,6 +2979,7 @@ rv20_decoder_select="h263_decoder"
>>>>>>>>      rv20_encoder_select="h263_encoder"
>>>>>>>>      rv30_decoder_select="golomb h264pred h264qpel mpegvideodec rv34dsp"
>>>>>>>>      rv40_decoder_select="golomb h264pred h264qpel mpegvideodec rv34dsp"
>>>>>>>> +s302m_decoder_select="dolby_e_decoder"
>>>>>>>>      screenpresso_decoder_deps="zlib"
>>>>>>>>      shorten_decoder_select="bswapdsp"
>>>>>>>>      sipr_decoder_select="lsp"
>>>>>>>> diff --git a/doc/decoders.texi b/doc/decoders.texi index
>>>>>>>> 293c82c2ba..9f85c876bf 100644
>>>>>>>> --- a/doc/decoders.texi
>>>>>>>> +++ b/doc/decoders.texi
>>>>>>>> @@ -347,6 +347,46 @@ configuration. You need to explicitly
>>>>>>>> configure the build with
>>>>>>>>      An FFmpeg native decoder for Opus exists, so users can decode Opus
>>>>>>>>      without this library.
>>>>>>>>      +@section s302m
>>>>>>>> +
>>>>>>>> +SMPTE ST 302 decoder.
>>>>>>>> +
>>>>>>>> +SMPTE ST 302 is a method for storing AES3 data format within an
>>>>>>>> +MPEG
>>>>>>>> Transport
>>>>>>>> +Stream. AES3 streams can contain LPCM streams of 2, 4, 6 or 8
>>>>>>>> channels with a
>>>>>>>> +bit depth of 16, 20 or 24-bits at a sample rate of 48 kHz.
>>>>>>>> +They can also contain non-PCM codec streams such as AC-3 or Dolby-E.
>>>>>>>> +
>>>>>>> This sounds like we should add bitstream filters to extract the
>>>>>>> proper underlying streams instead.
>>>>>>> (I see only two problems with this approach: The BSF API needs to
>>>>>>> set the CodecID of the output during init, but at this point no
>>>>>>> packet has reached the BSF to determine it. And changing codec IDs
>>>>>>> mid-stream is also not supported.)
>>>>>> In theory, this decoder shouldn't exist, as it is just a carrier,
>>>>>> whether of LPCM or non-PCM.
>>>>>> FFmpeg architecture also imposes a fundamental limitation in that
>>>>>> one s302m stream may carry multiple payload streams and we support
>>>>>> only one decoding context per input stream
>>>>> Then why does the demuxer not separate the data into multiple streams?
>>>> I didn't add demuxing support for this codec in MPEGTS, but I can
>>>> venture
>>>>
>>>> a) it would mean essentially inlining this decoder in the demuxer.
>>> Why is that a problem? This decoder seems like it shouldn't be a
>>> decoder.
>>>
>>> I agree with Andreas that this seems like it's a demuxer pretending to
>>> be a decoder.
>> This module transforms the entire raw payload data to generate its output, even if the syntax is simple which essentially makes it a de-coder. The de-multiplexer aspect of multiple streams is an academic possibility allowed by the >standard but not seen in any sample which makes me suspect it's used for carriage between broadcast facilities rather than something ever sent to an OTT provider, let alone an end user.
>>
>> Regards,
>> Gyan
> AFAIK, DolbyE may be found on satellite feeds, for carriage between broadcast facilities, and thus it makes them accessible so they may be grabbed by "end users". Otherwise, it is "broadcast professional end users", which are users too. AFAIK, its most common form is 20Bits and you simply "cannot" have a single stream in a 20Bit carrier; but indeed, most of the time only the first stream ("program") is used and the second is a downmix; but not always. For example, you can have a first program which is standard 5.1 and a second program with Audio Description.
In the samples I have, including yours, the outermost layer is AES3 
which contains one inner stream Dolby-E, which in turn contains 8 
channels constituting multiple programs. Those are programs downstream 
of the dolby_e decoder. That's not what we are talking about. The 
discussion here is about multiple payload streams within the AES3 layer 
- streams stored in subframe mode e.g. a Dolby-E and AC-3 or LPCM in 
alternate subframes/frames. I've no such samples of that sort.

Regards,
Gyan
Kieran Kunhya Jan. 29, 2024, 10:18 a.m. UTC | #19
On Mon, 29 Jan 2024 at 10:17, Gyan Doshi <ffmpeg@gyani.pro> wrote:

>
>
> On 2024-01-29 02:57 pm, Nicolas Gaullier wrote:
> >> On 2024-01-28 04:24 pm, Anton Khirnov wrote:
> >>> Quoting Gyan Doshi (2024-01-26 05:23:50)
> >>>> On 2024-01-25 06:47 pm, Andreas Rheinhardt wrote:
> >>>>> Gyan Doshi:
> >>>>>>> On 2024-01-25 10:29 am, Andreas Rheinhardt wrote:
> >>>>>>> Gyan Doshi:
> >>>>>>>> Set up framework for non-PCM decoding in-place and add support
> >>>>>>>> for Dolby-E decoding.
> >>>>>>>>
> >>>>>>>> Useful for direct transcoding of non-PCM audio in live inputs.
> >>>>>>>> ---
> >>>>>>>>      configure          |   1 +
> >>>>>>>>      doc/decoders.texi  |  40 +++
> >>>>>>>>      libavcodec/s302m.c | 609
> +++++++++++++++++++++++++++++++++++++--------
> >>>>>>>>      3 files changed, 543 insertions(+), 107 deletions(-)
> >>>>>>>>
> >>>>>>>> diff --git a/configure b/configure index c8ae0a061d..8db3fa3f4b
> >>>>>>>> 100755
> >>>>>>>> --- a/configure
> >>>>>>>> +++ b/configure
> >>>>>>>> @@ -2979,6 +2979,7 @@ rv20_decoder_select="h263_decoder"
> >>>>>>>>      rv20_encoder_select="h263_encoder"
> >>>>>>>>      rv30_decoder_select="golomb h264pred h264qpel mpegvideodec
> rv34dsp"
> >>>>>>>>      rv40_decoder_select="golomb h264pred h264qpel mpegvideodec
> rv34dsp"
> >>>>>>>> +s302m_decoder_select="dolby_e_decoder"
> >>>>>>>>      screenpresso_decoder_deps="zlib"
> >>>>>>>>      shorten_decoder_select="bswapdsp"
> >>>>>>>>      sipr_decoder_select="lsp"
> >>>>>>>> diff --git a/doc/decoders.texi b/doc/decoders.texi index
> >>>>>>>> 293c82c2ba..9f85c876bf 100644
> >>>>>>>> --- a/doc/decoders.texi
> >>>>>>>> +++ b/doc/decoders.texi
> >>>>>>>> @@ -347,6 +347,46 @@ configuration. You need to explicitly
> >>>>>>>> configure the build with
> >>>>>>>>      An FFmpeg native decoder for Opus exists, so users can
> decode Opus
> >>>>>>>>      without this library.
> >>>>>>>>      +@section s302m
> >>>>>>>> +
> >>>>>>>> +SMPTE ST 302 decoder.
> >>>>>>>> +
> >>>>>>>> +SMPTE ST 302 is a method for storing AES3 data format within an
> >>>>>>>> +MPEG
> >>>>>>>> Transport
> >>>>>>>> +Stream. AES3 streams can contain LPCM streams of 2, 4, 6 or 8
> >>>>>>>> channels with a
> >>>>>>>> +bit depth of 16, 20 or 24-bits at a sample rate of 48 kHz.
> >>>>>>>> +They can also contain non-PCM codec streams such as AC-3 or
> Dolby-E.
> >>>>>>>> +
> >>>>>>> This sounds like we should add bitstream filters to extract the
> >>>>>>> proper underlying streams instead.
> >>>>>>> (I see only two problems with this approach: The BSF API needs to
> >>>>>>> set the CodecID of the output during init, but at this point no
> >>>>>>> packet has reached the BSF to determine it. And changing codec IDs
> >>>>>>> mid-stream is also not supported.)
> >>>>>> In theory, this decoder shouldn't exist, as it is just a carrier,
> >>>>>> whether of LPCM or non-PCM.
> >>>>>> FFmpeg architecture also imposes a fundamental limitation in that
> >>>>>> one s302m stream may carry multiple payload streams and we support
> >>>>>> only one decoding context per input stream
> >>>>> Then why does the demuxer not separate the data into multiple
> streams?
> >>>> I didn't add demuxing support for this codec in MPEGTS, but I can
> >>>> venture
> >>>>
> >>>> a) it would mean essentially inlining this decoder in the demuxer.
> >>> Why is that a problem? This decoder seems like it shouldn't be a
> >>> decoder.
> >>>
> >>> I agree with Andreas that this seems like it's a demuxer pretending to
> >>> be a decoder.
> >> This module transforms the entire raw payload data to generate its
> output, even if the syntax is simple which essentially makes it a de-coder.
> The de-multiplexer aspect of multiple streams is an academic possibility
> allowed by the >standard but not seen in any sample which makes me suspect
> it's used for carriage between broadcast facilities rather than something
> ever sent to an OTT provider, let alone an end user.
> >>
> >> Regards,
> >> Gyan
> > AFAIK, DolbyE may be found on satellite feeds, for carriage between
> broadcast facilities, and thus it makes them accessible so they may be
> grabbed by "end users". Otherwise, it is "broadcast professional end
> users", which are users too. AFAIK, its most common form is 20Bits and you
> simply "cannot" have a single stream in a 20Bit carrier; but indeed, most
> of the time only the first stream ("program") is used and the second is a
> downmix; but not always. For example, you can have a first program which is
> standard 5.1 and a second program with Audio Description.
> In the samples I have, including yours, the outermost layer is AES3
> which contains one inner stream Dolby-E, which in turn contains 8
> channels constituting multiple programs. Those are programs downstream
> of the dolby_e decoder. That's not what we are talking about. The
> discussion here is about multiple payload streams within the AES3 layer
> - streams stored in subframe mode e.g. a Dolby-E and AC-3 or LPCM in
> alternate subframes/frames. I've no such samples of that sort.
>

I have not seen such streams in the wild.
Kieran
Anton Khirnov Feb. 15, 2024, 10:47 a.m. UTC | #20
Hi,
sorry for the delay, I've been busy fixing things for the release
Quoting Gyan Doshi via ffmpeg-devel (2024-01-29 05:00:33)
> On 2024-01-28 04:24 pm, Anton Khirnov wrote:
> >> a) it would mean essentially inlining this decoder in the demuxer.
> > Why is that a problem? This decoder seems like it shouldn't be a
> > decoder.
> >
> > I agree with Andreas that this seems like it's a demuxer pretending to
> > be a decoder.
> 
> This module transforms the entire raw payload data to generate its 
> output, even if the syntax is simple which
> essentially makes it a de-coder. The de-multiplexer aspect of multiple 
> streams is an academic possibility allowed
> by the standard but not seen in any sample which makes me suspect it's 
> used for carriage between broadcast
> facilities rather than something ever sent to an OTT provider, let alone 
> an end user.

If it dynamically generates nested decoders, then it's not a proper
codec in our model. It should be either a part of the demuxer, or a
bitstream filter (possibly inserted automatically by the demuxer).
Gyan Doshi Feb. 15, 2024, 12:31 p.m. UTC | #21
On 2024-02-15 04:17 pm, Anton Khirnov wrote:
> Hi,
> sorry for the delay, I've been busy fixing things for the release
> Quoting Gyan Doshi via ffmpeg-devel (2024-01-29 05:00:33)
>> On 2024-01-28 04:24 pm, Anton Khirnov wrote:
>>>> a) it would mean essentially inlining this decoder in the demuxer.
>>> Why is that a problem? This decoder seems like it shouldn't be a
>>> decoder.
>>>
>>> I agree with Andreas that this seems like it's a demuxer pretending to
>>> be a decoder.
>> This module transforms the entire raw payload data to generate its
>> output, even if the syntax is simple which
>> essentially makes it a de-coder. The de-multiplexer aspect of multiple
>> streams is an academic possibility allowed
>> by the standard but not seen in any sample which makes me suspect it's
>> used for carriage between broadcast
>> facilities rather than something ever sent to an OTT provider, let alone
>> an end user.
> If it dynamically generates nested decoders, then it's not a proper
> codec in our model. It should be either a part of the demuxer, or a
> bitstream filter (possibly inserted automatically by the demuxer).

s302m is a hybrid creature and does not slot cleanly into any role. So 
there is no theoretically proper place for this component - any choice 
is a least-out-of-place accommodation.

But it is much more out of place inside a demuxer. Analyzing packet 
payload and then manipulating that entire payload is much closer to a 
decoding role than data chunk extraction for packetization. And the 
stream extracted from the container is meant to be SMPTE ST 302 not PCM* 
or Dolby-E/AC-3..etc, which will both misrepresent what the container 
carries
and possibly discard S-ADM metadata, if present, in the packet. With 
passthrough demuxing, a stream can be mapped for both decoding and 
streamcopying.

A bsf in principle would work but in practice, can't as Andreas 
clarified that bsfs can't set or alter codec_id after init. And 
resetting the codec id requires packet inspection.

Nested decoders are used without issue in components like imm5 or ftr 
(upto 64 nested decoders!) among others. There's no breaking of new 
ground here.

If you feel strongly about this, please refer this to the TC.

Regards,
Gyan
Anton Khirnov Feb. 15, 2024, 4:10 p.m. UTC | #22
Quoting Gyan Doshi (2024-02-15 13:31:59)
> On 2024-02-15 04:17 pm, Anton Khirnov wrote:
> > Hi,
> > sorry for the delay, I've been busy fixing things for the release
> > Quoting Gyan Doshi via ffmpeg-devel (2024-01-29 05:00:33)
> >> On 2024-01-28 04:24 pm, Anton Khirnov wrote:
> >>>> a) it would mean essentially inlining this decoder in the demuxer.
> >>> Why is that a problem? This decoder seems like it shouldn't be a
> >>> decoder.
> >>>
> >>> I agree with Andreas that this seems like it's a demuxer pretending to
> >>> be a decoder.
> >> This module transforms the entire raw payload data to generate its
> >> output, even if the syntax is simple which
> >> essentially makes it a de-coder. The de-multiplexer aspect of multiple
> >> streams is an academic possibility allowed
> >> by the standard but not seen in any sample which makes me suspect it's
> >> used for carriage between broadcast
> >> facilities rather than something ever sent to an OTT provider, let alone
> >> an end user.
> > If it dynamically generates nested decoders, then it's not a proper
> > codec in our model. It should be either a part of the demuxer, or a
> > bitstream filter (possibly inserted automatically by the demuxer).
> 
> s302m is a hybrid creature and does not slot cleanly into any role. So 
> there is no theoretically proper place for this component - any choice 
> is a least-out-of-place accommodation.
> 
> But it is much more out of place inside a demuxer. Analyzing packet 
> payload and then manipulating that entire payload is much closer to a 
> decoding role than data chunk extraction for packetization.

I don't see why specifically this property should be the one
distinguishing demuxers from decoders, it sounds pretty arbitrary to me.
Many demuxers apply transformations of far higher complexity to the
bytestream before exporting it, e.g. in matroska the packet data may be
compressed, laced, etc.

> And the stream extracted from the container is meant to be SMPTE ST
> 302 not PCM* or Dolby-E/AC-3..etc,

"meant to be"? By whom?

The point of libavformat is to abstract away the differences between
containers as much as is reasonably feasible, and export the data in the
format most useful to the caller for decoding or other processing.

> which will both misrepresent what the container carries

Why should the caller care?

> and possibly discard S-ADM metadata, if present, in the packet.

Why could that not be exported as side data?

> A bsf in principle would work but in practice, can't as Andreas 
> clarified that bsfs can't set or alter codec_id after init. And 
> resetting the codec id requires packet inspection.

There are two possibilities then - either extend the BSF API to support
multiple output streams, or implement it inside libavformat as a
post-demuxer hook called in the same place as parsing.

> Nested decoders are used without issue in components like imm5 or ftr 
> (upto 64 nested decoders!) among others. There's no breaking of new 
> ground here.

Nested decoders are certainly not "without issue" - they are a constant
source of issues, since implementing nesting properly is very tricky. I
am fairly sure most nested decoders we have are subtly broken in various
ways.

> If you feel strongly about this, please refer this to the TC.

The TC is supposed to be invoked as a last resort.
Gyan Doshi Feb. 15, 2024, 4:47 p.m. UTC | #23
On 2024-02-15 09:40 pm, Anton Khirnov wrote:
> Quoting Gyan Doshi (2024-02-15 13:31:59)
>> On 2024-02-15 04:17 pm, Anton Khirnov wrote:
>>> Hi,
>>> sorry for the delay, I've been busy fixing things for the release
>>> Quoting Gyan Doshi via ffmpeg-devel (2024-01-29 05:00:33)
>>>> On 2024-01-28 04:24 pm, Anton Khirnov wrote:
>>>>>> a) it would mean essentially inlining this decoder in the demuxer.
>>>>> Why is that a problem? This decoder seems like it shouldn't be a
>>>>> decoder.
>>>>>
>>>>> I agree with Andreas that this seems like it's a demuxer pretending to
>>>>> be a decoder.
>>>> This module transforms the entire raw payload data to generate its
>>>> output, even if the syntax is simple which
>>>> essentially makes it a de-coder. The de-multiplexer aspect of multiple
>>>> streams is an academic possibility allowed
>>>> by the standard but not seen in any sample which makes me suspect it's
>>>> used for carriage between broadcast
>>>> facilities rather than something ever sent to an OTT provider, let alone
>>>> an end user.
>>> If it dynamically generates nested decoders, then it's not a proper
>>> codec in our model. It should be either a part of the demuxer, or a
>>> bitstream filter (possibly inserted automatically by the demuxer).
>> s302m is a hybrid creature and does not slot cleanly into any role. So
>> there is no theoretically proper place for this component - any choice
>> is a least-out-of-place accommodation.
>>
>> But it is much more out of place inside a demuxer. Analyzing packet
>> payload and then manipulating that entire payload is much closer to a
>> decoding role than data chunk extraction for packetization.
> I don't see why specifically this property should be the one
> distinguishing demuxers from decoders, it sounds pretty arbitrary to me.
> Many demuxers apply transformations of far higher complexity to the
> bytestream before exporting it, e.g. in matroska the packet data may be
> compressed, laced, etc.
>
>> And the stream extracted from the container is meant to be SMPTE ST
>> 302 not PCM* or Dolby-E/AC-3..etc,
> "meant to be"? By whom?
>
> The point of libavformat is to abstract away the differences between
> containers as much as is reasonably feasible, and export the data in the
> format most useful to the caller for decoding or other processing.
>
>> which will both misrepresent what the container carries
> Why should the caller care?
>
>> and possibly discard S-ADM metadata, if present, in the packet.
> Why could that not be exported as side data?
>
>> A bsf in principle would work but in practice, can't as Andreas
>> clarified that bsfs can't set or alter codec_id after init. And
>> resetting the codec id requires packet inspection.
> There are two possibilities then - either extend the BSF API to support
> multiple output streams, or implement it inside libavformat as a
> post-demuxer hook called in the same place as parsing.
>
>> Nested decoders are used without issue in components like imm5 or ftr
>> (upto 64 nested decoders!) among others. There's no breaking of new
>> ground here.
> Nested decoders are certainly not "without issue" - they are a constant
> source of issues, since implementing nesting properly is very tricky. I
> am fairly sure most nested decoders we have are subtly broken in various
> ways.

This patch facilitates a certain productive use of ffmpeg with respect 
to processing of live inputs that wasn't possible earlier,
and which currently is being used successfully by multiple people over 
the past few weeks.
It applies a processing model already implemented in multiple other 
decoders for a number of years. I haven't seen many reports
of issues with them. And surely something being 'a constant source of 
issues' would be a lot more than 'subtly broken' as you describe 
them.You're the only one who has objected on architectural grounds and 
this looks to be a fundamental disagreement.
If you are blocking this patch, do acknowledge here within 24 hours and 
we can send this to the TC else I'll push it after that period.

Regards,
Gyan
Kieran Kunhya Feb. 15, 2024, 8:26 p.m. UTC | #24
On Thu, 15 Feb 2024 at 16:48, Gyan Doshi <ffmpeg@gyani.pro> wrote:

>
>
> On 2024-02-15 09:40 pm, Anton Khirnov wrote:
> > Quoting Gyan Doshi (2024-02-15 13:31:59)
> >> On 2024-02-15 04:17 pm, Anton Khirnov wrote:
> >>> Hi,
> >>> sorry for the delay, I've been busy fixing things for the release
> >>> Quoting Gyan Doshi via ffmpeg-devel (2024-01-29 05:00:33)
> >>>> On 2024-01-28 04:24 pm, Anton Khirnov wrote:
> >>>>>> a) it would mean essentially inlining this decoder in the demuxer.
> >>>>> Why is that a problem? This decoder seems like it shouldn't be a
> >>>>> decoder.
> >>>>>
> >>>>> I agree with Andreas that this seems like it's a demuxer pretending
> to
> >>>>> be a decoder.
> >>>> This module transforms the entire raw payload data to generate its
> >>>> output, even if the syntax is simple which
> >>>> essentially makes it a de-coder. The de-multiplexer aspect of multiple
> >>>> streams is an academic possibility allowed
> >>>> by the standard but not seen in any sample which makes me suspect it's
> >>>> used for carriage between broadcast
> >>>> facilities rather than something ever sent to an OTT provider, let
> alone
> >>>> an end user.
> >>> If it dynamically generates nested decoders, then it's not a proper
> >>> codec in our model. It should be either a part of the demuxer, or a
> >>> bitstream filter (possibly inserted automatically by the demuxer).
> >> s302m is a hybrid creature and does not slot cleanly into any role. So
> >> there is no theoretically proper place for this component - any choice
> >> is a least-out-of-place accommodation.
> >>
> >> But it is much more out of place inside a demuxer. Analyzing packet
> >> payload and then manipulating that entire payload is much closer to a
> >> decoding role than data chunk extraction for packetization.
> > I don't see why specifically this property should be the one
> > distinguishing demuxers from decoders, it sounds pretty arbitrary to me.
> > Many demuxers apply transformations of far higher complexity to the
> > bytestream before exporting it, e.g. in matroska the packet data may be
> > compressed, laced, etc.
> >
> >> And the stream extracted from the container is meant to be SMPTE ST
> >> 302 not PCM* or Dolby-E/AC-3..etc,
> > "meant to be"? By whom?
> >
> > The point of libavformat is to abstract away the differences between
> > containers as much as is reasonably feasible, and export the data in the
> > format most useful to the caller for decoding or other processing.
> >
> >> which will both misrepresent what the container carries
> > Why should the caller care?
> >
> >> and possibly discard S-ADM metadata, if present, in the packet.
> > Why could that not be exported as side data?
> >
> >> A bsf in principle would work but in practice, can't as Andreas
> >> clarified that bsfs can't set or alter codec_id after init. And
> >> resetting the codec id requires packet inspection.
> > There are two possibilities then - either extend the BSF API to support
> > multiple output streams, or implement it inside libavformat as a
> > post-demuxer hook called in the same place as parsing.
> >
> >> Nested decoders are used without issue in components like imm5 or ftr
> >> (upto 64 nested decoders!) among others. There's no breaking of new
> >> ground here.
> > Nested decoders are certainly not "without issue" - they are a constant
> > source of issues, since implementing nesting properly is very tricky. I
> > am fairly sure most nested decoders we have are subtly broken in various
> > ways.
>
> This patch facilitates a certain productive use of ffmpeg with respect
> to processing of live inputs that wasn't possible earlier,
> and which currently is being used successfully by multiple people over
> the past few weeks.
> It applies a processing model already implemented in multiple other
> decoders for a number of years. I haven't seen many reports
> of issues with them. And surely something being 'a constant source of
> issues' would be a lot more than 'subtly broken' as you describe
> them.You're the only one who has objected on architectural grounds and
> this looks to be a fundamental disagreement.
> If you are blocking this patch, do acknowledge here within 24 hours and
> we can send this to the TC else I'll push it after that period.
>

Dolby E can exist in other containers apart from S302M. Raising the TC is
premature.

Kieran
Gyan Doshi Feb. 16, 2024, 4:12 a.m. UTC | #25
On 2024-02-16 01:56 am, Kieran Kunhya wrote:
> On Thu, 15 Feb 2024 at 16:48, Gyan Doshi <ffmpeg@gyani.pro> wrote:
>
>>
>> On 2024-02-15 09:40 pm, Anton Khirnov wrote:
>>> Quoting Gyan Doshi (2024-02-15 13:31:59)
>>>> On 2024-02-15 04:17 pm, Anton Khirnov wrote:
>>>>> Hi,
>>>>> sorry for the delay, I've been busy fixing things for the release
>>>>> Quoting Gyan Doshi via ffmpeg-devel (2024-01-29 05:00:33)
>>>>>> On 2024-01-28 04:24 pm, Anton Khirnov wrote:
>>>>>>>> a) it would mean essentially inlining this decoder in the demuxer.
>>>>>>> Why is that a problem? This decoder seems like it shouldn't be a
>>>>>>> decoder.
>>>>>>>
>>>>>>> I agree with Andreas that this seems like it's a demuxer pretending
>> to
>>>>>>> be a decoder.
>>>>>> This module transforms the entire raw payload data to generate its
>>>>>> output, even if the syntax is simple which
>>>>>> essentially makes it a de-coder. The de-multiplexer aspect of multiple
>>>>>> streams is an academic possibility allowed
>>>>>> by the standard but not seen in any sample which makes me suspect it's
>>>>>> used for carriage between broadcast
>>>>>> facilities rather than something ever sent to an OTT provider, let
>> alone
>>>>>> an end user.
>>>>> If it dynamically generates nested decoders, then it's not a proper
>>>>> codec in our model. It should be either a part of the demuxer, or a
>>>>> bitstream filter (possibly inserted automatically by the demuxer).
>>>> s302m is a hybrid creature and does not slot cleanly into any role. So
>>>> there is no theoretically proper place for this component - any choice
>>>> is a least-out-of-place accommodation.
>>>>
>>>> But it is much more out of place inside a demuxer. Analyzing packet
>>>> payload and then manipulating that entire payload is much closer to a
>>>> decoding role than data chunk extraction for packetization.
>>> I don't see why specifically this property should be the one
>>> distinguishing demuxers from decoders, it sounds pretty arbitrary to me.
>>> Many demuxers apply transformations of far higher complexity to the
>>> bytestream before exporting it, e.g. in matroska the packet data may be
>>> compressed, laced, etc.
>>>
>>>> And the stream extracted from the container is meant to be SMPTE ST
>>>> 302 not PCM* or Dolby-E/AC-3..etc,
>>> "meant to be"? By whom?
>>>
>>> The point of libavformat is to abstract away the differences between
>>> containers as much as is reasonably feasible, and export the data in the
>>> format most useful to the caller for decoding or other processing.
>>>
>>>> which will both misrepresent what the container carries
>>> Why should the caller care?
>>>
>>>> and possibly discard S-ADM metadata, if present, in the packet.
>>> Why could that not be exported as side data?
>>>
>>>> A bsf in principle would work but in practice, can't as Andreas
>>>> clarified that bsfs can't set or alter codec_id after init. And
>>>> resetting the codec id requires packet inspection.
>>> There are two possibilities then - either extend the BSF API to support
>>> multiple output streams, or implement it inside libavformat as a
>>> post-demuxer hook called in the same place as parsing.
>>>
>>>> Nested decoders are used without issue in components like imm5 or ftr
>>>> (upto 64 nested decoders!) among others. There's no breaking of new
>>>> ground here.
>>> Nested decoders are certainly not "without issue" - they are a constant
>>> source of issues, since implementing nesting properly is very tricky. I
>>> am fairly sure most nested decoders we have are subtly broken in various
>>> ways.
>> This patch facilitates a certain productive use of ffmpeg with respect
>> to processing of live inputs that wasn't possible earlier,
>> and which currently is being used successfully by multiple people over
>> the past few weeks.
>> It applies a processing model already implemented in multiple other
>> decoders for a number of years. I haven't seen many reports
>> of issues with them. And surely something being 'a constant source of
>> issues' would be a lot more than 'subtly broken' as you describe
>> them.You're the only one who has objected on architectural grounds and
>> this looks to be a fundamental disagreement.
>> If you are blocking this patch, do acknowledge here within 24 hours and
>> we can send this to the TC else I'll push it after that period.
>>
> Dolby E can exist in other containers apart from S302M. Raising the TC is
> premature.

This patch only concerns s302m and sets up a framework for non-PCM 
decode. Dolby-E is incidental.

Regards,
Gyan
Anton Khirnov Feb. 16, 2024, 9:03 a.m. UTC | #26
Quoting Gyan Doshi (2024-02-15 17:47:49)
> This patch facilitates a certain productive use of ffmpeg with respect 
> to processing of live inputs that wasn't possible earlier,
> and which currently is being used successfully by multiple people over 
> the past few weeks.
> It applies a processing model already implemented in multiple other 
> decoders for a number of years. I haven't seen many reports
> of issues with them. And surely something being 'a constant source of 
> issues' would be a lot more than 'subtly broken' as you describe 
> them.

This reads very much like "I can't be bothered to do it properly and
would rather someone else fix it in the future". Given past experience,
that someone is highly likely to be me, and fixing past architectural
decisions requires a lot more effort than doing things properly in the
first place..

> You're the only one who has objected on architectural grounds and 

Not true, Andreas has objected as well.

> If you are blocking this patch, do acknowledge here within 24 hours and 
> we can send this to the TC else I'll push it after that period.

I must say I rather dislike this "my way or the highway" attitude. So
yes, I am objecting to the patch in its current form.
Andreas Rheinhardt Feb. 16, 2024, 1:55 p.m. UTC | #27
Gyan Doshi:
> 
> 
> On 2024-02-15 04:17 pm, Anton Khirnov wrote:
>> Hi,
>> sorry for the delay, I've been busy fixing things for the release
>> Quoting Gyan Doshi via ffmpeg-devel (2024-01-29 05:00:33)
>>> On 2024-01-28 04:24 pm, Anton Khirnov wrote:
>>>>> a) it would mean essentially inlining this decoder in the demuxer.
>>>> Why is that a problem? This decoder seems like it shouldn't be a
>>>> decoder.
>>>>
>>>> I agree with Andreas that this seems like it's a demuxer pretending to
>>>> be a decoder.
>>> This module transforms the entire raw payload data to generate its
>>> output, even if the syntax is simple which
>>> essentially makes it a de-coder. The de-multiplexer aspect of multiple
>>> streams is an academic possibility allowed
>>> by the standard but not seen in any sample which makes me suspect it's
>>> used for carriage between broadcast
>>> facilities rather than something ever sent to an OTT provider, let alone
>>> an end user.
>> If it dynamically generates nested decoders, then it's not a proper
>> codec in our model. It should be either a part of the demuxer, or a
>> bitstream filter (possibly inserted automatically by the demuxer).
> 
> s302m is a hybrid creature and does not slot cleanly into any role. So
> there is no theoretically proper place for this component - any choice
> is a least-out-of-place accommodation.
> 
> But it is much more out of place inside a demuxer. Analyzing packet
> payload and then manipulating that entire payload is much closer to a
> decoding role than data chunk extraction for packetization. And the
> stream extracted from the container is meant to be SMPTE ST 302 not PCM*
> or Dolby-E/AC-3..etc, which will both misrepresent what the container
> carries
> and possibly discard S-ADM metadata, if present, in the packet. With
> passthrough demuxing, a stream can be mapped for both decoding and
> streamcopying.

This is not true: It can not be streamcopied into formats expecting
ordinary PCM or Dolby-e/AC-3. Which is why exporting the data without
the unnecessary packetization is preferable.

> 
> A bsf in principle would work but in practice, can't as Andreas
> clarified that bsfs can't set or alter codec_id after init. And
> resetting the codec id requires packet inspection.
> 
> Nested decoders are used without issue in components like imm5 or ftr
> (upto 64 nested decoders!) among others. There's no breaking of new
> ground here.
>
Gyan Doshi Feb. 17, 2024, 11:44 a.m. UTC | #28
On 2024-02-16 07:25 pm, Andreas Rheinhardt wrote:
> Gyan Doshi:
>>
>> On 2024-02-15 04:17 pm, Anton Khirnov wrote:
>>> Hi,
>>> sorry for the delay, I've been busy fixing things for the release
>>> Quoting Gyan Doshi via ffmpeg-devel (2024-01-29 05:00:33)
>>>> On 2024-01-28 04:24 pm, Anton Khirnov wrote:
>>>>>> a) it would mean essentially inlining this decoder in the demuxer.
>>>>> Why is that a problem? This decoder seems like it shouldn't be a
>>>>> decoder.
>>>>>
>>>>> I agree with Andreas that this seems like it's a demuxer pretending to
>>>>> be a decoder.
>>>> This module transforms the entire raw payload data to generate its
>>>> output, even if the syntax is simple which
>>>> essentially makes it a de-coder. The de-multiplexer aspect of multiple
>>>> streams is an academic possibility allowed
>>>> by the standard but not seen in any sample which makes me suspect it's
>>>> used for carriage between broadcast
>>>> facilities rather than something ever sent to an OTT provider, let alone
>>>> an end user.
>>> If it dynamically generates nested decoders, then it's not a proper
>>> codec in our model. It should be either a part of the demuxer, or a
>>> bitstream filter (possibly inserted automatically by the demuxer).
>> s302m is a hybrid creature and does not slot cleanly into any role. So
>> there is no theoretically proper place for this component - any choice
>> is a least-out-of-place accommodation.
>>
>> But it is much more out of place inside a demuxer. Analyzing packet
>> payload and then manipulating that entire payload is much closer to a
>> decoding role than data chunk extraction for packetization. And the
>> stream extracted from the container is meant to be SMPTE ST 302 not PCM*
>> or Dolby-E/AC-3..etc, which will both misrepresent what the container
>> carries
>> and possibly discard S-ADM metadata, if present, in the packet. With
>> passthrough demuxing, a stream can be mapped for both decoding and
>> streamcopying.
> This is not true: It can not be streamcopied into formats expecting
> ordinary PCM or Dolby-e/AC-3. Which is why exporting the data without
> the unnecessary packetization is preferable.


With this decoder patch, the packet exported from the demuxer remains 
s302m. That may contain either PCM or non-PCM payload[+ ADM metadata].
A streamcopied output into TS or raw data allows the user to parse out 
the ADM on their own. I don't have the standard or samples for ADM so I 
can't extract it with either a decoder or demuxer patch.
If they want the non-PCM coded stream then they select the pre-existing 
method of non_pcm_mode copy in the decoder and a suitable pcm_* encoder 
to extract the coded stream, else set the non pcm mode to decode to 
obtain playable LPCM AVFrames.

With a demuxer patch, the packet extracted from the demuxer is either 
PCM* or some non-PCM codec. The ADM portion if present is lost since 
there is no parser for it. And if we add an option for passthrough 
demuxing and leave the s302m decoder as it is, then we're back with the 
status quo of a 2-step transcoding pipeline.

Regards,
Gyan
Gyan Doshi Feb. 17, 2024, 11:46 a.m. UTC | #29
On 2024-02-16 02:33 pm, Anton Khirnov wrote:
> Quoting Gyan Doshi (2024-02-15 17:47:49)
>> This patch facilitates a certain productive use of ffmpeg with respect
>> to processing of live inputs that wasn't possible earlier,
>> and which currently is being used successfully by multiple people over
>> the past few weeks.
>> It applies a processing model already implemented in multiple other
>> decoders for a number of years. I haven't seen many reports
>> of issues with them. And surely something being 'a constant source of
>> issues' would be a lot more than 'subtly broken' as you describe
>> them.
> This reads very much like "I can't be bothered to do it properly and
> would rather someone else fix it in the future". Given past experience,
> that someone is highly likely to be me, and fixing past architectural
> decisions requires a lot more effort than doing things properly in the
> first place..
>
>> You're the only one who has objected on architectural grounds and
> Not true, Andreas has objected as well.
>
>> If you are blocking this patch, do acknowledge here within 24 hours and
>> we can send this to the TC else I'll push it after that period.
> I must say I rather dislike this "my way or the highway" attitude. So
> yes, I am objecting to the patch in its current form.

Thanks for the clear signal.

I've presented my case to the TC on the ML.

As a TC member who is part of the disagreement, I believe your 
participation is recused.

Regards,
Gyan
Anton Khirnov Feb. 17, 2024, 12:22 p.m. UTC | #30
Quoting Gyan Doshi (2024-02-17 12:46:27)
> As a TC member who is part of the disagreement, I believe your 
> participation is recused.

No, I do not think "TC members who commented on a patch lose their right
to vote" is a reasonable interpretation of that rule.
Rémi Denis-Courmont Feb. 17, 2024, 12:31 p.m. UTC | #31
Le lauantaina 17. helmikuuta 2024, 13.46.27 EET Gyan Doshi a écrit :
> As a TC member who is part of the disagreement, I believe your
> participation is recused.

Obviously not. We don't want to get into a situation whence TC members have an 
incentive not to participate in regular code reviews just so that they can 
participate in the hypothetical making of later related TC decisions. That 
would be both dumb and counter-productive.

Somebody disagreeing with you does not necessarily mean that they have a 
conflict of interest in the subject matter.
Gyan Doshi Feb. 17, 2024, 12:37 p.m. UTC | #32
On 2024-02-17 05:52 pm, Anton Khirnov wrote:
> Quoting Gyan Doshi (2024-02-17 12:46:27)
>> As a TC member who is part of the disagreement, I believe your
>> participation is recused.
> No, I do not think "TC members who commented on a patch lose their right
> to vote" is a reasonable interpretation of that rule.

I refer to

"If the disagreement involves a member of the TC, that member should 
recuse themselves from the decision"

at

https://ffmpeg.org/community.html#Announcement

You clearly are one of the parties to the disagreement, and "recuse 
themselves from the decision" is self-explanatory.


Regards,
Gyan
Anton Khirnov Feb. 17, 2024, 7:55 p.m. UTC | #33
Quoting Gyan Doshi (2024-02-17 13:37:38)
> On 2024-02-17 05:52 pm, Anton Khirnov wrote:
> > Quoting Gyan Doshi (2024-02-17 12:46:27)
> >> As a TC member who is part of the disagreement, I believe your
> >> participation is recused.
> > No, I do not think "TC members who commented on a patch lose their right
> > to vote" is a reasonable interpretation of that rule.
> 
> I refer to
> 
> "If the disagreement involves a member of the TC, that member should 
> recuse themselves from the decision"
> 
> at
> 
> https://ffmpeg.org/community.html#Announcement
> 
> You clearly are one of the parties to the disagreement, and "recuse 
> themselves from the decision" is self-explanatory.

Such a maximalist interpretation makes no sense - why should my opinion
become invalid because I commented on a patch, but not if I kept it to
myself and let someone else object to your patch?

And as Rémi said, it would lead to perverse counterproductive
incentives, such as TC members refraining from reviewing code.
Michael Niedermayer Feb. 18, 2024, 12:43 a.m. UTC | #34
On Sat, Feb 17, 2024 at 08:55:43PM +0100, Anton Khirnov wrote:
> Quoting Gyan Doshi (2024-02-17 13:37:38)
> > On 2024-02-17 05:52 pm, Anton Khirnov wrote:
> > > Quoting Gyan Doshi (2024-02-17 12:46:27)
> > >> As a TC member who is part of the disagreement, I believe your
> > >> participation is recused.
> > > No, I do not think "TC members who commented on a patch lose their right
> > > to vote" is a reasonable interpretation of that rule.
> > 
> > I refer to
> > 
> > "If the disagreement involves a member of the TC, that member should 
> > recuse themselves from the decision"
> > 
> > at
> > 
> > https://ffmpeg.org/community.html#Announcement
> > 
> > You clearly are one of the parties to the disagreement, and "recuse 
> > themselves from the decision" is self-explanatory.
> 
> Such a maximalist interpretation makes no sense - why should my opinion
> become invalid because I commented on a patch,

"If the disagreement involves a member of the TC"
does IMHO not preclude commenting on a patch.

For a disagreement we need 2 parties. For example one party who
wants a patch in and one who blocks the patch. or 2 parties where both
block the other.

Being a party of a disagreement would not make anyones opinon invalid.
But
I think it is reasonable that parties of a disagreement cannot be
the judge of the disagreement.


> but not if I kept it to
> myself and let someone else object to your patch?

I think the deatils of this would matter.
If a TC member simply does nothing and has no involvement and someone else objects,
sure thet TC member could vote.

OTOH

if a TC member asks someone else to object to hide a conflict of interrest and then
votes. IMO Thats a severe breach of trust and that person should not be in any committee

thx

[...]
Gyan Doshi Feb. 18, 2024, 4:06 a.m. UTC | #35
On 2024-02-18 01:25 am, Anton Khirnov wrote:
> Quoting Gyan Doshi (2024-02-17 13:37:38)
>> On 2024-02-17 05:52 pm, Anton Khirnov wrote:
>>> Quoting Gyan Doshi (2024-02-17 12:46:27)
>>>> As a TC member who is part of the disagreement, I believe your
>>>> participation is recused.
>>> No, I do not think "TC members who commented on a patch lose their right
>>> to vote" is a reasonable interpretation of that rule.
>> I refer to
>>
>> "If the disagreement involves a member of the TC, that member should
>> recuse themselves from the decision"
>>
>> at
>>
>> https://ffmpeg.org/community.html#Announcement
>>
>> You clearly are one of the parties to the disagreement, and "recuse
>> themselves from the decision" is self-explanatory.
> Such a maximalist interpretation makes no sense - why should my opinion
> become invalid because I commented on a patch, but not if I kept it to
> myself and let someone else object to your patch?

a) I didn't make that rule, just pointed it out
b) what "maximalist" interpretation? - I think the current patch is 
fine, you don't. That's a disagreement and you're involved in it, so the 
rule tells you to not be a judge. This is not a complex body of law or 
obscure legalese , just a trivial application of the rule composed in 
school-level English.
c) your opinion hasn't been made invalid - it just can't be a weighted 
vote in the TC decision. And it's not merely because you 'commented' - 
see (b).

Regards,
Gyan
Anton Khirnov Feb. 18, 2024, 6:03 p.m. UTC | #36
Quoting Gyan Doshi (2024-02-18 05:06:30)
> b) what "maximalist" interpretation?

A non-maximalist interpretation would be that a TC member is only
excluded from voting when they authored the patch that is being
disputed.

> - I think the current patch is fine, you don't. That's a disagreement
> and you're involved in it, so the rule tells you to not be a judge.

Anything handled by the TC is a disagreement. Then according to your
interpretation, any TC member who expressed an opinion on a patch
(either positive or negative) becomes a party to the disagreement and
cannot vote. That is absurd and makes no sense.
Anton Khirnov Feb. 18, 2024, 6:20 p.m. UTC | #37
Quoting Michael Niedermayer (2024-02-18 01:43:14)
> "If the disagreement involves a member of the TC"
> does IMHO not preclude commenting on a patch.
> 
> For a disagreement we need 2 parties. For example one party who
> wants a patch in and one who blocks the patch. or 2 parties where both
> block the other.
> 
> Being a party of a disagreement would not make anyones opinon invalid.

Anything that goes to TC is a disagreement. Anyone who expressed an
opinion on the patch then is 'a party to the disagreement'.

> But I think it is reasonable that parties of a disagreement cannot be
> the judge of the disagreement.

Why not? This is one of those truthy-sounding statements that does not
actually hold up to scrutiny.

TC members are not supposed to be impartial judges, because there is no
body of law for us to interpret. TC members are elected for their
opinions. And I see no good reason why those opinions should suddenly
become invalid just because they've been expressed before.

And again, interpreting this rule in this way means that TC members
are incentivized not to review patches. Given that TC members are also
often among the most active contributors, is that really what you want?
Nicolas George Feb. 18, 2024, 6:40 p.m. UTC | #38
Anton Khirnov (12024-02-18):
> A non-maximalist interpretation would be that a TC member is only
> excluded from voting when they authored the patch that is being
> disputed.

If the rules were meant to be interpreted that way, they would have been
written “if the patch was proposed by a member of the TC“, not “if the
disagreement involves a member of the TC”.

The world is “involves”, its meaning is inherently maximalist.

> Anything handled by the TC is a disagreement. Then according to your
> interpretation, any TC member who expressed an opinion on a patch
> (either positive or negative) becomes a party to the disagreement and
> cannot vote.

Exactly: you were part of this disagreement, you should recuse yourself.

>		That is absurd and makes no sense.

That makes absolute sense, unless you consider your opinion is worth
more than the opinion of the other people in the project.

A spot on the TC is not a license to consider one's opinion above the
other's, and therefore comes with a trade-off, and it is exactly that.

There are other members of the TC who did not take part in the
discussion: recusing yourself is not an issue.

Sitting on the TC is a duty, a responsibility to examine in details
changes one do not care about to make an informed decision. Somebody who
sees it as a means to power rather than a responsibility should be
evicted as soon as possible.
Rémi Denis-Courmont Feb. 18, 2024, 6:50 p.m. UTC | #39
Le sunnuntaina 18. helmikuuta 2024, 2.43.14 EET Michael Niedermayer a écrit :
> > > You clearly are one of the parties to the disagreement, and "recuse
> > > themselves from the decision" is self-explanatory.
> > 
> > Such a maximalist interpretation makes no sense - why should my opinion
> > become invalid because I commented on a patch,
> 
> "If the disagreement involves a member of the TC"
> does IMHO not preclude commenting on a patch.
> 
> For a disagreement we need 2 parties.
> For example one party who wants a patch in and one who blocks the patch. or
> 2 parties where both block the other.

This is an utterly absurd interpretation. By that logic, a TC member would 
automatically become party to the disagreement by expressing an opinion within 
even the TC itself. In fact, if you would read it maximally that way, any who 
has an opinion, even if they have not expressed it, would be a party.

So what then, the FFmpeg thought police?

You can argue that the rule is vague, and it is. But if anything, we can at 
least eliminate absurd interpretations. (And in any case, it says "should", 
not "must".)
Nicolas George Feb. 18, 2024, 6:55 p.m. UTC | #40
Rémi Denis-Courmont (12024-02-18):
> This is an utterly absurd interpretation. By that logic, a TC member would 
> automatically become party to the disagreement by expressing an opinion within 
> even the TC itself.

This is the most hypocritical argument put forward in this discussion
yet.

>		      In fact, if you would read it maximally that way, any who 
> has an opinion, even if they have not expressed it, would be a party.
> 
> So what then, the FFmpeg thought police?

And you break your own record in the very next sentence.

> You can argue that the rule is vague, and it is. But if anything, we can at 
> least eliminate absurd interpretations.

The rule is not vague at all.

>					   (And in any case, it says "should", 
> not "must".)

Indeed. I wondered when somebody dishonest would try to exploit that
loophole.

The obvious answer is: if somebody in the TC does not do what the rules
say they SHOULD, then the general assembly SHOULD vote them out at the
next election. Or earlier, because a vote of no confidence can be
brought at any time.
Gyan Doshi Feb. 18, 2024, 7:02 p.m. UTC | #41
On 2024-02-18 11:33 pm, Anton Khirnov wrote:
> Quoting Gyan Doshi (2024-02-18 05:06:30)
>> b) what "maximalist" interpretation?
> A non-maximalist interpretation would be that a TC member is only
> excluded from voting when they authored the patch that is being
> disputed.

If the promulgators meant to only prevent proposers of the disputed 
change to not take part, then
the verbiage would be different.

In looking up how this clause came to be present, I came across the 
following messages:

https://ffmpeg.org/pipermail/ffmpeg-devel/2020-December/273443.html
(Nicolas George originally proposes this clause - wording is more 
restrictive)

https://ffmpeg.org/pipermail/ffmpeg-devel/2021-January/274822.html
(this one is interesting, you objected to the clause but on the grounds 
that it was all-encompassing i.e.  anyone commenting on the dispute was 
potentially subjected to recusal and referred to some 'model' 
discussion, so your describing my reading as maximalist is weird since 
that is how you read it - you just happen to object to this rule)

https://ffmpeg.org/pipermail/ffmpeg-devel/2021-January/274826.html
(Ronald clarifies that "involved" should be constrained to just be one 
of the two parties -- of which you happen to be one)

There's the matter of what the rule currently is, distinct from what it 
should be. What it ideally should be is that the decision should be 
taken by a fresh set of eyes consisting of those who haven't become or 
are seen to be publicly invested in the outcome. So the TC should have a 
set of alternates - those who can make up the quorum and constitute an 
odd number of voters when some from the first 5 are recused.

Regards,
Gyan
Rémi Denis-Courmont Feb. 18, 2024, 7:03 p.m. UTC | #42
Le sunnuntaina 18. helmikuuta 2024, 20.40.14 EET Nicolas George a écrit :
> The world is “involves”, its meaning is inherently maximalist.

The wording is very clear (emphasis added): "If the disagreement involves a 
member of the TC, that member SHOULD recuse themselves from the decision."

I trust that you do know the meaning of the auxillary "should". That very 
definitely and very obviously eliminates any "maximalist" interpretations.

> Exactly: you were part of this disagreement, you should recuse yourself.

The maximalist interpretation is clearly nonsensical as per reductio ad 
absurdum. This is a *guideline* or *recommendation*, and obviously it _cannot_ 
be applied to its logic extreme.

Instead Anton, and later rach other TC member will have to each determine for 
themselves if they are so implicated in the disagreement as to justify 
recusing themselves.

Gyan does not get to dictate it, and neither do you or I. There is no point 
arguing this further, and I won't.
Nicolas George Feb. 18, 2024, 7:11 p.m. UTC | #43
Rémi Denis-Courmont (12024-02-18):
> I trust that you do know the meaning of the auxillary "should". That very 
> definitely and very obviously eliminates any "maximalist" interpretations.

Indeed. And I repeat what I already said in another mail:

If somebody dishonest wants to exploit that loophole, if somebody does
not do what the rules say they do, then the GA should do it for them.

> The maximalist interpretation is clearly nonsensical as per reductio ad 
> absurdum.

The maximalist interpretation was what was intended.

The maximalist interpretation also was what Anton understood when the
rule was put in place. It is a little hypocritical to pretend is means
something else now.

In fact, if you want to know everything, the current situation was
EXACTLY what I had in mind when I proposed the rule in the first place.
Including the identity of the TC with a conflict of interest.

>	    This is a *guideline* or *recommendation*, and obviously it _cannot_ 
> be applied to its logic extreme.

Human rules are never meant to be applied to their logic extreme,
arguments of this kind are bullshit.

> Instead Anton, and later rach other TC member will have to each determine for 
> themselves if they are so implicated in the disagreement as to justify 
> recusing themselves.
> 
> Gyan does not get to dictate it, and neither do you or I. There is no point 
> arguing this further, and I won't.

Gyan can call for a vote of no confidence in the TC or one of its
members.
Vittorio Giovara Feb. 18, 2024, 9:06 p.m. UTC | #44
On Sun, Feb 18, 2024 at 7:40 PM Nicolas George <george@nsup.org> wrote:

> Anton Khirnov (12024-02-18):
> >               That is absurd and makes no sense.
>
> That makes absolute sense, unless you consider your opinion is worth
> more than the opinion of the other people in the project.
>
> A spot on the TC is not a license to consider one's opinion above the
> other's, and therefore comes with a trade-off, and it is exactly that.
>
> There are other members of the TC who did not take part in the
> discussion: recusing yourself is not an issue.
>
> Sitting on the TC is a duty, a responsibility to examine in details
> changes one do not care about to make an informed decision. Somebody who
> sees it as a means to power rather than a responsibility should be
> evicted as soon as possible.
>

Thank you for your perspective on TC responsibilities within the FFmpeg
project. While I value your insights, I'd like to offer a different
viewpoint regarding the practice of recusing oneself from discussions.

It's essential to recognize that each TC member holds a duty to actively
participate in discussions and decision-making processes. Recusing oneself
from discussions should be a rare occurrence reserved for situations where
a clear conflict of interest exists, rather than a default response to
certain topics.

Serving on the TC requires a commitment to thoroughly examine proposed
changes and contribute to informed decision-making. It's through active
engagement and thoughtful consideration of all viewpoints that we can
ensure the best outcomes for the project.

While I understand the importance of avoiding conflicts of interest, it's
equally vital for TC members to fulfill their responsibility to the project
by actively engaging in discussions and providing valuable insights.

In conclusion, let's uphold the principle of active participation and
responsibility in TC discussions to maintain the integrity and
effectiveness of our governance processes within the FFmpeg project.
Nicolas George Feb. 18, 2024, 9:25 p.m. UTC | #45
Vittorio Giovara (12024-02-18):
>	   While I value your insights, I'd like to offer a different
> viewpoint regarding the practice of recusing oneself from discussions.

<snip>

That might be your viewpoint, but that is not what the rules we all
agreed upon say.
Vittorio Giovara Feb. 18, 2024, 9:46 p.m. UTC | #46
On Sun, Feb 18, 2024 at 8:02 PM Gyan Doshi <ffmpeg@gyani.pro> wrote:

>
>
> On 2024-02-18 11:33 pm, Anton Khirnov wrote:
> > Quoting Gyan Doshi (2024-02-18 05:06:30)
> >> b) what "maximalist" interpretation?
> > A non-maximalist interpretation would be that a TC member is only
> > excluded from voting when they authored the patch that is being
> > disputed.
>
> If the promulgators meant to only prevent proposers of the disputed
> change to not take part, then
> the verbiage would be different.
>
> In looking up how this clause came to be present, I came across the
> following messages:
>
> https://ffmpeg.org/pipermail/ffmpeg-devel/2020-December/273443.html
> (Nicolas George originally proposes this clause - wording is more
> restrictive)
>
> https://ffmpeg.org/pipermail/ffmpeg-devel/2021-January/274822.html
> (this one is interesting, you objected to the clause but on the grounds
> that it was all-encompassing i.e.  anyone commenting on the dispute was
> potentially subjected to recusal and referred to some 'model'
> discussion, so your describing my reading as maximalist is weird since
> that is how you read it - you just happen to object to this rule)
>
> https://ffmpeg.org/pipermail/ffmpeg-devel/2021-January/274826.html
> (Ronald clarifies that "involved" should be constrained to just be one
> of the two parties -- of which you happen to be one)
>
> There's the matter of what the rule currently is, distinct from what it
> should be. What it ideally should be is that the decision should be
> taken by a fresh set of eyes consisting of those who haven't become or
> are seen to be publicly invested in the outcome. So the TC should have a
> set of alternates - those who can make up the quorum and constitute an
> odd number of voters when some from the first 5 are recused.
>

I'd like to offer a lighter interpretation of the rule, the mailing list is
the common playing ground, where discussions and disagreements can be had.
In case of a technical "maximalist" disagreement, then either party can
invoke the TC to judge on the matter. If anyone in the TC is involved in
the patch, as if it's an author or significantly contributed to it, then
they should step away from voting. In other words the "level of
involvement" rule takes place at the TC level, not at the ffmpeg-devel
discussion. Also consider that even in a vote recusal, the member's
arguments will still be read and by all likelihood taken into consideration
by the TC, so yours seems to be a literal interpretation of the rule,
instead of the spirit of the rule, which in my opinion matters more.
Vittorio Giovara Feb. 18, 2024, 9:55 p.m. UTC | #47
On Sun, Feb 18, 2024 at 10:25 PM Nicolas George <george@nsup.org> wrote:

> Vittorio Giovara (12024-02-18):
> >          While I value your insights, I'd like to offer a different
> > viewpoint regarding the practice of recusing oneself from discussions.
>
> <snip>
>
> That might be your viewpoint, but that is not what the rules we all
> agreed upon say.
>

Thank you for your prompt response, Nicolas. I appreciate the opportunity
to delve deeper into our governance rules and their interpretation within
the FFmpeg community.

While I understand that you're referencing the existing rules that we've
collectively agreed upon, I believe it's crucial for us to periodically
review and refine these rules to ensure they remain aligned with our
evolving community values and goals.

The rules we've agreed upon serve as a foundation for our governance
structure, providing guidelines for decision-making and ensuring fairness
and transparency in our processes. However, as our community grows and
evolves, it's natural for interpretations and perspectives to shift,
necessitating occasional reassessment of these rules.

Your reminder about the importance of adhering to the established rules is
valid, and I share your commitment to upholding them. However, I also
believe that our rules should be flexible enough to accommodate diverse
viewpoints and adapt to changing circumstances.

Therefore, I propose that we engage in a collaborative effort to revisit
and clarify our existing rules. By fostering open dialogue and soliciting
input from all members of the community, we can ensure that our governance
framework reflects the collective wisdom and values of our community.

Furthermore, this process of review and refinement will help address any
ambiguities or discrepancies in our rules, promoting greater clarity and
understanding among all stakeholders. It will also provide an opportunity
to incorporate new insights and best practices that may have emerged since
the rules were initially established.

In conclusion, I appreciate your reminder about the importance of adhering
to our agreed-upon rules. However, I believe that revisiting and clarifying
these rules is a proactive step toward strengthening our governance
processes and ensuring the continued success and sustainability of the
FFmpeg project.

Thank you for your dedication to the community, and I look forward to
engaging in productive discussions to enhance our governance framework.
Michael Niedermayer Feb. 18, 2024, 10:34 p.m. UTC | #48
Hi

On Sun, Feb 18, 2024 at 07:20:43PM +0100, Anton Khirnov wrote:
> Quoting Michael Niedermayer (2024-02-18 01:43:14)
> > "If the disagreement involves a member of the TC"
> > does IMHO not preclude commenting on a patch.
> > 
> > For a disagreement we need 2 parties. For example one party who
> > wants a patch in and one who blocks the patch. or 2 parties where both
> > block the other.
> > 
> > Being a party of a disagreement would not make anyones opinon invalid.
> 
> Anything that goes to TC is a disagreement.

probably, thats true, yes


> Anyone who expressed an
> opinion on the patch then is 'a party to the disagreement'.

no, i dont see it that way
A developer blocking a patch is a party to the disagreement
So is the developer who calls the TC because of that.


Similarly if you look at real world court cases
parties to the lawsuit are the one who is filling the lawsuit and
the defendant.
The thousand people expressing an oppinion in some random place are
not parties to the disagreement

More formally, you could define a "party to a disagreement" as
all minimal sets of people whos non existence would resolve the disagreement

That is if 3 people block a patch all of them are parties to a disagreement
but a person expressing an oppinion is not. Also the person(s) calling on the
TC are parties to a disagreement. And the main author(s) of the patch too are


> 
> > But I think it is reasonable that parties of a disagreement cannot be
> > the judge of the disagreement.
> 
> Why not? This is one of those truthy-sounding statements that does not
> actually hold up to scrutiny.

Imagine a judge kills someone and judges himself innocent afterwards in a panel of 5 judges

* A disagreement implies that there are 2 parties
* And we assume here that what one party wants is better for FFmpeg than what the other wants.
* The TC needs to find out which partys choice is better or suggest a 3rd choice.
* If one but not the other party is a member of the TC then this decission becomes biased if that member votes

Your interpretation suggests that the TC members are "above" everyone and should
prevail in arguments they have with others.

I dont think the GA has given that power to the TC.

thx

[...]
Vittorio Giovara Feb. 18, 2024, 10:47 p.m. UTC | #49
On Sun, Feb 18, 2024 at 11:34 PM Michael Niedermayer <michael@niedermayer.cc>
wrote:

> Hi
>
> On Sun, Feb 18, 2024 at 07:20:43PM +0100, Anton Khirnov wrote:
> > Quoting Michael Niedermayer (2024-02-18 01:43:14)
> > > "If the disagreement involves a member of the TC"
> > > does IMHO not preclude commenting on a patch.
> > >
> > > For a disagreement we need 2 parties. For example one party who
> > > wants a patch in and one who blocks the patch. or 2 parties where both
> > > block the other.
> > >
> > > Being a party of a disagreement would not make anyones opinon invalid.
> >
> > Anything that goes to TC is a disagreement.
>
> probably, thats true, yes
>
>
> > Anyone who expressed an
> > opinion on the patch then is 'a party to the disagreement'.
>
> no, i dont see it that way
> A developer blocking a patch is a party to the disagreement
> So is the developer who calls the TC because of that.
>

Disagree. If that basically means that no patches will be ever blocked by
the members of the TC!
They should express the best technical excellence of the whole community,
not be stifled from participating in the discussion

If it helps, I'll block the patch so that Anton can vote in the TC.
Do you see how slippery (and insane) this interpretation of the rule
becomes?



> Similarly if you look at real world court cases
> parties to the lawsuit are the one who is filling the lawsuit and
> the defendant.
> The thousand people expressing an oppinion in some random place are
> not parties to the disagreement
>

This is a false dichotomy, we're not a court case where we're interpreting
the law, we're trying to solve a technical disagreement.


> More formally, you could define a "party to a disagreement" as
> all minimal sets of people whos non existence would resolve the
> disagreement
>

By that reasoning you shouldn't vote either since you touched basically all
of ffmpeg codebase!


> * A disagreement implies that there are 2 parties
> * And we assume here that what one party wants is better for FFmpeg than
> what the other wants.
> * The TC needs to find out which partys choice is better or suggest a 3rd
> choice.
> * If one but not the other party is a member of the TC then this decission
> becomes biased if that member votes
>
> Your interpretation suggests that the TC members are "above" everyone and
> should
> prevail in arguments they have with others.
>

Nobody is saying that!!!


> I dont think the GA has given that power to the TC.
>

Are you suggesting to have a vote to rewrite/reinterpret the rule?
Hendrik Leppkes Feb. 18, 2024, 10:48 p.m. UTC | #50
On Sun, Feb 18, 2024 at 11:34 PM Michael Niedermayer
<michael@niedermayer.cc> wrote:
>
> * A disagreement implies that there are 2 parties
> * And we assume here that what one party wants is better for FFmpeg than what the other wants.
> * The TC needs to find out which partys choice is better or suggest a 3rd choice.
> * If one but not the other party is a member of the TC then this decission becomes biased if that member votes
>
> Your interpretation suggests that the TC members are "above" everyone and should
> prevail in arguments they have with others.
>

Noone is above the rules, but just because someone has an opinion and
shared it shouldn't disqualify them, because they were specifically
voted into the TC for their opinions on technical matters.
Would their opinion, and therefore their vote, change if someone else
was seen as the person "blocking"?

What if multiple people had expressed disagreement with a patch, and
most of the TC was involved in the public discussion already? Do the
remaining "uninvolved" people on the TC get all the decision power? Or
do we consider most of the TC already opposing it publicly as perhaps
an indicator that the patch might not be the way to go?
Thats what the TC was voted in for, to give their opinion on technical
matters and decide if needed, so why deprive them of their opinion,
just because they already stated it publicly? That makes no sense to
me.

- Hendrik
Michael Niedermayer Feb. 19, 2024, 1:17 a.m. UTC | #51
On Sun, Feb 18, 2024 at 11:48:59PM +0100, Hendrik Leppkes wrote:
> On Sun, Feb 18, 2024 at 11:34 PM Michael Niedermayer
> <michael@niedermayer.cc> wrote:
> >
> > * A disagreement implies that there are 2 parties
> > * And we assume here that what one party wants is better for FFmpeg than what the other wants.
> > * The TC needs to find out which partys choice is better or suggest a 3rd choice.
> > * If one but not the other party is a member of the TC then this decission becomes biased if that member votes
> >
> > Your interpretation suggests that the TC members are "above" everyone and should
> > prevail in arguments they have with others.
> >
> 
> Noone is above the rules, but just because someone has an opinion and
> shared it shouldn't disqualify them, because they were specifically
> voted into the TC for their opinions on technical matters.
> Would their opinion, and therefore their vote, change if someone else
> was seen as the person "blocking"?

I think you are mixing the concept of an oppinion and blocking a patch.
following is how i see the concept

If you state that you prefer a linked list but dont mind the patch as it is
thats an oppinion

If you state that you prefer a linked list and i tell you that i would
prefer to keep an array and you say you are ok, again thats an oppinion
the patch is not blocked

If you state that you prefer a linked list and i tell you that i would
prefer to keep an array and you now tell me that if i want an array i have
to go to the TC then you are blocking the patch. You and me in this case
are the cause of the TC being involved.
Only at this point we would be parties to the disagreement IMHO
and we cannot be the judge here


> 
> What if multiple people had expressed disagreement with a patch, and
> most of the TC was involved in the public discussion already? Do the

The question would be who is actually blocking it and not just stating
their oppinion.


> remaining "uninvolved" people on the TC get all the decision power? Or
> do we consider most of the TC already opposing it publicly as perhaps
> an indicator that the patch might not be the way to go?
> Thats what the TC was voted in for, to give their opinion on technical
> matters and decide if needed, so why deprive them of their opinion,
> just because they already stated it publicly? That makes no sense to
> me.

You certainly have a point but, again I think there are big differences
between a TC oppinion and someone blocking a patch

If a TC member states an oppinion and clearly explains the reasoning behind it
that should have no impact on the TC members ability to vote. In fact it should
lead to all parties discussing and resolving the conflict probably without the
need to formally involve the TC

IMHO, invoking the TC is already an exceptional situation and failure.
and it shouldnt give the parties of that failure more influence in the decission.
(which is another orthogonal reason why the parties of a conflict should not
 be judges of the conflict)

Its really strange.

You know, if a judge files a lawsuit, that judge cannot be the judge in
that lawsuit.
This is a very simple concept.
It seems some people here see "their friend" not being allowed to vote
but thats not what this is about.
If a TC member blocks a patch, that TC member cannot vote on how to resolve
that blockage.

If a TC member chooses not to block a patch so he retains the power in a
potential future vote. Thats a game theoretic decission he makes to maximize
his blocking power. But really if he doesnt block it it could be applied
so this is not a logic decission. The logic decission is to block the patch
if thats what he wants and if noone else is blocking it.
game theoretically the example you provide above would never happen
as there would never be more than 1 TC member blocking a patch.

So IMO arguing that a person should be party to a disagreement and judge of
it. But making this dependant on an argument where people have to act in an
illogic way is really odd

thx

[...]
Ronald S. Bultje Feb. 19, 2024, 2:07 a.m. UTC | #52
Hi,

On Sun, Feb 18, 2024 at 5:34 PM Michael Niedermayer <michael@niedermayer.cc>
wrote:

> More formally, you could define a "party to a disagreement" as
> all minimal sets of people whos non existence would resolve the
> disagreement
>

I think I agree with this interpretation of the rules.

Ronald
Marvin Scholz Feb. 19, 2024, 2:16 a.m. UTC | #53
On 17 Feb 2024, at 13:31, Rémi Denis-Courmont wrote:

> Le lauantaina 17. helmikuuta 2024, 13.46.27 EET Gyan Doshi a écrit :
>> As a TC member who is part of the disagreement, I believe your
>> participation is recused.
>
> Obviously not. We don't want to get into a situation whence TC members have an
> incentive not to participate in regular code reviews just so that they can
> participate in the hypothetical making of later related TC decisions. That
> would be both dumb and counter-productive.
>

I agree…

> Somebody disagreeing with you does not necessarily mean that they have a
> conflict of interest in the subject matter.

My understanding of this rule was too that it would be for conflict-of-interest
kind of cases, which I fail to see here…

>
> -- 
> 雷米‧德尼-库尔蒙
> http://www.remlab.net/
>
>
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Vittorio Giovara Feb. 19, 2024, 2:26 a.m. UTC | #54
On Mon, Feb 19, 2024 at 2:17 AM Michael Niedermayer <michael@niedermayer.cc>
wrote:

> On Sun, Feb 18, 2024 at 11:48:59PM +0100, Hendrik Leppkes wrote:
> > On Sun, Feb 18, 2024 at 11:34 PM Michael Niedermayer
> > <michael@niedermayer.cc> wrote:
> > >
> > > * A disagreement implies that there are 2 parties
> > > * And we assume here that what one party wants is better for FFmpeg
> than what the other wants.
> > > * The TC needs to find out which partys choice is better or suggest a
> 3rd choice.
> > > * If one but not the other party is a member of the TC then this
> decission becomes biased if that member votes
> > >
> > > Your interpretation suggests that the TC members are "above" everyone
> and should
> > > prevail in arguments they have with others.
> > >
> >
> > Noone is above the rules, but just because someone has an opinion and
> > shared it shouldn't disqualify them, because they were specifically
> > voted into the TC for their opinions on technical matters.
> > Would their opinion, and therefore their vote, change if someone else
> > was seen as the person "blocking"?
>
> I think you are mixing the concept of an oppinion and blocking a patch.
> following is how i see the concept
>
> If you state that you prefer a linked list but dont mind the patch as it is
> thats an oppinion
>
> If you state that you prefer a linked list and i tell you that i would
> prefer to keep an array and you say you are ok, again thats an oppinion
> the patch is not blocked
>
> If you state that you prefer a linked list and i tell you that i would
> prefer to keep an array and you now tell me that if i want an array i have
> to go to the TC then you are blocking the patch. You and me in this case
> are the cause of the TC being involved.
> Only at this point we would be parties to the disagreement IMHO
> and we cannot be the judge here
>
>
> >
> > What if multiple people had expressed disagreement with a patch, and
> > most of the TC was involved in the public discussion already? Do the
>
> The question would be who is actually blocking it and not just stating
> their oppinion.
>
>
> > remaining "uninvolved" people on the TC get all the decision power? Or
> > do we consider most of the TC already opposing it publicly as perhaps
> > an indicator that the patch might not be the way to go?
> > Thats what the TC was voted in for, to give their opinion on technical
> > matters and decide if needed, so why deprive them of their opinion,
> > just because they already stated it publicly? That makes no sense to
> > me.
>
> You certainly have a point but, again I think there are big differences
> between a TC oppinion and someone blocking a patch
>
> If a TC member states an oppinion and clearly explains the reasoning
> behind it
> that should have no impact on the TC members ability to vote. In fact it
> should
> lead to all parties discussing and resolving the conflict probably without
> the
> need to formally involve the TC
>
> IMHO, invoking the TC is already an exceptional situation and failure.
> and it shouldnt give the parties of that failure more influence in the
> decission.
> (which is another orthogonal reason why the parties of a conflict should
> not
>  be judges of the conflict)
>
> Its really strange.
>
> You know, if a judge files a lawsuit, that judge cannot be the judge in
> that lawsuit.
> This is a very simple concept.
> It seems some people here see "their friend" not being allowed to vote
> but thats not what this is about.
> If a TC member blocks a patch, that TC member cannot vote on how to resolve
> that blockage.
>
> If a TC member chooses not to block a patch so he retains the power in a
> potential future vote. Thats a game theoretic decission he makes to
> maximize
> his blocking power. But really if he doesnt block it it could be applied
> so this is not a logic decission. The logic decission is to block the patch
> if thats what he wants and if noone else is blocking it.
> game theoretically the example you provide above would never happen
> as there would never be more than 1 TC member blocking a patch.
>
> So IMO arguing that a person should be party to a disagreement and judge of
> it. But making this dependant on an argument where people have to act in an
> illogic way is really odd
>

i long for the day in which ffmpeg might actually seem like a functioning
community, where we would not constantly and needlessly bikeshed rules and
other politics,but that day is clearly not today
Gyan Doshi Feb. 19, 2024, 5:10 a.m. UTC | #55
On 2024-02-19 03:16 am, Vittorio Giovara wrote:
> On Sun, Feb 18, 2024 at 8:02 PM Gyan Doshi <ffmpeg@gyani.pro> wrote:
>
>>
>> On 2024-02-18 11:33 pm, Anton Khirnov wrote:
>>> Quoting Gyan Doshi (2024-02-18 05:06:30)
>>>> b) what "maximalist" interpretation?
>>> A non-maximalist interpretation would be that a TC member is only
>>> excluded from voting when they authored the patch that is being
>>> disputed.
>> If the promulgators meant to only prevent proposers of the disputed
>> change to not take part, then
>> the verbiage would be different.
>>
>> In looking up how this clause came to be present, I came across the
>> following messages:
>>
>> https://ffmpeg.org/pipermail/ffmpeg-devel/2020-December/273443.html
>> (Nicolas George originally proposes this clause - wording is more
>> restrictive)
>>
>> https://ffmpeg.org/pipermail/ffmpeg-devel/2021-January/274822.html
>> (this one is interesting, you objected to the clause but on the grounds
>> that it was all-encompassing i.e.  anyone commenting on the dispute was
>> potentially subjected to recusal and referred to some 'model'
>> discussion, so your describing my reading as maximalist is weird since
>> that is how you read it - you just happen to object to this rule)
>>
>> https://ffmpeg.org/pipermail/ffmpeg-devel/2021-January/274826.html
>> (Ronald clarifies that "involved" should be constrained to just be one
>> of the two parties -- of which you happen to be one)
>>
>> There's the matter of what the rule currently is, distinct from what it
>> should be. What it ideally should be is that the decision should be
>> taken by a fresh set of eyes consisting of those who haven't become or
>> are seen to be publicly invested in the outcome. So the TC should have a
>> set of alternates - those who can make up the quorum and constitute an
>> odd number of voters when some from the first 5 are recused.
>>
> I'd like to offer a lighter interpretation of the rule, the mailing list is
> the common playing ground, where discussions and disagreements can be had.
> In case of a technical "maximalist" disagreement, then either party can
> invoke the TC to judge on the matter. If anyone in the TC is involved in
> the patch, as if it's an author or significantly contributed to it, then
> they should step away from voting. In other words the "level of
> involvement" rule takes place at the TC level, not at the ffmpeg-devel
> discussion.

The TC is invoked when there's an intractable dispute. So the dispute 
precedes the TC activity hence the parties to the dispute are the main 
opposing participants at the venue of the dispute wherever that is, and 
the rules applies to all main parties. Imagine there's a new feature to 
be added which doesn't exist in the codebase in any form so there's no 
status quo. Member A submits a patch using design pattern X. Member B 
objects and wants design pattern Y. Now let's say if only A was on the 
TC, then as per the arguments of some here, A should recuse themselves 
but if only B was on the TC, B gets to vote. That asymmetry is not 
supported in the wording nor would it be fair.


> Also consider that even in a vote recusal, the member's
> arguments will still be read and by all likelihood taken into consideration
> by the TC, so yours seems to be a literal interpretation of the rule,
> instead of the spirit of the rule, which in my opinion matters more.

Of course. There's no mind-reading or mind-control machine here. Humans 
aren't automatons either. The judges on any Supreme Court are older 
human beings with all the deep convictions that one acquires during a 
long lifetime but that's the best we can do. The rules are meant to be 
the most that is practically feasible within mutually observable 
reality, not ideally efficient within an omniscient universe.

Regards,
Gyan
Nicolas George Feb. 19, 2024, 8:45 a.m. UTC | #56
Vittorio Giovara (12024-02-18):
> If it helps, I'll block the patch so that Anton can vote in the TC.
> Do you see how slippery (and insane) this interpretation of the rule
> becomes?

The rules are written assuming people in the project are working in good
faith for the benefit of the project.

Proposing that kind of cheat to work-around the rules is the opposite of
acting in good faith, it is acting like a corrupt politician. And that
should disqualify anybody of participating in the governance of the
project.

If a member of the TC uses a sock puppet to oppose a patch without
having to recuse themselves, when it inevitably gets noticed they should
be evicted from the committees, the general assembly and the git write
access.
Nicolas George Feb. 19, 2024, 8:54 a.m. UTC | #57
Vittorio Giovara (12024-02-18):
> While I understand that you're referencing the existing rules that we've
> collectively agreed upon, I believe it's crucial for us to periodically
> review and refine these rules to ensure they remain aligned with our
> evolving community values and goals.

<snip>

This is true, but you neglect two very important facts:

First, the rule we are discussing is a direct consequence of the more
general rule that nobody is above anybody in the project. A change to
this rule is a very deep change and must be discussed knowing the
consequences.

Second, the current committee has been elected based on the current
rules. If the rule get changed, then the election must be redone. And
you can count on me for emphasizing that somebody who did not want to
adhere to the current rules is unfit to be elected even under the new
ones.

> Therefore, I propose that we engage in a collaborative effort to revisit
> and clarify our existing rules.

Sure. I propose, to be discussed and refined:

- Let us add a preamble stating the general spirit of the project:

  * FFmpeg is a Libre Software project meant to be useful to users, all
    users, not just large or numerous ones.
  * If a feature is a legitimate use case and not easily possible
    otherwise then it belongs in the project.
  * FFmpeg is the place to experiment, try new solutions for old
    problems, invent new way of using the C language.

- People who caused the libav fork or sided with it are excluded from
  the committees and general assembly.

- People who work on FFmpeg for the continued benefit of a commercial
  entity have a smaller vote in the general assembly (and all employees
  of said entity get excluded if one happens to lie).

- On the opposite, people with a long continued involvement in the
  project and people whose contribution touch a wide part of the project
  get a larger vote.

> Thank you for your dedication to the community, and I look forward to
> engaging in productive discussions to enhance our governance framework.

PS: this fake, condescending politeness is worse than sincerity.
Vittorio Giovara Feb. 19, 2024, 2:15 p.m. UTC | #58
On Mon, Feb 19, 2024 at 9:45 AM Nicolas George <george@nsup.org> wrote:

> Vittorio Giovara (12024-02-18):
> > If it helps, I'll block the patch so that Anton can vote in the TC.
> > Do you see how slippery (and insane) this interpretation of the rule
> > becomes?
>
> The rules are written assuming people in the project are working in good
> faith for the benefit of the project.
>
> Proposing that kind of cheat to work-around the rules is the opposite of
> acting in good faith, it is acting like a corrupt politician. And that
> should disqualify anybody of participating in the governance of the
> project.
>

By that reasoning, someone could argue that you forced the inclusion of
this rule being discussed only to set up a backdoor in the process and
thwart any chance of a functioning process for the community
Of course we're talking by hypotheticals, right?
Vittorio Giovara Feb. 19, 2024, 2:21 p.m. UTC | #59
On Mon, Feb 19, 2024 at 9:54 AM Nicolas George <george@nsup.org> wrote:

> Vittorio Giovara (12024-02-18):
> > While I understand that you're referencing the existing rules that we've
> > collectively agreed upon, I believe it's crucial for us to periodically
> > review and refine these rules to ensure they remain aligned with our
> > evolving community values and goals.
>
> <snip>
>
> This is true, but you neglect two very important facts:
>
> First, the rule we are discussing is a direct consequence of the more
> general rule that nobody is above anybody in the project. A change to
> this rule is a very deep change and must be discussed knowing the
> consequences.
>
> Second, the current committee has been elected based on the current
> rules. If the rule get changed, then the election must be redone. And
> you can count on me for emphasizing that somebody who did not want to
> adhere to the current rules is unfit to be elected even under the new
> ones.
>
I understand your concerns regarding the potential consequences of changing
this rule, and I acknowledge the importance of upholding the principles
that underpin our project's governance. However, I must express my
disappointment in the insinuation that I am acting in bad faith.

It is essential to approach discussions with mutual respect and trust,
assuming positive intent from all parties involved. Implying bad faith
undermines the spirit of constructive dialogue and collaboration that is
crucial for the progress of our community.

I must address the concerning implication of bad faith in your response.
Accusations of acting in bad faith are detrimental to our collaborative
efforts and undermine the mutual respect necessary for productive discourse
within our community.

It is imperative that we engage in discussions with goodwill and a
presumption of positive intent from all parties involved. Such insinuations
hinder our ability to work together effectively and erode the trust
essential for maintaining a healthy community environment.

While I value your dedication to upholding the integrity of our project's
rules, I urge us to refrain from casting doubt on each other's motivations.
Let us refocus our efforts on constructive dialogue aimed at finding
solutions that serve the best interests of the FFmpeg project.

Thank you for your understanding, and I remain committed to fostering a
culture of respect and collaboration within our community.
Nicolas George Feb. 19, 2024, 2:28 p.m. UTC | #60
Vittorio Giovara (12024-02-19):
> By that reasoning, someone could argue that you forced the inclusion of
> this rule being discussed only to set up a backdoor in the process and
> thwart any chance of a functioning process for the community

Can you explain the part of your reasoning where you establish that I
“forced” anything?

Otherwise, you are just being libelous.
Vittorio Giovara Feb. 19, 2024, 2:30 p.m. UTC | #61
On Mon, Feb 19, 2024 at 6:11 AM Gyan Doshi <ffmpeg@gyani.pro> wrote:

>
>
> On 2024-02-19 03:16 am, Vittorio Giovara wrote:
> > On Sun, Feb 18, 2024 at 8:02 PM Gyan Doshi <ffmpeg@gyani.pro> wrote:
> >
> >>
> >> On 2024-02-18 11:33 pm, Anton Khirnov wrote:
> >>> Quoting Gyan Doshi (2024-02-18 05:06:30)
> >>>> b) what "maximalist" interpretation?
> >>> A non-maximalist interpretation would be that a TC member is only
> >>> excluded from voting when they authored the patch that is being
> >>> disputed.
> >> If the promulgators meant to only prevent proposers of the disputed
> >> change to not take part, then
> >> the verbiage would be different.
> >>
> >> In looking up how this clause came to be present, I came across the
> >> following messages:
> >>
> >> https://ffmpeg.org/pipermail/ffmpeg-devel/2020-December/273443.html
> >> (Nicolas George originally proposes this clause - wording is more
> >> restrictive)
> >>
> >> https://ffmpeg.org/pipermail/ffmpeg-devel/2021-January/274822.html
> >> (this one is interesting, you objected to the clause but on the grounds
> >> that it was all-encompassing i.e.  anyone commenting on the dispute was
> >> potentially subjected to recusal and referred to some 'model'
> >> discussion, so your describing my reading as maximalist is weird since
> >> that is how you read it - you just happen to object to this rule)
> >>
> >> https://ffmpeg.org/pipermail/ffmpeg-devel/2021-January/274826.html
> >> (Ronald clarifies that "involved" should be constrained to just be one
> >> of the two parties -- of which you happen to be one)
> >>
> >> There's the matter of what the rule currently is, distinct from what it
> >> should be. What it ideally should be is that the decision should be
> >> taken by a fresh set of eyes consisting of those who haven't become or
> >> are seen to be publicly invested in the outcome. So the TC should have a
> >> set of alternates - those who can make up the quorum and constitute an
> >> odd number of voters when some from the first 5 are recused.
> >>
> > I'd like to offer a lighter interpretation of the rule, the mailing list
> is
> > the common playing ground, where discussions and disagreements can be
> had.
> > In case of a technical "maximalist" disagreement, then either party can
> > invoke the TC to judge on the matter. If anyone in the TC is involved in
> > the patch, as if it's an author or significantly contributed to it, then
> > they should step away from voting. In other words the "level of
> > involvement" rule takes place at the TC level, not at the ffmpeg-devel
> > discussion.
>
> The TC is invoked when there's an intractable dispute. So the dispute
> precedes the TC activity hence the parties to the dispute are the main
> opposing participants at the venue of the dispute wherever that is, and
> the rules applies to all main parties. Imagine there's a new feature to
> be added which doesn't exist in the codebase in any form so there's no
> status quo. Member A submits a patch using design pattern X. Member B
> objects and wants design pattern Y. Now let's say if only A was on the
> TC, then as per the arguments of some here, A should recuse themselves
> but if only B was on the TC, B gets to vote. That asymmetry is not
> supported in the wording nor would it be fair.
>

The asymmetry is that the TC should be protecting the good of the project
and the community interests, while the member of the community proposing
the patch is protecting their own interest. For the better or the worse of
course. The rule you keep bringing forth is stated to avoid a conflict of
interest where the member of the TC is also the author of the patch, but
was never meant to exclude one party from voting in the TC.

> Also consider that even in a vote recusal, the member's
> > arguments will still be read and by all likelihood taken into
> consideration
> > by the TC, so yours seems to be a literal interpretation of the rule,
> > instead of the spirit of the rule, which in my opinion matters more.
>
> Of course. There's no mind-reading or mind-control machine here. Humans
> aren't automatons either. The judges on any Supreme Court are older
> human beings with all the deep convictions that one acquires during a
> long lifetime but that's the best we can do. The rules are meant to be
> the most that is practically feasible within mutually observable
> reality, not ideally efficient within an omniscient universe.
>

If you believe the interpretation of the rule is dubious or incorrect, you
should propose a formal vote to change or clarify its wording.
Nicolas George Feb. 19, 2024, 2:30 p.m. UTC | #62
Vittorio Giovara (12024-02-19):
> I understand your concerns regarding the potential consequences of changing
> this rule, and I acknowledge the importance of upholding the principles
> that underpin our project's governance. However, I must express my
> disappointment in the insinuation that I am acting in bad faith.

There is no insinuation in the mail you are answering to.
Vittorio Giovara Feb. 19, 2024, 2:33 p.m. UTC | #63
On Mon, Feb 19, 2024 at 3:30 PM Nicolas George <george@nsup.org> wrote:

> Vittorio Giovara (12024-02-19):
> > I understand your concerns regarding the potential consequences of
> changing
> > this rule, and I acknowledge the importance of upholding the principles
> > that underpin our project's governance. However, I must express my
> > disappointment in the insinuation that I am acting in bad faith.
>
> There is no insinuation in the mail you are answering to.
>

There are many in the section snipped below the original email.
Nicolas George Feb. 19, 2024, 2:34 p.m. UTC | #64
Vittorio Giovara (12024-02-19):
> There are many in the section snipped below the original email.

You are making accusations. You know what it implies.
Vittorio Giovara Feb. 19, 2024, 2:37 p.m. UTC | #65
On Mon, Feb 19, 2024 at 3:28 PM Nicolas George <george@nsup.org> wrote:

> Vittorio Giovara (12024-02-19):
> > By that reasoning, someone could argue that you forced the inclusion of
> > this rule being discussed only to set up a backdoor in the process and
> > thwart any chance of a functioning process for the community
>
> Can you explain the part of your reasoning where you establish that I
> “forced” anything?
>
> Otherwise, you are just being libelous.
>

Not my intention, allow me to rephrase:
> By that reasoning, it could be argued that someone proposed the inclusion
of
> this rule being discussed only to set up a backdoor in the process and
> thwart any chance of a functioning process for the community

As mentioned, it's just a hyperbole, extending the deviance of this thought
logic.
Nicolas George Feb. 19, 2024, 2:41 p.m. UTC | #66
Vittorio Giovara (12024-02-19):
> > By that reasoning, it could be argued that someone proposed the inclusion of
> > this rule being discussed only to set up a backdoor in the process and
> > thwart any chance of a functioning process for the community
> As mentioned, it's just a hyperbole, extending the deviance of this thought
> logic.

Indeed, somebody stupid enough to not how a rule to prevent somebody
from placing themselves above the others is necessary could make that
argument.
Gyan Doshi Feb. 19, 2024, 3:39 p.m. UTC | #67
On 2024-02-19 08:00 pm, Vittorio Giovara wrote:
> On Mon, Feb 19, 2024 at 6:11 AM Gyan Doshi<ffmpeg@gyani.pro>  wrote:
>
>> The TC is invoked when there's an intractable dispute. So the dispute
>> precedes the TC activity hence the parties to the dispute are the main
>> opposing participants at the venue of the dispute wherever that is, and
>> the rules applies to all main parties. Imagine there's a new feature to
>> be added which doesn't exist in the codebase in any form so there's no
>> status quo. Member A submits a patch using design pattern X. Member B
>> objects and wants design pattern Y. Now let's say if only A was on the
>> TC, then as per the arguments of some here, A should recuse themselves
>> but if only B was on the TC, B gets to vote. That asymmetry is not
>> supported in the wording nor would it be fair.
>>
> The asymmetry is that the TC should be protecting the good of the project
> and the community interests, while the member of the community proposing
> the patch is protecting their own interest.

Both the proposer and disputer of a patch may have a vested interest in 
steering decisions one way or the
other, or both may believe they're furthering the good of the project. 
There is no asymmetry inherent in the
roles of the participants. There shouldn't be in the rules either.

>   The rule you keep bringing forth is stated to avoid a conflict of
> interest where the member of the TC is also the author of the patch, but
> was never meant to exclude one party from voting in the TC.
We've already had the proposer of the rule participate in this thread 
and I cited discussion from the time of drafting of the rule that it is 
meant to apply to both sides. Treating the rule as applying to only the 
author is the aberrant interpretation. Regards, Gyan
Anton Khirnov Feb. 19, 2024, 9:37 p.m. UTC | #68
Quoting Michael Niedermayer (2024-02-18 23:34:39)
> More formally, you could define a "party to a disagreement" as
> all minimal sets of people whos non existence would resolve the disagreement

That is a useless definition in practice, because it is unknowable. It
is very common that developers to not bother voicing their opinion when
someone else is doing that already. E.g. in this very case Andreas
prompted me to reply to the patch, presumably because he also does not
want it to go in in its current form.

And yet if he, or someone else, had argued forcefully against the patch,
I probably would not have - not out of some political calculations, but
simply because it saves me time. Then my TC vote would not be questioned
and we would not be having this discussion at all. It seems absurd to me
that TC members should be prevented from voting based on such random
factors.

> > > But I think it is reasonable that parties of a disagreement cannot be
> > > the judge of the disagreement.
> > 
> > Why not? This is one of those truthy-sounding statements that does not
> > actually hold up to scrutiny.
> 
> * A disagreement implies that there are 2 parties
> * And we assume here that what one party wants is better for FFmpeg than what the other wants.
> * The TC needs to find out which partys choice is better or suggest a 3rd choice.
> * If one but not the other party is a member of the TC then this decission becomes biased if that member votes

This example is flawed in at least two following ways:

First, you keep comparing TC members to judges in a legal system. As I
said above - in a paragraph you ignored - I do not think that is a
meaningful comparison. We have no law, TC members are not judges and
decide based on their experience and opinions.

> Imagine a judge kills someone and judges himself innocent afterwards in a panel of 5 judges

Second, in this example the judge in question has two roles in the
situation: that of a criminal who wants to avoid being found guilty and that
of a judge who is supposed to find criminals guilty. The interests of
these roles are in conflict, hence we have a conflict of interest.

That does not translate to the situation we are actually dealing with.
My interests in my role as a patch reviewer and as a TC member are
exactly the same. There is thus no conflict of interest.

There might have been conflict of interest if e.g. I was being paid for
ensuring the code works a certain specfic way, but I am not.

I am explaining all this in such detail because people in this thread
keep using this term apparently without realizing that in order to have
a conflict of interest there must in fact be multiple interests that are
in conflict, not just a person having multiple roles at once.

> Your interpretation suggests that the TC members are "above" everyone and should
> prevail in arguments they have with others.

I have no idea how you arrived at that conclusion, it does not follow
from anything I wrote.
Nicolas George Feb. 19, 2024, 9:54 p.m. UTC | #69
Anton Khirnov (12024-02-19):
> I am explaining all this in such detail because people in this thread
> keep using this term apparently without realizing that in order to have
> a conflict of interest there must in fact be multiple interests that are
> in conflict, not just a person having multiple roles at once.

In legal systems, they recuse people not only if they have a conflict of
interest but also if they have a clear bias about the issue.

In fact, all members of the TC who had a strong opinion on the issue
before it was brought in front of the TC should have the honesty to
recuse themselves, so that the other members of the TC can discuss the
arguments without being biassed by their original opinion.

But the most telling in this whole discussion is that somebody who would
spend so much time and energy into bypassing a minor rule instead of
just saying “fine, I recuse myself, I had better things to do anyway”
and just TRUST THE OTHER MEMBERS to choose right shows a lack of
team-spirit and a greed for power that might make them fit as a manager
for an open-source project but certainly unfit to be in a position of
authority and trust in a Libre Software like project FFmpeg.
Vittorio Giovara Feb. 20, 2024, 3:02 a.m. UTC | #70
On Mon, Feb 19, 2024 at 4:40 PM Gyan Doshi <ffmpeg@gyani.pro> wrote:

>
>
> On 2024-02-19 08:00 pm, Vittorio Giovara wrote:
> > On Mon, Feb 19, 2024 at 6:11 AM Gyan Doshi<ffmpeg@gyani.pro>  wrote:
> >
> >> The TC is invoked when there's an intractable dispute. So the dispute
> >> precedes the TC activity hence the parties to the dispute are the main
> >> opposing participants at the venue of the dispute wherever that is, and
> >> the rules applies to all main parties. Imagine there's a new feature to
> >> be added which doesn't exist in the codebase in any form so there's no
> >> status quo. Member A submits a patch using design pattern X. Member B
> >> objects and wants design pattern Y. Now let's say if only A was on the
> >> TC, then as per the arguments of some here, A should recuse themselves
> >> but if only B was on the TC, B gets to vote. That asymmetry is not
> >> supported in the wording nor would it be fair.
> >>
> > The asymmetry is that the TC should be protecting the good of the project
> > and the community interests, while the member of the community proposing
> > the patch is protecting their own interest.
>
> Both the proposer and disputer of a patch may have a vested interest in
> steering decisions one way or the
> other, or both may believe they're furthering the good of the project.
> There is no asymmetry inherent in the
> roles of the participants. There shouldn't be in the rules either.
>

As a matter of fact there is no asymmetry rule: if the author of the patch
is a member of TC and there is a disagreement, then they shall not vote.
Applies equally, to all members of the TC, of course it does not apply if a
TC member is involved in a technical discussion outside of the TC
discussion itself. It wouldn't make sense otherwise!


> >   The rule you keep bringing forth is stated to avoid a conflict of
> > interest where the member of the TC is also the author of the patch, but
> > was never meant to exclude one party from voting in the TC.
> We've already had the proposer of the rule participate in this thread
> and I cited discussion from the time of drafting of the rule that it is
> meant to apply to both sides. Treating the rule as applying to only the
> author is the aberrant interpretation. Regards, Gyan
>

While you may find my interpretation of the rule "aberrant" for the local
case, I find yours abhorrent for the health of the project itself. I
understand your frustration about this situation (and I am sure some people
are enjoying the show) but I am starting to suspect bad faith here, and I
invite you to reconsider your views on the matter.
Michael Niedermayer Feb. 20, 2024, 9:39 p.m. UTC | #71
On Mon, Feb 19, 2024 at 10:37:15PM +0100, Anton Khirnov wrote:
> Quoting Michael Niedermayer (2024-02-18 23:34:39)
[...]
> > > > But I think it is reasonable that parties of a disagreement cannot be
> > > > the judge of the disagreement.
> > > 
> > > Why not? This is one of those truthy-sounding statements that does not
> > > actually hold up to scrutiny.
> > 
> > * A disagreement implies that there are 2 parties
> > * And we assume here that what one party wants is better for FFmpeg than what the other wants.
> > * The TC needs to find out which partys choice is better or suggest a 3rd choice.
> > * If one but not the other party is a member of the TC then this decission becomes biased if that member votes
> 
> This example is flawed in at least two following ways:
> 
> First, you keep comparing TC members to judges in a legal system. As I
> said above - in a paragraph you ignored - I do not think that is a
> meaningful comparison. We have no law, TC members are not judges and
> decide based on their experience and opinions.

well, TC members make decissions, lets call them decission makers then

lets see how that would look
so if we had judges and law we seem to agree that a decission maker cannot
be a party to the very disagreement she decides on.

lets take away the law
so a courtroom with 5 judges, deciding on a persons fate, no more law everyone
can choose as they prefer.
The judge accusing another man of something, can now be a judge of that case ?

No, the removial of "the law" isnt making the common sense rule any less common
sense

so lets now not call teh judges "judges" anymore, and lets pick them identically
to how TC members are picked
Does this change anything ?

The decission maker accusing another man of something, can now be a decission maker of that case ?

No, the issue remains. A party to a disagreement cannot be decission maker in
the disagreement. There is bias and if the goal is a optimal decission
we want no bias.


> 
> > Imagine a judge kills someone and judges himself innocent afterwards in a panel of 5 judges
> 
> Second, in this example the judge in question has two roles in the
> situation: that of a criminal who wants to avoid being found guilty and that
> of a judge who is supposed to find criminals guilty. The interests of
> these roles are in conflict, hence we have a conflict of interest.
> 

> That does not translate to the situation we are actually dealing with.
> My interests in my role as a patch reviewer and as a TC member are
> exactly the same. There is thus no conflict of interest.

i disagree

A TC member who wants to block a patch and wants to decide if a patch should be
blocked is in the same situation as

a Judge who wants to sue someone and wants to judge that someone.

Again, the judge being a party to a lawsuite cannot be judge in that lawsuite
Similarly a TC member initiating a conflict cannot judge in that same conflict
The TC member surely does the same thing in both cases, he wants something
like blocking a patch.
Similarly the judge also wants to whatever is the goal of his lawsuite

And there are in fact more problems
Consider this:
Normally we have PartyA and PartyB in a conflict and then we have 5 TC members
who look into that, both parties can argue their case in front of the TC (publically)
and the TC then discusses and make a decission, possibly asking more people an so on

But now the case here changes we have PartyA and PartyTC
Theres a disagreement between a developer and a TC member

The TC member is part of the TC and discusses with the other 4 members, he will
support himself in all votes and potential private arguments.
PartyA here has substantially worse chance to win even if PartyA is correct and
has the better solution.

Iam sorry but i insist that the TC member in this case cannot act both as a party
to teh disagreement and as a member of the TC in the same disagreement.

It is completely unfair to partyA in this example above, they are not on equal ground.

thx

[...]
Kieran Kunhya Feb. 20, 2024, 9:56 p.m. UTC | #72
>
> i disagree
>
> A TC member who wants to block a patch and wants to decide if a patch
> should be
> blocked is in the same situation as
>
> a Judge who wants to sue someone and wants to judge that someone.
>

Whilst I am not getting into a whole philosophical legal discussion about
this (to follow on from last month's Nobel Prize winning mailing list
lectures on economics and business).

This isn't the same thing. The TC is more like a jury where a juror can
have an opinion and their opinion can be swayed by arguments during private
jury discussions.
The TC is doing this for the good of the project (I hope) and not to push
their own agenda and it's very clear in this case there is no personal
agenda.

Regards,
Kieran Kunhya
Nicolas George Feb. 20, 2024, 10:07 p.m. UTC | #73
Kieran Kunhya (12024-02-20):
> This isn't the same thing. The TC is more like a jury where a juror can
> have an opinion and their opinion can be swayed by arguments during private

In a jury trial, the defense can recuse any juror they want.
diff mbox series

Patch

diff --git a/configure b/configure
index c8ae0a061d..8db3fa3f4b 100755
--- a/configure
+++ b/configure
@@ -2979,6 +2979,7 @@  rv20_decoder_select="h263_decoder"
 rv20_encoder_select="h263_encoder"
 rv30_decoder_select="golomb h264pred h264qpel mpegvideodec rv34dsp"
 rv40_decoder_select="golomb h264pred h264qpel mpegvideodec rv34dsp"
+s302m_decoder_select="dolby_e_decoder"
 screenpresso_decoder_deps="zlib"
 shorten_decoder_select="bswapdsp"
 sipr_decoder_select="lsp"
diff --git a/doc/decoders.texi b/doc/decoders.texi
index 293c82c2ba..9f85c876bf 100644
--- a/doc/decoders.texi
+++ b/doc/decoders.texi
@@ -347,6 +347,46 @@  configuration. You need to explicitly configure the build with
 An FFmpeg native decoder for Opus exists, so users can decode Opus
 without this library.
 
+@section s302m
+
+SMPTE ST 302 decoder.
+
+SMPTE ST 302 is a method for storing AES3 data format within an MPEG Transport
+Stream. AES3 streams can contain LPCM streams of 2, 4, 6 or 8 channels with a
+bit depth of 16, 20 or 24-bits at a sample rate of 48 kHz.
+They can also contain non-PCM codec streams such as AC-3 or Dolby-E.
+
+Decoding non-PCM streams directly requires that the necessary stream decoder be
+present in the build. At present, only Dolby-E decoding is supported.
+
+@subsection Options
+
+The following options are supported by the s302m decoder.
+
+@table @option
+@item non_pcm_mode @var{mode}
+Specify how to process non-PCM streams
+
+@table @samp
+@item copy
+Treat data as if it were LPCM.
+@item drop
+Discard the stream.
+@item decode_copy
+Decode if possible eise treat the same as @code{copy}.
+@item decode_drop
+Decode if possible eise treat the same as @code{drop}.
+@end table
+
+The default value is @code{decode_drop}. This option does not affect the processing of
+LPCM streams.
+
+@item non_pcm_options @var{options}
+Set options for non-PCM decoder using a list of key=value pairs separated by ":".
+Consult the docs for the non-PCM decoder for its options.
+
+@end table
+
 @c man end AUDIO DECODERS
 
 @chapter Subtitles Decoders
diff --git a/libavcodec/s302m.c b/libavcodec/s302m.c
index f1b41608f3..d6a75cfa73 100644
--- a/libavcodec/s302m.c
+++ b/libavcodec/s302m.c
@@ -24,21 +24,264 @@ 
 #include "libavutil/intreadwrite.h"
 #include "libavutil/opt.h"
 #include "libavutil/log.h"
+#include "libavutil/dict.h"
 #include "libavutil/reverse.h"
 #include "avcodec.h"
 #include "codec_internal.h"
+#include "get_bits.h"
 #include "decode.h"
 
 #define AES3_HEADER_LEN 4
 
+#define NONPCMSYNC_16MARKER      0x4E1F0F8720
+#define NONPCMSYNC_20MARKER      0x4E1F60F872A0
+#define NONPCMSYNC_24MARKER      0x7E1F690F872A50
+
+#define NONPCMSYNC_16_IN_20MARKER      0x04E1F00F8720
+#define NONPCMSYNC_20_IN_24MARKER      0x04E1F600F872A0
+
+#define IS_NONPCMSYNC_16(state)   ((state & 0xFFFF0FFFF0)     == NONPCMSYNC_16MARKER)
+#define IS_NONPCMSYNC_20(state)   ((state & 0xFFFFF0FFFFF0)   == NONPCMSYNC_20MARKER)
+#define IS_NONPCMSYNC_24(state)   ((state & 0xFFFFFF0FFFFFF0) == NONPCMSYNC_24MARKER)
+
+#define IS_NONPCMSYNC_16_IN_20(state)   ((state & 0x0FFFF00FFFF0)   == NONPCMSYNC_16_IN_20MARKER)
+#define IS_NONPCMSYNC_20_IN_24(state)   ((state & 0x0FFFFF00FFFFF0) == NONPCMSYNC_20_IN_24MARKER)
+
+#define IS_NONPCMSYNC(bit,state)  ( ((bit == 16) &&  IS_NONPCMSYNC_16(state)) || \
+                                    ((bit == 20) && (IS_NONPCMSYNC_20(state) || IS_NONPCMSYNC_16_IN_20(state))) || \
+                                    ((bit == 24) && (IS_NONPCMSYNC_24(state) || IS_NONPCMSYNC_20_IN_24(state))) \
+                                  )
+
+enum non_pcm_modes {
+    NON_PCM_COPY,
+    NON_PCM_DROP,
+    NON_PCM_DEC_ELSE_COPY,
+    NON_PCM_DEC_ELSE_DROP,
+};
+
 typedef struct S302Context {
     AVClass *class;
+
+    int avctx_props_set;
+
+    int channels;
+    int bits;
+
     int non_pcm_mode;
+    int non_pcm_data_type;
+    int non_pcm_bits;
+    int non_pcm_dec;
+
+    AVCodecContext *non_pcm_ctx;
+    AVDictionary   *non_pcm_opts;
+    AVPacket *packet;
+    AVFrame  *frame;
 } S302Context;
 
+static av_cold int s302m_init(AVCodecContext *avctx)
+{
+    S302Context *s = avctx->priv_data;
+
+    s->non_pcm_data_type = -1;
+
+    return 0;
+}
+
+static int s302m_non_pcm_inspect(AVCodecContext *avctx, const uint8_t *buf, int buf_size,
+                                  int *offset, int *length)
+{
+    S302Context *s = avctx->priv_data;
+    GetBitContext gb;
+    int ret, aes_frm_size, data_type, length_code = 0;
+    uint64_t state = 0;
+    uint32_t size;
+
+    if (s->channels != 2) {
+        goto end;
+    }
+
+    ret = init_get_bits8(&gb, buf, buf_size);
+    if (ret < 0)
+        return ret;
+
+    aes_frm_size = (s->bits + 4) * 2 / 8;
+    if (buf_size < aes_frm_size * 2)  // not enough to contain data_type & length_code
+        return AVERROR_INVALIDDATA;
+
+    state = get_bits64(&gb, aes_frm_size * 8);
+
+    while (!IS_NONPCMSYNC(s->bits,state) && (get_bits_left(&gb) >= 8))
+        state = (state << 8) | get_bits(&gb, 8);
+
+    if (IS_NONPCMSYNC(s->bits,state)) {
+        if (get_bits_left(&gb) < aes_frm_size * 8) {
+            av_log(avctx, AV_LOG_ERROR, "truncated non-pcm frame detected\n");
+            return AVERROR_INVALIDDATA;
+        }
+
+        if (s->bits == 16) {
+            s->non_pcm_bits = 16;
+        } else if (s->bits == 20) {
+            if (IS_NONPCMSYNC_16_IN_20(state))
+                s->non_pcm_bits = 16;
+            else
+                s->non_pcm_bits = 20;
+        } else if (s->bits == 24) {
+            if (IS_NONPCMSYNC_20_IN_24(state))
+                s->non_pcm_bits = 20;
+            else
+                s->non_pcm_bits = 24;
+        }
+
+        skip_bits(&gb, s->bits - 16);
+
+        data_type = ff_reverse[(uint8_t)get_bits(&gb, 5)] >> 3;
+
+        if (s->non_pcm_data_type == -1) {
+            s->non_pcm_data_type = data_type;
+            av_log(avctx, AV_LOG_INFO, "stream has non-pcm data of type %d with %d-bit words in aes3 payload of size %d bits\n", data_type, s->non_pcm_bits, s->bits);
+        } else if (s->non_pcm_data_type != data_type) {
+            av_log(avctx, AV_LOG_DEBUG, "type mismatch of non-pcm type in packet %d vs stream %d. Dropping.\n", data_type, s->non_pcm_data_type);
+            return AVERROR_INVALIDDATA;
+        }
+
+        if (s->non_pcm_mode == NON_PCM_COPY)
+            return 0;
+        else if (s->non_pcm_mode == NON_PCM_DROP)
+            return AVERROR_INVALIDDATA;
+
+        skip_bits(&gb, 15);
+
+        size = get_bits(&gb, s->bits);
+
+        length_code = ((ff_reverse[(uint8_t)((size & 0xFF)          )] << 16) |
+                       (ff_reverse[(uint8_t)((size & 0xFF00)   >> 8 )] << 8 ) |
+                       (ff_reverse[(uint8_t)((size & 0xFF0000) >> 16)]      ) ) >> (24 - s->non_pcm_bits);
+
+        skip_bits(&gb, 4);
+
+        *offset = get_bits_count(&gb)/8;
+        *length = length_code;
+
+        av_log(avctx, AV_LOG_TRACE, "located non-pcm packet at offset %d length code %d.\n", AES3_HEADER_LEN + *offset, length_code);
+
+        return data_type;
+    }
+
+end:
+    if (s->non_pcm_data_type == -1) {
+        s->non_pcm_data_type = 32;  // indicate stream should be treated as LPCM
+        return 0;
+    } else
+        return AVERROR_INVALIDDATA;
+}
+
+static int s302m_setup_non_pcm_handling(AVCodecContext *avctx, const uint8_t *buf, int buf_size)
+{
+    S302Context *s = avctx->priv_data;
+    const AVCodec *codec;
+    enum AVCodecID codec_id;
+    AVDictionary *dec_opts = NULL;
+    int ret;
+
+    if (s->non_pcm_mode > NON_PCM_DROP) {
+        switch (s->non_pcm_data_type) {
+        case 0x1C:
+            codec_id = AV_CODEC_ID_DOLBY_E;
+            break;
+        default:
+            avpriv_report_missing_feature(avctx, "decode of non-pcm data type %d", s->non_pcm_data_type);
+            ret = AVERROR_PATCHWELCOME;
+            goto fail;
+        }
+
+        codec = avcodec_find_decoder(codec_id);
+        if (!codec) {
+            ret = AVERROR_DECODER_NOT_FOUND;
+            goto fail;
+        }
+
+        s->non_pcm_ctx = avcodec_alloc_context3(codec);
+        if (!s->non_pcm_ctx) {
+            ret = AVERROR(ENOMEM);
+            goto fail;
+        }
+
+        av_dict_copy(&dec_opts, s->non_pcm_opts, 0);
+
+        ret = avcodec_open2(s->non_pcm_ctx, codec, &dec_opts);
+        av_dict_free(&dec_opts);
+        if (ret < 0)
+            goto fail;
+
+        s->packet = av_packet_alloc();
+        if (!s->packet) {
+            ret = AVERROR(ENOMEM);
+            goto fail;
+        }
+
+        s->frame = av_frame_alloc();
+        if (!s->frame) {
+            ret = AVERROR(ENOMEM);
+            goto fail;
+        }
+    } else
+        return 0;
+
+    s->non_pcm_dec = 1;
+    return 0;
+
+fail:
+    avcodec_free_context(&s->non_pcm_ctx);
+    av_packet_free(&s->packet);
+    av_frame_free(&s->frame);
+
+    if (s->non_pcm_mode == NON_PCM_DEC_ELSE_COPY)
+        s->non_pcm_mode = NON_PCM_COPY;
+    else if (s->non_pcm_mode == NON_PCM_DEC_ELSE_DROP)
+        s->non_pcm_mode = NON_PCM_DROP;
+
+    return ret;
+}
+
+static int s302m_get_non_pcm_pkt_size(AVCodecContext *avctx, int buf_size, int offset,
+                                      int length_code, int *dec_pkt_size)
+{
+    S302Context *s = avctx->priv_data;
+    int nb_words, word_size, aesframe_size, s302m_read_size;
+
+    if (offset < 0 || offset >= buf_size)
+        return AVERROR_INVALIDDATA;
+
+    switch (s->non_pcm_data_type) {
+    case 0x1C:
+        goto dolby_e;
+    default:
+        return AVERROR_INVALIDDATA;
+    }
+
+dolby_e:
+    {
+    nb_words = length_code / s->non_pcm_bits;
+    nb_words += nb_words & 1;
+
+    word_size = s->non_pcm_bits + 7 >> 3;
+    aesframe_size = (s->bits + 4) * 2 / 8;  // 2 subframes, each with payload + VUCF bits
+
+    *dec_pkt_size = nb_words * word_size;
+    s302m_read_size = aesframe_size * nb_words/2;
+
+    if (offset + s302m_read_size > buf_size)
+        return AVERROR_INVALIDDATA;
+
+    return s302m_read_size;
+    }
+}
+
 static int s302m_parse_frame_header(AVCodecContext *avctx, const uint8_t *buf,
                                     int buf_size)
 {
+    S302Context *s = avctx->priv_data;
     uint32_t h;
     int frame_size, channels, bits;
 
@@ -66,33 +309,15 @@  static int s302m_parse_frame_header(AVCodecContext *avctx, const uint8_t *buf,
         return AVERROR_INVALIDDATA;
     }
 
-    /* Set output properties */
-    avctx->bits_per_raw_sample = bits;
-    if (bits > 16)
-        avctx->sample_fmt = AV_SAMPLE_FMT_S32;
-    else
-        avctx->sample_fmt = AV_SAMPLE_FMT_S16;
-
-    av_channel_layout_uninit(&avctx->ch_layout);
-    switch(channels) {
-        case 2:
-            avctx->ch_layout = (AVChannelLayout)AV_CHANNEL_LAYOUT_STEREO;
-            break;
-        case 4:
-            avctx->ch_layout = (AVChannelLayout)AV_CHANNEL_LAYOUT_QUAD;
-            break;
-        case 6:
-            avctx->ch_layout = (AVChannelLayout)AV_CHANNEL_LAYOUT_5POINT1_BACK;
-            break;
-        case 8:
-            av_channel_layout_from_mask(&avctx->ch_layout,
-                                        AV_CH_LAYOUT_5POINT1_BACK | AV_CH_LAYOUT_STEREO_DOWNMIX);
-            break;
-        default:
-            avctx->ch_layout.order       = AV_CHANNEL_ORDER_UNSPEC;
-            avctx->ch_layout.nb_channels = channels;
-            break;
-    }
+    if (!s->channels)
+        s->channels = channels;
+    else if (s->channels != channels)
+        return AVERROR_INVALIDDATA;
+
+    if (!s->bits)
+        s->bits = bits;
+    else if (s->bits != bits)
+        return AVERROR_INVALIDDATA;
 
     return frame_size;
 }
@@ -103,119 +328,286 @@  static int s302m_decode_frame(AVCodecContext *avctx, AVFrame *frame,
     S302Context *s = avctx->priv_data;
     const uint8_t *buf = avpkt->data;
     int buf_size       = avpkt->size;
-    int block_size, ret, channels;
-    int i;
-    int non_pcm_data_type = -1;
+    int block_size, ret, channels, frame_size;
+    int non_pcm_offset = -1, non_pcm_length = 0;
+    int dec_pkt_size = 0;
+
+    if (s->non_pcm_mode == NON_PCM_DROP && s->non_pcm_data_type != -1 && s->non_pcm_data_type != 32)
+        return avpkt->size;
 
-    int frame_size = s302m_parse_frame_header(avctx, buf, buf_size);
+    frame_size = s302m_parse_frame_header(avctx, buf, buf_size);
     if (frame_size < 0)
         return frame_size;
 
     buf_size -= AES3_HEADER_LEN;
     buf      += AES3_HEADER_LEN;
 
-    /* get output buffer */
-    block_size = (avctx->bits_per_raw_sample + 4) / 4;
-    channels = avctx->ch_layout.nb_channels;
-    frame->nb_samples = 2 * (buf_size / block_size) / channels;
-    if ((ret = ff_get_buffer(avctx, frame, 0)) < 0)
-        return ret;
+    // set non-pcm status if not determined
+    // else extract offset and length if non-pcm can be decoded
+    if (s->non_pcm_data_type == -1 || s->non_pcm_dec) {
+        ret = s302m_non_pcm_inspect(avctx, buf, buf_size, &non_pcm_offset, &non_pcm_length);
+        if (ret >= 0 && s->non_pcm_data_type != 32 && !s->non_pcm_dec)
+            ret = s302m_setup_non_pcm_handling(avctx, buf, buf_size);
+        else if (ret < 0)
+            return avpkt->size;
+    }
+
+    if (s->non_pcm_data_type == -1)
+        return AVERROR_INVALIDDATA;  // we should know data type in order to proceed with output
+
+    if (!s->non_pcm_dec && !s->avctx_props_set) {
+        /* Set output properties */
+        avctx->bits_per_raw_sample = s->non_pcm_bits ? s->non_pcm_bits : s->bits;
+        if (avctx->bits_per_raw_sample > 16)
+            avctx->sample_fmt = AV_SAMPLE_FMT_S32;
+        else
+            avctx->sample_fmt = AV_SAMPLE_FMT_S16;
 
-    avctx->bit_rate = 48000 * channels * (avctx->bits_per_raw_sample + 4) +
-                      32 * 48000 / frame->nb_samples;
-    buf_size = (frame->nb_samples * channels / 2) * block_size;
+        av_channel_layout_uninit(&avctx->ch_layout);
+        switch(s->channels) {
+            case 2:
+                avctx->ch_layout = (AVChannelLayout)AV_CHANNEL_LAYOUT_STEREO;
+                break;
+            case 4:
+                avctx->ch_layout = (AVChannelLayout)AV_CHANNEL_LAYOUT_QUAD;
+                break;
+            case 6:
+                avctx->ch_layout = (AVChannelLayout)AV_CHANNEL_LAYOUT_5POINT1_BACK;
+                break;
+            case 8:
+                av_channel_layout_from_mask(&avctx->ch_layout,
+                                            AV_CH_LAYOUT_5POINT1_BACK | AV_CH_LAYOUT_STEREO_DOWNMIX);
+                break;
+            default:
+                avctx->ch_layout.order       = AV_CHANNEL_ORDER_UNSPEC;
+                avctx->ch_layout.nb_channels = channels;
+                break;
+        }
+
+        avctx->sample_rate = 48000;
+        s->avctx_props_set = 1;
+    }
+
+    if (s->non_pcm_dec) {
+        buf_size = s302m_get_non_pcm_pkt_size(avctx, buf_size, non_pcm_offset, non_pcm_length, &dec_pkt_size);
+        if (buf_size < 0)
+            return AVERROR_INVALIDDATA;
+        buf += non_pcm_offset;
+
+        if (dec_pkt_size > s->packet->size) {
+            ret = av_grow_packet(s->packet, dec_pkt_size - s->packet->size);
+            if (ret < 0)
+                return ret;
+        }
+        ret = av_packet_make_writable(s->packet);
+        if (ret < 0)
+            return ret;
+        memset(s->packet->data, 0, s->packet->size);
+
+        ret = av_packet_copy_props(s->packet, avpkt);
+        if (ret < 0)
+            return ret;
+    } else {
+        /* get output buffer */
+        block_size = (s->bits + 4) / 4;
+        channels = s->channels;
+        frame->nb_samples = 2 * (buf_size / block_size) / channels;
+        if ((ret = ff_get_buffer(avctx, frame, 0)) < 0)
+            return ret;
+
+        buf_size = (frame->nb_samples * channels / 2) * block_size;
+
+        avctx->bit_rate = 48000 * s->channels * ((s->non_pcm_bits ? s->non_pcm_bits : s->bits) + 4) +
+                          32 * 48000 / frame->nb_samples;
+    }
+
+    if (s->bits == 24) {
+        uint8_t  *p   = NULL;
+        uint32_t *f32 = NULL;
+
+        if (s->non_pcm_dec)
+            p = (uint8_t *)s->packet->data;
+        else
+            f32 = (uint32_t *)frame->data[0];
 
-    if (avctx->bits_per_raw_sample == 24) {
-        uint32_t *o = (uint32_t *)frame->data[0];
         for (; buf_size > 6; buf_size -= 7) {
-            *o++ = ((unsigned)ff_reverse[buf[2]]        << 24) |
-                   (ff_reverse[buf[1]]        << 16) |
-                   (ff_reverse[buf[0]]        <<  8);
-            *o++ = ((unsigned)ff_reverse[buf[6] & 0xf0] << 28) |
-                   (ff_reverse[buf[5]]        << 20) |
-                   (ff_reverse[buf[4]]        << 12) |
-                   (ff_reverse[buf[3] & 0x0f] <<  4);
+            uint8_t b[6];
+
+            b[0] = (ff_reverse[buf[2]       ]      ) ;
+            b[1] = (ff_reverse[buf[1]       ]      ) ;
+            b[2] = (ff_reverse[buf[0]       ]      ) ;
+            b[3] = (ff_reverse[buf[6] & 0xf0] <<  4) |
+                   (ff_reverse[buf[5] & 0x0f] >>  4) ;
+            b[4] = (ff_reverse[buf[5] & 0xf0] <<  4) |
+                   (ff_reverse[buf[4] & 0x0f] >>  4) ;
+            b[5] = (ff_reverse[buf[4] & 0xf0] <<  4) |
+                   (ff_reverse[buf[3] & 0x0f] >>  4) ;
+
+            if (s->non_pcm_bits == 20) {
+                b[2] &= 0xf0;
+                b[5] &= 0xf0;
+            }
+
+            if (s->non_pcm_dec)
+                for (int i = 0; i < 6; i++)
+                    *p++ = b[i];
+            else {
+                *f32++ = (b[0] << 24) |
+                         (b[1] << 16) |
+                         (b[2] <<  8) ;
+                *f32++ = (b[3] << 24) |
+                         (b[4] << 16) |
+                         (b[5] <<  8) ;
+            }
             buf += 7;
         }
-        o = (uint32_t *)frame->data[0];
-        if (channels == 2)
-            for (i=0; i<frame->nb_samples * 2 - 6; i+=2) {
-                if (o[i] || o[i+1] || o[i+2] || o[i+3])
-                    break;
-                if (o[i+4] == 0x96F87200U && o[i+5] == 0xA54E1F00) {
-                    non_pcm_data_type = (o[i+6] >> 16) & 0x1F;
-                    break;
+    } else if (s->bits == 20) {
+        uint8_t  *p   = NULL;
+        uint16_t *f16 = NULL;
+        uint32_t *f32 = NULL;
+
+        if (s->non_pcm_dec)
+            p = (uint8_t *)s->packet->data;
+        else if (s->non_pcm_bits == 16)
+            f16 = (uint16_t *)frame->data[0];
+        else
+            f32 = (uint32_t *)frame->data[0];
+
+        for (; buf_size > 5; buf_size -= 6) {
+            uint8_t b[6];
+
+            b[0] = (ff_reverse[buf[2] & 0xf0] <<  4) |
+                   (ff_reverse[buf[1] & 0x0f] >>  4) ;
+            b[1] = (ff_reverse[buf[1] & 0xf0] <<  4) |
+                   (ff_reverse[buf[0] & 0x0f] >>  4) ;
+            b[2] = (ff_reverse[buf[0] & 0xf0] <<  4) ;
+            b[3] = (ff_reverse[buf[5] & 0xf0] <<  4) |
+                   (ff_reverse[buf[4] & 0x0f] >>  4) ;
+            b[4] = (ff_reverse[buf[4] & 0xf0] <<  4) |
+                   (ff_reverse[buf[3] & 0x0f] >>  4) ;
+            b[5] = (ff_reverse[buf[3] & 0xf0] <<  4) ;
+
+            if (s->non_pcm_dec) {
+                for (int i = 0; i < 6; i++) {
+                    if (s->non_pcm_bits == 16 && (i % 3 == 2))
+                        continue;
+                    *p++ = b[i];
                 }
+            } else if (s->non_pcm_bits == 16) {
+                *f16++ = (b[0] << 8) |
+                         (b[1]     ) ;
+                *f16++ = (b[3] << 8) |
+                         (b[4]     ) ;
+            } else {
+                *f32++ = (b[0] << 24) |
+                         (b[1] << 16) |
+                         (b[2] <<  8) ;
+                *f32++ = (b[3] << 24) |
+                         (b[4] << 16) |
+                         (b[5] <<  8) ;
             }
-    } else if (avctx->bits_per_raw_sample == 20) {
-        uint32_t *o = (uint32_t *)frame->data[0];
-        for (; buf_size > 5; buf_size -= 6) {
-            *o++ = ((unsigned)ff_reverse[buf[2] & 0xf0] << 28) |
-                   (ff_reverse[buf[1]]        << 20) |
-                   (ff_reverse[buf[0]]        << 12);
-            *o++ = ((unsigned)ff_reverse[buf[5] & 0xf0] << 28) |
-                   (ff_reverse[buf[4]]        << 20) |
-                   (ff_reverse[buf[3]]        << 12);
             buf += 6;
         }
-        o = (uint32_t *)frame->data[0];
-        if (channels == 2)
-            for (i=0; i<frame->nb_samples * 2 - 6; i+=2) {
-                if (o[i] || o[i+1] || o[i+2] || o[i+3])
-                    break;
-                if (o[i+4] == 0x6F872000U && o[i+5] == 0x54E1F000) {
-                    non_pcm_data_type = (o[i+6] >> 16) & 0x1F;
-                    break;
-                }
-            }
     } else {
-        uint16_t *o = (uint16_t *)frame->data[0];
+        uint8_t  *p   = NULL;
+        uint16_t *f16 = NULL;
+
+        if (s->non_pcm_dec)
+            p = (uint8_t *)s->packet->data;
+        else
+            f16 = (uint16_t *)frame->data[0];
+
         for (; buf_size > 4; buf_size -= 5) {
-            *o++ = (ff_reverse[buf[1]]        <<  8) |
-                    ff_reverse[buf[0]];
-            *o++ = (ff_reverse[buf[4] & 0xf0] << 12) |
-                   (ff_reverse[buf[3]]        <<  4) |
-                   (ff_reverse[buf[2]]        >>  4);
+            uint8_t b[4];
+
+            b[0] = (ff_reverse[buf[1]       ]      ) ;
+            b[1] = (ff_reverse[buf[0]       ]      ) ;
+            b[2] = (ff_reverse[buf[4] & 0xf0] <<  4) |
+                   (ff_reverse[buf[3] & 0x0f] >>  4) ;
+            b[3] = (ff_reverse[buf[3] & 0xf0] <<  4) |
+                   (ff_reverse[buf[2] & 0x0f] >>  4) ;
+
+            if (s->non_pcm_dec)
+                for (int i = 0; i < 4; i++)
+                    *p++ = b[i];
+            else {
+                *f16++ = (b[0] << 8) |
+                         (b[1]     ) ;
+                *f16++ = (b[2] << 8) |
+                         (b[3]     ) ;
+            }
             buf += 5;
         }
-        o = (uint16_t *)frame->data[0];
-        if (channels == 2)
-            for (i=0; i<frame->nb_samples * 2 - 6; i+=2) {
-                if (o[i] || o[i+1] || o[i+2] || o[i+3])
-                    break;
-                if (o[i+4] == 0xF872U && o[i+5] == 0x4E1F) {
-                    non_pcm_data_type = (o[i+6] & 0x1F);
-                    break;
-                }
-            }
     }
 
-    if (non_pcm_data_type != -1) {
-        if (s->non_pcm_mode == 3) {
-            av_log(avctx, AV_LOG_ERROR,
-                   "S302 non PCM mode with data type %d not supported\n",
-                   non_pcm_data_type);
-            return AVERROR_PATCHWELCOME;
+    if (s->non_pcm_dec) {
+        ret = avcodec_send_packet(s->non_pcm_ctx, s->packet);
+        if (ret < 0) {
+            av_log(avctx, AV_LOG_ERROR, "error %d submitting non-pcm packet with pts %"PRId64" for decoding\n", ret, s->packet->pts);
+            return ret;
         }
-        if (s->non_pcm_mode & 1) {
-            return avpkt->size;
+        ret = avcodec_receive_frame(s->non_pcm_ctx, s->frame);
+        if (ret < 0) {
+            av_log(avctx, AV_LOG_ERROR, "error %d receiving non-pcm decoded frame for packet with pts %"PRId64"\n", ret, s->packet->pts);
+            return ret;
         }
-    }
 
-    avctx->sample_rate = 48000;
+        if (!s->avctx_props_set) {
+            avctx->sample_fmt  = s->non_pcm_ctx->sample_fmt;
+            avctx->sample_rate = s->non_pcm_ctx->sample_rate;
+
+            av_channel_layout_uninit(&avctx->ch_layout);
+            ret = av_channel_layout_copy(&avctx->ch_layout, &s->non_pcm_ctx->ch_layout);
+            if (ret < 0) {
+                av_log(avctx, AV_LOG_ERROR, "error %d when copying channel layout from non-pcm decoder context to parent context.\n", ret);
+                return ret;
+            }
+            s->avctx_props_set = 1;
+        }
+
+        frame->nb_samples = s->frame->nb_samples;
+        ret = ff_get_buffer(avctx, frame, 0);
+        if (ret < 0)
+            return ret;
+
+        for (int ch = 0; ch < s->frame->ch_layout.nb_channels; ch++)
+            memcpy(frame->extended_data[ch], s->frame->extended_data[ch],
+                   av_get_bytes_per_sample(s->non_pcm_ctx->sample_fmt) * s->frame->nb_samples);
+    }
 
     *got_frame_ptr = 1;
 
     return avpkt->size;
 }
 
+static void s302m_flush(AVCodecContext *avctx)
+{
+    S302Context *s = avctx->priv_data;
+
+    if (s->non_pcm_dec && s->non_pcm_ctx)
+        avcodec_flush_buffers(s->non_pcm_ctx);
+}
+
+static av_cold int s302m_close(AVCodecContext *avctx)
+{
+    S302Context *s = avctx->priv_data;
+
+    avcodec_free_context(&s->non_pcm_ctx);
+    av_packet_free(&s->packet);
+    av_frame_free(&s->frame);
+    av_dict_free(&s->non_pcm_opts);
+
+    return 0;
+}
+
 #define FLAGS AV_OPT_FLAG_AUDIO_PARAM|AV_OPT_FLAG_DECODING_PARAM
 static const AVOption s302m_options[] = {
-    {"non_pcm_mode", "Chooses what to do with NON-PCM", offsetof(S302Context, non_pcm_mode), AV_OPT_TYPE_INT, {.i64 = 3}, 0, 3, FLAGS, "non_pcm_mode"},
-    {"copy"        , "Pass NON-PCM through unchanged"     , 0, AV_OPT_TYPE_CONST, {.i64 = 0}, 0, 3, FLAGS, "non_pcm_mode"},
-    {"drop"        , "Drop NON-PCM"                       , 0, AV_OPT_TYPE_CONST, {.i64 = 1}, 0, 3, FLAGS, "non_pcm_mode"},
-    {"decode_copy" , "Decode if possible else passthrough", 0, AV_OPT_TYPE_CONST, {.i64 = 2}, 0, 3, FLAGS, "non_pcm_mode"},
-    {"decode_drop" , "Decode if possible else drop"       , 0, AV_OPT_TYPE_CONST, {.i64 = 3}, 0, 3, FLAGS, "non_pcm_mode"},
+    {"non_pcm_mode", "Chooses what to do with NON-PCM", offsetof(S302Context, non_pcm_mode), AV_OPT_TYPE_INT, {.i64 = NON_PCM_DEC_ELSE_DROP}, NON_PCM_COPY, NON_PCM_DEC_ELSE_DROP, FLAGS, "non_pcm_mode"},
+    {"copy"        , "Pass NON-PCM through unchanged"     , 0, AV_OPT_TYPE_CONST, {.i64 = NON_PCM_COPY},          0, 3, FLAGS, "non_pcm_mode"},
+    {"drop"        , "Drop NON-PCM"                       , 0, AV_OPT_TYPE_CONST, {.i64 = NON_PCM_DROP},          0, 3, FLAGS, "non_pcm_mode"},
+    {"decode_copy" , "Decode if possible else passthrough", 0, AV_OPT_TYPE_CONST, {.i64 = NON_PCM_DEC_ELSE_COPY}, 0, 3, FLAGS, "non_pcm_mode"},
+    {"decode_drop" , "Decode if possible else drop"       , 0, AV_OPT_TYPE_CONST, {.i64 = NON_PCM_DEC_ELSE_DROP}, 0, 3, FLAGS, "non_pcm_mode"},
+    {"non_pcm_options", "Set options for non-pcm decoder",  offsetof(S302Context, non_pcm_opts), AV_OPT_TYPE_DICT, {.str = NULL}, 0, 0, FLAGS},
     {NULL}
 };
 
@@ -231,6 +623,9 @@  const FFCodec ff_s302m_decoder = {
     CODEC_LONG_NAME("SMPTE 302M"),
     .p.type         = AVMEDIA_TYPE_AUDIO,
     .p.id           = AV_CODEC_ID_S302M,
+    .init           = s302m_init,
+    .close          = s302m_close,
+    .flush          = s302m_flush,
     .p.priv_class   = &s302m_class,
     .priv_data_size = sizeof(S302Context),
     FF_CODEC_DECODE_CB(s302m_decode_frame),