diff mbox series

[FFmpeg-devel] lavc: export flag for MPEG audio dual channel

Message ID 20220921192611.3241-1-scott.the.elm@gmail.com
State New
Headers show
Series [FFmpeg-devel] lavc: export flag for MPEG audio dual channel | expand

Checks

Context Check Description
yinshiyou/make_loongarch64 success Make finished
yinshiyou/make_fate_loongarch64 success Make fate finished
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Scott Theisen Sept. 21, 2022, 7:26 p.m. UTC
From: ulmus-scott <scott.the.elm@gmail.com>

The flag identifies two independant mono channels recorded as stereo.

This change has been kicking around in the MythTV modifications since 2006.
See https://github.com/MythTV/mythtv/commit/435540c9e8ac245ceca968791c67431f37c8d617
I have changed the names and comment.  For the current MythTV modification see
https://github.com/ulmus-scott/FFmpeg/commit/645fa6f9a61d23bac0665851d211bbeb3686deb0
---
 libavcodec/audiotoolboxdec.c    |  2 +-
 libavcodec/avcodec.h            | 11 +++++++++++
 libavcodec/mpegaudio_parser.c   |  2 +-
 libavcodec/mpegaudiodecheader.c |  4 +++-
 libavcodec/mpegaudiodecheader.h |  2 +-
 5 files changed, 17 insertions(+), 4 deletions(-)

Comments

Rémi Denis-Courmont Sept. 21, 2022, 7:44 p.m. UTC | #1
Le keskiviikkona 21. syyskuuta 2022, 22.26.11 EEST Scott Theisen a écrit :
> diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
> index 7db5d1b1c5..bcf3a845a8 100644
> --- a/libavcodec/avcodec.h
> +++ b/libavcodec/avcodec.h
> @@ -2076,6 +2076,17 @@ typedef struct AVCodecContext {
>       *             The decoder can then override during decoding as needed.
> */
>      AVChannelLayout ch_layout;
> +
> +    /**
> +     * Audio only.  This flag is set when MPEG audio mode dual channel has
> been detected. +     * This signals that the audio is two independent mono
> channels. +     *
> +     * 0 normally, 1 if dual channel flag is set.
> +     *
> +     * - encoding: currently unused (functionally equivalent to stereo,
> patch welcome) +     * - decoding: set by lavc
> +     */
> +    int mpeg_audio_mode_dual_channel;
>  } AVCodecContext;

I agree that the dual mono flag should be exposed to the application somehow, 
but isn't this a slient ABI break?
James Almer Sept. 21, 2022, 7:51 p.m. UTC | #2
On 9/21/2022 4:44 PM, Rémi Denis-Courmont wrote:
> Le keskiviikkona 21. syyskuuta 2022, 22.26.11 EEST Scott Theisen a écrit :
>> diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
>> index 7db5d1b1c5..bcf3a845a8 100644
>> --- a/libavcodec/avcodec.h
>> +++ b/libavcodec/avcodec.h
>> @@ -2076,6 +2076,17 @@ typedef struct AVCodecContext {
>>        *             The decoder can then override during decoding as needed.
>> */
>>       AVChannelLayout ch_layout;
>> +
>> +    /**
>> +     * Audio only.  This flag is set when MPEG audio mode dual channel has
>> been detected. +     * This signals that the audio is two independent mono
>> channels. +     *
>> +     * 0 normally, 1 if dual channel flag is set.
>> +     *
>> +     * - encoding: currently unused (functionally equivalent to stereo,
>> patch welcome) +     * - decoding: set by lavc
>> +     */
>> +    int mpeg_audio_mode_dual_channel;
>>   } AVCodecContext;
> 
> I agree that the dual mono flag should be exposed to the application somehow,
> but isn't this a slient ABI break?

It's not a break, but it's overkill for what's essentially a flag.

The proper way to do this would be to signal such a layout as an actual 
channel layout, using a custom order one where both channels are set as 
Front Center. But i don't know if until the old channel layout API 
fields are gone we should have decoders setting something only new API 
users will understand. Old API field users would look at channels and 
see a 2, and channel_layout and see it's empty, but then old and new API 
values would technically conflict, so I'd like to hear some opinions.

An alternative could be signaling this using an AVFormatContext.metadata 
entry called "dualmono", and ensuring the channel layout is 2 channels 
of UNSPEC order.
Andreas Rheinhardt Sept. 21, 2022, 7:51 p.m. UTC | #3
Rémi Denis-Courmont:
> Le keskiviikkona 21. syyskuuta 2022, 22.26.11 EEST Scott Theisen a écrit :
>> diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
>> index 7db5d1b1c5..bcf3a845a8 100644
>> --- a/libavcodec/avcodec.h
>> +++ b/libavcodec/avcodec.h
>> @@ -2076,6 +2076,17 @@ typedef struct AVCodecContext {
>>       *             The decoder can then override during decoding as needed.
>> */
>>      AVChannelLayout ch_layout;
>> +
>> +    /**
>> +     * Audio only.  This flag is set when MPEG audio mode dual channel has
>> been detected. +     * This signals that the audio is two independent mono
>> channels. +     *
>> +     * 0 normally, 1 if dual channel flag is set.
>> +     *
>> +     * - encoding: currently unused (functionally equivalent to stereo,
>> patch welcome) +     * - decoding: set by lavc
>> +     */
>> +    int mpeg_audio_mode_dual_channel;
>>  } AVCodecContext;
> 
> I agree that the dual mono flag should be exposed to the application somehow, 
> but isn't this a slient ABI break?
> 

Now. sizeof(AVCodecContext) is not part of the public ABI (we have a
custom allocator for it: avcodec_alloc_context3()).

- Andreas
Scott Theisen Sept. 22, 2022, 1:04 a.m. UTC | #4
On 9/21/22 15:51, James Almer wrote:
> On 9/21/2022 4:44 PM, Rémi Denis-Courmont wrote:
>> Le keskiviikkona 21. syyskuuta 2022, 22.26.11 EEST Scott Theisen a 
>> écrit :
>>> diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
>>> index 7db5d1b1c5..bcf3a845a8 100644
>>> --- a/libavcodec/avcodec.h
>>> +++ b/libavcodec/avcodec.h
>>> @@ -2076,6 +2076,17 @@ typedef struct AVCodecContext {
>>>        *             The decoder can then override during decoding 
>>> as needed.
>>> */
>>>       AVChannelLayout ch_layout;
>>> +
>>> +    /**
>>> +     * Audio only.  This flag is set when MPEG audio mode dual 
>>> channel has
>>> been detected. +     * This signals that the audio is two 
>>> independent mono
>>> channels. +     *
>>> +     * 0 normally, 1 if dual channel flag is set.
>>> +     *
>>> +     * - encoding: currently unused (functionally equivalent to 
>>> stereo,
>>> patch welcome) +     * - decoding: set by lavc
>>> +     */
>>> +    int mpeg_audio_mode_dual_channel;
>>>   } AVCodecContext;
>>
>> I agree that the dual mono flag should be exposed to the application 
>> somehow,
>> but isn't this a slient ABI break?
>
> It's not a break, but it's overkill for what's essentially a flag.

This is how MythTV customized FFmpeg (in 2006) for this and this way was 
probably the easiest.  It /is/ a flag, so maybe adding an int as a 
bitset instead of as a bool so other flags could be added if necessary?

>
> The proper way to do this would be to signal such a layout as an 
> actual channel layout, using a custom order one where both channels 
> are set as Front Center. But i don't know if until the old channel 
> layout API fields are gone we should have decoders setting something 
> only new API users will understand. Old API field users would look at 
> channels and see a 2, and channel_layout and see it's empty, but then 
> old and new API values would technically conflict, so I'd like to hear 
> some opinions.

I'm not very familiar with either channel layout API, but the audio is 
encoded identically to stereo other than the flag, so could we add 
another entry, e.g. AV_CHANNEL_ORDER_MONO, to the `enum AVChannelOrder` 
to signify that each channel is independent, but is otherwise identical 
to AV_CHANNEL_ORDER_NATIVE?

>
> An alternative could be signaling this using an 
> AVFormatContext.metadata entry called "dualmono", and ensuring the 
> channel layout is 2 channels of UNSPEC order.
>

AVFormatContext sounds like the wrong place, since there could, in 
theory, be multiple audio streams utilizing or not dual mono audio, 
which could also be switching back and forth between using dual mono and 
not.

Regards,

Scott
Anton Khirnov Sept. 22, 2022, 12:43 p.m. UTC | #5
Quoting Scott Theisen (2022-09-22 03:04:16)
> On 9/21/22 15:51, James Almer wrote:
> > On 9/21/2022 4:44 PM, Rémi Denis-Courmont wrote:
> >> Le keskiviikkona 21. syyskuuta 2022, 22.26.11 EEST Scott Theisen a 
> >> écrit :
> >>> diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
> >>> index 7db5d1b1c5..bcf3a845a8 100644
> >>> --- a/libavcodec/avcodec.h
> >>> +++ b/libavcodec/avcodec.h
> >>> @@ -2076,6 +2076,17 @@ typedef struct AVCodecContext {
> >>>        *             The decoder can then override during decoding 
> >>> as needed.
> >>> */
> >>>       AVChannelLayout ch_layout;
> >>> +
> >>> +    /**
> >>> +     * Audio only.  This flag is set when MPEG audio mode dual 
> >>> channel has
> >>> been detected. +     * This signals that the audio is two 
> >>> independent mono
> >>> channels. +     *
> >>> +     * 0 normally, 1 if dual channel flag is set.
> >>> +     *
> >>> +     * - encoding: currently unused (functionally equivalent to 
> >>> stereo,
> >>> patch welcome) +     * - decoding: set by lavc
> >>> +     */
> >>> +    int mpeg_audio_mode_dual_channel;
> >>>   } AVCodecContext;
> >>
> >> I agree that the dual mono flag should be exposed to the application 
> >> somehow,
> >> but isn't this a slient ABI break?
> >
> > It's not a break, but it's overkill for what's essentially a flag.
> 
> This is how MythTV customized FFmpeg (in 2006) for this and this way was 
> probably the easiest.  It /is/ a flag, so maybe adding an int as a 
> bitset instead of as a bool so other flags could be added if necessary?
> 
> >
> > The proper way to do this would be to signal such a layout as an 
> > actual channel layout, using a custom order one where both channels 
> > are set as Front Center. But i don't know if until the old channel 
> > layout API fields are gone we should have decoders setting something 
> > only new API users will understand. Old API field users would look at 
> > channels and see a 2, and channel_layout and see it's empty, but then 
> > old and new API values would technically conflict, so I'd like to hear 
> > some opinions.
> 
> I'm not very familiar with either channel layout API, but the audio is 
> encoded identically to stereo other than the flag, so could we add 
> another entry, e.g. AV_CHANNEL_ORDER_MONO, to the `enum AVChannelOrder` 
> to signify that each channel is independent, but is otherwise identical 
> to AV_CHANNEL_ORDER_NATIVE?

The whole point is that it's NOT identical to AV_CHANNEL_ORDER_NATIVE.
ORDER_CUSTOM with two FC channels is exactly the right way to handle
this IMO.
Scott Theisen Sept. 22, 2022, 8:02 p.m. UTC | #6
On 9/22/22 08:43, Anton Khirnov wrote:
> Quoting Scott Theisen (2022-09-22 03:04:16)
>> On 9/21/22 15:51, James Almer wrote:
>>> On 9/21/2022 4:44 PM, Rémi Denis-Courmont wrote:
>>>> Le keskiviikkona 21. syyskuuta 2022, 22.26.11 EEST Scott Theisen a
>>>> écrit :
>>>>> diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
>>>>> index 7db5d1b1c5..bcf3a845a8 100644
>>>>> --- a/libavcodec/avcodec.h
>>>>> +++ b/libavcodec/avcodec.h
>>>>> @@ -2076,6 +2076,17 @@ typedef struct AVCodecContext {
>>>>>         *             The decoder can then override during decoding
>>>>> as needed.
>>>>> */
>>>>>        AVChannelLayout ch_layout;
>>>>> +
>>>>> +    /**
>>>>> +     * Audio only.  This flag is set when MPEG audio mode dual
>>>>> channel has
>>>>> been detected. +     * This signals that the audio is two
>>>>> independent mono
>>>>> channels. +     *
>>>>> +     * 0 normally, 1 if dual channel flag is set.
>>>>> +     *
>>>>> +     * - encoding: currently unused (functionally equivalent to
>>>>> stereo,
>>>>> patch welcome) +     * - decoding: set by lavc
>>>>> +     */
>>>>> +    int mpeg_audio_mode_dual_channel;
>>>>>    } AVCodecContext;
>>>> I agree that the dual mono flag should be exposed to the application
>>>> somehow,
>>>> but isn't this a slient ABI break?
>>> It's not a break, but it's overkill for what's essentially a flag.
>> This is how MythTV customized FFmpeg (in 2006) for this and this way was
>> probably the easiest.  It /is/ a flag, so maybe adding an int as a
>> bitset instead of as a bool so other flags could be added if necessary?
>>
>>> The proper way to do this would be to signal such a layout as an
>>> actual channel layout, using a custom order one where both channels
>>> are set as Front Center. But i don't know if until the old channel
>>> layout API fields are gone we should have decoders setting something
>>> only new API users will understand. Old API field users would look at
>>> channels and see a 2, and channel_layout and see it's empty, but then
>>> old and new API values would technically conflict, so I'd like to hear
>>> some opinions.
>> I'm not very familiar with either channel layout API, but the audio is
>> encoded identically to stereo other than the flag, so could we add
>> another entry, e.g. AV_CHANNEL_ORDER_MONO, to the `enum AVChannelOrder`
>> to signify that each channel is independent, but is otherwise identical
>> to AV_CHANNEL_ORDER_NATIVE?
> The whole point is that it's NOT identical to AV_CHANNEL_ORDER_NATIVE.
> ORDER_CUSTOM with two FC channels is exactly the right way to handle
> this IMO.
>

I don't really disagree that AV_CHANNEL_ORDER_CUSTOM with two Front 
Center channels is one way to do it, I was just hoping for an easier way.

I think that would require modification to the code calling 
ff_mpa_decode_header.  But I'm not sure how to create that custom order 
layout and I would like consensus that the custom order is the way to go 
before doing it.

libavcodec/audiotoolboxdec.c just sets an unspecified order. 
libavcodec/mpegaudio_parser.c uses av_channel_layout_default.

Also, should we concern ourselves with the old API?  It didn't look like 
either of the above files did.

Regards,

Scott Theisen
diff mbox series

Patch

diff --git a/libavcodec/audiotoolboxdec.c b/libavcodec/audiotoolboxdec.c
index 82babe3d31..9d11844142 100644
--- a/libavcodec/audiotoolboxdec.c
+++ b/libavcodec/audiotoolboxdec.c
@@ -346,7 +346,7 @@  static av_cold int ffat_create_decoder(AVCodecContext *avctx,
         int bit_rate;
         if (ff_mpa_decode_header(AV_RB32(pkt->data), &avctx->sample_rate,
                                  &in_format.mChannelsPerFrame, &avctx->frame_size,
-                                 &bit_rate, &codec_id) < 0)
+                                 &bit_rate, &codec_id, &avctx->mpeg_audio_mode_dual_channel) < 0)
             return AVERROR_INVALIDDATA;
         avctx->bit_rate = bit_rate;
         in_format.mSampleRate = avctx->sample_rate;
diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
index 7db5d1b1c5..bcf3a845a8 100644
--- a/libavcodec/avcodec.h
+++ b/libavcodec/avcodec.h
@@ -2076,6 +2076,17 @@  typedef struct AVCodecContext {
      *             The decoder can then override during decoding as needed.
      */
     AVChannelLayout ch_layout;
+
+    /**
+     * Audio only.  This flag is set when MPEG audio mode dual channel has been detected.
+     * This signals that the audio is two independent mono channels.
+     *
+     * 0 normally, 1 if dual channel flag is set.
+     *
+     * - encoding: currently unused (functionally equivalent to stereo, patch welcome)
+     * - decoding: set by lavc
+     */
+    int mpeg_audio_mode_dual_channel;
 } AVCodecContext;
 
 /**
diff --git a/libavcodec/mpegaudio_parser.c b/libavcodec/mpegaudio_parser.c
index d54366f10a..d957cf467f 100644
--- a/libavcodec/mpegaudio_parser.c
+++ b/libavcodec/mpegaudio_parser.c
@@ -70,7 +70,7 @@  static int mpegaudio_parse(AVCodecParserContext *s1,
 
                 state= (state<<8) + buf[i++];
 
-                ret = ff_mpa_decode_header(state, &sr, &channels, &frame_size, &bit_rate, &codec_id);
+                ret = ff_mpa_decode_header(state, &sr, &channels, &frame_size, &bit_rate, &codec_id, &avctx->mpeg_audio_mode_dual_channel);
                 if (ret < 4) {
                     if (i > 4)
                         s->header_count = -2;
diff --git a/libavcodec/mpegaudiodecheader.c b/libavcodec/mpegaudiodecheader.c
index ef63befbf4..6c9a641906 100644
--- a/libavcodec/mpegaudiodecheader.c
+++ b/libavcodec/mpegaudiodecheader.c
@@ -117,7 +117,7 @@  int avpriv_mpegaudio_decode_header(MPADecodeHeader *s, uint32_t header)
     return 0;
 }
 
-int ff_mpa_decode_header(uint32_t head, int *sample_rate, int *channels, int *frame_size, int *bit_rate, enum AVCodecID *codec_id)
+int ff_mpa_decode_header(uint32_t head, int *sample_rate, int *channels, int *frame_size, int *bit_rate, enum AVCodecID *codec_id, int *dual_mono)
 {
     MPADecodeHeader s1, *s = &s1;
 
@@ -148,5 +148,7 @@  int ff_mpa_decode_header(uint32_t head, int *sample_rate, int *channels, int *fr
     *sample_rate = s->sample_rate;
     *channels = s->nb_channels;
     *bit_rate = s->bit_rate;
+    *dual_mono = (s->mode == MPA_DUAL) ? 1 : 0;
+
     return s->frame_size;
 }
diff --git a/libavcodec/mpegaudiodecheader.h b/libavcodec/mpegaudiodecheader.h
index ed5d1f3b33..e599d287f7 100644
--- a/libavcodec/mpegaudiodecheader.h
+++ b/libavcodec/mpegaudiodecheader.h
@@ -56,7 +56,7 @@  int avpriv_mpegaudio_decode_header(MPADecodeHeader *s, uint32_t header);
 /* useful helper to get MPEG audio stream info. Return -1 if error in
    header, otherwise the coded frame size in bytes */
 int ff_mpa_decode_header(uint32_t head, int *sample_rate,
-                         int *channels, int *frame_size, int *bitrate, enum AVCodecID *codec_id);
+                         int *channels, int *frame_size, int *bitrate, enum AVCodecID *codec_id, int *dual_mono);
 
 /* fast header check for resync */
 static inline int ff_mpa_check_header(uint32_t header){