diff mbox series

[FFmpeg-devel] Set native order for wav channel layouts up until 8 channels.

Message ID 20240223204105.95953-1-toots@rastageeks.org
State New
Headers show
Series [FFmpeg-devel] Set native order for wav channel layouts up until 8 channels. | expand

Checks

Context Check Description
andriy/commit_msg_x86 warning The first line of the commit message must start with a context terminated by a colon and a space, for example "lavu/opt: " or "doc: ".
yinshiyou/commit_msg_loongarch64 warning The first line of the commit message must start with a context terminated by a colon and a space, for example "lavu/opt: " or "doc: ".
andriy/make_x86 success Make finished
andriy/make_fate_x86 fail Make fate failed

Commit Message

Romain Beauxis Feb. 23, 2024, 8:41 p.m. UTC
The new default channel layout for the various RIFF/WAV decoders is not
backward compatible.

Historically, most decoders will expect the channel layouts to follow
the native layout up-to a reasonable number of channels.

Additionally, non-native layouts are causing troubles with filters
chaining.

This PR changes the default channel layout reported by RIFF/WAV decoders
to default to the native layout when the number of channels is up-to 8.

The logic for these changes is the same as the logic for the vorbis/opus
decoders.

Romain

---
 libavformat/riff.h    |  1 +
 libavformat/riffdec.c | 31 ++++++++++++++++++++++++++++---
 libavformat/wavdec.c  |  4 +---
 3 files changed, 30 insertions(+), 6 deletions(-)

Comments

Marton Balint Feb. 23, 2024, 9:11 p.m. UTC | #1
On Fri, 23 Feb 2024, Romain Beauxis wrote:

> The new default channel layout for the various RIFF/WAV decoders is not
> backward compatible.
>
> Historically, most decoders will expect the channel layouts to follow
> the native layout up-to a reasonable number of channels.
>
> Additionally, non-native layouts are causing troubles with filters
> chaining.
>
> This PR changes the default channel layout reported by RIFF/WAV decoders
> to default to the native layout when the number of channels is up-to 8.
>
> The logic for these changes is the same as the logic for the vorbis/opus
> decoders.

For Vorbis the channel layout is in the actual Vorbis specification. So 
you should follow the specification, simple guessing in the demuxer likely 
won't be acceptable.

Regards,
Marton
Romain Beauxis Feb. 23, 2024, 10:32 p.m. UTC | #2
Le ven. 23 févr. 2024 à 15:11, Marton Balint <cus@passwd.hu> a écrit :
>
>
>
> On Fri, 23 Feb 2024, Romain Beauxis wrote:
>
> > The new default channel layout for the various RIFF/WAV decoders is not
> > backward compatible.
> >
> > Historically, most decoders will expect the channel layouts to follow
> > the native layout up-to a reasonable number of channels.
> >
> > Additionally, non-native layouts are causing troubles with filters
> > chaining.
> >
> > This PR changes the default channel layout reported by RIFF/WAV decoders
> > to default to the native layout when the number of channels is up-to 8.
> >
> > The logic for these changes is the same as the logic for the vorbis/opus
> > decoders.
>
> For Vorbis the channel layout is in the actual Vorbis specification. So
> you should follow the specification, simple guessing in the demuxer likely
> won't be acceptable.

I would argue that even though there is no official specification on
channel layout for wav/riff, the de-facto assumption that _most_ users
of the library would expect is the native layout.

Typically, 1 and 2 channels would be assumed to be mono and stereo by
most users.

It's great that the API does provide flexibility but the default
should be set to satisfy most users and, in that regard, it seems that
assuming a native layout is what the vast majority of library's users
will expect.

This choice also implies at least two ABI breakage for applications
using the deprecated API but running on a the new ABI:

1: With the updated API, the library is not able to provide a backward
compatible `channel_layout` for AVFrame when the new channel order is
AV_CHANNEL_ORDER_UNSPEC which breaks ABI compatibility as the field is
reported as `0`.

2. AV_CHANNEL_ORDER_UNSPEC channel order also breaks filters chaining.
The issue appears to be that the unspec channel order implicitly sets
the the filter to accept all channel order, which breaks compatibility
here:

        if (link->incfg.channel_layouts->all_layouts) {
            av_log(link->src, AV_LOG_ERROR, "Cannot select channel layout for"
                   " the link between filters %s and %s.\n", link->src->name,
                   link->dst->name);
            if (!link->incfg.channel_layouts->all_counts)
                av_log(link->src, AV_LOG_ERROR, "Unknown channel layouts not "
                       "supported, try specifying a channel layout using "
                       "'aformat=channel_layouts=something'.\n");
            return AVERROR(EINVAL);
        }

-- Romain
Marton Balint Feb. 23, 2024, 10:57 p.m. UTC | #3
On Fri, 23 Feb 2024, Romain Beauxis wrote:

> Le ven. 23 févr. 2024 à 15:11, Marton Balint <cus@passwd.hu> a écrit :
>>
>>
>>
>> On Fri, 23 Feb 2024, Romain Beauxis wrote:
>>
>> > The new default channel layout for the various RIFF/WAV decoders is not
>> > backward compatible.
>> >
>> > Historically, most decoders will expect the channel layouts to follow
>> > the native layout up-to a reasonable number of channels.
>> >
>> > Additionally, non-native layouts are causing troubles with filters
>> > chaining.
>> >
>> > This PR changes the default channel layout reported by RIFF/WAV decoders
>> > to default to the native layout when the number of channels is up-to 8.
>> >
>> > The logic for these changes is the same as the logic for the vorbis/opus
>> > decoders.
>>
>> For Vorbis the channel layout is in the actual Vorbis specification. So
>> you should follow the specification, simple guessing in the demuxer likely
>> won't be acceptable.
>
> I would argue that even though there is no official specification on
> channel layout for wav/riff, the de-facto assumption that _most_ users
> of the library would expect is the native layout.

I think this a good starting point:

https://www.mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/WAVE.html

>
> Typically, 1 and 2 channels would be assumed to be mono and stereo by
> most users.
>
> It's great that the API does provide flexibility but the default
> should be set to satisfy most users and, in that regard, it seems that
> assuming a native layout is what the vast majority of library's users
> will expect.

Actually ffmpeg.c used to have some channel layout guessing code for 
exactly this purpose. Maybe you should check why that does not kick in.

>
> This choice also implies at least two ABI breakage for applications
> using the deprecated API but running on a the new ABI:
>
> 1: With the updated API, the library is not able to provide a backward
> compatible `channel_layout` for AVFrame when the new channel order is
> AV_CHANNEL_ORDER_UNSPEC which breaks ABI compatibility as the field is
> reported as `0`.

I believe this was valid even before the new channel layout API for 
unknown channel layouts, so I don't quite see the API break.

>
> 2. AV_CHANNEL_ORDER_UNSPEC channel order also breaks filters chaining.
> The issue appears to be that the unspec channel order implicitly sets
> the the filter to accept all channel order, which breaks compatibility
> here:
>
>        if (link->incfg.channel_layouts->all_layouts) {
>            av_log(link->src, AV_LOG_ERROR, "Cannot select channel layout for"
>                   " the link between filters %s and %s.\n", link->src->name,
>                   link->dst->name);
>            if (!link->incfg.channel_layouts->all_counts)
>                av_log(link->src, AV_LOG_ERROR, "Unknown channel layouts not "
>                       "supported, try specifying a channel layout using "
>                       "'aformat=channel_layouts=something'.\n");
>            return AVERROR(EINVAL);
>        }

A specific ffmpeg command line which shows this error would be useful. 
Maybe this can be fixed, if this is a regression.

Regards,
Marton
James Almer Feb. 24, 2024, 12:01 a.m. UTC | #4
On 2/23/2024 7:32 PM, Romain Beauxis wrote:
> Le ven. 23 févr. 2024 à 15:11, Marton Balint <cus@passwd.hu> a écrit :
>>
>>
>>
>> On Fri, 23 Feb 2024, Romain Beauxis wrote:
>>
>>> The new default channel layout for the various RIFF/WAV decoders is not
>>> backward compatible.
>>>
>>> Historically, most decoders will expect the channel layouts to follow
>>> the native layout up-to a reasonable number of channels.
>>>
>>> Additionally, non-native layouts are causing troubles with filters
>>> chaining.
>>>
>>> This PR changes the default channel layout reported by RIFF/WAV decoders
>>> to default to the native layout when the number of channels is up-to 8.
>>>
>>> The logic for these changes is the same as the logic for the vorbis/opus
>>> decoders.
>>
>> For Vorbis the channel layout is in the actual Vorbis specification. So
>> you should follow the specification, simple guessing in the demuxer likely
>> won't be acceptable.
> 
> I would argue that even though there is no official specification on
> channel layout for wav/riff, the de-facto assumption that _most_ users
> of the library would expect is the native layout.
> 
> Typically, 1 and 2 channels would be assumed to be mono and stereo by
> most users.
> 
> It's great that the API does provide flexibility but the default
> should be set to satisfy most users and, in that regard, it seems that
> assuming a native layout is what the vast majority of library's users
> will expect.
> 
> This choice also implies at least two ABI breakage for applications
> using the deprecated API but running on a the new ABI:
> 
> 1: With the updated API, the library is not able to provide a backward
> compatible `channel_layout` for AVFrame when the new channel order is
> AV_CHANNEL_ORDER_UNSPEC which breaks ABI compatibility as the field is
> reported as `0`.
> 
> 2. AV_CHANNEL_ORDER_UNSPEC channel order also breaks filters chaining.
> The issue appears to be that the unspec channel order implicitly sets
> the the filter to accept all channel order, which breaks compatibility
> here:
> 
>          if (link->incfg.channel_layouts->all_layouts) {
>              av_log(link->src, AV_LOG_ERROR, "Cannot select channel layout for"
>                     " the link between filters %s and %s.\n", link->src->name,
>                     link->dst->name);
>              if (!link->incfg.channel_layouts->all_counts)
>                  av_log(link->src, AV_LOG_ERROR, "Unknown channel layouts not "
>                         "supported, try specifying a channel layout using "
>                         "'aformat=channel_layouts=something'.\n");
>              return AVERROR(EINVAL);
>          }

It's not the demuxer's job to guess what the caller would like or 
expect. It's meant to export what the input contains.
If you want to fix the CLI behavior, then you need to do it in the CLI. 
Remember that ffmpeg.c is not the only lavf user.
Michael Niedermayer Feb. 25, 2024, 1:27 a.m. UTC | #5
On Fri, Feb 23, 2024 at 02:41:06PM -0600, Romain Beauxis wrote:
> The new default channel layout for the various RIFF/WAV decoders is not
> backward compatible.
> 
> Historically, most decoders will expect the channel layouts to follow
> the native layout up-to a reasonable number of channels.
> 
> Additionally, non-native layouts are causing troubles with filters
> chaining.
> 
> This PR changes the default channel layout reported by RIFF/WAV decoders
> to default to the native layout when the number of channels is up-to 8.
> 
> The logic for these changes is the same as the logic for the vorbis/opus
> decoders.
> 
> Romain

breaks fate
make -j32 fate-flcl1905
TEST    flcl1905
--- ./tests/ref/fate/flcl1905	2024-02-09 03:32:32.540199565 +0100
+++ tests/data/fate/flcl1905	2024-02-25 02:26:51.079111678 +0100
@@ -1,192 +1,192 @@
 packet|codec_type=audio|stream_index=0|pts=0|pts_time=0.000000|dts=0|dts_time=0.000000|duration=22528|duration_time=0.510839|size=4092|pos=56|flags=K__
-frame|media_type=audio|stream_index=0|key_frame=1|pts=N/A|pts_time=N/A|pkt_dts=N/A|pkt_dts_time=N/A|best_effort_timestamp=N/A|best_effort_timestamp_time=N/A|pkt_duration=22528|pkt_duration_time=0.510839|duration=22528|duration_time=0.510839|pkt_pos=56|pkt_size=4092|sample_fmt=fltp|nb_samples=2048|channels=2|channel_layout=unknown
-frame|media_type=audio|stream_index=0|key_frame=1|pts=N/A|pts_time=N/A|pkt_dts=N/A|pkt_dts_time=N/A|best_effort_timestamp=N/A|best_effort_timestamp_time=N/A|pkt_duration=22528|pkt_duration_time=0.510839|duration=22528|duration_time=0.510839|pkt_pos=56|pkt_size=4092|sample_fmt=fltp|nb_samples=2048|channels=2|channel_layout=unknown
-frame|media_type=audio|stream_index=0|key_frame=1|pts=N/A|pts_time=N/A|pkt_dts=N/A|pkt_dts_time=N/A|best_effort_timestamp=N/A|best_effort_timestamp_time=N/A|pkt_duration=22528|pkt_duration_time=0.510839|duration=22528|duration_time=0.510839|pkt_pos=56|pkt_size=4092|sample_fmt=fltp|nb_samples=2048|channels=2|channel_layout=unknown
...


[...]
Romain Beauxis Feb. 28, 2024, 2:29 a.m. UTC | #6
Le sam. 24 févr. 2024 à 19:27, Michael Niedermayer
<michael@niedermayer.cc> a écrit :
>
> On Fri, Feb 23, 2024 at 02:41:06PM -0600, Romain Beauxis wrote:
> > The new default channel layout for the various RIFF/WAV decoders is not
> > backward compatible.
> >
> > Historically, most decoders will expect the channel layouts to follow
> > the native layout up-to a reasonable number of channels.
> >
> > Additionally, non-native layouts are causing troubles with filters
> > chaining.
> >
> > This PR changes the default channel layout reported by RIFF/WAV decoders
> > to default to the native layout when the number of channels is up-to 8.
> >
> > The logic for these changes is the same as the logic for the vorbis/opus
> > decoders.
> >
> > Romain
>
> breaks fate
> make -j32 fate-flcl1905
> TEST    flcl1905
> --- ./tests/ref/fate/flcl1905   2024-02-09 03:32:32.540199565 +0100
> +++ tests/data/fate/flcl1905    2024-02-25 02:26:51.079111678 +0100
> @@ -1,192 +1,192 @@
>  packet|codec_type=audio|stream_index=0|pts=0|pts_time=0.000000|dts=0|dts_time=0.000000|duration=22528|duration_time=0.510839|size=4092|pos=56|flags=K__
> -frame|media_type=audio|stream_index=0|key_frame=1|pts=N/A|pts_time=N/A|pkt_dts=N/A|pkt_dts_time=N/A|best_effort_timestamp=N/A|best_effort_timestamp_time=N/A|pkt_duration=22528|pkt_duration_time=0.510839|duration=22528|duration_time=0.510839|pkt_pos=56|pkt_size=4092|sample_fmt=fltp|nb_samples=2048|channels=2|channel_layout=unknown
> -frame|media_type=audio|stream_index=0|key_frame=1|pts=N/A|pts_time=N/A|pkt_dts=N/A|pkt_dts_time=N/A|best_effort_timestamp=N/A|best_effort_timestamp_time=N/A|pkt_duration=22528|pkt_duration_time=0.510839|duration=22528|duration_time=0.510839|pkt_pos=56|pkt_size=4092|sample_fmt=fltp|nb_samples=2048|channels=2|channel_layout=unknown
> -frame|media_type=audio|stream_index=0|key_frame=1|pts=N/A|pts_time=N/A|pkt_dts=N/A|pkt_dts_time=N/A|best_effort_timestamp=N/A|best_effort_timestamp_time=N/A|pkt_duration=22528|pkt_duration_time=0.510839|duration=22528|duration_time=0.510839|pkt_pos=56|pkt_size=4092|sample_fmt=fltp|nb_samples=2048|channels=2|channel_layout=unknown
> ...

Thanks.

I did more backward compatibility tests and I might have jumped the
gun on it as well.

I'll check again and will come back to it if I can confirm anything.

-- Romain
diff mbox series

Patch

diff --git a/libavformat/riff.h b/libavformat/riff.h
index a93eadfeca..f474efdce9 100644
--- a/libavformat/riff.h
+++ b/libavformat/riff.h
@@ -67,6 +67,7 @@  void ff_put_bmp_header(AVIOContext *pb, AVCodecParameters *par, int for_asf, int
 int ff_put_wav_header(AVFormatContext *s, AVIOContext *pb, AVCodecParameters *par, int flags);
 
 enum AVCodecID ff_wav_codec_get_id(unsigned int tag, int bps);
+void ff_get_wav_ch_layout(AVChannelLayout *ch_layout, int channels);
 int ff_get_wav_header(void *logctx, AVIOContext *pb, AVCodecParameters *par,
                       int size, int big_endian);
 
diff --git a/libavformat/riffdec.c b/libavformat/riffdec.c
index 0fe4e02b7b..bb7c01c5f5 100644
--- a/libavformat/riffdec.c
+++ b/libavformat/riffdec.c
@@ -90,6 +90,33 @@  static void parse_waveformatex(void *logctx, AVIOContext *pb, AVCodecParameters
     }
 }
 
+const AVChannelLayout ff_wav_demux_ch_layouts[9] = {
+    AV_CHANNEL_LAYOUT_MONO,
+    AV_CHANNEL_LAYOUT_STEREO,
+    AV_CHANNEL_LAYOUT_SURROUND,
+    AV_CHANNEL_LAYOUT_QUAD,
+    AV_CHANNEL_LAYOUT_5POINT0_BACK,
+    AV_CHANNEL_LAYOUT_5POINT1_BACK,
+    {
+        .nb_channels = 7,
+        .order       = AV_CHANNEL_ORDER_NATIVE,
+        .u.mask      = AV_CH_LAYOUT_5POINT1 | AV_CH_BACK_CENTER,
+    },
+    AV_CHANNEL_LAYOUT_7POINT1,
+    { 0 }
+};
+
+void ff_get_wav_ch_layout(AVChannelLayout *ch_layout, int channels)
+{
+    av_channel_layout_uninit(ch_layout);
+    if (channels > 8) {
+        ch_layout->order       = AV_CHANNEL_ORDER_UNSPEC;
+        ch_layout->nb_channels = channels;
+    } else {
+        av_channel_layout_copy(ch_layout, &ff_wav_demux_ch_layouts[channels - 1]);
+    }
+}
+
 /* "big_endian" values are needed for RIFX file format */
 int ff_get_wav_header(void *logctx, AVIOContext *pb,
                       AVCodecParameters *par, int size, int big_endian)
@@ -195,9 +222,7 @@  int ff_get_wav_header(void *logctx, AVIOContext *pb,
 
     /* ignore WAVEFORMATEXTENSIBLE layout if different from channel count */
     if (channels != par->ch_layout.nb_channels) {
-        av_channel_layout_uninit(&par->ch_layout);
-        par->ch_layout.order       = AV_CHANNEL_ORDER_UNSPEC;
-        par->ch_layout.nb_channels = channels;
+        ff_get_wav_ch_layout(&par->ch_layout, channels);
     }
 
     return 0;
diff --git a/libavformat/wavdec.c b/libavformat/wavdec.c
index 0c6629b157..282e1ae017 100644
--- a/libavformat/wavdec.c
+++ b/libavformat/wavdec.c
@@ -222,9 +222,7 @@  static int wav_parse_xma2_tag(AVFormatContext *s, int64_t size, AVStream *st)
         channels += avio_r8(pb);
         avio_skip(pb, 3);
     }
-    av_channel_layout_uninit(&st->codecpar->ch_layout);
-    st->codecpar->ch_layout.order       = AV_CHANNEL_ORDER_UNSPEC;
-    st->codecpar->ch_layout.nb_channels = channels;
+    ff_get_wav_ch_layout(&st->codecpar->ch_layout, channels);
 
     if (st->codecpar->ch_layout.nb_channels <= 0 || st->codecpar->sample_rate <= 0)
         return AVERROR_INVALIDDATA;