diff mbox series

[FFmpeg-devel,2/2] avformat/riffdec: follow the MS docs more strictly for setting wav channel layouts

Message ID 20240317195729.11755-2-cus@passwd.hu
State New
Headers show
Series [FFmpeg-devel,1/2] avformat/riff: pass an AVFormatContext explicitly to ff_get_wav_header | expand

Checks

Context Check Description
yinshiyou/commit_msg_loongarch64 warning Please wrap lines in the body of the commit message between 60 and 72 characters.
andriy/commit_msg_x86 warning Please wrap lines in the body of the commit message between 60 and 72 characters.
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Marton Balint March 17, 2024, 7:57 p.m. UTC
- Only parse the defined masks in dwChannelMask, unless strict_std_compliance
  is less than normal. This matches with the behaviour of the wav muxer.
- Ignore additional bits in dwChannelMasks as the MS documentation suggests [1]
- Assume UNKNOWN channels for missing bits as the MS documentation suggests [1]

[1] https://learn.microsoft.com/en-us/previous-versions/windows/hardware/design/dn653308(v=vs.85)#details-about-dwchannelmask

Signed-off-by: Marton Balint <cus@passwd.hu>
---
 libavformat/riffdec.c | 28 +++++++++++++++++++++++++---
 1 file changed, 25 insertions(+), 3 deletions(-)

Comments

Michael Niedermayer March 19, 2024, 12:15 a.m. UTC | #1
On Sun, Mar 17, 2024 at 08:57:29PM +0100, Marton Balint wrote:
> - Only parse the defined masks in dwChannelMask, unless strict_std_compliance
>   is less than normal. This matches with the behaviour of the wav muxer.
> - Ignore additional bits in dwChannelMasks as the MS documentation suggests [1]
> - Assume UNKNOWN channels for missing bits as the MS documentation suggests [1]
> 
> [1] https://learn.microsoft.com/en-us/previous-versions/windows/hardware/design/dn653308(v=vs.85)#details-about-dwchannelmask
> 
> Signed-off-by: Marton Balint <cus@passwd.hu>
> ---
>  libavformat/riffdec.c | 28 +++++++++++++++++++++++++---
>  1 file changed, 25 insertions(+), 3 deletions(-)

breaks:
./ffmpeg  -i ~/tickets/2859/5.1plusdownmix.wav -ac 2 -t 100 -bitexact -c:a aac -y /tmp/2859-frenchspeack-nolibfaac.mp4

thx

[...]
Marton Balint March 19, 2024, 7:14 p.m. UTC | #2
On Tue, 19 Mar 2024, Michael Niedermayer wrote:

> On Sun, Mar 17, 2024 at 08:57:29PM +0100, Marton Balint wrote:
>> - Only parse the defined masks in dwChannelMask, unless strict_std_compliance
>>   is less than normal. This matches with the behaviour of the wav muxer.
>> - Ignore additional bits in dwChannelMasks as the MS documentation suggests [1]
>> - Assume UNKNOWN channels for missing bits as the MS documentation suggests [1]
>>
>> [1] https://learn.microsoft.com/en-us/previous-versions/windows/hardware/design/dn653308(v=vs.85)#details-about-dwchannelmask
>>
>> Signed-off-by: Marton Balint <cus@passwd.hu>
>> ---
>>  libavformat/riffdec.c | 28 +++++++++++++++++++++++++---
>>  1 file changed, 25 insertions(+), 3 deletions(-)
>
> breaks:
> ./ffmpeg  -i ~/tickets/2859/5.1plusdownmix.wav -ac 2 -t 100 -bitexact -c:a aac -y /tmp/2859-frenchspeack-nolibfaac.mp4

After the patch this is file will need -strict unofficial to work, since 
the downmix channels are not officially recognized in the dwChannelMask.

Regards,
Marton
Tobias Rapp March 21, 2024, 8:23 a.m. UTC | #3
On 19/03/2024 20:14, Marton Balint wrote:

>
>
> On Tue, 19 Mar 2024, Michael Niedermayer wrote:
>
>> On Sun, Mar 17, 2024 at 08:57:29PM +0100, Marton Balint wrote:
>>> - Only parse the defined masks in dwChannelMask, unless 
>>> strict_std_compliance
>>>   is less than normal. This matches with the behaviour of the wav 
>>> muxer.
>>> - Ignore additional bits in dwChannelMasks as the MS documentation 
>>> suggests [1]
>>> - Assume UNKNOWN channels for missing bits as the MS documentation 
>>> suggests [1]
>>>
>>> [1] 
>>> https://learn.microsoft.com/en-us/previous-versions/windows/hardware/design/dn653308(v=vs.85)#details-about-dwchannelmask
>>>
>>> Signed-off-by: Marton Balint <cus@passwd.hu>
>>> ---
>>>  libavformat/riffdec.c | 28 +++++++++++++++++++++++++---
>>>  1 file changed, 25 insertions(+), 3 deletions(-)
>>
>> breaks:
>> ./ffmpeg  -i ~/tickets/2859/5.1plusdownmix.wav -ac 2 -t 100 -bitexact 
>> -c:a aac -y /tmp/2859-frenchspeack-nolibfaac.mp4
>
> After the patch this is file will need -strict unofficial to work, 
> since the downmix channels are not officially recognized in the 
> dwChannelMask.
>
I think downmix channels are part of the RF64 specification, see EBU 
Tech 3306 section 3.1:

"""
3.1 Enhancement for a PCM stereo down mix

No PCM stereo signal is included in the basic Wave Format Extensible.

To include a stereo channel the following is added:

#define SPEAKER_STEREO_LEFT    0x20000000
#define SPEAKER_STEREO_RIGHT    0x40000000
"""

Regards, Tobias
Marton Balint March 21, 2024, 9:40 a.m. UTC | #4
On Thu, 21 Mar 2024, Tobias Rapp wrote:

> On 19/03/2024 20:14, Marton Balint wrote:
>
>> 
>>
>>  On Tue, 19 Mar 2024, Michael Niedermayer wrote:
>>
>>>  On Sun, Mar 17, 2024 at 08:57:29PM +0100, Marton Balint wrote:
>>>>  - Only parse the defined masks in dwChannelMask, unless
>>>>  strict_std_compliance
>>>>    is less than normal. This matches with the behaviour of the wav muxer.
>>>>  - Ignore additional bits in dwChannelMasks as the MS documentation
>>>>  suggests [1]
>>>>  - Assume UNKNOWN channels for missing bits as the MS documentation
>>>>  suggests [1]
>>>>
>>>>  [1]
>>>>  https://learn.microsoft.com/en-us/previous-versions/windows/hardware/design/dn653308(v=vs.85)#details-about-dwchannelmask
>>>>
>>>>  Signed-off-by: Marton Balint <cus@passwd.hu>
>>>>  ---
>>>>   libavformat/riffdec.c | 28 +++++++++++++++++++++++++---
>>>>   1 file changed, 25 insertions(+), 3 deletions(-)
>>>
>>>  breaks:
>>>  ./ffmpeg  -i ~/tickets/2859/5.1plusdownmix.wav -ac 2 -t 100 -bitexact
>>>  -c:a aac -y /tmp/2859-frenchspeack-nolibfaac.mp4
>>
>>  After the patch this is file will need -strict unofficial to work, since
>>  the downmix channels are not officially recognized in the dwChannelMask.
>> 
> I think downmix channels are part of the RF64 specification, see EBU Tech 
> 3306 section 3.1:
>
> """
> 3.1 Enhancement for a PCM stereo down mix
>
> No PCM stereo signal is included in the basic Wave Format Extensible.
>
> To include a stereo channel the following is added:
>
> #define SPEAKER_STEREO_LEFT    0x20000000
> #define SPEAKER_STEREO_RIGHT    0x40000000
> """
>

I was reluctant to add these, because the recommendation which superseded 
this, ITU BS.2088 does not mention these masks.

Nevertheless, you are right, these should be recognized to support 
historical RF64 files. Or we should not even make the distinction of RF64?

Regards,
Marton
Tobias Rapp March 21, 2024, 10:26 a.m. UTC | #5
On 21/03/2024 10:40, Marton Balint wrote:

>
>
> On Thu, 21 Mar 2024, Tobias Rapp wrote:
>
>> On 19/03/2024 20:14, Marton Balint wrote:
>>
>>>
>>>
>>>  On Tue, 19 Mar 2024, Michael Niedermayer wrote:
>>>
>>>>  On Sun, Mar 17, 2024 at 08:57:29PM +0100, Marton Balint wrote:
>>>>>  - Only parse the defined masks in dwChannelMask, unless
>>>>>  strict_std_compliance
>>>>>    is less than normal. This matches with the behaviour of the wav 
>>>>> muxer.
>>>>>  - Ignore additional bits in dwChannelMasks as the MS documentation
>>>>>  suggests [1]
>>>>>  - Assume UNKNOWN channels for missing bits as the MS documentation
>>>>>  suggests [1]
>>>>>
>>>>>  [1]
>>>>>  https://learn.microsoft.com/en-us/previous-versions/windows/hardware/design/dn653308(v=vs.85)#details-about-dwchannelmask 
>>>>>
>>>>>
>>>>>  Signed-off-by: Marton Balint <cus@passwd.hu>
>>>>>  ---
>>>>>   libavformat/riffdec.c | 28 +++++++++++++++++++++++++---
>>>>>   1 file changed, 25 insertions(+), 3 deletions(-)
>>>>
>>>>  breaks:
>>>>  ./ffmpeg  -i ~/tickets/2859/5.1plusdownmix.wav -ac 2 -t 100 -bitexact
>>>>  -c:a aac -y /tmp/2859-frenchspeack-nolibfaac.mp4
>>>
>>>  After the patch this is file will need -strict unofficial to work, 
>>> since
>>>  the downmix channels are not officially recognized in the 
>>> dwChannelMask.
>>>
>> I think downmix channels are part of the RF64 specification, see EBU 
>> Tech 3306 section 3.1:
>>
>> """
>> 3.1 Enhancement for a PCM stereo down mix
>>
>> No PCM stereo signal is included in the basic Wave Format Extensible.
>>
>> To include a stereo channel the following is added:
>>
>> #define SPEAKER_STEREO_LEFT    0x20000000
>> #define SPEAKER_STEREO_RIGHT    0x40000000
>> """
>>
>
> I was reluctant to add these, because the recommendation which 
> superseded this, ITU BS.2088 does not mention these masks.
>
> Nevertheless, you are right, these should be recognized to support 
> historical RF64 files. Or we should not even make the distinction of 
> RF64?
>
WAV files smaller than 2/4GiB could start with the usual "RIFF" bytes 
but still contain these mask bits. From a quick glance at ITU BS.2088 it 
seems it could use the "chna" chunk to achieve a mapping like "5.1 + 
downmix".

Having said that, I have not stumbled over "X plus downmix" WAVE files 
in production yet. So no strong opinion from my side on whether this 
should work without "-strict unofficial", or not.

Regards, Tobias
diff mbox series

Patch

diff --git a/libavformat/riffdec.c b/libavformat/riffdec.c
index 6cc912133c..0c7b2c6bdb 100644
--- a/libavformat/riffdec.c
+++ b/libavformat/riffdec.c
@@ -58,7 +58,7 @@  enum AVCodecID ff_codec_guid_get_id(const AVCodecGuid *guids, ff_asf_guid guid)
  * an openended structure.
  */
 
-static void parse_waveformatex(AVFormatContext *ctx, AVIOContext *pb, AVCodecParameters *par)
+static int parse_waveformatex(AVFormatContext *ctx, AVIOContext *pb, AVCodecParameters *par, int channels)
 {
     ff_asf_guid subformat;
     int bps;
@@ -69,7 +69,26 @@  static void parse_waveformatex(AVFormatContext *ctx, AVIOContext *pb, AVCodecPar
         par->bits_per_coded_sample = bps;
 
     mask = avio_rl32(pb); /* dwChannelMask */
-    av_channel_layout_from_mask(&par->ch_layout, mask);
+    if (ctx->strict_std_compliance >= FF_COMPLIANCE_NORMAL)
+        mask &= 0x3ffff;
+
+    /* According to MS docs, most significant bits should be ignored if mask
+     * has more bits than the number of channels */
+    while (av_popcount64(mask) > channels)
+        mask &= ~(1LL << av_log2(mask));
+
+    /* If the mask has less bits than channels, then we assign the remaining
+     * channels as UNKNOWN. */
+    if (mask && av_popcount64(mask) < channels) {
+        int ret = av_channel_layout_custom_init(&par->ch_layout, channels);
+        if (ret < 0)
+            return ret;
+        for (int i = 0, idx = 0; mask; i++, mask >>= 1)
+            if (mask & 1)
+                par->ch_layout.u.map[idx++].id = i;
+    } else {
+        av_channel_layout_from_mask(&par->ch_layout, mask);
+    }
 
     ff_get_guid(pb, &subformat);
     if (!memcmp(subformat + 4,
@@ -88,6 +107,7 @@  static void parse_waveformatex(AVFormatContext *ctx, AVIOContext *pb, AVCodecPar
                    "unknown subformat:"FF_PRI_GUID"\n",
                    FF_ARG_GUID(subformat));
     }
+    return 0;
 }
 
 /* "big_endian" values are needed for RIFX file format */
@@ -145,7 +165,9 @@  int ff_get_wav_header(AVFormatContext *ctx, AVIOContext *pb,
         size  -= 18;
         cbSize = FFMIN(size, cbSize);
         if (cbSize >= 22 && id == 0xfffe) { /* WAVEFORMATEXTENSIBLE */
-            parse_waveformatex(ctx, pb, par);
+            ret = parse_waveformatex(ctx, pb, par, channels);
+            if (ret < 0)
+                return ret;
             cbSize -= 22;
             size   -= 22;
         }