diff mbox series

[FFmpeg-devel,v2] avcodec/atrac3plus: reorder channels to match the output layout

Message ID 20221031202845.1860-1-jamrial@gmail.com
State New
Headers show
Series [FFmpeg-devel,v2] avcodec/atrac3plus: reorder channels to match the output layout | expand

Checks

Context Check Description
yinshiyou/make_loongarch64 success Make finished
yinshiyou/make_fate_loongarch64 success Make fate finished
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

James Almer Oct. 31, 2022, 8:28 p.m. UTC
The order in which the channels are coded in the bitstream do not always follow
the native, bitmask-based order of channels both signaled by the WAV container
and forced by this same decoder. This is the case with layouts containing an
LFE channel, as it's always coded last.

Fixes ticket #9964.

Signed-off-by: James Almer <jamrial@gmail.com>
---
 libavcodec/atrac3plusdec.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

Comments

Andreas Rheinhardt Oct. 31, 2022, 10:13 p.m. UTC | #1
James Almer:
> The order in which the channels are coded in the bitstream do not always follow
> the native, bitmask-based order of channels both signaled by the WAV container
> and forced by this same decoder. This is the case with layouts containing an
> LFE channel, as it's always coded last.
> 
> Fixes ticket #9964.
> 
> Signed-off-by: James Almer <jamrial@gmail.com>
> ---
>  libavcodec/atrac3plusdec.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/libavcodec/atrac3plusdec.c b/libavcodec/atrac3plusdec.c
> index ee71645a3c..9e12f21930 100644
> --- a/libavcodec/atrac3plusdec.c
> +++ b/libavcodec/atrac3plusdec.c
> @@ -48,6 +48,17 @@
>  #include "atrac.h"
>  #include "atrac3plus.h"
>  
> +static uint8_t channel_map[8][8] = {
> +    { 0, },
> +    { 0, 1, },
> +    { 0, 1, 2, },
> +    { 0, 1, 2, 3, },
> +    { 0, },
> +    { 0, 1, 2, 4, 5, 3, },
> +    { 0, 1, 2, 4, 5, 8, 3, },
> +    { 0, 1, 2, 4, 5, 9, 10, 3, },
> +};
> +
>  typedef struct ATRAC3PContext {
>      GetBitContext gb;
>      AVFloatDSPContext *fdsp;
> @@ -378,7 +389,7 @@ static int atrac3p_decode_frame(AVCodecContext *avctx, AVFrame *frame,
>                            channels_to_process, avctx);
>  
>          for (i = 0; i < channels_to_process; i++)
> -            memcpy(samples_p[out_ch_index + i], ctx->outp_buf[i],
> +            memcpy(samples_p[channel_map[frame->ch_layout.nb_channels - 1][out_ch_index + i]], ctx->outp_buf[i],
>                     ATRAC3P_FRAME_SAMPLES * sizeof(**samples_p));
>  
>          ch_block++;

Looking at the last two entries, it seems to me that you simply used the
numerical values of the AV_CHAN_* constants, i.e. as if you wanted to
write { AV_CHAN_FRONT_LEFT, AV_CHAN_FRONT_RIGHT, AV_CHAN_FRONT_CENTER,
AV_CHAN_BACK_LEFT, AV_CHAN_BACK_RIGHT, AV_CHAN_SIDE_LEFT,
AV_CHAN_SIDE_RIGHT, AV_CHAN_LOW_FREQUENCY } for the last entry. This is
wrong, as it conflates the enum AVChannel domain with the index domain;
it will segfault for 6.1 and 7.1, because there are no data planes with
indices 8, 9 and 10 in the output frame.

The array for 6.1 is { 0, 1, 2, 4, 5, 6, 3 }, for 7.1 it is { 0, 1, 2,
4, 5, 6, 7, 3, } (presuming that your first patch was right). The
derivation for 6.1 is as follows: The first channel in atrac is FL,
which is also the first channel in AV_CHANNEL_LAYOUT_6POINT1_BACK. So
the first entry is 0; the next channel is FR which is the second channel
in AV_CHANNEL_LAYOUT_6POINT1_BACK, so the second entry is 1. The next
atrac entry is FC, which is also the next entry in
AV_CHANNEL_LAYOUT_6POINT1_BACK, so the next entry is 2. The next atrac
entry is BL, which is not the next channel in
AV_CHANNEL_LAYOUT_6POINT1_BACK (which is lfe), but the one after that.
So the next entry is four. Similarly, the next entry is five. The atrac
entry after that is BC, which is the next (and last) entry of
AV_CHANNEL_LAYOUT_6POINT1_BACK and has index six. It doesn't matter for
this that there are several channels in enum AVChannel between BR and
BC, as these channels are not present in AV_CHANNEL_LAYOUT_6POINT1_BACK
(it would also not matter if there were any gaps in the values of the
AV_CHAN_* constants in between). The next (and last) atrac entry is LFE,
which is the fourth channel in AV_CHANNEL_LAYOUT_6POINT1_BACK and
therefore has index 3.

Is it really absolutely guaranteed that atrac only has one channel
layout per channel count? It seems to me that adding a const uint8_t
*channel_map to the context that gets set like ctx->channel_map = (const
uint8_t[]){ 0, 1, 2, 4, 5, 6, 3 } (for 6.1) would be simpler.

- Andreas
James Almer Oct. 31, 2022, 10:30 p.m. UTC | #2
On 10/31/2022 7:13 PM, Andreas Rheinhardt wrote:
> James Almer:
>> The order in which the channels are coded in the bitstream do not always follow
>> the native, bitmask-based order of channels both signaled by the WAV container
>> and forced by this same decoder. This is the case with layouts containing an
>> LFE channel, as it's always coded last.
>>
>> Fixes ticket #9964.
>>
>> Signed-off-by: James Almer <jamrial@gmail.com>
>> ---
>>   libavcodec/atrac3plusdec.c | 13 ++++++++++++-
>>   1 file changed, 12 insertions(+), 1 deletion(-)
>>
>> diff --git a/libavcodec/atrac3plusdec.c b/libavcodec/atrac3plusdec.c
>> index ee71645a3c..9e12f21930 100644
>> --- a/libavcodec/atrac3plusdec.c
>> +++ b/libavcodec/atrac3plusdec.c
>> @@ -48,6 +48,17 @@
>>   #include "atrac.h"
>>   #include "atrac3plus.h"
>>   
>> +static uint8_t channel_map[8][8] = {
>> +    { 0, },
>> +    { 0, 1, },
>> +    { 0, 1, 2, },
>> +    { 0, 1, 2, 3, },
>> +    { 0, },
>> +    { 0, 1, 2, 4, 5, 3, },
>> +    { 0, 1, 2, 4, 5, 8, 3, },
>> +    { 0, 1, 2, 4, 5, 9, 10, 3, },
>> +};
>> +
>>   typedef struct ATRAC3PContext {
>>       GetBitContext gb;
>>       AVFloatDSPContext *fdsp;
>> @@ -378,7 +389,7 @@ static int atrac3p_decode_frame(AVCodecContext *avctx, AVFrame *frame,
>>                             channels_to_process, avctx);
>>   
>>           for (i = 0; i < channels_to_process; i++)
>> -            memcpy(samples_p[out_ch_index + i], ctx->outp_buf[i],
>> +            memcpy(samples_p[channel_map[frame->ch_layout.nb_channels - 1][out_ch_index + i]], ctx->outp_buf[i],
>>                      ATRAC3P_FRAME_SAMPLES * sizeof(**samples_p));
>>   
>>           ch_block++;
> 
> Looking at the last two entries, it seems to me that you simply used the
> numerical values of the AV_CHAN_* constants, i.e. as if you wanted to
> write { AV_CHAN_FRONT_LEFT, AV_CHAN_FRONT_RIGHT, AV_CHAN_FRONT_CENTER,
> AV_CHAN_BACK_LEFT, AV_CHAN_BACK_RIGHT, AV_CHAN_SIDE_LEFT,
> AV_CHAN_SIDE_RIGHT, AV_CHAN_LOW_FREQUENCY } for the last entry. This is
> wrong, as it conflates the enum AVChannel domain with the index domain;
> it will segfault for 6.1 and 7.1, because there are no data planes with
> indices 8, 9 and 10 in the output frame.

Yeah, i mixed ids and indexes. Will fix.

> 
> The array for 6.1 is { 0, 1, 2, 4, 5, 6, 3 }, for 7.1 it is { 0, 1, 2,
> 4, 5, 6, 7, 3, } (presuming that your first patch was right). The
> derivation for 6.1 is as follows: The first channel in atrac is FL,
> which is also the first channel in AV_CHANNEL_LAYOUT_6POINT1_BACK. So
> the first entry is 0; the next channel is FR which is the second channel
> in AV_CHANNEL_LAYOUT_6POINT1_BACK, so the second entry is 1. The next
> atrac entry is FC, which is also the next entry in
> AV_CHANNEL_LAYOUT_6POINT1_BACK, so the next entry is 2. The next atrac
> entry is BL, which is not the next channel in
> AV_CHANNEL_LAYOUT_6POINT1_BACK (which is lfe), but the one after that.
> So the next entry is four. Similarly, the next entry is five. The atrac
> entry after that is BC, which is the next (and last) entry of
> AV_CHANNEL_LAYOUT_6POINT1_BACK and has index six. It doesn't matter for
> this that there are several channels in enum AVChannel between BR and
> BC, as these channels are not present in AV_CHANNEL_LAYOUT_6POINT1_BACK
> (it would also not matter if there were any gaps in the values of the
> AV_CHAN_* constants in between). The next (and last) atrac entry is LFE,
> which is the fourth channel in AV_CHANNEL_LAYOUT_6POINT1_BACK and
> therefore has index 3.
> 
> Is it really absolutely guaranteed that atrac only has one channel
> layout per channel count? It seems to me that adding a const uint8_t
> *channel_map to the context that gets set like ctx->channel_map = (const
> uint8_t[]){ 0, 1, 2, 4, 5, 6, 3 } (for 6.1) would be simpler.

I don't know if it's guaranteed, but that's what this decoder assumes 
directly during init with user set values (so nothing read from the 
bitstream for it), so I'll not assume the opposite.

> 
> - Andreas
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
James Almer Oct. 31, 2022, 11:18 p.m. UTC | #3
On 10/31/2022 7:13 PM, Andreas Rheinhardt wrote:
> James Almer:
>> The order in which the channels are coded in the bitstream do not always follow
>> the native, bitmask-based order of channels both signaled by the WAV container
>> and forced by this same decoder. This is the case with layouts containing an
>> LFE channel, as it's always coded last.
>>
>> Fixes ticket #9964.
>>
>> Signed-off-by: James Almer <jamrial@gmail.com>
>> ---
>>   libavcodec/atrac3plusdec.c | 13 ++++++++++++-
>>   1 file changed, 12 insertions(+), 1 deletion(-)
>>
>> diff --git a/libavcodec/atrac3plusdec.c b/libavcodec/atrac3plusdec.c
>> index ee71645a3c..9e12f21930 100644
>> --- a/libavcodec/atrac3plusdec.c
>> +++ b/libavcodec/atrac3plusdec.c
>> @@ -48,6 +48,17 @@
>>   #include "atrac.h"
>>   #include "atrac3plus.h"
>>   
>> +static uint8_t channel_map[8][8] = {
>> +    { 0, },
>> +    { 0, 1, },
>> +    { 0, 1, 2, },
>> +    { 0, 1, 2, 3, },
>> +    { 0, },
>> +    { 0, 1, 2, 4, 5, 3, },
>> +    { 0, 1, 2, 4, 5, 8, 3, },
>> +    { 0, 1, 2, 4, 5, 9, 10, 3, },
>> +};
>> +
>>   typedef struct ATRAC3PContext {
>>       GetBitContext gb;
>>       AVFloatDSPContext *fdsp;
>> @@ -378,7 +389,7 @@ static int atrac3p_decode_frame(AVCodecContext *avctx, AVFrame *frame,
>>                             channels_to_process, avctx);
>>   
>>           for (i = 0; i < channels_to_process; i++)
>> -            memcpy(samples_p[out_ch_index + i], ctx->outp_buf[i],
>> +            memcpy(samples_p[channel_map[frame->ch_layout.nb_channels - 1][out_ch_index + i]], ctx->outp_buf[i],
>>                      ATRAC3P_FRAME_SAMPLES * sizeof(**samples_p));
>>   
>>           ch_block++;
> 
> Looking at the last two entries, it seems to me that you simply used the
> numerical values of the AV_CHAN_* constants, i.e. as if you wanted to
> write { AV_CHAN_FRONT_LEFT, AV_CHAN_FRONT_RIGHT, AV_CHAN_FRONT_CENTER,
> AV_CHAN_BACK_LEFT, AV_CHAN_BACK_RIGHT, AV_CHAN_SIDE_LEFT,
> AV_CHAN_SIDE_RIGHT, AV_CHAN_LOW_FREQUENCY } for the last entry. This is
> wrong, as it conflates the enum AVChannel domain with the index domain;
> it will segfault for 6.1 and 7.1, because there are no data planes with
> indices 8, 9 and 10 in the output frame.
> 
> The array for 6.1 is { 0, 1, 2, 4, 5, 6, 3 }, for 7.1 it is { 0, 1, 2,
> 4, 5, 6, 7, 3, } (presuming that your first patch was right). The
> derivation for 6.1 is as follows: The first channel in atrac is FL,
> which is also the first channel in AV_CHANNEL_LAYOUT_6POINT1_BACK. So
> the first entry is 0; the next channel is FR which is the second channel
> in AV_CHANNEL_LAYOUT_6POINT1_BACK, so the second entry is 1. The next
> atrac entry is FC, which is also the next entry in
> AV_CHANNEL_LAYOUT_6POINT1_BACK, so the next entry is 2. The next atrac
> entry is BL, which is not the next channel in
> AV_CHANNEL_LAYOUT_6POINT1_BACK (which is lfe), but the one after that.
> So the next entry is four. Similarly, the next entry is five. The atrac
> entry after that is BC, which is the next (and last) entry of
> AV_CHANNEL_LAYOUT_6POINT1_BACK and has index six. It doesn't matter for
> this that there are several channels in enum AVChannel between BR and
> BC, as these channels are not present in AV_CHANNEL_LAYOUT_6POINT1_BACK
> (it would also not matter if there were any gaps in the values of the
> AV_CHAN_* constants in between). The next (and last) atrac entry is LFE,
> which is the fourth channel in AV_CHANNEL_LAYOUT_6POINT1_BACK and
> therefore has index 3.
> 
> Is it really absolutely guaranteed that atrac only has one channel
> layout per channel count? It seems to me that adding a const uint8_t
> *channel_map to the context that gets set like ctx->channel_map = (const
> uint8_t[]){ 0, 1, 2, 4, 5, 6, 3 } (for 6.1) would be simpler.

Unless i'm missing something, this would be local to set_channel_params().

> 
> - Andreas
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Andreas Rheinhardt Oct. 31, 2022, 11:44 p.m. UTC | #4
James Almer:
> On 10/31/2022 7:13 PM, Andreas Rheinhardt wrote:
>> James Almer:
>>> The order in which the channels are coded in the bitstream do not
>>> always follow
>>> the native, bitmask-based order of channels both signaled by the WAV
>>> container
>>> and forced by this same decoder. This is the case with layouts
>>> containing an
>>> LFE channel, as it's always coded last.
>>>
>>> Fixes ticket #9964.
>>>
>>> Signed-off-by: James Almer <jamrial@gmail.com>
>>> ---
>>>   libavcodec/atrac3plusdec.c | 13 ++++++++++++-
>>>   1 file changed, 12 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/libavcodec/atrac3plusdec.c b/libavcodec/atrac3plusdec.c
>>> index ee71645a3c..9e12f21930 100644
>>> --- a/libavcodec/atrac3plusdec.c
>>> +++ b/libavcodec/atrac3plusdec.c
>>> @@ -48,6 +48,17 @@
>>>   #include "atrac.h"
>>>   #include "atrac3plus.h"
>>>   +static uint8_t channel_map[8][8] = {
>>> +    { 0, },
>>> +    { 0, 1, },
>>> +    { 0, 1, 2, },
>>> +    { 0, 1, 2, 3, },
>>> +    { 0, },
>>> +    { 0, 1, 2, 4, 5, 3, },
>>> +    { 0, 1, 2, 4, 5, 8, 3, },
>>> +    { 0, 1, 2, 4, 5, 9, 10, 3, },
>>> +};
>>> +
>>>   typedef struct ATRAC3PContext {
>>>       GetBitContext gb;
>>>       AVFloatDSPContext *fdsp;
>>> @@ -378,7 +389,7 @@ static int atrac3p_decode_frame(AVCodecContext
>>> *avctx, AVFrame *frame,
>>>                             channels_to_process, avctx);
>>>             for (i = 0; i < channels_to_process; i++)
>>> -            memcpy(samples_p[out_ch_index + i], ctx->outp_buf[i],
>>> +           
>>> memcpy(samples_p[channel_map[frame->ch_layout.nb_channels -
>>> 1][out_ch_index + i]], ctx->outp_buf[i],
>>>                      ATRAC3P_FRAME_SAMPLES * sizeof(**samples_p));
>>>             ch_block++;
>>
>> Looking at the last two entries, it seems to me that you simply used the
>> numerical values of the AV_CHAN_* constants, i.e. as if you wanted to
>> write { AV_CHAN_FRONT_LEFT, AV_CHAN_FRONT_RIGHT, AV_CHAN_FRONT_CENTER,
>> AV_CHAN_BACK_LEFT, AV_CHAN_BACK_RIGHT, AV_CHAN_SIDE_LEFT,
>> AV_CHAN_SIDE_RIGHT, AV_CHAN_LOW_FREQUENCY } for the last entry. This is
>> wrong, as it conflates the enum AVChannel domain with the index domain;
>> it will segfault for 6.1 and 7.1, because there are no data planes with
>> indices 8, 9 and 10 in the output frame.
>>
>> The array for 6.1 is { 0, 1, 2, 4, 5, 6, 3 }, for 7.1 it is { 0, 1, 2,
>> 4, 5, 6, 7, 3, } (presuming that your first patch was right). The
>> derivation for 6.1 is as follows: The first channel in atrac is FL,
>> which is also the first channel in AV_CHANNEL_LAYOUT_6POINT1_BACK. So
>> the first entry is 0; the next channel is FR which is the second channel
>> in AV_CHANNEL_LAYOUT_6POINT1_BACK, so the second entry is 1. The next
>> atrac entry is FC, which is also the next entry in
>> AV_CHANNEL_LAYOUT_6POINT1_BACK, so the next entry is 2. The next atrac
>> entry is BL, which is not the next channel in
>> AV_CHANNEL_LAYOUT_6POINT1_BACK (which is lfe), but the one after that.
>> So the next entry is four. Similarly, the next entry is five. The atrac
>> entry after that is BC, which is the next (and last) entry of
>> AV_CHANNEL_LAYOUT_6POINT1_BACK and has index six. It doesn't matter for
>> this that there are several channels in enum AVChannel between BR and
>> BC, as these channels are not present in AV_CHANNEL_LAYOUT_6POINT1_BACK
>> (it would also not matter if there were any gaps in the values of the
>> AV_CHAN_* constants in between). The next (and last) atrac entry is LFE,
>> which is the fourth channel in AV_CHANNEL_LAYOUT_6POINT1_BACK and
>> therefore has index 3.
>>
>> Is it really absolutely guaranteed that atrac only has one channel
>> layout per channel count? It seems to me that adding a const uint8_t
>> *channel_map to the context that gets set like ctx->channel_map = (const
>> uint8_t[]){ 0, 1, 2, 4, 5, 6, 3 } (for 6.1) would be simpler.
> 
> Unless i'm missing something, this would be local to set_channel_params().
> 

Correct.

- Andreas
diff mbox series

Patch

diff --git a/libavcodec/atrac3plusdec.c b/libavcodec/atrac3plusdec.c
index ee71645a3c..9e12f21930 100644
--- a/libavcodec/atrac3plusdec.c
+++ b/libavcodec/atrac3plusdec.c
@@ -48,6 +48,17 @@ 
 #include "atrac.h"
 #include "atrac3plus.h"
 
+static uint8_t channel_map[8][8] = {
+    { 0, },
+    { 0, 1, },
+    { 0, 1, 2, },
+    { 0, 1, 2, 3, },
+    { 0, },
+    { 0, 1, 2, 4, 5, 3, },
+    { 0, 1, 2, 4, 5, 8, 3, },
+    { 0, 1, 2, 4, 5, 9, 10, 3, },
+};
+
 typedef struct ATRAC3PContext {
     GetBitContext gb;
     AVFloatDSPContext *fdsp;
@@ -378,7 +389,7 @@  static int atrac3p_decode_frame(AVCodecContext *avctx, AVFrame *frame,
                           channels_to_process, avctx);
 
         for (i = 0; i < channels_to_process; i++)
-            memcpy(samples_p[out_ch_index + i], ctx->outp_buf[i],
+            memcpy(samples_p[channel_map[frame->ch_layout.nb_channels - 1][out_ch_index + i]], ctx->outp_buf[i],
                    ATRAC3P_FRAME_SAMPLES * sizeof(**samples_p));
 
         ch_block++;