diff mbox series

[FFmpeg-devel,v2] avcodec/jpegxl_parser: check ANS cluster alphabet size vs bundle size

Message ID 20231225170417.153992-1-leo.izen@gmail.com
State New
Headers show
Series [FFmpeg-devel,v2] avcodec/jpegxl_parser: check ANS cluster alphabet size vs bundle size | expand

Checks

Context Check Description
yinshiyou/make_loongarch64 success Make finished
yinshiyou/make_fate_loongarch64 success Make fate finished
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Leo Izen Dec. 25, 2023, 5:04 p.m. UTC
The specification doesn't mention that clusters cannot have alphabet
sizes greater than 1 << bundle->log_alphabet_size, but the reference
implementation rejects these entropy streams as invalid, so we should
too. Refusing to do so can overflow a stack variable on line 556 that
should be large enough otherwise.

Fixes #10738.

Found-by: Zeng Yunxiang and Li Zeyuan
Signed-off-by: Leo Izen <leo.izen@gmail.com>
---
 libavcodec/jpegxl_parser.c | 28 +++++++++++++++++++---------
 1 file changed, 19 insertions(+), 9 deletions(-)

Comments

Michael Niedermayer Dec. 25, 2023, 9:09 p.m. UTC | #1
On Mon, Dec 25, 2023 at 12:04:17PM -0500, Leo Izen wrote:
> The specification doesn't mention that clusters cannot have alphabet
> sizes greater than 1 << bundle->log_alphabet_size, but the reference
> implementation rejects these entropy streams as invalid, so we should
> too. Refusing to do so can overflow a stack variable on line 556 that
> should be large enough otherwise.
> 
> Fixes #10738.
> 
> Found-by: Zeng Yunxiang and Li Zeyuan
> Signed-off-by: Leo Izen <leo.izen@gmail.com>
> ---
>  libavcodec/jpegxl_parser.c | 28 +++++++++++++++++++---------
>  1 file changed, 19 insertions(+), 9 deletions(-)
> 
> diff --git a/libavcodec/jpegxl_parser.c b/libavcodec/jpegxl_parser.c
> index 006eb6b295..f026fda9ac 100644
> --- a/libavcodec/jpegxl_parser.c
> +++ b/libavcodec/jpegxl_parser.c
> @@ -64,26 +64,26 @@ typedef struct JXLSymbolDistribution {
>      int log_bucket_size;
>      /* this is the actual size of the alphabet */
>      int alphabet_size;
> -    /* ceil(log(alphabet_size)) */
> -    int log_alphabet_size;
>  
>      /* for prefix code distributions */
>      VLC vlc;
>      /* in case bits == 0 */
>      uint32_t default_symbol;
> +    /* ceil(log(alphabet_size)) */
> +    int log_alphabet_size;
>  

that seems unneeded

thx

[...]
Leo Izen Dec. 26, 2023, 2:23 p.m. UTC | #2
On 12/25/23 15:09, Michael Niedermayer wrote:
> On Mon, Dec 25, 2023 at 12:04:17PM -0500, Leo Izen wrote:
>> The specification doesn't mention that clusters cannot have alphabet
>> sizes greater than 1 << bundle->log_alphabet_size, but the reference
>> implementation rejects these entropy streams as invalid, so we should
>> too. Refusing to do so can overflow a stack variable on line 556 that
>> should be large enough otherwise.
>>
>> Fixes #10738.
>>
>> Found-by: Zeng Yunxiang and Li Zeyuan
>> Signed-off-by: Leo Izen <leo.izen@gmail.com>
>> ---
>>   libavcodec/jpegxl_parser.c | 28 +++++++++++++++++++---------
>>   1 file changed, 19 insertions(+), 9 deletions(-)
>>
>> diff --git a/libavcodec/jpegxl_parser.c b/libavcodec/jpegxl_parser.c
>> index 006eb6b295..f026fda9ac 100644
>> --- a/libavcodec/jpegxl_parser.c
>> +++ b/libavcodec/jpegxl_parser.c
>> @@ -64,26 +64,26 @@ typedef struct JXLSymbolDistribution {
>>       int log_bucket_size;
>>       /* this is the actual size of the alphabet */
>>       int alphabet_size;
>> -    /* ceil(log(alphabet_size)) */
>> -    int log_alphabet_size;
>>   
>>       /* for prefix code distributions */
>>       VLC vlc;
>>       /* in case bits == 0 */
>>       uint32_t default_symbol;
>> +    /* ceil(log(alphabet_size)) */
>> +    int log_alphabet_size;
>>   
> 
> that seems unneeded
> 

dist->log_alphaebet_size is only used for prefix code distributions so I 
moved it for clarity. I can also remove this change from this commit if 
you think it's off-topic.

In either case, is the commit okay, apart from this one change? If so 
I'm going to merge it (after I remove this one change from the diff).

- Leo Izen (Traneptora)
Michael Niedermayer Dec. 26, 2023, 6:37 p.m. UTC | #3
On Tue, Dec 26, 2023 at 08:23:35AM -0600, Leo Izen wrote:
> On 12/25/23 15:09, Michael Niedermayer wrote:
> > On Mon, Dec 25, 2023 at 12:04:17PM -0500, Leo Izen wrote:
> > > The specification doesn't mention that clusters cannot have alphabet
> > > sizes greater than 1 << bundle->log_alphabet_size, but the reference
> > > implementation rejects these entropy streams as invalid, so we should
> > > too. Refusing to do so can overflow a stack variable on line 556 that
> > > should be large enough otherwise.
> > > 
> > > Fixes #10738.
> > > 
> > > Found-by: Zeng Yunxiang and Li Zeyuan
> > > Signed-off-by: Leo Izen <leo.izen@gmail.com>
> > > ---
> > >   libavcodec/jpegxl_parser.c | 28 +++++++++++++++++++---------
> > >   1 file changed, 19 insertions(+), 9 deletions(-)
> > > 
> > > diff --git a/libavcodec/jpegxl_parser.c b/libavcodec/jpegxl_parser.c
> > > index 006eb6b295..f026fda9ac 100644
> > > --- a/libavcodec/jpegxl_parser.c
> > > +++ b/libavcodec/jpegxl_parser.c
> > > @@ -64,26 +64,26 @@ typedef struct JXLSymbolDistribution {
> > >       int log_bucket_size;
> > >       /* this is the actual size of the alphabet */
> > >       int alphabet_size;
> > > -    /* ceil(log(alphabet_size)) */
> > > -    int log_alphabet_size;
> > >       /* for prefix code distributions */
> > >       VLC vlc;
> > >       /* in case bits == 0 */
> > >       uint32_t default_symbol;
> > > +    /* ceil(log(alphabet_size)) */
> > > +    int log_alphabet_size;
> > 
> > that seems unneeded
> > 
> 
> dist->log_alphaebet_size is only used for prefix code distributions so I
> moved it for clarity. I can also remove this change from this commit if you
> think it's off-topic.

it belongs in a seperate patch

also the 258 -> 256 should be a seperate patch IMHO
Its not part of fixing the ticket


> 
> In either case, is the commit okay, apart from this one change? If so I'm
> going to merge it (after I remove this one change from the diff).

yes

thx

[...]
diff mbox series

Patch

diff --git a/libavcodec/jpegxl_parser.c b/libavcodec/jpegxl_parser.c
index 006eb6b295..f026fda9ac 100644
--- a/libavcodec/jpegxl_parser.c
+++ b/libavcodec/jpegxl_parser.c
@@ -64,26 +64,26 @@  typedef struct JXLSymbolDistribution {
     int log_bucket_size;
     /* this is the actual size of the alphabet */
     int alphabet_size;
-    /* ceil(log(alphabet_size)) */
-    int log_alphabet_size;
 
     /* for prefix code distributions */
     VLC vlc;
     /* in case bits == 0 */
     uint32_t default_symbol;
+    /* ceil(log(alphabet_size)) */
+    int log_alphabet_size;
 
     /*
      * each (1 << log_alphabet_size) length
      * with log_alphabet_size <= 8
      */
     /* frequencies associated with this Distribution */
-    uint32_t freq[258];
+    uint32_t freq[256];
     /* cutoffs for using the symbol table */
-    uint16_t cutoffs[258];
+    uint16_t cutoffs[256];
     /* the symbol table for this distribution */
-    uint16_t symbols[258];
+    uint16_t symbols[256];
     /* the offset for symbols */
-    uint16_t offsets[258];
+    uint16_t offsets[256];
 
     /* if this distribution contains only one symbol this is its index */
     int uniq_pos;
@@ -382,13 +382,13 @@  static int populate_distribution(GetBitContext *gb, JXLSymbolDistribution *dist,
     int len = 0, shift, omit_log = -1, omit_pos = -1;
     int prev = 0, num_same = 0;
     uint32_t total_count = 0;
-    uint8_t logcounts[258] = { 0 };
-    uint8_t same[258] = { 0 };
+    uint8_t logcounts[256] = { 0 };
+    uint8_t same[256] = { 0 };
+    const int table_size = 1 << log_alphabet_size;
     dist->uniq_pos = -1;
 
     if (get_bits1(gb)) {
         /* simple code */
-        dist->alphabet_size = 256;
         if (get_bits1(gb)) {
             uint8_t v1 = jxl_u8(gb);
             uint8_t v2 = jxl_u8(gb);
@@ -398,17 +398,24 @@  static int populate_distribution(GetBitContext *gb, JXLSymbolDistribution *dist,
             dist->freq[v2] = (1 << 12) - dist->freq[v1];
             if (!dist->freq[v1])
                 dist->uniq_pos = v2;
+            dist->alphabet_size = 1 + FFMAX(v1, v2);
         } else {
             uint8_t x = jxl_u8(gb);
             dist->freq[x] = 1 << 12;
             dist->uniq_pos = x;
+            dist->alphabet_size = 1 + x;
         }
+        if (dist->alphabet_size > table_size)
+            return AVERROR_INVALIDDATA;
+
         return 0;
     }
 
     if (get_bits1(gb)) {
         /* flat code */
         dist->alphabet_size = jxl_u8(gb) + 1;
+        if (dist->alphabet_size > table_size)
+            return AVERROR_INVALIDDATA;
         for (int i = 0; i < dist->alphabet_size; i++)
             dist->freq[i] = (1 << 12) / dist->alphabet_size;
         for (int i = 0; i < (1 << 12) % dist->alphabet_size; i++)
@@ -426,6 +433,9 @@  static int populate_distribution(GetBitContext *gb, JXLSymbolDistribution *dist,
         return AVERROR_INVALIDDATA;
 
     dist->alphabet_size = jxl_u8(gb) + 3;
+    if (dist->alphabet_size > table_size)
+        return AVERROR_INVALIDDATA;
+
     for (int i = 0; i < dist->alphabet_size; i++) {
         logcounts[i] = get_vlc2(gb, dist_prefix_table, 7, 1);
         if (logcounts[i] == 13) {