diff mbox series

[FFmpeg-devel] lavc/ffv1: Properly check that the 4th and 5th quant tables are zeroes

Message ID 20200102152225.182072-1-derek.buitenhuis@gmail.com
State Superseded
Headers show
Series [FFmpeg-devel] lavc/ffv1: Properly check that the 4th and 5th quant tables are zeroes
Related show

Checks

Context Check Description
andriy/ffmpeg-patchwork pending
andriy/ffmpeg-patchwork success Applied patch
andriy/ffmpeg-patchwork success Configure finished
andriy/ffmpeg-patchwork success Make finished
andriy/ffmpeg-patchwork success Make fate finished

Commit Message

Derek Buitenhuis Jan. 2, 2020, 3:22 p.m. UTC
Currently, the decoder checks the 128th value of the 4th quant table during
while deriving the context on each sample, in order to speed itself up. This
is due to relying on the behavior of FFmpeg's FFV1 encoder, in which if that
value is zero, the entire 4th and 5th quant tables are assumed to be entirely
zero.

This does not match the FFV1 spec, which has no such restriction, and after
some discussion, it was decided to fix FFmpeg to abide by the spec, rather
than change the spec.

We will now check whether the 4th and 5th quant tables are zero properly,
using a full table compare, once, when they are read in. There should be
no speed loss with this method.

For further context, the FFV1 issue in question is located at:

    https://github.com/FFmpeg/FFV1/issues/169

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
---
 libavcodec/ffv1.h          | 15 +++++++++++++++
 libavcodec/ffv1_template.c |  2 +-
 libavcodec/ffv1dec.c       |  5 +++++
 libavcodec/ffv1enc.c       |  6 ++++++
 4 files changed, 27 insertions(+), 1 deletion(-)

Comments

Jerome Martinez Jan. 2, 2020, 4:17 p.m. UTC | #1
On 02/01/2020 16:22, Derek Buitenhuis wrote:
> Currently, the decoder checks the 128th value of the 4th quant table during
> while deriving the context on each sample, in order to speed itself up. This
> is due to relying on the behavior of FFmpeg's FFV1 encoder, in which if that
> value is zero, the entire 4th and 5th quant tables are assumed to be entirely
> zero.
>
> This does not match the FFV1 spec, which has no such restriction, and after
> some discussion, it was decided to fix FFmpeg to abide by the spec, rather
> than change the spec.

Thanks for having taken care of this issue.

> [...]
> @@ -735,6 +737,7 @@ static int read_header(FFV1Context *f)
>               f->chroma_h_shift, f->chroma_v_shift, f->avctx->pix_fmt);
>       if (f->version < 2) {
>           context_count = read_quant_tables(c, f->quant_table);
> +        f->three_quant_table = is_three_quant_tables(f->quant_table);
>           if (context_count < 0) {
>               av_log(f->avctx, AV_LOG_ERROR, "read_quant_table error\n");
>               return AVERROR_INVALIDDATA;
> @@ -797,9 +800,11 @@ static int read_header(FFV1Context *f)
>                   p->quant_table_index = idx;
>                   memcpy(p->quant_table, f->quant_tables[idx],
>                          sizeof(p->quant_table));
> +                p->three_quant_tables = f->three_quant_tables[idx];
>                   context_count = f->context_count[idx];
>               } else {
>                   memcpy(p->quant_table, f->quant_table, sizeof(p->quant_table));
> +                p->three_quant_tables = f->three_quant_table;
>               }

Couldn't three_quant_table be a local value instead of being stored in 
the context?


> diff --git a/libavcodec/ffv1enc.c b/libavcodec/ffv1enc.c

I may have missed something, but I don't get what the additional code in 
the encoding part does, value seems not used.
And I am reluctant to change anything in the encoder in this patch as we 
consider that the issue is "only" a too aggressive optimization in the 
decoder.

Jérôme
Derek Buitenhuis Jan. 2, 2020, 4:27 p.m. UTC | #2
On 02/01/2020 16:17, Jerome Martinez wrote:
>> @@ -797,9 +800,11 @@ static int read_header(FFV1Context *f)
>>                   p->quant_table_index = idx;
>>                   memcpy(p->quant_table, f->quant_tables[idx],
>>                          sizeof(p->quant_table));
>> +                p->three_quant_tables = f->three_quant_tables[idx];
>>                   context_count = f->context_count[idx];
>>               } else {
>>                   memcpy(p->quant_table, f->quant_table, sizeof(p->quant_table));
>> +                p->three_quant_tables = f->three_quant_table;
>>               }
> 
> Couldn't three_quant_table be a local value instead of being stored in 
> the context?

I'm not sure that is possible - there is a subtle ifference between the
->quant_table and ->quant_tables (i.e. one is for version 2, which does
not have extradata). I followed what is done for the the quant table itself.

Michael can likely comment on it.

>> diff --git a/libavcodec/ffv1enc.c b/libavcodec/ffv1enc.c
> 
> I may have missed something, but I don't get what the additional code in 
> the encoding part does, value seems not used.
> And I am reluctant to change anything in the encoder in this patch as we 
> consider that the issue is "only" a too aggressive optimization in the 
> decoder.

The encoder does use it, because the encoder calls get_context, of which
the template is shared between the encoder and decoder, so it has to be
changed. The optimization is/was for both encoding and decoding.

- Derek
Michael Niedermayer Jan. 2, 2020, 11:09 p.m. UTC | #3
On Thu, Jan 02, 2020 at 03:22:25PM +0000, Derek Buitenhuis wrote:
> Currently, the decoder checks the 128th value of the 4th quant table during
> while deriving the context on each sample, in order to speed itself up. This
> is due to relying on the behavior of FFmpeg's FFV1 encoder, in which if that
> value is zero, the entire 4th and 5th quant tables are assumed to be entirely
> zero.

I think if entry 128 is 0 then the whole table must be 0.
If that is the case, checking the entry 128 of table 4 and 5 would be enough
and caching the entry comparission is maybe not needed.

thx

[...]
Derek Buitenhuis Jan. 3, 2020, 9:59 p.m. UTC | #4
On 02/01/2020 23:09, Michael Niedermayer wrote:
> I think if entry 128 is 0 then the whole table must be 0.
> If that is the case, checking the entry 128 of table 4 and 5 would be enough
> and caching the entry comparission is maybe not needed.

Is this guaranteed somehow? It isn't mentioned in the spec.

- Derek
Jerome Martinez Jan. 3, 2020, 10:16 p.m. UTC | #5
On 03/01/2020 22:59, Derek Buitenhuis wrote:
> On 02/01/2020 23:09, Michael Niedermayer wrote:
>> I think if entry 128 is 0 then the whole table must be 0.
>> If that is the case, checking the entry 128 of table 4 and 5 would be enough
>> and caching the entry comparission is maybe not needed.
> Is this guaranteed somehow? It isn't mentioned in the spec.

Checking better the spec about how the quant table is stored in 
bitstream, we may have missed in our past discussion that it is actually 
not stored as 256 independent values, "QuantizationTable(i, j, scale)" 
permits only increasing numbers from index 0 to 127 (so 128, as index 
128 is the negative of index 127), so it seems that by design of how the 
quant table is stored it is impossible to have 0 at index 128 without 
having 0 at indexes 0-127 (one value of "len - 1", 127, seems to be the 
only way to have 0 at index 128).

The need to check the 5th quant table is still valid.

Jérôme
Derek Buitenhuis Jan. 4, 2020, 4:06 p.m. UTC | #6
On 03/01/2020 21:59, Derek Buitenhuis wrote:
> Is this guaranteed somehow? It isn't mentioned in the spec.

Looks like it's inherent to how it's stored.

I'll send a v2 with this.

Thanks,
- Derek
diff mbox series

Patch

diff --git a/libavcodec/ffv1.h b/libavcodec/ffv1.h
index f0bb19350a..115ea24120 100644
--- a/libavcodec/ffv1.h
+++ b/libavcodec/ffv1.h
@@ -67,6 +67,7 @@  typedef struct VlcState {
 
 typedef struct PlaneContext {
     int16_t quant_table[MAX_CONTEXT_INPUTS][256];
+    int three_quant_tables;
     int quant_table_index;
     int context_count;
     uint8_t (*state)[CONTEXT_SIZE];
@@ -102,7 +103,9 @@  typedef struct FFV1Context {
     int ac_byte_count;                   ///< number of bytes used for AC coding
     PlaneContext plane[MAX_PLANES];
     int16_t quant_table[MAX_CONTEXT_INPUTS][256];
+    int three_quant_table;
     int16_t quant_tables[MAX_QUANT_TABLES][MAX_CONTEXT_INPUTS][256];
+    int three_quant_tables[MAX_QUANT_TABLES];
     int context_count[MAX_QUANT_TABLES];
     uint8_t state_transition[256];
     uint8_t (*initial_states[MAX_QUANT_TABLES])[32];
@@ -148,6 +151,18 @@  int ff_ffv1_allocate_initial_states(FFV1Context *f);
 void ff_ffv1_clear_slice_state(FFV1Context *f, FFV1Context *fs);
 int ff_ffv1_close(AVCodecContext *avctx);
 
+/*
+ * The last two quant tables are often zeroes, so we can do a single check
+ * here, to speed up every get_context call on such files.
+ */
+static av_always_inline int is_three_quant_tables(const int16_t quant_table[MAX_CONTEXT_INPUTS][256])
+{
+    const int16_t zero_quant_table[256] = { 0 };
+
+    return !memcmp(zero_quant_table, quant_table[3], sizeof(zero_quant_table)) &&
+           !memcmp(zero_quant_table, quant_table[4], sizeof(zero_quant_table));
+}
+
 static av_always_inline int fold(int diff, int bits)
 {
     if (bits == 8)
diff --git a/libavcodec/ffv1_template.c b/libavcodec/ffv1_template.c
index f2ab93313a..6304b54ab1 100644
--- a/libavcodec/ffv1_template.c
+++ b/libavcodec/ffv1_template.c
@@ -37,7 +37,7 @@  static inline int RENAME(get_context)(PlaneContext *p, TYPE *src,
     const int RT = last[1];
     const int L  = src[-1];
 
-    if (p->quant_table[3][127]) {
+    if (!p->three_quant_tables) {
         const int TT = last2[0];
         const int LL = src[-2];
         return p->quant_table[0][(L - LT) & 0xFF] +
diff --git a/libavcodec/ffv1dec.c b/libavcodec/ffv1dec.c
index e465ed49d7..5ac67daaa8 100644
--- a/libavcodec/ffv1dec.c
+++ b/libavcodec/ffv1dec.c
@@ -197,6 +197,7 @@  static int decode_slice_header(FFV1Context *f, FFV1Context *fs)
         }
         p->quant_table_index = idx;
         memcpy(p->quant_table, f->quant_tables[idx], sizeof(p->quant_table));
+        p->three_quant_tables = f->three_quant_tables[idx];
         context_count = f->context_count[idx];
 
         if (p->context_count < context_count) {
@@ -474,6 +475,7 @@  static int read_extra_header(FFV1Context *f)
 
     for (i = 0; i < f->quant_table_count; i++) {
         f->context_count[i] = read_quant_tables(c, f->quant_tables[i]);
+        f->three_quant_tables[i] = is_three_quant_tables(f->quant_tables[i]);
         if (f->context_count[i] < 0) {
             av_log(f->avctx, AV_LOG_ERROR, "read_quant_table error\n");
             return AVERROR_INVALIDDATA;
@@ -735,6 +737,7 @@  static int read_header(FFV1Context *f)
             f->chroma_h_shift, f->chroma_v_shift, f->avctx->pix_fmt);
     if (f->version < 2) {
         context_count = read_quant_tables(c, f->quant_table);
+        f->three_quant_table = is_three_quant_tables(f->quant_table);
         if (context_count < 0) {
             av_log(f->avctx, AV_LOG_ERROR, "read_quant_table error\n");
             return AVERROR_INVALIDDATA;
@@ -797,9 +800,11 @@  static int read_header(FFV1Context *f)
                 p->quant_table_index = idx;
                 memcpy(p->quant_table, f->quant_tables[idx],
                        sizeof(p->quant_table));
+                p->three_quant_tables = f->three_quant_tables[idx];
                 context_count = f->context_count[idx];
             } else {
                 memcpy(p->quant_table, f->quant_table, sizeof(p->quant_table));
+                p->three_quant_tables = f->three_quant_table;
             }
 
             if (f->version <= 2) {
diff --git a/libavcodec/ffv1enc.c b/libavcodec/ffv1enc.c
index c521b7d445..e6e5171c8f 100644
--- a/libavcodec/ffv1enc.c
+++ b/libavcodec/ffv1enc.c
@@ -733,31 +733,37 @@  FF_ENABLE_DEPRECATION_WARNINGS
             s->quant_tables[0][0][i]=           quant11[i];
             s->quant_tables[0][1][i]=        11*quant11[i];
             s->quant_tables[0][2][i]=     11*11*quant11[i];
+            s->three_quant_tables[0]=                    1;
             s->quant_tables[1][0][i]=           quant11[i];
             s->quant_tables[1][1][i]=        11*quant11[i];
             s->quant_tables[1][2][i]=     11*11*quant5 [i];
             s->quant_tables[1][3][i]=   5*11*11*quant5 [i];
             s->quant_tables[1][4][i]= 5*5*11*11*quant5 [i];
+            s->three_quant_tables[1]=                    0;
         } else {
             s->quant_tables[0][0][i]=           quant9_10bit[i];
             s->quant_tables[0][1][i]=        11*quant9_10bit[i];
             s->quant_tables[0][2][i]=     11*11*quant9_10bit[i];
+            s->three_quant_tables[0]=                         1;
             s->quant_tables[1][0][i]=           quant9_10bit[i];
             s->quant_tables[1][1][i]=        11*quant9_10bit[i];
             s->quant_tables[1][2][i]=     11*11*quant5_10bit[i];
             s->quant_tables[1][3][i]=   5*11*11*quant5_10bit[i];
             s->quant_tables[1][4][i]= 5*5*11*11*quant5_10bit[i];
+            s->three_quant_tables[1]=                         0;
         }
     }
     s->context_count[0] = (11 * 11 * 11        + 1) / 2;
     s->context_count[1] = (11 * 11 * 5 * 5 * 5 + 1) / 2;
     memcpy(s->quant_table, s->quant_tables[s->context_model],
            sizeof(s->quant_table));
+    s->three_quant_table = s->three_quant_tables[s->context_model];
 
     for (i = 0; i < s->plane_count; i++) {
         PlaneContext *const p = &s->plane[i];
 
         memcpy(p->quant_table, s->quant_table, sizeof(p->quant_table));
+        p->three_quant_tables= s->three_quant_table;
         p->quant_table_index = s->context_model;
         p->context_count     = s->context_count[p->quant_table_index];
     }