[FFmpeg-devel,PATCH/RFC] libavcodec/cinepak: Separate decoding from parsing

Submitted by Tomas Härdin on Sept. 1, 2019, 8:58 p.m.

Details

Message ID 153c1256f94ff72829f35fb0de84fcec5e405d94.camel@acc.umu.se
State New
Headers show

Commit Message

Tomas Härdin Sept. 1, 2019, 8:58 p.m.
Hi

Attached patch separates parsing from decoding in the Cinepak decoder.
It puts in some rather strict checks which are in line with how I've
figured the VfW 1.1 decoder works. Parsing is still intermixed with
validation, but the code should be much easier to read this way
compared to before. Some avpriv_request_sample()s should probably be
switched to outright rejection, I haven't decided yet

I've tested it on all samples on samples.mplayerhq.hu, both under 
http://samples.mplayerhq.hu/V-codecs/CVID and 
http://samples.mplayerhq.hu/game-formats/film-cpk/

I've discovered some bugs in the existing decoder. It considers frames
with any strip being intra as being a keyframe, when in fact mixing
intra and inter strips is allowed. A similar issue exists with files
that use 0x00 and 0x01 as strip IDs, like 1984.apple_ad.mov

FATE references updated since this now rejects invalid frames

There is still some work to do, like properly inspecting what type of
mode is used in each chunk. The larger picture here is that the project
at large might benefit from a general parsing and validation framework.
I don't like that we keep having these KLV-type parsers reimplemented
all over the place. Even with this simple decoder I'm going to have to
implement a second layer of KLV parsing to validate the chunks inside
each strip. Tedious and error-prone.

/Tomas

Comments

Carl Eugen Hoyos Sept. 1, 2019, 9:07 p.m.
Am So., 1. Sept. 2019 um 22:58 Uhr schrieb Tomas Härdin <tjoppen@acc.umu.se>:

> Attached patch separates parsing from decoding in the Cinepak decoder.
> It puts in some rather strict checks which are in line with how I've
> figured the VfW 1.1 decoder works. Parsing is still intermixed with
> validation, but the code should be much easier to read this way
> compared to before. Some avpriv_request_sample()s should probably be
> switched to outright rejection, I haven't decided yet

Your patch looks to me as if it would break decoding broken samples
instead of returning as much decoded data as is possible with the
input.

Carl Eugen
Tomas Härdin Sept. 2, 2019, 9:17 a.m.
sön 2019-09-01 klockan 23:07 +0200 skrev Carl Eugen Hoyos:
> Am So., 1. Sept. 2019 um 22:58 Uhr schrieb Tomas Härdin <
> tjoppen@acc.umu.se>:
> 
> > Attached patch separates parsing from decoding in the Cinepak decoder.
> > It puts in some rather strict checks which are in line with how I've
> > figured the VfW 1.1 decoder works. Parsing is still intermixed with
> > validation, but the code should be much easier to read this way
> > compared to before. Some avpriv_request_sample()s should probably be
> > switched to outright rejection, I haven't decided yet
> 
> Your patch looks to me as if it would break decoding broken samples
> instead of returning as much decoded data as is possible with the
> input.

For some definitions of broken, yes. There is a contradiction between
safe, efficient parsing and just how broken bitstreams a particular
decoder will accept.

Until recently the definition of "not-broken" was "has enough bytes to
decode to a few lines of garbage"

Michael's recently suggested patch would define broken as "has too few
bytes to decode as an entire skip frame". This would change the
decoder's behavior for some class of files (gasp!)

My suggestion is that we should be very specific in what we accept,
backed up by samples in FATE. We should explicitly not worry about
files that someone somewhere might be able to generate which we
successfully reject. Else we cannot refactor with confidence.

/Tomas
Michael Niedermayer Sept. 3, 2019, 11:53 a.m.
On Sun, Sep 01, 2019 at 10:58:27PM +0200, Tomas Härdin wrote:
>  libavcodec/cinepak.c        |  337 ++++++++++++++++++++++++++++----------------
>  tests/ref/fate/cvid-palette |    1 
>  tests/ref/fate/cvid-partial |    1 
>  3 files changed, 215 insertions(+), 124 deletions(-)
> 197f986bfb297c54ee0f208449d4053fcb948645  0001-libavcodec-cinepak-Separate-decoding-from-parsing.patch
> From d4bda2edbecb93fa5c5910b8b91a866210368a95 Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?Tomas=20H=C3=A4rdin?= <tjoppen@acc.umu.se>
> Date: Sun, 1 Sep 2019 15:37:13 +0200
> Subject: [PATCH] libavcodec/cinepak: Separate decoding from parsing
> 
> This is rather strict and replaces the previous hacks with checks that should hopefully placate the fuzzers for the forseeable future
> ---
>  libavcodec/cinepak.c        | 337 +++++++++++++++++++++++-------------
>  tests/ref/fate/cvid-palette |   1 -
>  tests/ref/fate/cvid-partial |   1 -
>  3 files changed, 215 insertions(+), 124 deletions(-)

this does not seem to apply automatically here

Applying: libavcodec/cinepak: Separate decoding from parsing
error: sha1 information is lacking or useless (libavcodec/cinepak.c).
error: could not build fake ancestor
Patch failed at 0001 libavcodec/cinepak: Separate decoding from parsing


[...]

Patch hide | download patch | download mbox

From d4bda2edbecb93fa5c5910b8b91a866210368a95 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Tomas=20H=C3=A4rdin?= <tjoppen@acc.umu.se>
Date: Sun, 1 Sep 2019 15:37:13 +0200
Subject: [PATCH] libavcodec/cinepak: Separate decoding from parsing

This is rather strict and replaces the previous hacks with checks that should hopefully placate the fuzzers for the forseeable future
---
 libavcodec/cinepak.c        | 337 +++++++++++++++++++++++-------------
 tests/ref/fate/cvid-palette |   1 -
 tests/ref/fate/cvid-partial |   1 -
 3 files changed, 215 insertions(+), 124 deletions(-)

diff --git a/libavcodec/cinepak.c b/libavcodec/cinepak.c
index 0d8147b247..05d1820de1 100644
--- a/libavcodec/cinepak.c
+++ b/libavcodec/cinepak.c
@@ -46,6 +46,8 @@ 
 typedef uint8_t cvid_codebook[12];
 
 #define MAX_STRIPS      32
+#define MAX_HEIGHT      UINT16_MAX
+#define MB_AREA         16
 
 typedef struct cvid_strip {
     uint16_t          id;
@@ -53,6 +55,8 @@  typedef struct cvid_strip {
     uint16_t          x2, y2;
     cvid_codebook     v4_codebook[256];
     cvid_codebook     v1_codebook[256];
+    const uint8_t     *data;    // points to strip data, excluding header
+    int               size;
 } cvid_strip;
 
 typedef struct CinepakContext {
@@ -60,9 +64,6 @@  typedef struct CinepakContext {
     AVCodecContext *avctx;
     AVFrame *frame;
 
-    const unsigned char *data;
-    int size;
-
     int width, height;
 
     int palette_video;
@@ -71,6 +72,10 @@  typedef struct CinepakContext {
     int sega_film_skip_bytes;
 
     uint32_t pal[256];
+
+    int frame_flags;
+    int num_strips;
+    uint8_t strip_cover[MAX_HEIGHT];
 } CinepakContext;
 
 static void cinepak_decode_codebook (cvid_codebook *codebook,
@@ -269,12 +274,6 @@  static int cinepak_decode_strip (CinepakContext *s,
     const uint8_t *eod = (data + size);
     int      chunk_id, chunk_size;
 
-    /* coordinate sanity checks */
-    if (strip->x2 > s->width   ||
-        strip->y2 > s->height  ||
-        strip->x1 >= strip->x2 || strip->y1 >= strip->y2)
-        return AVERROR_INVALIDDATA;
-
     while ((data + 4) <= eod) {
         chunk_id   = data[0];
         chunk_size = AV_RB24 (&data[1]) - 4;
@@ -315,16 +314,78 @@  static int cinepak_decode_strip (CinepakContext *s,
     return AVERROR_INVALIDDATA;
 }
 
-static int cinepak_predecode_check (CinepakContext *s)
+static int cinepak_decode (CinepakContext *s)
 {
-    int           num_strips;
-    int           encoded_buf_size;
+    int i, result;
+
+    for (i = 0; i < s->num_strips; i++) {
+        // cinepak.txt claims that codebooks should be copied from the last strip of the previous frame
+        // I am not sure what to do in case we have just seeked to a keyframe
+        // this also requires keeping track of which strip was the last one,
+        // which I'm not doing quite yet
+        // the samples we have seem to decode fine the way this is right now, no need to mess with it unless a user complains
+        if ((i > 0) && !(s->frame_flags & 0x01)) {
+            memcpy (s->strips[i].v4_codebook, s->strips[i-1].v4_codebook,
+                sizeof(s->strips[i].v4_codebook));
+            memcpy (s->strips[i].v1_codebook, s->strips[i-1].v1_codebook,
+                sizeof(s->strips[i].v1_codebook));
+        }
 
-    num_strips  = AV_RB16 (&s->data[8]);
-    encoded_buf_size = AV_RB24(&s->data[1]);
+        result = cinepak_decode_strip (s, &s->strips[i], s->strips[i].data, s->strips[i].size);
+
+        if (result != 0)
+            return result;
+    }
+    return 0;
+}
 
-    if (s->size < encoded_buf_size * (int64_t)(100 - s->avctx->discard_damaged_percentage) / 100)
+static av_cold int cinepak_decode_init(AVCodecContext *avctx)
+{
+    CinepakContext *s = avctx->priv_data;
+
+    s->avctx = avctx;
+    s->width = (avctx->width + 3) & ~3;
+    s->height = (avctx->height + 3) & ~3;
+
+    // dimensions are limited to 16-bit in the format,
+    // so it would be exceedingly unlikely that this would ever be hit outside of fuzzing
+    if (s->height > MAX_HEIGHT) {
+        avpriv_request_sample(s->avctx, "height > UINT16_MAX");
+        return AVERROR_PATCHWELCOME;
+    }
+
+    s->sega_film_skip_bytes = -1;  /* uninitialized state */
+
+    // check for paletted data
+    if (avctx->bits_per_coded_sample != 8) {
+        s->palette_video = 0;
+        avctx->pix_fmt = AV_PIX_FMT_RGB24;
+    } else {
+        s->palette_video = 1;
+        avctx->pix_fmt = AV_PIX_FMT_PAL8;
+    }
+
+    s->frame = av_frame_alloc();
+    if (!s->frame)
+        return AVERROR(ENOMEM);
+
+    return 0;
+}
+
+static int cinepak_parse_header(AVCodecContext *avctx, const AVPacket *avpkt)
+{
+    CinepakContext *s = avctx->priv_data;
+    int w, h, encoded_buf_size;
+
+    if (avpkt->size < 10) {
         return AVERROR_INVALIDDATA;
+    }
+
+    s->frame_flags   = avpkt->data[0];
+    encoded_buf_size = AV_RB24(&avpkt->data[1]);
+    w                = AV_RB16(&avpkt->data[4]);
+    h                = AV_RB16(&avpkt->data[6]);
+    s->num_strips    = AV_RB16(&avpkt->data[8]);
 
     /* if this is the first frame, check for deviant Sega FILM data */
     if (s->sega_film_skip_bytes == -1) {
@@ -332,20 +393,20 @@  static int cinepak_predecode_check (CinepakContext *s)
             avpriv_request_sample(s->avctx, "encoded_buf_size 0");
             return AVERROR_PATCHWELCOME;
         }
-        if (encoded_buf_size != s->size && (s->size % encoded_buf_size) != 0) {
+        if (encoded_buf_size != avpkt->size && (avpkt->size % encoded_buf_size) != 0) {
             /* If the encoded frame size differs from the frame size as indicated
              * by the container file, this data likely comes from a Sega FILM/CPK file.
              * If the frame header is followed by the bytes FE 00 00 06 00 00 then
              * this is probably one of the two known files that have 6 extra bytes
              * after the frame header. Else, assume 2 extra bytes. The container
              * size also cannot be a multiple of the encoded size. */
-            if (s->size >= 16 &&
-                (s->data[10] == 0xFE) &&
-                (s->data[11] == 0x00) &&
-                (s->data[12] == 0x00) &&
-                (s->data[13] == 0x06) &&
-                (s->data[14] == 0x00) &&
-                (s->data[15] == 0x00))
+            if (avpkt->size >= 16 &&
+                (avpkt->data[10] == 0xFE) &&
+                (avpkt->data[11] == 0x00) &&
+                (avpkt->data[12] == 0x00) &&
+                (avpkt->data[13] == 0x06) &&
+                (avpkt->data[14] == 0x00) &&
+                (avpkt->data[15] == 0x00))
                 s->sega_film_skip_bytes = 6;
             else
                 s->sega_film_skip_bytes = 2;
@@ -353,117 +414,162 @@  static int cinepak_predecode_check (CinepakContext *s)
             s->sega_film_skip_bytes = 0;
     }
 
-    if (s->size < 10 + s->sega_film_skip_bytes + num_strips * 12)
+    // compare w and h to the width and height in CinepakContext, not AVCodecContext
+    // the former has been appropriately rounded up
+    if (w != s->width || h != s->height) {
+        av_log(avctx, AV_LOG_ERROR, "dimension mismatch (%ix%i vs %ix%i)\n", w, h, avctx->width, avctx->height);
         return AVERROR_INVALIDDATA;
+    }
 
-    if (num_strips) {
-        const uint8_t *data = s->data + 10 + s->sega_film_skip_bytes;
-        int strip_size = AV_RB24 (data + 1);
-        if (strip_size < 12 || strip_size > encoded_buf_size)
-            return AVERROR_INVALIDDATA;
+    if (s->num_strips == 0) {
+        // the old code would accept num_strips == 0 if there was at least a palette change
+        // I know of no sample that requires this, and the VfW 1.1 decoder
+        // will produce garbage if the entire frame isn't covered by strips
+        av_log(avctx, AV_LOG_ERROR, "num_strips == 0\n");
+        return AVERROR_INVALIDDATA;
+    }
+
+    // unfortunately we can't demand that the sizes match exactly
+    // example: http://samples.mplayerhq.hu/V-codecs/CVID/bad_cinepak_frame_size.mov
+    // at least require that the header value is no larger than the size of the packet
+    if (encoded_buf_size + s->sega_film_skip_bytes > avpkt->size) {
+        av_log(avctx, AV_LOG_ERROR, "CVID size larger than actual packet: %i vs %i\n", encoded_buf_size + s->sega_film_skip_bytes, avpkt->size);
+        return AVERROR_INVALIDDATA;
+    }
+
+    if (s->sega_film_skip_bytes) {
+        int skip_diff = avpkt->size - encoded_buf_size - s->sega_film_skip_bytes;
+        if (skip_diff != 2 && skip_diff != 6) {
+            // there doesn't seem to be much pattern to these 2 or 6 last bytes
+            // getting samples with different skip_diffs might be useful for further RE work
+            avpriv_request_sample(s->avctx, "SEGA FILM file with skip_diff %i not equal to 2 or 6", skip_diff);
+            return AVERROR_PATCHWELCOME;
+        }
     }
 
     return 0;
 }
 
-static int cinepak_decode (CinepakContext *s)
+static int cinepak_parse_strip_headers(AVCodecContext *avctx, const AVPacket *avpkt)
 {
-    const uint8_t  *eod = (s->data + s->size);
-    int           i, result, strip_size, frame_flags, num_strips;
-    int           y0 = 0;
+    CinepakContext *s = avctx->priv_data;
+    const uint8_t  *ptr = avpkt->data + 10 + s->sega_film_skip_bytes;
+    int i, y0 = 0, bytes_left = avpkt->size, htot = 0;
 
-    frame_flags = s->data[0];
-    num_strips  = AV_RB16 (&s->data[8]);
+    if (s->num_strips > MAX_STRIPS) {
+        avpriv_request_sample(avctx, "num_strips %i > %i", s->num_strips, MAX_STRIPS);
+        return AVERROR_PATCHWELCOME;
+    }
 
-    s->data += 10 + s->sega_film_skip_bytes;
+    // the frame is a keyframe only if none of the strips are inter coded
+    s->frame->key_frame = 1;
 
-    num_strips = FFMIN(num_strips, MAX_STRIPS);
+    memset(s->strip_cover, 0, s->height);
 
-    s->frame->key_frame = 0;
+    for (i = 0; i < s->num_strips; i++) {
+        int id, w, h, pixels;
 
-    for (i=0; i < num_strips; i++) {
-        if ((s->data + 12) > eod)
-            return AVERROR_INVALIDDATA;
+        if (bytes_left < 12) {
+            avpriv_request_sample(avctx, "only %i bytes left for strip %i header", bytes_left, i);
+            return AVERROR_PATCHWELCOME;
+        }
 
-        s->strips[i].id = s->data[0];
-/* zero y1 means "relative to the previous stripe" */
-        if (!(s->strips[i].y1 = AV_RB16 (&s->data[4])))
-            s->strips[i].y2 = (s->strips[i].y1 = y0) + AV_RB16 (&s->data[8]);
-        else
-            s->strips[i].y2 = AV_RB16 (&s->data[8]);
-        s->strips[i].x1 = AV_RB16 (&s->data[6]);
-        s->strips[i].x2 = AV_RB16 (&s->data[10]);
+        id = s->strips[i].id = ptr[0];
 
-        if (s->strips[i].id == 0x10)
-            s->frame->key_frame = 1;
+        if (id != 0x10 && id != 0x11 &&
+            // http://samples.mplayerhq.hu/V-codecs/CVID/1984.apple_ad.mov
+            id != 0x00 && id != 0x01) {
+            avpriv_request_sample(s->avctx, "unexpected strip ID %02x", s->strips[i].id);
+            return AVERROR_PATCHWELCOME;
+        }
 
-        strip_size = AV_RB24 (&s->data[1]) - 12;
-        if (strip_size < 0)
+/* zero y1 means "relative to the previous stripe" */
+        if (!(s->strips[i].y1 = AV_RB16 (&ptr[4])))
+            s->strips[i].y2 = (s->strips[i].y1 = y0) + AV_RB16 (&ptr[8]);
+        else
+            s->strips[i].y2 = AV_RB16 (&ptr[8]);
+        s->strips[i].x1 = AV_RB16 (&ptr[6]);
+        s->strips[i].x2 = AV_RB16 (&ptr[10]);
+
+        /* coordinate sanity checks */
+        if (s->strips[i].x2 > s->width   ||
+            s->strips[i].y2 > s->height  ||
+            s->strips[i].x1 >= s->strips[i].x2 ||
+            s->strips[i].y1 >= s->strips[i].y2) {
+            av_log(avctx, AV_LOG_ERROR, "strip %i with bad coordinates (%i,%i %i,%i)", i,
+                   (int)s->strips[i].x1, (int)s->strips[i].y1,
+                   (int)s->strips[i].x2, (int)s->strips[i].y2);
             return AVERROR_INVALIDDATA;
-        s->data   += 12;
-        strip_size = ((s->data + strip_size) > eod) ? (eod - s->data) : strip_size;
-
-        if ((i > 0) && !(frame_flags & 0x01)) {
-            memcpy (s->strips[i].v4_codebook, s->strips[i-1].v4_codebook,
-                sizeof(s->strips[i].v4_codebook));
-            memcpy (s->strips[i].v1_codebook, s->strips[i-1].v1_codebook,
-                sizeof(s->strips[i].v1_codebook));
         }
 
-        result = cinepak_decode_strip (s, &s->strips[i], s->data, strip_size);
+        // cinepak.txt claims size is 16-bit, but
+        // http://samples.mplayerhq.hu/V-codecs/CVID/tn.avi
+        // has strips which are bigger than that
+        s->strips[i].size = AV_RB24(&ptr[1]);
 
-        if (result != 0)
-            return result;
+        if (s->strips[i].size < 12 || s->strips[i].size > bytes_left) {
+            av_log(avctx, AV_LOG_ERROR, "invalid strip %i size: %i vs %i\n", i, s->strips[i].size, bytes_left);
+            return AVERROR_INVALIDDATA;
+        }
 
-        s->data += strip_size;
-        y0    = s->strips[i].y2;
-    }
-    return 0;
-}
+        w = s->strips[i].x2 - s->strips[i].x1;
+        h = s->strips[i].y2 - s->strips[i].y1;
+        htot += h;
+        pixels = w*h;
 
-static av_cold int cinepak_decode_init(AVCodecContext *avctx)
-{
-    CinepakContext *s = avctx->priv_data;
+        // I'm not sure if vertical splitting is allowed, all samples available use only horizontal splitting
+        // disallow it for now, since allowing it would make the strip coverage check further down more expensive
+        // ideally we should check what both the VfW 1.1 and MacOS decoders do in this case
+        if (w != s->width) {
+            avpriv_request_sample(s->avctx, "strip with %i != frame width %i", w, s->width);
+            return AVERROR_PATCHWELCOME;
+        }
 
-    s->avctx = avctx;
-    s->width = (avctx->width + 3) & ~3;
-    s->height = (avctx->height + 3) & ~3;
+        // id == 0x11 vs id != 0x10? 1984.apple_ad.mov makes this tricky
+        if (id != 0x10) {
+            // inter-coded -> not a keyframe
+            // there can be a mix of intra and inter strips in a single frame
+            s->frame->key_frame = 0;
+
+            // inter strips can be MODE_MC, which are at least 1 bit per MB
+            if (s->strips[i].size < pixels/(MB_AREA*8)) {
+                avpriv_request_sample(s->avctx, "strip %i too small to be inter (%i B vs %i pixels)",
+                    i, s->strips[i].size, pixels);
+                return AVERROR_PATCHWELCOME;
+            }
+        } else if (s->strips[i].size < pixels/MB_AREA) {
+            // intra strips are at least 8 bits per MB (MODE_V1_ONLY)
+            // MODE_V1_V4 is 9 bits per MB, but we don't know if the strip is that mode at this stage
+            avpriv_request_sample(s->avctx, "strip %i too small to be intra (%i B vs %i pixels)",
+                i, s->strips[i].size, pixels);
+            return AVERROR_PATCHWELCOME;
+        }
 
-    s->sega_film_skip_bytes = -1;  /* uninitialized state */
+        // mark coded lines
+        for (int y = s->strips[i].y1; y < s->strips[i].y2; y++) {
+            s->strip_cover[y] = 1;
+        }
 
-    // check for paletted data
-    if (avctx->bits_per_coded_sample != 8) {
-        s->palette_video = 0;
-        avctx->pix_fmt = AV_PIX_FMT_RGB24;
-    } else {
-        s->palette_video = 1;
-        avctx->pix_fmt = AV_PIX_FMT_PAL8;
+        // point into the strip data
+        s->strips[i].data  = ptr + 12;
+        ptr               += s->strips[i].size;
+        bytes_left        -= s->strips[i].size;
+        s->strips[i].size -= 12;
+        y0                 = s->strips[i].y2;
     }
 
-    s->frame = av_frame_alloc();
-    if (!s->frame)
-        return AVERROR(ENOMEM);
-
-    return 0;
-}
-
-static int cinepak_parse_header(AVCodecContext *avctx, int *keyframe, int *num_strips)
-{
-    CinepakContext *s = avctx->priv_data;
-    int sz, w, h;
-
-    if (s->size < 10) {
-        return AVERROR_INVALIDDATA;
+    // require that every single line be coded at least once
+    for (int y = 0; y < s->height; y++) {
+        if (!s->strip_cover[y]) {
+            avpriv_request_sample(s->avctx, "line %i not covered by a strip", y);
+            return AVERROR_PATCHWELCOME;
+        }
     }
 
-    *keyframe   = s->data[0];
-    sz          = AV_RB24(&s->data[1]);
-    w           = AV_RB16(&s->data[4]);
-    h           = AV_RB16(&s->data[6]);
-    *num_strips = AV_RB16 (&s->data[8]);
-
-    if (sz - 10 != s->size || w != avctx->width || h != avctx->height || *num_strips == 0) {
-        return AVERROR_INVALIDDATA;
+    // require that every line be coded *exactly* once
+    if (htot != s->height) {
+        avpriv_request_sample(s->avctx, "strip heights not adding up to frame height");
+        return AVERROR_PATCHWELCOME;
     }
 
     return 0;
@@ -473,24 +579,11 @@  static int cinepak_decode_frame(AVCodecContext *avctx,
                                 void *data, int *got_frame,
                                 AVPacket *avpkt)
 {
-    const uint8_t *buf = avpkt->data;
-    int ret = 0, buf_size = avpkt->size;
     CinepakContext *s = avctx->priv_data;
-    int num_strips, keyframe;
-
-    s->data = buf;
-    s->size = buf_size;
-
-    if ((ret = cinepak_parse_header(avctx, &keyframe, &num_strips)) < 0) {
-        return buf_size;
-    }
-
-    //Empty frame, do not waste time
-    if (!num_strips && (!s->palette_video || !av_packet_get_side_data(avpkt, AV_PKT_DATA_PALETTE, NULL)))
-        return buf_size;
+    int ret = 0;
 
-    if ((ret = cinepak_predecode_check(s)) < 0) {
-        av_log(avctx, AV_LOG_ERROR, "cinepak_predecode_check failed\n");
+    if ((ret = cinepak_parse_header(avctx, avpkt)) < 0 ||
+        (ret = cinepak_parse_strip_headers(avctx, avpkt)) < 0) {
         return ret;
     }
 
@@ -521,7 +614,7 @@  static int cinepak_decode_frame(AVCodecContext *avctx,
     *got_frame = 1;
 
     /* report that the buffer was completely consumed */
-    return buf_size;
+    return avpkt->size;
 }
 
 static av_cold int cinepak_decode_end(AVCodecContext *avctx)
diff --git a/tests/ref/fate/cvid-palette b/tests/ref/fate/cvid-palette
index eae41618b5..975778bafb 100644
--- a/tests/ref/fate/cvid-palette
+++ b/tests/ref/fate/cvid-palette
@@ -58,4 +58,3 @@ 
 0,         52,         52,        1,    57600, 0xdae585c7
 0,         53,         53,        1,    57600, 0xf99baa60
 0,         54,         54,        1,    57600, 0x28a8b1ee
-0,         55,         55,        1,    57600, 0xcd5587f8
diff --git a/tests/ref/fate/cvid-partial b/tests/ref/fate/cvid-partial
index 990939db1c..38be650ed5 100644
--- a/tests/ref/fate/cvid-partial
+++ b/tests/ref/fate/cvid-partial
@@ -81,4 +81,3 @@ 
 0,         75,         75,        1,   224400, 0x920ee561
 0,         76,         76,        1,   224400, 0x8a2c1bbf
 0,         77,         77,        1,   224400, 0x6150c072
-0,         78,         78,        1,   224400, 0x499dc869
-- 
2.20.1