diff mbox

[FFmpeg-devel] mpeg12dec fix up DVD caption handling

Message ID 57D72C08.4020808@impactstudiopro.com
State Superseded
Headers show

Commit Message

Jonathan Campbell Sept. 12, 2016, 10:28 p.m. UTC
These patches fix up the DVD caption handling in mpeg12dec.c to better handle odd cases.
It's based on code I've written elsewhere to handle captions.
While it's common for these packets to contain 15 frames worth and start on the odd field there are also DVDs that start on even field or even encode extra fields and switch starting fields.
Part of the patch is to document comprehensively the format of the DVD caption packet.

Jonathan Campbell
From 9213012c7d8ceef2af43fe3c218b1b50728e8f80 Mon Sep 17 00:00:00 2001
From: Jonathan Campbell <jonathan@castus.tv>
Date: Mon, 12 Sep 2016 12:34:48 -0700
Subject: [PATCH 1/2] add comments documenting the format of the DVD CC
 user-data packet. this is to aid development and maintenance of that code.

---
 libavcodec/mpeg12dec.c | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

Comments

Michael Niedermayer Sept. 12, 2016, 11:56 p.m. UTC | #1
On Mon, Sep 12, 2016 at 03:28:24PM -0700, Jonathan Campbell wrote:
> These patches fix up the DVD caption handling in mpeg12dec.c to better handle odd cases.
> It's based on code I've written elsewhere to handle captions.
> While it's common for these packets to contain 15 frames worth and start on the odd field there are also DVDs that start on even field or even encode extra fields and switch starting fields.
> Part of the patch is to document comprehensively the format of the DVD caption packet.
> 
> Jonathan Campbell

>  mpeg12dec.c |   27 ++++++++++++++++++++++++++-
>  1 file changed, 26 insertions(+), 1 deletion(-)
> a839a0d0e9000ab140f6aef9dee9577f242462bf  0001-add-comments-documenting-the-format-of-the-DVD-CC-us.patch
> From 9213012c7d8ceef2af43fe3c218b1b50728e8f80 Mon Sep 17 00:00:00 2001
> From: Jonathan Campbell <jonathan@castus.tv>
> Date: Mon, 12 Sep 2016 12:34:48 -0700
> Subject: [PATCH 1/2] add comments documenting the format of the DVD CC
>  user-data packet. this is to aid development and maintenance of that code.
> 
> ---
>  libavcodec/mpeg12dec.c | 27 ++++++++++++++++++++++++++-
>  1 file changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/libavcodec/mpeg12dec.c b/libavcodec/mpeg12dec.c
> index 204a578..522621a 100644
> --- a/libavcodec/mpeg12dec.c
> +++ b/libavcodec/mpeg12dec.c
> @@ -2262,7 +2262,32 @@ static int mpeg_decode_a53_cc(AVCodecContext *avctx,
>          return 1;
>      } else if (buf_size >= 11 &&
>                 p[0] == 'C' && p[1] == 'C' && p[2] == 0x01 && p[3] == 0xf8) {
> -        /* extract DVD CC data */
> +        /* extract DVD CC data

> +         * for more information see: [https://en.wikipedia.org/wiki/EIA-608#DVD_GOP_User_Data_Insertion]

wikipedia is not a good reference, in fact its not even a
constant reference without a revission. wikipedia can massivly change
and may at times, especially with niche areas be just wrong, the link
itself also wont always work possibly

Please use the specifications itself, H.262 is public, its the 4th
link when searching for H.262 with google for example (wikipedia
refers to H.262 IIUC)

[...]
diff mbox

Patch

diff --git a/libavcodec/mpeg12dec.c b/libavcodec/mpeg12dec.c
index 204a578..522621a 100644
--- a/libavcodec/mpeg12dec.c
+++ b/libavcodec/mpeg12dec.c
@@ -2262,7 +2262,32 @@  static int mpeg_decode_a53_cc(AVCodecContext *avctx,
         return 1;
     } else if (buf_size >= 11 &&
                p[0] == 'C' && p[1] == 'C' && p[2] == 0x01 && p[3] == 0xf8) {
-        /* extract DVD CC data */
+        /* extract DVD CC data
+         * for more information see: [https://en.wikipedia.org/wiki/EIA-608#DVD_GOP_User_Data_Insertion]
+         *
+         * uint32_t   user_data_start_code        0x000001B2    (big endian)
+         * uint16_t   user_identifier             0x4343 "CC"
+         * uint8_t    user_data_type_code         0x01
+         * uint8_t    caption_block_size          0xF8
+         * uint8_t
+         *   bit 7    caption_odd_field_first     1=odd field (CC1/CC2) first  0=even field (CC3/CC4) first
+         *   bit 6    caption_filler              0
+         *   bit 5:1  caption_block_count         number of caption blocks (pairs of caption words = frames). Most DVDs use 15 per start of GOP.
+         *   bit 0    caption_extra_field_added   1=one additional caption word
+         *
+         * struct caption_field_block {
+         *   uint8_t
+         *     bit 7:1 caption_filler             0x7F (all 1s)
+         *     bit 0   caption_field_odd          1=odd field (this is CC1/CC2)  0=even field (this is CC3/CC4)
+         *   uint8_t   caption_first_byte
+         *   uint8_t   caption_second_byte
+         * } caption_block[(caption_block_count * 2) + caption_extra_field_added];
+         *
+         * Some DVDs encode caption data for both fields with caption_field_odd=1. The only way to decode the fields
+         * correctly is to start on the field indicated by caption_odd_field_first and count between odd/even fields.
+         * Don't assume that the first caption word is the odd field. There do exist MPEG files in the wild that start
+         * on the even field. There also exist DVDs in the wild that encode an odd field count and the
+         * caption_extra_field_added/caption_odd_field_first bits change per packet to allow that. */
         int cc_count = 0;
         int i;
         // There is a caption count field in the data, but it is often