diff mbox

[FFmpeg-devel] mpeg12dec fix up DVD caption handling

Message ID 57D7542F.5060103@impactstudiopro.com
State Superseded
Headers show

Commit Message

Jonathan Campbell Sept. 13, 2016, 1:19 a.m. UTC
On 09/12/2016 04:56 PM, Michael Niedermayer wrote:
> On Mon, Sep 12, 2016 at 03:28:24PM -0700, Jonathan Campbell wrote:
>> These patches fix up the DVD caption handling in mpeg12dec.c to better handle odd cases.
>> It's based on code I've written elsewhere to handle captions.
>> While it's common for these packets to contain 15 frames worth and start on the odd field there are also DVDs that start on even field or even encode extra fields and switch starting fields.
>> Part of the patch is to document comprehensively the format of the DVD caption packet.
>>
>> Jonathan Campbell
> 
>>  mpeg12dec.c |   27 ++++++++++++++++++++++++++-
>>  1 file changed, 26 insertions(+), 1 deletion(-)
>> a839a0d0e9000ab140f6aef9dee9577f242462bf  0001-add-comments-documenting-the-format-of-the-DVD-CC-us.patch
>> From 9213012c7d8ceef2af43fe3c218b1b50728e8f80 Mon Sep 17 00:00:00 2001
>> From: Jonathan Campbell <jonathan@castus.tv>
>> Date: Mon, 12 Sep 2016 12:34:48 -0700
>> Subject: [PATCH 1/2] add comments documenting the format of the DVD CC
>>  user-data packet. this is to aid development and maintenance of that code.
>>
>> ---
>>  libavcodec/mpeg12dec.c | 27 ++++++++++++++++++++++++++-
>>  1 file changed, 26 insertions(+), 1 deletion(-)
>>
>> diff --git a/libavcodec/mpeg12dec.c b/libavcodec/mpeg12dec.c
>> index 204a578..522621a 100644
>> --- a/libavcodec/mpeg12dec.c
>> +++ b/libavcodec/mpeg12dec.c
>> @@ -2262,7 +2262,32 @@ static int mpeg_decode_a53_cc(AVCodecContext *avctx,
>>          return 1;
>>      } else if (buf_size >= 11 &&
>>                 p[0] == 'C' && p[1] == 'C' && p[2] == 0x01 && p[3] == 0xf8) {
>> -        /* extract DVD CC data */
>> +        /* extract DVD CC data
> 
>> +         * for more information see: [https://en.wikipedia.org/wiki/EIA-608#DVD_GOP_User_Data_Insertion]
> 
> wikipedia is not a good reference, in fact its not even a
> constant reference without a revission. wikipedia can massivly change
> and may at times, especially with niche areas be just wrong, the link
> itself also wont always work possibly
> 
> Please use the specifications itself, H.262 is public, its the 4th
> link when searching for H.262 with google for example (wikipedia
> refers to H.262 IIUC)
> 
> [...]
> 
> 
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
Updated patch set, removes Wikipedia link.

I see that H.262 specs are free on the web, but I can't find the part that describes the DVD-style H.262 user data packets that this code handles.

But the structure as described has been reliable when tested against my DVD library.

Jonathan Campbell
From 9213012c7d8ceef2af43fe3c218b1b50728e8f80 Mon Sep 17 00:00:00 2001
From: Jonathan Campbell <jonathan@castus.tv>
Date: Mon, 12 Sep 2016 12:34:48 -0700
Subject: [PATCH 1/7] add comments documenting the format of the DVD CC
 user-data packet. this is to aid development and maintenance of that code.

---
 libavcodec/mpeg12dec.c | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

Comments

Michael Niedermayer Sept. 13, 2016, 2:58 a.m. UTC | #1
On Mon, Sep 12, 2016 at 06:19:43PM -0700, Jonathan Campbell wrote:
> 
> 
> On 09/12/2016 04:56 PM, Michael Niedermayer wrote:
> > On Mon, Sep 12, 2016 at 03:28:24PM -0700, Jonathan Campbell wrote:
> >> These patches fix up the DVD caption handling in mpeg12dec.c to better handle odd cases.
> >> It's based on code I've written elsewhere to handle captions.
> >> While it's common for these packets to contain 15 frames worth and start on the odd field there are also DVDs that start on even field or even encode extra fields and switch starting fields.
> >> Part of the patch is to document comprehensively the format of the DVD caption packet.
> >>
> >> Jonathan Campbell
> > 
> >>  mpeg12dec.c |   27 ++++++++++++++++++++++++++-
> >>  1 file changed, 26 insertions(+), 1 deletion(-)
> >> a839a0d0e9000ab140f6aef9dee9577f242462bf  0001-add-comments-documenting-the-format-of-the-DVD-CC-us.patch
> >> From 9213012c7d8ceef2af43fe3c218b1b50728e8f80 Mon Sep 17 00:00:00 2001
> >> From: Jonathan Campbell <jonathan@castus.tv>
> >> Date: Mon, 12 Sep 2016 12:34:48 -0700
> >> Subject: [PATCH 1/2] add comments documenting the format of the DVD CC
> >>  user-data packet. this is to aid development and maintenance of that code.
> >>
> >> ---
> >>  libavcodec/mpeg12dec.c | 27 ++++++++++++++++++++++++++-
> >>  1 file changed, 26 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/libavcodec/mpeg12dec.c b/libavcodec/mpeg12dec.c
> >> index 204a578..522621a 100644
> >> --- a/libavcodec/mpeg12dec.c
> >> +++ b/libavcodec/mpeg12dec.c
> >> @@ -2262,7 +2262,32 @@ static int mpeg_decode_a53_cc(AVCodecContext *avctx,
> >>          return 1;
> >>      } else if (buf_size >= 11 &&
> >>                 p[0] == 'C' && p[1] == 'C' && p[2] == 0x01 && p[3] == 0xf8) {
> >> -        /* extract DVD CC data */
> >> +        /* extract DVD CC data
> > 
> >> +         * for more information see: [https://en.wikipedia.org/wiki/EIA-608#DVD_GOP_User_Data_Insertion]
> > 
> > wikipedia is not a good reference, in fact its not even a
> > constant reference without a revission. wikipedia can massivly change
> > and may at times, especially with niche areas be just wrong, the link
> > itself also wont always work possibly
> > 
> > Please use the specifications itself, H.262 is public, its the 4th
> > link when searching for H.262 with google for example (wikipedia
> > refers to H.262 IIUC)
> > 
> > [...]
> > 
> > 
> > 
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> > 
> Updated patch set, removes Wikipedia link.
> 
> I see that H.262 specs are free on the web, but I can't find the part that describes the DVD-style H.262 user data packets that this code handles.
> 
> But the structure as described has been reliable when tested against my DVD library.

applied the docs and wiki removial stashed

please dont send multiple patches per mail it makes keeping track of
what needs a review hard both with MUAs as well as patchwork

[...]
Jonathan Campbell Sept. 13, 2016, 5:08 p.m. UTC | #2
On 09/12/2016 07:58 PM, Michael Niedermayer wrote:
> On Mon, Sep 12, 2016 at 06:19:43PM -0700, Jonathan Campbell wrote:
>>
>>
>> On 09/12/2016 04:56 PM, Michael Niedermayer wrote:
>>> On Mon, Sep 12, 2016 at 03:28:24PM -0700, Jonathan Campbell wrote:
>>>> These patches fix up the DVD caption handling in mpeg12dec.c to better handle odd cases.
>>>> It's based on code I've written elsewhere to handle captions.
>>>> While it's common for these packets to contain 15 frames worth and start on the odd field there are also DVDs that start on even field or even encode extra fields and switch starting fields.
>>>> Part of the patch is to document comprehensively the format of the DVD caption packet.
>>>>
>>>> Jonathan Campbell
>>>
>>>>  mpeg12dec.c |   27 ++++++++++++++++++++++++++-
>>>>  1 file changed, 26 insertions(+), 1 deletion(-)
>>>> a839a0d0e9000ab140f6aef9dee9577f242462bf  0001-add-comments-documenting-the-format-of-the-DVD-CC-us.patch
>>>> From 9213012c7d8ceef2af43fe3c218b1b50728e8f80 Mon Sep 17 00:00:00 2001
>>>> From: Jonathan Campbell <jonathan@castus.tv>
>>>> Date: Mon, 12 Sep 2016 12:34:48 -0700
>>>> Subject: [PATCH 1/2] add comments documenting the format of the DVD CC
>>>>  user-data packet. this is to aid development and maintenance of that code.
>>>>
>>>> ---
>>>>  libavcodec/mpeg12dec.c | 27 ++++++++++++++++++++++++++-
>>>>  1 file changed, 26 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/libavcodec/mpeg12dec.c b/libavcodec/mpeg12dec.c
>>>> index 204a578..522621a 100644
>>>> --- a/libavcodec/mpeg12dec.c
>>>> +++ b/libavcodec/mpeg12dec.c
>>>> @@ -2262,7 +2262,32 @@ static int mpeg_decode_a53_cc(AVCodecContext *avctx,
>>>>          return 1;
>>>>      } else if (buf_size >= 11 &&
>>>>                 p[0] == 'C' && p[1] == 'C' && p[2] == 0x01 && p[3] == 0xf8) {
>>>> -        /* extract DVD CC data */
>>>> +        /* extract DVD CC data
>>>
>>>> +         * for more information see: [https://en.wikipedia.org/wiki/EIA-608#DVD_GOP_User_Data_Insertion]
>>>
>>> wikipedia is not a good reference, in fact its not even a
>>> constant reference without a revission. wikipedia can massivly change
>>> and may at times, especially with niche areas be just wrong, the link
>>> itself also wont always work possibly
>>>
>>> Please use the specifications itself, H.262 is public, its the 4th
>>> link when searching for H.262 with google for example (wikipedia
>>> refers to H.262 IIUC)
>>>
>>> [...]
>>>
>>>
>>>
>>> _______________________________________________
>>> ffmpeg-devel mailing list
>>> ffmpeg-devel@ffmpeg.org
>>> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>>
>> Updated patch set, removes Wikipedia link.
>>
>> I see that H.262 specs are free on the web, but I can't find the part that describes the DVD-style H.262 user data packets that this code handles.
>>
>> But the structure as described has been reliable when tested against my DVD library.
> 
> applied the docs and wiki removial stashed
> 
> please dont send multiple patches per mail it makes keeping track of
> what needs a review hard both with MUAs as well as patchwork
> 
> [...]
> 
> 
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
Makes sense.

Would running git format-patch >patchfile work to submit it as one patch or do I need to squash it into one patch?

Do I need to resubmit anything?

Jonathan Campbell
Moritz Barsnick Sept. 13, 2016, 7:42 p.m. UTC | #3
On Mon, Sep 12, 2016 at 18:19:43 -0700, Jonathan Campbell wrote:
> Subject: [PATCH 2/7] read caption words field-wise, count properly and limit
>  to cc_count. transfer each CC word taking into consideration immediate CC
>  field bit or for DVDs that don't use it, keep track according to first field
>  specified at start of DVD caption packet.

We're apparently missing patches 3 to 6 of 7?

Anyways, commit messages should consist of a first line as summary (not
too long) with a prefix such as "lavc/mpeg12dec: ", then an empty line
separating the following optional, but possibly verbose commit
explanation or continuation. Don't try to squeeze mulitple sentences
into the commit line. (Check other commits for reference.)

> +            if ((p[i] & 0xfe) != 0xfe) {
> +                av_log(avctx, AV_LOG_DEBUG, "cc_count is too large (%u > %u) or junk data in DVD caption packet",(unsigned int)caption_block_count,(unsigned int)cc_count);

I don't see the need to cast ints to unsigned ints for printing. Either
just print with %d, or use unsigned ints in the first place if you
never (ever) need negative values.

> +                    /* if the source actually uses the caption_odd_field bit, then use that to determine the field.
> +                     * else, toggle between fields to keep track for DVDs where p[0] == 0xFF at all times. */

Very nitpicky: If you use full sentences ending in a period, then also
begin them with capitals, like writing proper text. Else my eyes hurt.
(Or: Else it's hard finding the beginning.)
;-)

Moritz
Jonathan Campbell Sept. 13, 2016, 9:37 p.m. UTC | #4
On 09/13/2016 12:42 PM, Moritz Barsnick wrote:
> On Mon, Sep 12, 2016 at 18:19:43 -0700, Jonathan Campbell wrote:
>> Subject: [PATCH 2/7] read caption words field-wise, count properly and limit
>>  to cc_count. transfer each CC word taking into consideration immediate CC
>>  field bit or for DVDs that don't use it, keep track according to first field
>>  specified at start of DVD caption packet.
> 
> We're apparently missing patches 3 to 6 of 7?

In the branch I wrote these patches, I wrote a quick proof-of-concept program to read, dump, and decode the A53 caption side data from the AVFrame to help confirm that the data was read correctly from various MPEG and VOB files on my hard drive. That code was patches 3 to 6 and not relevant to what I posted therefore not included.

You can look at that program on Github if you like: https://github.com/joncampbell123/FFmpeg/commit/98ce62db24d0d537d9ae99610c67d7caa09a0a0c.patch

> 
> Anyways, commit messages should consist of a first line as summary (not
> too long) with a prefix such as "lavc/mpeg12dec: ", then an empty line
> separating the following optional, but possibly verbose commit
> explanation or continuation. Don't try to squeeze mulitple sentences
> into the commit line. (Check other commits for reference.)

Good idea. Will do.

> 
>> +            if ((p[i] & 0xfe) != 0xfe) {
>> +                av_log(avctx, AV_LOG_DEBUG, "cc_count is too large (%u > %u) or junk data in DVD caption packet",(unsigned int)caption_block_count,(unsigned int)cc_count);
> 
> I don't see the need to cast ints to unsigned ints for printing. Either
> just print with %d, or use unsigned ints in the first place if you
> never (ever) need negative values.

That's true. I'm used to using printf() against library structures that may or may not use unsigned int or unsigned long.

> 
>> +                    /* if the source actually uses the caption_odd_field bit, then use that to determine the field.
>> +                     * else, toggle between fields to keep track for DVDs where p[0] == 0xFF at all times. */
> 
> Very nitpicky: If you use full sentences ending in a period, then also
> begin them with capitals, like writing proper text. Else my eyes hurt.
> (Or: Else it's hard finding the beginning.)

Will fix.

> ;-)
> 
> Moritz
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 

Should I make these changes and resubmit them? (except for the documentation patch which I saw)

Jonathan Campbell
Michael Niedermayer Sept. 14, 2016, 1:01 a.m. UTC | #5
On Tue, Sep 13, 2016 at 10:08:06AM -0700, Jonathan Campbell wrote:
> 
> 
> On 09/12/2016 07:58 PM, Michael Niedermayer wrote:
> > On Mon, Sep 12, 2016 at 06:19:43PM -0700, Jonathan Campbell wrote:
> >>
> >>
> >> On 09/12/2016 04:56 PM, Michael Niedermayer wrote:
> >>> On Mon, Sep 12, 2016 at 03:28:24PM -0700, Jonathan Campbell wrote:
> >>>> These patches fix up the DVD caption handling in mpeg12dec.c to better handle odd cases.
> >>>> It's based on code I've written elsewhere to handle captions.
> >>>> While it's common for these packets to contain 15 frames worth and start on the odd field there are also DVDs that start on even field or even encode extra fields and switch starting fields.
> >>>> Part of the patch is to document comprehensively the format of the DVD caption packet.
> >>>>
> >>>> Jonathan Campbell
> >>>
> >>>>  mpeg12dec.c |   27 ++++++++++++++++++++++++++-
> >>>>  1 file changed, 26 insertions(+), 1 deletion(-)
> >>>> a839a0d0e9000ab140f6aef9dee9577f242462bf  0001-add-comments-documenting-the-format-of-the-DVD-CC-us.patch
> >>>> From 9213012c7d8ceef2af43fe3c218b1b50728e8f80 Mon Sep 17 00:00:00 2001
> >>>> From: Jonathan Campbell <jonathan@castus.tv>
> >>>> Date: Mon, 12 Sep 2016 12:34:48 -0700
> >>>> Subject: [PATCH 1/2] add comments documenting the format of the DVD CC
> >>>>  user-data packet. this is to aid development and maintenance of that code.
> >>>>
> >>>> ---
> >>>>  libavcodec/mpeg12dec.c | 27 ++++++++++++++++++++++++++-
> >>>>  1 file changed, 26 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/libavcodec/mpeg12dec.c b/libavcodec/mpeg12dec.c
> >>>> index 204a578..522621a 100644
> >>>> --- a/libavcodec/mpeg12dec.c
> >>>> +++ b/libavcodec/mpeg12dec.c
> >>>> @@ -2262,7 +2262,32 @@ static int mpeg_decode_a53_cc(AVCodecContext *avctx,
> >>>>          return 1;
> >>>>      } else if (buf_size >= 11 &&
> >>>>                 p[0] == 'C' && p[1] == 'C' && p[2] == 0x01 && p[3] == 0xf8) {
> >>>> -        /* extract DVD CC data */
> >>>> +        /* extract DVD CC data
> >>>
> >>>> +         * for more information see: [https://en.wikipedia.org/wiki/EIA-608#DVD_GOP_User_Data_Insertion]
> >>>
> >>> wikipedia is not a good reference, in fact its not even a
> >>> constant reference without a revission. wikipedia can massivly change
> >>> and may at times, especially with niche areas be just wrong, the link
> >>> itself also wont always work possibly
> >>>
> >>> Please use the specifications itself, H.262 is public, its the 4th
> >>> link when searching for H.262 with google for example (wikipedia
> >>> refers to H.262 IIUC)
> >>>
> >>> [...]
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> ffmpeg-devel mailing list
> >>> ffmpeg-devel@ffmpeg.org
> >>> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >>>
> >> Updated patch set, removes Wikipedia link.
> >>
> >> I see that H.262 specs are free on the web, but I can't find the part that describes the DVD-style H.262 user data packets that this code handles.
> >>
> >> But the structure as described has been reliable when tested against my DVD library.
> > 
> > applied the docs and wiki removial stashed
> > 
> > please dont send multiple patches per mail it makes keeping track of
> > what needs a review hard both with MUAs as well as patchwork
> > 
> > [...]
> > 
> > 
> > 
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> > 
> Makes sense.
> 
> Would running git format-patch >patchfile work to submit it as one patch or do I need to squash it into one patch?
> 
> Do I need to resubmit anything?

yes, please resbmit the remaining patch/changes with any comments
from everyone taken care of or with explanation why not

thx

[...]
Jonathan Campbell Sept. 14, 2016, 6:06 p.m. UTC | #6
>> Would running git format-patch >patchfile work to submit it as one patch or do I need to squash it into one patch?
>>
>> Do I need to resubmit anything?
> 
> yes, please resbmit the remaining patch/changes with any comments
> from everyone taken care of or with explanation why not
> 
> thx
> 
> [...]

Do you need me to resubmit the patches for the AC-3 consistent noise generation as well?

Jonathan Campbell
diff mbox

Patch

diff --git a/libavcodec/mpeg12dec.c b/libavcodec/mpeg12dec.c
index 204a578..522621a 100644
--- a/libavcodec/mpeg12dec.c
+++ b/libavcodec/mpeg12dec.c
@@ -2262,7 +2262,32 @@  static int mpeg_decode_a53_cc(AVCodecContext *avctx,
         return 1;
     } else if (buf_size >= 11 &&
                p[0] == 'C' && p[1] == 'C' && p[2] == 0x01 && p[3] == 0xf8) {
-        /* extract DVD CC data */
+        /* extract DVD CC data
+         * for more information see: [https://en.wikipedia.org/wiki/EIA-608#DVD_GOP_User_Data_Insertion]
+         *
+         * uint32_t   user_data_start_code        0x000001B2    (big endian)
+         * uint16_t   user_identifier             0x4343 "CC"
+         * uint8_t    user_data_type_code         0x01
+         * uint8_t    caption_block_size          0xF8
+         * uint8_t
+         *   bit 7    caption_odd_field_first     1=odd field (CC1/CC2) first  0=even field (CC3/CC4) first
+         *   bit 6    caption_filler              0
+         *   bit 5:1  caption_block_count         number of caption blocks (pairs of caption words = frames). Most DVDs use 15 per start of GOP.
+         *   bit 0    caption_extra_field_added   1=one additional caption word
+         *
+         * struct caption_field_block {
+         *   uint8_t
+         *     bit 7:1 caption_filler             0x7F (all 1s)
+         *     bit 0   caption_field_odd          1=odd field (this is CC1/CC2)  0=even field (this is CC3/CC4)
+         *   uint8_t   caption_first_byte
+         *   uint8_t   caption_second_byte
+         * } caption_block[(caption_block_count * 2) + caption_extra_field_added];
+         *
+         * Some DVDs encode caption data for both fields with caption_field_odd=1. The only way to decode the fields
+         * correctly is to start on the field indicated by caption_odd_field_first and count between odd/even fields.
+         * Don't assume that the first caption word is the odd field. There do exist MPEG files in the wild that start
+         * on the even field. There also exist DVDs in the wild that encode an odd field count and the
+         * caption_extra_field_added/caption_odd_field_first bits change per packet to allow that. */
         int cc_count = 0;
         int i;
         // There is a caption count field in the data, but it is often