[FFmpeg-devel] Add A53 Closed Captions to MPEG header if they are available.

Submitted by John P Poet on June 6, 2017, 9:06 p.m.

Details

Message ID 1496783180-6128-1-git-send-email-jppoet@digital-nirvana.com
State New
Headers show

Commit Message

John P Poet June 6, 2017, 9:06 p.m.
ff_mpeg1_encode_picture_header: Add support for AV_FRAME_DATA_A53_CC
frame: Add av_frame_get_side_data_at() to allow retrival of multiple side
    data of the same type.
---
 libavcodec/mpeg12enc.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++
 libavutil/frame.c      |  8 ++++++++
 libavutil/frame.h      | 38 +++++++++++++++++++++++++++++++++++-
 3 files changed, 97 insertions(+), 1 deletion(-)

Comments

Marton Balint June 6, 2017, 9:45 p.m.
On Tue, 6 Jun 2017, John Poet wrote:

> ff_mpeg1_encode_picture_header: Add support for AV_FRAME_DATA_A53_CC
> frame: Add av_frame_get_side_data_at() to allow retrival of multiple side
>    data of the same type.

As far as I remember multiple side data of the same type is not something 
we wanted to support. Why do you need it? Can't a single 
AV_FRAME_DATA_A53_CC side data packet contain many CC entries?

In general, split API additions and ordinary features into separate 
patches.

Regards,
Marton
Hendrik Leppkes June 6, 2017, 9:59 p.m.
On Tue, Jun 6, 2017 at 11:45 PM, Marton Balint <cus@passwd.hu> wrote:
>
> On Tue, 6 Jun 2017, John Poet wrote:
>
>> ff_mpeg1_encode_picture_header: Add support for AV_FRAME_DATA_A53_CC
>> frame: Add av_frame_get_side_data_at() to allow retrival of multiple side
>>    data of the same type.
>
>
> As far as I remember multiple side data of the same type is not something we
> wanted to support. Why do you need it? Can't a single AV_FRAME_DATA_A53_CC
> side data packet contain many CC entries?
>

Indeed, multiple entries of the same type are really a bad idea and we
basically made them impossible with stream sidedata, although maybe
not with frame side data yet. We should not add API for them or
encourage their use.
If there is a real need for multiple of the same type, maybe the type
should be expanded to hold more information.

- Hendrik
John P Poet June 6, 2017, 10:10 p.m.
On Tue, Jun 6, 2017 at 3:59 PM Hendrik Leppkes <h.leppkes@gmail.com> wrote:

> On Tue, Jun 6, 2017 at 11:45 PM, Marton Balint <cus@passwd.hu> wrote:
> >
> > On Tue, 6 Jun 2017, John Poet wrote:
> >
> >> ff_mpeg1_encode_picture_header: Add support for AV_FRAME_DATA_A53_CC
> >> frame: Add av_frame_get_side_data_at() to allow retrival of multiple
> side
> >>    data of the same type.
> >
> >
> > As far as I remember multiple side data of the same type is not
> something we
> > wanted to support. Why do you need it? Can't a single
> AV_FRAME_DATA_A53_CC
> > side data packet contain many CC entries?
> >
>
> Indeed, multiple entries of the same type are really a bad idea and we
> basically made them impossible with stream sidedata, although maybe
> not with frame side data yet. We should not add API for them or
> encourage their use.
> If there is a real need for multiple of the same type, maybe the type
> should be expanded to hold more information.
>
>
The cc_count is only 5 bits, which mean that only 31 3-byte "closed caption
constructs" will fit in a "block".    Testing this with 1080i60 material, I
found that 2 or 3 blocks was often necessary to hold all of the CC data.

I tried ignoring that limit of 31 "constructs" per block, and ended up with
corrupt captions.   By preserving the 2 or 3 separate blocks I observed
from the original source, the captions are perfect.

If you would like me to go about this a different way, please give me some
direction.

Thank you,

John
Brian Matherly June 6, 2017, 10:36 p.m.
>> Indeed, multiple entries of the same type are really a bad idea and we
>> basically made them impossible with stream sidedata, although maybe
>> not with frame side data yet. We should not add API for them or
>> encourage their use.
>> If there is a real need for multiple of the same type, maybe the type
>> should be expanded to hold more information.
>>
>>
> The cc_count is only 5 bits, which mean that only 31 3-byte "closed caption
> constructs" will fit in a "block".    Testing this with 1080i60 material, I
> found that 2 or 3 blocks was often necessary to hold all of the CC data.
>
> I tried ignoring that limit of 31 "constructs" per block, and ended up with
> corrupt captions.   By preserving the 2 or 3 separate blocks I observed
> from the original source, the captions are perfect.
>
According to CEA-708, in the case of 1080i60, the correct number for 
cc_count is 10. The highest number that cc_count is allowed to be is 30 
in the case of repeating a frame three times for film mode. Exceeding 
the correct cc_count for the frame rate would cause the CC channel data 
rate to exceed 9,600bps.

~Brian
Kieran Kunhya June 6, 2017, 10:40 p.m.
>
> The cc_count is only 5 bits, which mean that only 31 3-byte "closed caption
> constructs" will fit in a "block".    Testing this with 1080i60 material, I
> found that 2 or 3 blocks was often necessary to hold all of the CC data.
>
> I tried ignoring that limit of 31 "constructs" per block, and ended up with
> corrupt captions.   By preserving the 2 or 3 separate blocks I observed
> from the original source, the captions are perfect.
>

Odd, ATSC specifies specific bitrate requirements in this area. Are you
sure your insertion process isn't  bursting?

Kieran
Brian Matherly June 6, 2017, 10:42 p.m.
>> Indeed, multiple entries of the same type are really a bad idea and we
>> basically made them impossible with stream sidedata, although maybe
>> not with frame side data yet. We should not add API for them or
>> encourage their use.
>> If there is a real need for multiple of the same type, maybe the type
>> should be expanded to hold more information.
>>
>>
> The cc_count is only 5 bits, which mean that only 31 3-byte "closed caption
> constructs" will fit in a "block".    Testing this with 1080i60 material, I
> found that 2 or 3 blocks was often necessary to hold all of the CC data.
>
> I tried ignoring that limit of 31 "constructs" per block, and ended up with
> corrupt captions.   By preserving the 2 or 3 separate blocks I observed
> from the original source, the captions are perfect.
>
According to CEA-708, in the case of 1080i60, the correct number for 
cc_count is 10. The highest number that cc_count is allowed to be is 30 
in the case of repeating a frame three times for film mode. Exceeding 
the correct cc_count for the frame rate would cause the CC channel data 
rate to exceed 9,600bps.

~Brian
John P Poet June 6, 2017, 10:48 p.m.
On Tue, Jun 6, 2017 at 4:40 PM Kieran Kunhya <kierank@obe.tv> wrote:

> >
> > The cc_count is only 5 bits, which mean that only 31 3-byte "closed
> caption
> > constructs" will fit in a "block".    Testing this with 1080i60
> material, I
> > found that 2 or 3 blocks was often necessary to hold all of the CC data.
> >
> > I tried ignoring that limit of 31 "constructs" per block, and ended up
> with
> > corrupt captions.   By preserving the 2 or 3 separate blocks I observed
> > from the original source, the captions are perfect.
> >
>
> Odd, ATSC specifies specific bitrate requirements in this area. Are you
> sure your insertion process isn't  bursting?
>
> Kieran
>

The source is SDI with embedded 708 captions.  I supposed there may be an
issue there.  I have not tried this with any other source.

John
John P Poet June 7, 2017, 9:46 p.m.
On Tue, Jun 6, 2017 at 4:48 PM John P Poet <jppoet@gmail.com> wrote:

> On Tue, Jun 6, 2017 at 4:40 PM Kieran Kunhya <kierank@obe.tv> wrote:
>
>> >
>> > The cc_count is only 5 bits, which mean that only 31 3-byte "closed
>> caption
>> > constructs" will fit in a "block".    Testing this with 1080i60
>> material, I
>> > found that 2 or 3 blocks was often necessary to hold all of the CC data.
>> >
>> > I tried ignoring that limit of 31 "constructs" per block, and ended up
>> with
>> > corrupt captions.   By preserving the 2 or 3 separate blocks I observed
>> > from the original source, the captions are perfect.
>> >
>>
>> Odd, ATSC specifies specific bitrate requirements in this area. Are you
>> sure your insertion process isn't  bursting?
>>
>> Kieran
>>
>
> The source is SDI with embedded 708 captions.  I supposed there may be an
> issue there.  I have not tried this with any other source.
>
> John
>

From my SDI source, most frames have cc_count=20.  Then there can be two or
three frames without any CC packets, which are often followed by a single
frame with with two or three VBI lines of CC packets:  e.g. two or three
groups of cc_count=20 in a single frame.

When you asked if it is "bursting", I assume this is what you mean?  If so,
what would you suggest I do about it?

Thanks,

John
Devin Heitmueller June 9, 2017, 5:34 p.m.
Hello Marton,

On Tue, Jun 6, 2017 at 5:45 PM, Marton Balint <cus@passwd.hu> wrote:

> As far as I remember multiple side data of the same type is not something we
> wanted to support. Why do you need it? Can't a single AV_FRAME_DATA_A53_CC
> side data packet contain many CC entries?

Could you please expand on where this was discussed?  Is there any
design documentation for side data infrastructure that's been
introduced into ffmpeg?  Is there some list of other known design
constraints/limitations?

While I agree it would be great to simply say that you should never
have multiple side data items of the same type, I'm not sure how
realistic that is.  It would be helpful if I could better understand
the rationale in that thinking.

I'm starting a rather large project which will likely stretch the
design for side data in order to support a variety of other ancillary
data protocols (e.g. SCTE-104, SMPTE 2038, etc), and it would be great
to better understand where the constraints are (so I can either plan
to address those issues, or if too significant then choose a different
multimedia framework to work with before making a significant
investment building out a bunch of features in ffmpeg).

Thanks,

Devin
John P Poet July 5, 2017, 8:07 p.m.
On Fri, Jun 9, 2017 at 11:41 AM Devin Heitmueller <
dheitmueller@kernellabs.com> wrote:

> Hello Marton,
>
> On Tue, Jun 6, 2017 at 5:45 PM, Marton Balint <cus@passwd.hu> wrote:
>
> > As far as I remember multiple side data of the same type is not
> something we
> > wanted to support. Why do you need it? Can't a single
> AV_FRAME_DATA_A53_CC
> > side data packet contain many CC entries?
>
> Could you please expand on where this was discussed?  Is there any
> design documentation for side data infrastructure that's been
> introduced into ffmpeg?  Is there some list of other known design
> constraints/limitations?
>
> While I agree it would be great to simply say that you should never
> have multiple side data items of the same type, I'm not sure how
> realistic that is.  It would be helpful if I could better understand
> the rationale in that thinking.
>
> I'm starting a rather large project which will likely stretch the
> design for side data in order to support a variety of other ancillary
> data protocols (e.g. SCTE-104, SMPTE 2038, etc), and it would be great
> to better understand where the constraints are (so I can either plan
> to address those issues, or if too significant then choose a different
> multimedia framework to work with before making a significant
> investment building out a bunch of features in ffmpeg).
>
> Thanks,
>
> Devin
>

Is there anything I need to be doing, to get this committed?

Thanks,

John
Marton Balint July 5, 2017, 9:30 p.m.
On Fri, 9 Jun 2017, Devin Heitmueller wrote:

> Hello Marton,
>
> On Tue, Jun 6, 2017 at 5:45 PM, Marton Balint <cus@passwd.hu> wrote:
>
>> As far as I remember multiple side data of the same type is not something we
>> wanted to support. Why do you need it? Can't a single AV_FRAME_DATA_A53_CC
>> side data packet contain many CC entries?
>
> Could you please expand on where this was discussed?  Is there any
> design documentation for side data infrastructure that's been
> introduced into ffmpeg?  Is there some list of other known design
> constraints/limitations?
>
> While I agree it would be great to simply say that you should never
> have multiple side data items of the same type, I'm not sure how
> realistic that is.  It would be helpful if I could better understand
> the rationale in that thinking.

I guess there are two reasons for prohibiting same type side data:

- existing API implicitly gave the impression that is is not supported

Av_frame_get_side_data gives you no possibility to retrieve multiple side 
data of the same type. API users might assumed that the reason for that is 
because it is not possible/permitted.

- consistency with packet/stream side data types

For stream side data is is already explicitly disallowed, API would be 
inconsistent if multiple side data entries of the same types were possible 
for one kind of side data (frame), but not for the other (stream).

> I'm starting a rather large project which will likely stretch the
> design for side data in order to support a variety of other ancillary
> data protocols (e.g. SCTE-104, SMPTE 2038, etc), and it would be great
> to better understand where the constraints are (so I can either plan
> to address those issues, or if too significant then choose a different
> multimedia framework to work with before making a significant
> investment building out a bunch of features in ffmpeg).

Yeah, you definitely have to think about this, depending on your actual 
goals libavcodec/format might not be the best choice for complex ancillary 
data handling.

It seems multiple side data of the same type is not going to pick up 
support from other developers, so you have to come up with something else 
to move forward.

Regards,
Marton
John P Poet July 7, 2017, 9:01 p.m.
On Wed, Jul 5, 2017 at 3:30 PM Marton Balint <cus@passwd.hu> wrote:

>
> On Fri, 9 Jun 2017, Devin Heitmueller wrote:
>
> > Hello Marton,
> >
> > On Tue, Jun 6, 2017 at 5:45 PM, Marton Balint <cus@passwd.hu> wrote:
> >
> >> As far as I remember multiple side data of the same type is not
> something we
> >> wanted to support. Why do you need it? Can't a single
> AV_FRAME_DATA_A53_CC
> >> side data packet contain many CC entries?
> >
> > Could you please expand on where this was discussed?  Is there any
> > design documentation for side data infrastructure that's been
> > introduced into ffmpeg?  Is there some list of other known design
> > constraints/limitations?
> >
> > While I agree it would be great to simply say that you should never
> > have multiple side data items of the same type, I'm not sure how
> > realistic that is.  It would be helpful if I could better understand
> > the rationale in that thinking.
>
> I guess there are two reasons for prohibiting same type side data:
>
> - existing API implicitly gave the impression that is is not supported
>
> Av_frame_get_side_data gives you no possibility to retrieve multiple side
> data of the same type. API users might assumed that the reason for that is
> because it is not possible/permitted.
>
> - consistency with packet/stream side data types
>
> For stream side data is is already explicitly disallowed, API would be
> inconsistent if multiple side data entries of the same types were possible
> for one kind of side data (frame), but not for the other (stream).
>
> > I'm starting a rather large project which will likely stretch the
> > design for side data in order to support a variety of other ancillary
> > data protocols (e.g. SCTE-104, SMPTE 2038, etc), and it would be great
> > to better understand where the constraints are (so I can either plan
> > to address those issues, or if too significant then choose a different
> > multimedia framework to work with before making a significant
> > investment building out a bunch of features in ffmpeg).
>
> Yeah, you definitely have to think about this, depending on your actual
> goals libavcodec/format might not be the best choice for complex ancillary
> data handling.
>
> It seems multiple side data of the same type is not going to pick up
> support from other developers, so you have to come up with something else
> to move forward.
>

Hi Marton,

If I understand what you and others are saying, AVFrameSideDataType should
not be used to hold "MPEG user data", such as CEA-708, as described here
https://www.atsc.org/wp-content/uploads/2015/03/a_53-Part-4-2009.pdf in
section 6.2.

What I am unclear about, is the proper way to handle MPEG user data.  Are
you saying FFmpeg has no desire to handle this type of data, and that I
need to look to another project?  Or, are you just saying I need to define
a new structure (AVFrameUserDataType or AVFrameMPEGUserDataType) for this
type of data?

Thank you,

John

Patch hide | download patch | download mbox

diff --git a/libavcodec/mpeg12enc.c b/libavcodec/mpeg12enc.c
index f45598a..01fda83 100644
--- a/libavcodec/mpeg12enc.c
+++ b/libavcodec/mpeg12enc.c
@@ -543,6 +543,58 @@  void ff_mpeg1_encode_picture_header(MpegEncContext *s, int picture_number)
         }
     }
 
+    /* MPEG closed caption user data is limited to 31 3-byte "closed
+     * caption constructs" per user data block.  There can be serveral
+     * such user data blocks per frame.
+     */
+    for (int i = 0;; ++i) {
+        side_data = av_frame_get_side_data_at(s->current_picture_ptr->f, i);
+        if (!side_data)
+            break;
+
+        if (side_data->type == AV_FRAME_DATA_A53_CC) {
+            avpriv_align_put_bits(&s->pb);
+            put_header(s, USER_START_CODE);
+
+            /* ATSC user data identifier for CC or BAR data */
+            put_bits(&s->pb, 8, 'G');
+            put_bits(&s->pb, 8, 'A');
+            put_bits(&s->pb, 8, '9');
+            put_bits(&s->pb, 8, '4');
+
+            /* MPEG CC data type code */
+            put_bits(&s->pb, 8, 0x03);
+
+            /* cc_data() {
+             *     reserved (1 bits) ’1’
+             *     process_cc_data_flag  (1 bits) bslbf
+             *     additional_data_flag  (1 bits) bslbf
+             *     cc_count              (5 bits) uimsbf
+             *     reserved              (8 bits) ‘1111 1111’
+             *     for (i=0 ; i < cc_count ; ++i) {
+             *         marker_bits (5 bits) ‘1111 1’
+             *         cc_valid    (1 bits) bslbf
+             *         cc_type     (2 bits) bslbf
+             *         cc_data_1   (8 bits) bslbf
+             *         cc_data_2   (8 bits) bslbf
+             *     }
+             *     marker_bits     (8 bits) ‘1111 1111’
+             *     if (additional_data_flag) {
+             *         while (nextbits() != ‘0000 0000 0000 0000 0000 0001’) {
+             *             additional_cc_data
+             *         }
+             *     }
+             * }
+             */
+            for(int j = 0; j < side_data->size; ++j) {
+                put_bits(&s->pb, 8, side_data->data[j]);
+            }
+
+            /* Marker bits */
+            put_bits(&s->pb, 8, 0xFF);
+        }
+    }
+
     s->mb_y = 0;
     ff_mpeg1_encode_slice_header(s);
 }
diff --git a/libavutil/frame.c b/libavutil/frame.c
index 24d5d5f..8912f52 100644
--- a/libavutil/frame.c
+++ b/libavutil/frame.c
@@ -688,6 +688,14 @@  AVFrameSideData *av_frame_get_side_data(const AVFrame *frame,
     return NULL;
 }
 
+AVFrameSideData *av_frame_get_side_data_at(const AVFrame *frame, int idx)
+{
+    if (idx < frame->nb_side_data)
+        return frame->side_data[idx];
+
+  return NULL;
+}
+
 static int frame_copy_video(AVFrame *dst, const AVFrame *src)
 {
     const uint8_t *src_data[4];
diff --git a/libavutil/frame.h b/libavutil/frame.h
index 26261d7..5503106 100644
--- a/libavutil/frame.h
+++ b/libavutil/frame.h
@@ -52,8 +52,39 @@  enum AVFrameSideDataType {
     AV_FRAME_DATA_PANSCAN,
     /**
      * ATSC A53 Part 4 Closed Captions.
-     * A53 CC bitstream is stored as uint8_t in AVFrameSideData.data.
+     * A53 CC bitstream (cc_data) is stored as uint8_t in AVFrameSideData.data.
      * The number of bytes of CC data is AVFrameSideData.size.
+     *
+     * Data format:
+     *
+     * bslbf -- Bit string, left bit first, where “left” is the order in
+     * which bit strings are written in the Standard. Bit strings are
+     * written as a string of 1s and 0s within single quote marks,
+     * e.g. ‘1000 0001’. Blanks within a bit string are for ease of
+     * reading and have no significance
+     *
+     * uimsbf -- Unsigned integer, most significant bit first.
+     *
+     * cc_data() {
+     *     reserved (1 bits) ’1’
+     *     process_cc_data_flag  (1 bits) bslbf
+     *     additional_data_flag  (1 bits) bslbf
+     *     cc_count              (5 bits) uimsbf
+     *     reserved              (8 bits) ‘1111 1111’
+     *     for (i=0 ; i < cc_count ; ++i) {
+     *         marker_bits (5 bits) ‘1111 1’
+     *         cc_valid    (1 bits) bslbf
+     *         cc_type     (2 bits) bslbf
+     *         cc_data_1   (8 bits) bslbf
+     *         cc_data_2   (8 bits) bslbf
+     *     }
+     *     marker_bits     (8 bits) ‘1111 1111’
+     *     if (additional_data_flag) {
+     *         while (nextbits() != ‘0000 0000 0000 0000 0000 0001’) {
+     *             additional_cc_data
+     *         }
+     *     }
+     * }
      */
     AV_FRAME_DATA_A53_CC,
     /**
@@ -759,6 +790,11 @@  AVFrameSideData *av_frame_new_side_data(AVFrame *frame,
  */
 AVFrameSideData *av_frame_get_side_data(const AVFrame *frame,
                                         enum AVFrameSideDataType type);
+/**
+ * @return a pointer to the side data at the given index on success,
+ * NULL if the index is out-of-bounds.
+ */
+AVFrameSideData *av_frame_get_side_data_at(const AVFrame *frame, int idx);
 
 /**
  * If side data of the supplied type exists in the frame, free it and remove it