diff mbox series

[FFmpeg-devel,v2,1/2] libavformat/mpegts: Extract arib_caption descriptor data into codecpar->extradata

Message ID 20210410031017.1271-1-xqq@xqq.im
State New
Headers show
Series [FFmpeg-devel,v2,1/2] libavformat/mpegts: Extract arib_caption descriptor data into codecpar->extradata | expand

Checks

Context Check Description
andriy/x86_make_warn warning New warnings during build
andriy/x86_make success Make finished
andriy/x86_make_fate success Make fate finished
andriy/PPC64_make success Make finished
andriy/PPC64_make_fate success Make fate finished

Commit Message

zheng qian April 10, 2021, 3:10 a.m. UTC
Changes since v1:
  If desc_len < 3, break the switch statement instead of return AVERROR_INVALIDDATA
  to make it more robust.
  For arib_caption, data_component_descriptor should contains at least 3 bytes:
  data_component_id: uint16, additional_arib_caption_info: uint8.

The recognization of ARIB STD-B24 caption has been introduced in commit a03885b,
which is used as closed caption in Japanese / Brazilian Digital Television.

But whenever copying arib_caption into mpegts output, arib_caption inside the
outputted stream cannot be recgonized as a arib_caption subtitle track once again,
which is caused by the missing of descriptors in the PMT table.

These patches are intended to fix broken stream copying of arib_caption subtitle.

This patch extracts necessary fields into codecpar->extradata for future remuxing,
which describe the stream_identifier, the profile and other additional information.

Signed-off-by: zheng qian <xqq@xqq.im>
---
 libavformat/mpegts.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

Comments

Marton Balint April 11, 2021, 10:41 a.m. UTC | #1
On Sat, 10 Apr 2021, zheng qian wrote:

> Changes since v1:
>  If desc_len < 3, break the switch statement instead of return AVERROR_INVALIDDATA
>  to make it more robust.
>  For arib_caption, data_component_descriptor should contains at least 3 bytes:
>  data_component_id: uint16, additional_arib_caption_info: uint8.
>
> The recognization of ARIB STD-B24 caption has been introduced in commit a03885b,
> which is used as closed caption in Japanese / Brazilian Digital Television.
>
> But whenever copying arib_caption into mpegts output, arib_caption inside the
> outputted stream cannot be recgonized as a arib_caption subtitle track once again,
> which is caused by the missing of descriptors in the PMT table.
>
> These patches are intended to fix broken stream copying of arib_caption subtitle.
>
> This patch extracts necessary fields into codecpar->extradata for future remuxing,
> which describe the stream_identifier, the profile and other additional information.
>
> Signed-off-by: zheng qian <xqq@xqq.im>
> ---
> libavformat/mpegts.c | 25 +++++++++++++++++++++++++
> 1 file changed, 25 insertions(+)
>
> diff --git a/libavformat/mpegts.c b/libavformat/mpegts.c
> index 6e0d9d7496..fd7ea1f504 100644
> --- a/libavformat/mpegts.c
> +++ b/libavformat/mpegts.c
> @@ -2107,6 +2107,9 @@ int ff_parse_mpeg2_descriptor(AVFormatContext *fc, AVStream *st, int stream_type
>         // STD-B24, fascicle 3, chapter 4 defines private_stream_1
>         // for captions
>         if (stream_type == STREAM_TYPE_PRIVATE_DATA) {
> +            if (desc_len < 3)
> +                break;
> +
>             // This structure is defined in STD-B10, part 1, listing 5.4 and
>             // part 2, 6.2.20).
>             // Listing of data_component_ids is in STD-B10, part 2, Annex J.
> @@ -2145,6 +2148,28 @@ int ff_parse_mpeg2_descriptor(AVFormatContext *fc, AVStream *st, int stream_type
>             st->codecpar->codec_id   = AV_CODEC_ID_ARIB_CAPTION;
>             st->codecpar->profile    = picked_profile;
>             st->internal->request_probe        = 0;
> +
> +            // Store stream_identifier and payload of data_component_descriptor
> +            // (data_component_id & additional_arib_caption_info) as extradata.
> +            // data_component_descriptor is defined in ARIB STD-B10, part 2, 6.2.20
> +            // These will be useful for remuxing arib_caption into mpegts output.
> +            if (!st->codecpar->extradata) {
> +                st->codecpar->extradata = av_mallocz(4 + AV_INPUT_BUFFER_PADDING_SIZE);
> +                if (!st->codecpar->extradata)
> +                    return AVERROR(ENOMEM);
> +
> +                st->codecpar->extradata_size = 4;
> +
> +                // stream_identifier (component_tag of stream_identifier_descriptor)
> +                st->codecpar->extradata[0] = (uint8_t)(st->stream_identifier - 1);

Storing this in extradata seems redundant. AVStream->stream_identifier 
already exists, and you should be able to use that for muxing as well.

Regards,
Marton

> +
> +                // payload of data_component_descriptor structure
> +                // data_component_id
> +                st->codecpar->extradata[1] = (uint8_t)((data_component_id & 0xF0) >> 8);
> +                st->codecpar->extradata[2] = (uint8_t)(data_component_id & 0x0F);
> +                // additional_arib_caption_info, defined in ARIB STD-B24, fascicle 1, Part 3, 9.6.1
> +                st->codecpar->extradata[3] = get8(pp, desc_end);
> +            }
>         }
>         break;
>     case 0xb0: /* DOVI video stream descriptor */
> -- 
> 2.29.2
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
zheng qian April 11, 2021, 1:18 p.m. UTC | #2
I have already tried to use st->stream_identifier directly in mpegtsenc.c
while handling arib_caption stream in mpegts_write_pmt() function.

But in mpegtsenc.c, st->stream_identifier always provides
an incorrect value of 0, which should be 0x30 for arib_caption A profile.

st->stream_identifier seems hasn't been copied correctly while remuxing into
another mpegts stream. So I manually copied it.


On Sun, Apr 11, 2021 at 7:41 PM Marton Balint <cus@passwd.hu> wrote:
>
>
>
> On Sat, 10 Apr 2021, zheng qian wrote:
>
> > Changes since v1:
> >  If desc_len < 3, break the switch statement instead of return AVERROR_INVALIDDATA
> >  to make it more robust.
> >  For arib_caption, data_component_descriptor should contains at least 3 bytes:
> >  data_component_id: uint16, additional_arib_caption_info: uint8.
> >
> > The recognization of ARIB STD-B24 caption has been introduced in commit a03885b,
> > which is used as closed caption in Japanese / Brazilian Digital Television.
> >
> > But whenever copying arib_caption into mpegts output, arib_caption inside the
> > outputted stream cannot be recgonized as a arib_caption subtitle track once again,
> > which is caused by the missing of descriptors in the PMT table.
> >
> > These patches are intended to fix broken stream copying of arib_caption subtitle.
> >
> > This patch extracts necessary fields into codecpar->extradata for future remuxing,
> > which describe the stream_identifier, the profile and other additional information.
> >
> > Signed-off-by: zheng qian <xqq@xqq.im>
> > ---
> > libavformat/mpegts.c | 25 +++++++++++++++++++++++++
> > 1 file changed, 25 insertions(+)
> >
> > diff --git a/libavformat/mpegts.c b/libavformat/mpegts.c
> > index 6e0d9d7496..fd7ea1f504 100644
> > --- a/libavformat/mpegts.c
> > +++ b/libavformat/mpegts.c
> > @@ -2107,6 +2107,9 @@ int ff_parse_mpeg2_descriptor(AVFormatContext *fc, AVStream *st, int stream_type
> >         // STD-B24, fascicle 3, chapter 4 defines private_stream_1
> >         // for captions
> >         if (stream_type == STREAM_TYPE_PRIVATE_DATA) {
> > +            if (desc_len < 3)
> > +                break;
> > +
> >             // This structure is defined in STD-B10, part 1, listing 5.4 and
> >             // part 2, 6.2.20).
> >             // Listing of data_component_ids is in STD-B10, part 2, Annex J.
> > @@ -2145,6 +2148,28 @@ int ff_parse_mpeg2_descriptor(AVFormatContext *fc, AVStream *st, int stream_type
> >             st->codecpar->codec_id   = AV_CODEC_ID_ARIB_CAPTION;
> >             st->codecpar->profile    = picked_profile;
> >             st->internal->request_probe        = 0;
> > +
> > +            // Store stream_identifier and payload of data_component_descriptor
> > +            // (data_component_id & additional_arib_caption_info) as extradata.
> > +            // data_component_descriptor is defined in ARIB STD-B10, part 2, 6.2.20
> > +            // These will be useful for remuxing arib_caption into mpegts output.
> > +            if (!st->codecpar->extradata) {
> > +                st->codecpar->extradata = av_mallocz(4 + AV_INPUT_BUFFER_PADDING_SIZE);
> > +                if (!st->codecpar->extradata)
> > +                    return AVERROR(ENOMEM);
> > +
> > +                st->codecpar->extradata_size = 4;
> > +
> > +                // stream_identifier (component_tag of stream_identifier_descriptor)
> > +                st->codecpar->extradata[0] = (uint8_t)(st->stream_identifier - 1);
>
> Storing this in extradata seems redundant. AVStream->stream_identifier
> already exists, and you should be able to use that for muxing as well.
>
> Regards,
> Marton
>
> > +
> > +                // payload of data_component_descriptor structure
> > +                // data_component_id
> > +                st->codecpar->extradata[1] = (uint8_t)((data_component_id & 0xF0) >> 8);
> > +                st->codecpar->extradata[2] = (uint8_t)(data_component_id & 0x0F);
> > +                // additional_arib_caption_info, defined in ARIB STD-B24, fascicle 1, Part 3, 9.6.1
> > +                st->codecpar->extradata[3] = get8(pp, desc_end);
> > +            }
> >         }
> >         break;
> >     case 0xb0: /* DOVI video stream descriptor */
> > --
> > 2.29.2
> >
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Jan Ekström April 11, 2021, 1:58 p.m. UTC | #3
On Sun, Apr 11, 2021 at 4:25 PM magic xqq <xqq@xqq.im> wrote:
>
> I have already tried to use st->stream_identifier directly in mpegtsenc.c
> while handling arib_caption stream in mpegts_write_pmt() function.
>
> But in mpegtsenc.c, st->stream_identifier always provides
> an incorrect value of 0, which should be 0x30 for arib_caption A profile.
>
> st->stream_identifier seems hasn't been copied correctly while remuxing into
> another mpegts stream. So I manually copied it.
>

Yes, AVStream::stream_identifier is not currently handled by ffmpeg.c.
Not sure its addition was ever really thought thoroughly, and mostly
added so that MPEG-TS reading clients could gain this value when
reading data. Not even ffprobe.c is currently poking a stick at it.

But quickly looking at things, you can map component_tag and
component_id according to the ARIB caption profile, since your
possibilities are:

FF_PROFILE_ARIB_PROFILE_A => component tag 0x30-0x37, data_component_id = 0x0008
FF_PROFILE_ARIB_PROFILE_C => component_tag 0x87, data_component_id = 0x0012

The only issue is if you want to keep the exact same component tag,
but that can be improved in ffmpeg.c by just passing it during stream
copy into the output AVStream if it's nonzero (that is why it's
identifier plus 1). Also probably we need an option to force it to
something else if needed. In any case, I consider this somewhat
separate from the remux case since I don't think you need to have the
exact same id for the remux to be workable? Please note if I am
mistaken.

Then there's the extra info, I'll have to check the specs later today
whether that contains anything dynamic.

Best regards,
Jan
Jan Ekström April 11, 2021, 2:51 p.m. UTC | #4
On Sun, Apr 11, 2021 at 4:58 PM Jan Ekström <jeebjp@gmail.com> wrote:
>
> On Sun, Apr 11, 2021 at 4:25 PM magic xqq <xqq@xqq.im> wrote:
> >
> > I have already tried to use st->stream_identifier directly in mpegtsenc.c
> > while handling arib_caption stream in mpegts_write_pmt() function.
> >
> > But in mpegtsenc.c, st->stream_identifier always provides
> > an incorrect value of 0, which should be 0x30 for arib_caption A profile.
> >
> > st->stream_identifier seems hasn't been copied correctly while remuxing into
> > another mpegts stream. So I manually copied it.
> >
>
> Yes, AVStream::stream_identifier is not currently handled by ffmpeg.c.
> Not sure its addition was ever really thought thoroughly, and mostly
> added so that MPEG-TS reading clients could gain this value when
> reading data. Not even ffprobe.c is currently poking a stick at it.
>
> But quickly looking at things, you can map component_tag and
> component_id according to the ARIB caption profile, since your
> possibilities are:
>
> FF_PROFILE_ARIB_PROFILE_A => component tag 0x30-0x37, data_component_id = 0x0008
> FF_PROFILE_ARIB_PROFILE_C => component_tag 0x87, data_component_id = 0x0012
>
> The only issue is if you want to keep the exact same component tag,
> but that can be improved in ffmpeg.c by just passing it during stream
> copy into the output AVStream if it's nonzero (that is why it's
> identifier plus 1). Also probably we need an option to force it to
> something else if needed. In any case, I consider this somewhat
> separate from the remux case since I don't think you need to have the
> exact same id for the remux to be workable? Please note if I am
> mistaken.
>
> Then there's the extra info, I'll have to check the specs later today
> whether that contains anything dynamic.
>
> Best regards,
> Jan

Alright, so that additional data structure is one byte large and
contains the following things:

additional_arib_caption_info(){
  DMF       bslbf(4)
  Reserved  bslbf(2)
  Timing    bslbf(2)
}

DMF - (Display mode flag), with b1111 being the most common value "dynamic"
Timing - async|program sync|real time sync

So I guess at the very least this would make sense to add into the
extradata. I do find it interesting though that libaribb24 doesn't
seem to require this data for valid decoding at all :D (although I
guess mostly the output timing and automated subtitle sub-stream
selection is affected by these values).

Jan
Jan Ekström April 11, 2021, 3:13 p.m. UTC | #5
On Sun, Apr 11, 2021 at 5:51 PM Jan Ekström <jeebjp@gmail.com> wrote:
>
> On Sun, Apr 11, 2021 at 4:58 PM Jan Ekström <jeebjp@gmail.com> wrote:
> >
> > On Sun, Apr 11, 2021 at 4:25 PM magic xqq <xqq@xqq.im> wrote:
> > >
> > > I have already tried to use st->stream_identifier directly in mpegtsenc.c
> > > while handling arib_caption stream in mpegts_write_pmt() function.
> > >
> > > But in mpegtsenc.c, st->stream_identifier always provides
> > > an incorrect value of 0, which should be 0x30 for arib_caption A profile.
> > >
> > > st->stream_identifier seems hasn't been copied correctly while remuxing into
> > > another mpegts stream. So I manually copied it.
> > >
> >
> > Yes, AVStream::stream_identifier is not currently handled by ffmpeg.c.
> > Not sure its addition was ever really thought thoroughly, and mostly
> > added so that MPEG-TS reading clients could gain this value when
> > reading data. Not even ffprobe.c is currently poking a stick at it.
> >
> > But quickly looking at things, you can map component_tag and
> > component_id according to the ARIB caption profile, since your
> > possibilities are:
> >
> > FF_PROFILE_ARIB_PROFILE_A => component tag 0x30-0x37, data_component_id = 0x0008
> > FF_PROFILE_ARIB_PROFILE_C => component_tag 0x87, data_component_id = 0x0012
> >
> > The only issue is if you want to keep the exact same component tag,
> > but that can be improved in ffmpeg.c by just passing it during stream
> > copy into the output AVStream if it's nonzero (that is why it's
> > identifier plus 1). Also probably we need an option to force it to
> > something else if needed. In any case, I consider this somewhat
> > separate from the remux case since I don't think you need to have the
> > exact same id for the remux to be workable? Please note if I am
> > mistaken.
> >
> > Then there's the extra info, I'll have to check the specs later today
> > whether that contains anything dynamic.
> >
> > Best regards,
> > Jan
>
> Alright, so that additional data structure is one byte large and
> contains the following things:
>
> additional_arib_caption_info(){
>   DMF       bslbf(4)
>   Reserved  bslbf(2)
>   Timing    bslbf(2)
> }
>
> DMF - (Display mode flag), with b1111 being the most common value "dynamic"
> Timing - async|program sync|real time sync
>
> So I guess at the very least this would make sense to add into the
> extradata. I do find it interesting though that libaribb24 doesn't
> seem to require this data for valid decoding at all :D (although I
> guess mostly the output timing and automated subtitle sub-stream
> selection is affected by these values).

For context, just checked some random sample I had on hand:

ARIB caption info: DMF: 0xa, reserved: 0x3, timing: 0x1

So "both caption streams are selectable/showable when both received
and playing back a recording" (0b1010), reserved bits are nonzero,
timing is PTS based ("program based" seems to mean that).

Jan
zheng qian April 11, 2021, 3:16 p.m. UTC | #6
Consider the remuxed stream could be handled by other clients rather
than FFmpeg,
e.g. I'm working on HTML5 playback for ISDB television with ARIB
captions through
mpegts.js + b24.js/aribb24.js. Other ISDB TV clients may also have
possibilities to
check these fields, probably.

It will become more robust to discover whether there's an arib_caption
stream through
the descriptors in PMT. Obviously, the descriptors should match the
ARIB standard.

Best regards,
zheng

On Sun, Apr 11, 2021 at 11:52 PM Jan Ekström <jeebjp@gmail.com> wrote:
>
> On Sun, Apr 11, 2021 at 4:58 PM Jan Ekström <jeebjp@gmail.com> wrote:
> >
> > On Sun, Apr 11, 2021 at 4:25 PM magic xqq <xqq@xqq.im> wrote:
> > >
> > > I have already tried to use st->stream_identifier directly in mpegtsenc.c
> > > while handling arib_caption stream in mpegts_write_pmt() function.
> > >
> > > But in mpegtsenc.c, st->stream_identifier always provides
> > > an incorrect value of 0, which should be 0x30 for arib_caption A profile.
> > >
> > > st->stream_identifier seems hasn't been copied correctly while remuxing into
> > > another mpegts stream. So I manually copied it.
> > >
> >
> > Yes, AVStream::stream_identifier is not currently handled by ffmpeg.c.
> > Not sure its addition was ever really thought thoroughly, and mostly
> > added so that MPEG-TS reading clients could gain this value when
> > reading data. Not even ffprobe.c is currently poking a stick at it.
> >
> > But quickly looking at things, you can map component_tag and
> > component_id according to the ARIB caption profile, since your
> > possibilities are:
> >
> > FF_PROFILE_ARIB_PROFILE_A => component tag 0x30-0x37, data_component_id = 0x0008
> > FF_PROFILE_ARIB_PROFILE_C => component_tag 0x87, data_component_id = 0x0012
> >
> > The only issue is if you want to keep the exact same component tag,
> > but that can be improved in ffmpeg.c by just passing it during stream
> > copy into the output AVStream if it's nonzero (that is why it's
> > identifier plus 1). Also probably we need an option to force it to
> > something else if needed. In any case, I consider this somewhat
> > separate from the remux case since I don't think you need to have the
> > exact same id for the remux to be workable? Please note if I am
> > mistaken.
> >
> > Then there's the extra info, I'll have to check the specs later today
> > whether that contains anything dynamic.
> >
> > Best regards,
> > Jan
>
> Alright, so that additional data structure is one byte large and
> contains the following things:
>
> additional_arib_caption_info(){
>   DMF       bslbf(4)
>   Reserved  bslbf(2)
>   Timing    bslbf(2)
> }
>
> DMF - (Display mode flag), with b1111 being the most common value "dynamic"
> Timing - async|program sync|real time sync
>
> So I guess at the very least this would make sense to add into the
> extradata. I do find it interesting though that libaribb24 doesn't
> seem to require this data for valid decoding at all :D (although I
> guess mostly the output timing and automated subtitle sub-stream
> selection is affected by these values).
>
> Jan
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Jan Ekström April 11, 2021, 3:30 p.m. UTC | #7
On Sun, Apr 11, 2021 at 6:17 PM zheng qian <xqq@xqq.im> wrote:
>
> Consider the remuxed stream could be handled by other clients rather
> than FFmpeg,
> e.g. I'm working on HTML5 playback for ISDB television with ARIB
> captions through
> mpegts.js + b24.js/aribb24.js. Other ISDB TV clients may also have
> possibilities to
> check these fields, probably.
>

I don't think anything I have noted so far has made the remuxed
streams any more invalid than what your patch ideas have been? I mean,
that is the base idea - to make sure what we are generating is valid.
Just that you can already grasp the component tag and
data_component_id ranges/values from the ARIB caption profile already,
without having to copy the structure wholesale. component tag happens
to be what is exposed by AVStream::stream_identifier already at this
point, so if re-use of this value is wanted, that can be attained
through other purposes.

That is, unless there are other structures in the stream which have
references to the component_tag bit (which as far as I can tell is the
only changing bit in addition to additional_arib_caption_info, which I
went through). That said, if there are references to this value in
other streams or data structures, we are *not* copying those around,
so those would have to be implemented if that is so.

In other words, at this point the only thing explicitly required by
the muxer (albeit dummy values can be inserted if really needed) seems
to be the stuff within additional_arib_caption_info.

Jan

P.S.
Please do not top-post on this mailing list.
zheng qian April 12, 2021, 1:41 a.m. UTC | #8
On Mon, Apr 12, 2021 at 1:10 AM Jan Ekström <jeebjp@gmail.com> wrote:
> For context, just checked some random sample I had on hand:
>
> ARIB caption info: DMF: 0xa, reserved: 0x3, timing: 0x1
>
> So "both caption streams are selectable/showable when both received
> and playing back a recording" (0b1010), reserved bits are nonzero,
> timing is PTS based ("program based" seems to mean that).

I have checked several samples dumped from Japanese and
Brazilian TV shows that they all have a value of 0x3d for
additional_arib_caption_info that means they have value 0x3
for DMF field.

Regards,
zheng
diff mbox series

Patch

diff --git a/libavformat/mpegts.c b/libavformat/mpegts.c
index 6e0d9d7496..fd7ea1f504 100644
--- a/libavformat/mpegts.c
+++ b/libavformat/mpegts.c
@@ -2107,6 +2107,9 @@  int ff_parse_mpeg2_descriptor(AVFormatContext *fc, AVStream *st, int stream_type
         // STD-B24, fascicle 3, chapter 4 defines private_stream_1
         // for captions
         if (stream_type == STREAM_TYPE_PRIVATE_DATA) {
+            if (desc_len < 3)
+                break;
+
             // This structure is defined in STD-B10, part 1, listing 5.4 and
             // part 2, 6.2.20).
             // Listing of data_component_ids is in STD-B10, part 2, Annex J.
@@ -2145,6 +2148,28 @@  int ff_parse_mpeg2_descriptor(AVFormatContext *fc, AVStream *st, int stream_type
             st->codecpar->codec_id   = AV_CODEC_ID_ARIB_CAPTION;
             st->codecpar->profile    = picked_profile;
             st->internal->request_probe        = 0;
+
+            // Store stream_identifier and payload of data_component_descriptor
+            // (data_component_id & additional_arib_caption_info) as extradata.
+            // data_component_descriptor is defined in ARIB STD-B10, part 2, 6.2.20
+            // These will be useful for remuxing arib_caption into mpegts output.
+            if (!st->codecpar->extradata) {
+                st->codecpar->extradata = av_mallocz(4 + AV_INPUT_BUFFER_PADDING_SIZE);
+                if (!st->codecpar->extradata)
+                    return AVERROR(ENOMEM);
+
+                st->codecpar->extradata_size = 4;
+
+                // stream_identifier (component_tag of stream_identifier_descriptor)
+                st->codecpar->extradata[0] = (uint8_t)(st->stream_identifier - 1);
+
+                // payload of data_component_descriptor structure
+                // data_component_id
+                st->codecpar->extradata[1] = (uint8_t)((data_component_id & 0xF0) >> 8);
+                st->codecpar->extradata[2] = (uint8_t)(data_component_id & 0x0F);
+                // additional_arib_caption_info, defined in ARIB STD-B24, fascicle 1, Part 3, 9.6.1
+                st->codecpar->extradata[3] = get8(pp, desc_end);
+            }
         }
         break;
     case 0xb0: /* DOVI video stream descriptor */