diff mbox series

[FFmpeg-devel,v3,1/1] avformat/mpegtsenc: Write necessary descriptors into PMT for arib_caption muxing

Message ID 20210415152133.85408-1-xqq@xqq.im
State New
Headers show
Series [FFmpeg-devel,v3,1/1] avformat/mpegtsenc: Write necessary descriptors into PMT for arib_caption muxing
Related show

Checks

Context Check Description
andriy/x86_make success Make finished
andriy/x86_make_fate success Make fate finished
andriy/PPC64_make success Make finished
andriy/PPC64_make_fate success Make fate finished

Commit Message

zheng qian April 15, 2021, 3:21 p.m. UTC
Changes since v2:
  Generate stream_identifier and data_component_id from profile

The recognization of ARIB STD-B24 caption has been introduced
in commit a03885b, which is used as closed caption in
Japanese / Brazilian Digital Television.

But arib_caption stream copy is not working correctly caused by
the missing of descriptors in PMT. ARIB caption data inside
remuxed mpegts stream could not be recognized as an arib_caption
subtitle track once again because of the missing of descriptors.

This patch writes stream_identifier_descriptor and
data_component_descriptor by generating stream_identifier and
data_component_id from ARIB profile.

arib_caption remuxing could be worked correctly through this patch.

Signed-off-by: zheng qian <xqq@xqq.im>
---
 libavformat/mpegtsenc.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

Comments

zheng qian April 15, 2021, 3:30 p.m. UTC | #1
On Fri, Apr 16, 2021 at 12:22 AM zheng qian <xqq@xqq.im> wrote:
> +
> +                // data_component_descriptor, defined in ARIB STD-B10, part 2, 6.2.20
> +                *q++ = 0xFD;  // descriptor_tag: ARIB data coding type descriptor
> +                *q++ = 3;     // descriptor_length
> +                put16(&q, data_component_id);  // data_component_id
> +                // additional_arib_caption_info: defined in ARIB STD-B24, fascicle 1, Part 3, 9.6.1
> +                // Use most commonly used value 0x3D: DMF(0x3), Reserved(0x3), Timing(0x1)
> +                *q++ = 0x3D;

I have tested lots of TV streams in Japan from terrestrial channels
near Tokyo, and plenty of BS/CS satellite channels. All of them have
a value of 0x3d for the additional_arib_caption_info field so I used it.

Regards,
zheng

>              }
>          break;
>          case AVMEDIA_TYPE_VIDEO:
> --
> 2.29.2
>
zheng qian April 19, 2021, 2:38 a.m. UTC | #2
Is there anyone who could review this patch?

Best regards,
zheng
Marton Balint April 19, 2021, 9:10 p.m. UTC | #3
On Mon, 19 Apr 2021, zheng qian wrote:

> Is there anyone who could review this patch?

Jan was interested in this, so preferably he should also comment, but it 
looks fine to me.

Thanks,
Marton
Jan Ekström April 19, 2021, 9:45 p.m. UTC | #4
On Tue, Apr 20, 2021 at 12:11 AM Marton Balint <cus@passwd.hu> wrote:
>
>
>
> On Mon, 19 Apr 2021, zheng qian wrote:
>
> > Is there anyone who could review this patch?
>
> Jan was interested in this, so preferably he should also comment, but it
> looks fine to me.
>

OK, this explains why I didn't see my response on patchwork.
Apparently he had CC'd me and thus the "reply" button in gmail sent an
e-mail directly to him and I was hurrying due to being on a lunch
break -_- (and thus didn't notice).

In any case, I did some comments and am now waiting for a second
opinion regarding the usage of stream_identifiers in the ARIB context.
After all, the specifications do let one utilize 0x30-0x37 for profile
A/full-seg ARIB captions, so there must be a reason for them to not be
as limited as profile C/1seg to a single identifier :)

Jan
zheng qian April 20, 2021, 12:02 p.m. UTC | #5
On Tue, Apr 20, 2021 at 6:46 AM Jan Ekström <jeebjp@gmail.com> wrote:
>
> On Tue, Apr 20, 2021 at 12:11 AM Marton Balint <cus@passwd.hu> wrote:
> >
> >
> >
> > On Mon, 19 Apr 2021, zheng qian wrote:
> >
> > > Is there anyone who could review this patch?
> >
> > Jan was interested in this, so preferably he should also comment, but it
> > looks fine to me.
> >
>
> OK, this explains why I didn't see my response on patchwork.
> Apparently he had CC'd me and thus the "reply" button in gmail sent an
> e-mail directly to him and I was hurrying due to being on a lunch
> break -_- (and thus didn't notice).
>
> In any case, I did some comments and am now waiting for a second
> opinion regarding the usage of stream_identifiers in the ARIB context.
> After all, the specifications do let one utilize 0x30-0x37 for profile
> A/full-seg ARIB captions, so there must be a reason for them to not be
> as limited as profile C/1seg to a single identifier :)
>

I've found related definitions in ARIB TR-B14, Fascicle 1, 4.2.8.1
and you can find it in
http://web.archive.org/web/20160319090430/http://arib.or.jp/english/html/overview/doc/8-TR-B14v2_8-1p3-2-E2.pdf

4.2.8.1 section says:
"However, for component tag values of default ES of caption,
set 0x30 or 0x87, for component tag value of default ES of
superimpose, set 0x38 or 0x88."

That means 0x30 is considered as the default value for
Profile A caption ES. The section didn't describe how to
utilize other values rather than 0x30 in the 0x30~0x37 range,
and due to the second language caption is designed to be
multiplexed in the same ES, seems that it's assumed that
there will be usually only one ARIB caption within a program.

Anyway, I have never seen a TS program that carries 2 or more
arib_caption streams among Japanese TV channels.
Even if we manually try to remux 2 or more arib_caption streams
into a TS program and both use the component tag of 0x30,
it shouldn't cause any playback problems.

Best regards,
zheng qian

> Jan
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Jan Ekström April 20, 2021, 5:16 p.m. UTC | #6
On Tue, Apr 20, 2021 at 3:03 PM zheng qian <xqq@xqq.im> wrote:
>
> On Tue, Apr 20, 2021 at 6:46 AM Jan Ekström <jeebjp@gmail.com> wrote:
> >
> > On Tue, Apr 20, 2021 at 12:11 AM Marton Balint <cus@passwd.hu> wrote:
> > >
> > >
> > >
> > > On Mon, 19 Apr 2021, zheng qian wrote:
> > >
> > > > Is there anyone who could review this patch?
> > >
> > > Jan was interested in this, so preferably he should also comment, but it
> > > looks fine to me.
> > >
> >
> > OK, this explains why I didn't see my response on patchwork.
> > Apparently he had CC'd me and thus the "reply" button in gmail sent an
> > e-mail directly to him and I was hurrying due to being on a lunch
> > break -_- (and thus didn't notice).
> >
> > In any case, I did some comments and am now waiting for a second
> > opinion regarding the usage of stream_identifiers in the ARIB context.
> > After all, the specifications do let one utilize 0x30-0x37 for profile
> > A/full-seg ARIB captions, so there must be a reason for them to not be
> > as limited as profile C/1seg to a single identifier :)
> >
>
> I've found related definitions in ARIB TR-B14, Fascicle 1, 4.2.8.1
> and you can find it in
> http://web.archive.org/web/20160319090430/http://arib.or.jp/english/html/overview/doc/8-TR-B14v2_8-1p3-2-E2.pdf
>
> 4.2.8.1 section says:
> "However, for component tag values of default ES of caption,
> set 0x30 or 0x87, for component tag value of default ES of
> superimpose, set 0x38 or 0x88."
>
> That means 0x30 is considered as the default value for
> Profile A caption ES. The section didn't describe how to
> utilize other values rather than 0x30 in the 0x30~0x37 range,
> and due to the second language caption is designed to be
> multiplexed in the same ES, seems that it's assumed that
> there will be usually only one ARIB caption within a program.
>

Alright, so I took another look as well throughout these Technical
Recommendations:

TR B14 fascicle 4
- http://www.arib.or.jp/english/html/overview/doc/8-TR-B14v6_5-4p5-E1.pdf

See:
14  Use of “component_tag”

> Table 14-1   Assignment of “component_tag” Values
> Others
> 0x30 to 0x7F
> Please note that 0x40 is assigned to the default ES for data
> broadcasting. 0x30, 0x31 to 0x37, 0x38 and 0x39 to 0x3F are assigned to subtitle
> main, subtitle sub, teletext main and teletext sub respectively.

Then in TR B14 fascicle 2:
- http://www.arib.or.jp/english/html/overview/doc/8-TR-B14v6_5-2p5-E1.pdf

> 4.2.1  Specification for composition and transmission
> (3) Number of ES
> The maximum number of ESs that can be transmitted simultaneously for the same service is
> 1  for  captions  and  1  for  superimpositions  when  the  component  group  descriptor  is  not
> transmitted. When the component group descriptor is transmitted, the maximum number of ESs
> for captions is 1 and the maximum number of ESs for superimpositions is 1 for each component
> group.

So to recap my understanding:
1. STD side (specs): It seems like these values have to be unique at
least at the level of a PMT (thus, per-program) - discussed this with
a person working on ARIB area of operations :) .
2. STD side: We can indeed have multiple caption/subtitle ES in a
single program (PMT) for Profile A. Profile C is always limited to one
ES per program since only one content_tag is possible.
3. TR side (tech. rec.): 0x30/0x87 should be the default for Profile A/C
4. TR side: If we do not code Component Group Descriptor (0xD9 in the
EIT), we should stick to a single ES (the default one) on both Profile
A and C.
5. TR side: If we code a Component Group Descriptor in the EIT, we can
have one ES for each composition group.

So basically:
1. STD side lets us go for it
2. TR side says "nope, you stick to one - unless you're defining
composition groups". Which is why you do not see multi-subtitle
streams in the wild, since almost nobody utilizes Content Group
Descriptors.

We don't and not sure if we will ever support component groups, so the
alternatives are:
1. Follow TR side, and be more strict than the specification. Limit in
`mpegts_init` each program to a single ARIB caption stream (be it
Profile A or Profile C). Set component_tag in descriptor to the
default one (0x30 or 0x87 respectively).
2. Follow STD side, and utilize what the specification enables. Limit
in `mpegts_init` each program to either 8 Profile A ARIB caption
streams, or 1 Profile C ARIB caption stream. Set component_tag
iteratively in order.

In both cases during counting the profile should be checked, since we
require the profile info to be there for writing of the descriptor.

Jan
Jan Ekström April 20, 2021, 10:03 p.m. UTC | #7
On Tue, Apr 20, 2021 at 8:16 PM Jan Ekström <jeebjp@gmail.com> wrote:
>
> On Tue, Apr 20, 2021 at 3:03 PM zheng qian <xqq@xqq.im> wrote:
> >
> > On Tue, Apr 20, 2021 at 6:46 AM Jan Ekström <jeebjp@gmail.com> wrote:
> > >
> > > On Tue, Apr 20, 2021 at 12:11 AM Marton Balint <cus@passwd.hu> wrote:
> > > >
> > > >
> > > >
> > > > On Mon, 19 Apr 2021, zheng qian wrote:
> > > >
> > > > > Is there anyone who could review this patch?
> > > >
> > > > Jan was interested in this, so preferably he should also comment, but it
> > > > looks fine to me.
> > > >
> > >
> > > OK, this explains why I didn't see my response on patchwork.
> > > Apparently he had CC'd me and thus the "reply" button in gmail sent an
> > > e-mail directly to him and I was hurrying due to being on a lunch
> > > break -_- (and thus didn't notice).
> > >
> > > In any case, I did some comments and am now waiting for a second
> > > opinion regarding the usage of stream_identifiers in the ARIB context.
> > > After all, the specifications do let one utilize 0x30-0x37 for profile
> > > A/full-seg ARIB captions, so there must be a reason for them to not be
> > > as limited as profile C/1seg to a single identifier :)
> > >
> >
> > I've found related definitions in ARIB TR-B14, Fascicle 1, 4.2.8.1
> > and you can find it in
> > http://web.archive.org/web/20160319090430/http://arib.or.jp/english/html/overview/doc/8-TR-B14v2_8-1p3-2-E2.pdf
> >
> > 4.2.8.1 section says:
> > "However, for component tag values of default ES of caption,
> > set 0x30 or 0x87, for component tag value of default ES of
> > superimpose, set 0x38 or 0x88."
> >
> > That means 0x30 is considered as the default value for
> > Profile A caption ES. The section didn't describe how to
> > utilize other values rather than 0x30 in the 0x30~0x37 range,
> > and due to the second language caption is designed to be
> > multiplexed in the same ES, seems that it's assumed that
> > there will be usually only one ARIB caption within a program.
> >
>
> Alright, so I took another look as well throughout these Technical
> Recommendations:
>
> TR B14 fascicle 4
> - http://www.arib.or.jp/english/html/overview/doc/8-TR-B14v6_5-4p5-E1.pdf
>
> See:
> 14  Use of “component_tag”
>
> > Table 14-1   Assignment of “component_tag” Values
> > Others
> > 0x30 to 0x7F
> > Please note that 0x40 is assigned to the default ES for data
> > broadcasting. 0x30, 0x31 to 0x37, 0x38 and 0x39 to 0x3F are assigned to subtitle
> > main, subtitle sub, teletext main and teletext sub respectively.
>
> Then in TR B14 fascicle 2:
> - http://www.arib.or.jp/english/html/overview/doc/8-TR-B14v6_5-2p5-E1.pdf
>
> > 4.2.1  Specification for composition and transmission
> > (3) Number of ES
> > The maximum number of ESs that can be transmitted simultaneously for the same service is
> > 1  for  captions  and  1  for  superimpositions  when  the  component  group  descriptor  is  not
> > transmitted. When the component group descriptor is transmitted, the maximum number of ESs
> > for captions is 1 and the maximum number of ESs for superimpositions is 1 for each component
> > group.
>
> So to recap my understanding:
> 1. STD side (specs): It seems like these values have to be unique at
> least at the level of a PMT (thus, per-program) - discussed this with
> a person working on ARIB area of operations :) .
> 2. STD side: We can indeed have multiple caption/subtitle ES in a
> single program (PMT) for Profile A. Profile C is always limited to one
> ES per program since only one content_tag is possible.
> 3. TR side (tech. rec.): 0x30/0x87 should be the default for Profile A/C
> 4. TR side: If we do not code Component Group Descriptor (0xD9 in the
> EIT), we should stick to a single ES (the default one) on both Profile
> A and C.
> 5. TR side: If we code a Component Group Descriptor in the EIT, we can
> have one ES for each composition group.
>
> So basically:
> 1. STD side lets us go for it
> 2. TR side says "nope, you stick to one - unless you're defining
> composition groups". Which is why you do not see multi-subtitle
> streams in the wild, since almost nobody utilizes Content Group
> Descriptors.
>
> We don't and not sure if we will ever support component groups, so the
> alternatives are:
> 1. Follow TR side, and be more strict than the specification. Limit in
> `mpegts_init` each program to a single ARIB caption stream (be it
> Profile A or Profile C). Set component_tag in descriptor to the
> default one (0x30 or 0x87 respectively).
> 2. Follow STD side, and utilize what the specification enables. Limit
> in `mpegts_init` each program to either 8 Profile A ARIB caption
> streams, or 1 Profile C ARIB caption stream. Set component_tag
> iteratively in order.
>

I think in general I am preferring the "follow the TR" way since I
just heard another implementer generally speaking also ignores
everything else than 0x30/0x87.

Did not yet add the per-program limitation but poked a bit at it with
https://github.com/jeeb/ffmpeg/commits/mpegts_arib_caption_muxing .

1. Split the writing of the ARIB caption descriptor as the PMT writing
function is already way too long. Was not sure if exit or plain break
would be sufficient, but nothing else around in that function seems to
return with an error... so break it is.
2. Added logging in case an unset/unknown profile was utilized.
3. Re-added extraction of additional_arib_caption_info into extradata,
and writing it out in the muxer if available.

Feel free to note how you like these changes :)

Jan
zheng qian April 21, 2021, 1 a.m. UTC | #8
On Wed, Apr 21, 2021 at 7:04 AM Jan Ekström <jeebjp@gmail.com> wrote:
>
> I think in general I am preferring the "follow the TR" way since I
> just heard another implementer generally speaking also ignores
> everything else than 0x30/0x87.
>
> Did not yet add the per-program limitation but poked a bit at it with
> https://github.com/jeeb/ffmpeg/commits/mpegts_arib_caption_muxing .
>
> 1. Split the writing of the ARIB caption descriptor as the PMT writing
> function is already way too long. Was not sure if exit or plain break
> would be sufficient, but nothing else around in that function seems to
> return with an error... so break it is.
> 2. Added logging in case an unset/unknown profile was utilized.
> 3. Re-added extraction of additional_arib_caption_info into extradata,
> and writing it out in the muxer if available.
>

I doubt whether it's necessary to extract addtitional_arib_caption_info.
According to ARIB TR-B24, Fascicle 1, 4.2.8.4 - 4.2.8.5:

> Data Encoding Method Descriptor required in PMT:
>     DMF is always '0011', aka 0x3
>     Timing is always '01' for caption, aka 0x1

Reversed is filled as '11'  aka 0x3 as you know, so for arib_caption,
addtitional_arib_caption_info could be determined as 0x3d.

ARIB superimpose should utilize Timing(00) since superimpose
uses Asynchronous_PES, but it belongs to another topic.

In other words, addtitional_arib_caption_info can be generated by
profile information safely and is not necessary to copy by extradata.

Regards,
zheng

> Feel free to note how you like these changes :)
>
> Jan
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
zheng qian April 21, 2021, 6:49 a.m. UTC | #9
On Wed, Apr 21, 2021 at 10:00 AM zheng qian <xqq@xqq.im> wrote:
>
> ARIB superimpose should utilize Timing(00) since superimpose
> uses Asynchronous_PES, but it belongs to another topic.
>
> In other words, addtitional_arib_caption_info can be generated by
> profile information safely and is not necessary to copy by extradata.
>

P.S.: AFAIK the DMF and Timing field actually do not affect ARIB B24
subtitle decoder/renderers' behavior since they always follow the
TMD and DMF field in caption_management_data(), which is
periodically transmitted in arib_caption ES.

In TR-B24, the Timing field for Superimpose could be
'00'(Non-synchronization) or '01'(Time synchronization),
and AFAIK Superimpose used in Japan DTV has never utilized
Timing field rather than '00', since Superimpose is designed to
be displayed immediately once received and usually used for
providing urgent disaster alerts and newsletters (like NHK速報).

Later I'd like to introduce arib_superimpose codec for superimpose
recognition and remuxing, and I believe that for arib_superimpose,
using fixed value 00 for Timing doesn't cause problems in practice.

thus, I'm waiting for your opinion.

Regards,
zheng qian

> Regards,
> zheng
>
> > Feel free to note how you like these changes :)
> >
> > Jan
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
zheng qian April 27, 2021, 11:39 a.m. UTC | #10
On Wed, Apr 21, 2021 at 7:04 AM Jan Ekström <jeebjp@gmail.com> wrote:
>
> I think in general I am preferring the "follow the TR" way since I
> just heard another implementer generally speaking also ignores
> everything else than 0x30/0x87.
>
> Did not yet add the per-program limitation but poked a bit at it with
> https://github.com/jeeb/ffmpeg/commits/mpegts_arib_caption_muxing .
>
> 1. Split the writing of the ARIB caption descriptor as the PMT writing
> function is already way too long. Was not sure if exit or plain break
> would be sufficient, but nothing else around in that function seems to
> return with an error... so break it is.
> 2. Added logging in case an unset/unknown profile was utilized.
> 3. Re-added extraction of additional_arib_caption_info into extradata,
> and writing it out in the muxer if available.
>
> Feel free to note how you like these changes :)
>
> Jan

Any update on this thread?

Please let me know if there is any new information.

Thanks,
zheng
diff mbox series

Patch

diff --git a/libavformat/mpegtsenc.c b/libavformat/mpegtsenc.c
index a357f3a6aa..f302af84ff 100644
--- a/libavformat/mpegtsenc.c
+++ b/libavformat/mpegtsenc.c
@@ -357,6 +357,7 @@  static int get_dvb_stream_type(AVFormatContext *s, AVStream *st)
         break;
     case AV_CODEC_ID_DVB_SUBTITLE:
     case AV_CODEC_ID_DVB_TELETEXT:
+    case AV_CODEC_ID_ARIB_CAPTION:
         stream_type = STREAM_TYPE_PRIVATE_DATA;
         break;
     case AV_CODEC_ID_SMPTE_KLV:
@@ -714,6 +715,34 @@  static int mpegts_write_pmt(AVFormatContext *s, MpegTSService *service)
                }
 
                *len_ptr = q - len_ptr - 1;
+            } else if (codec_id == AV_CODEC_ID_ARIB_CAPTION) {
+                uint8_t stream_identifier;
+                uint16_t data_component_id;
+
+                if (st->codecpar->profile == FF_PROFILE_ARIB_PROFILE_A) {
+                    // non-mobile captioning service ("profile A")
+                    stream_identifier = 0x30;
+                    data_component_id = 0x0008;
+                } else if (st->codecpar->profile == FF_PROFILE_ARIB_PROFILE_C) {
+                    // (1seg) captioning service ("profile C")
+                    stream_identifier = 0x87;
+                    data_component_id = 0x0012;
+                } else {
+                    break;
+                }
+
+                // stream_identifier_descriptor
+                *q++ = 0x52;  // descriptor_tag
+                *q++ = 1;     // descriptor_length
+                *q++ = stream_identifier;  // component_tag: stream_identifier
+
+                // data_component_descriptor, defined in ARIB STD-B10, part 2, 6.2.20
+                *q++ = 0xFD;  // descriptor_tag: ARIB data coding type descriptor
+                *q++ = 3;     // descriptor_length
+                put16(&q, data_component_id);  // data_component_id
+                // additional_arib_caption_info: defined in ARIB STD-B24, fascicle 1, Part 3, 9.6.1
+                // Use most commonly used value 0x3D: DMF(0x3), Reserved(0x3), Timing(0x1)
+                *q++ = 0x3D;
             }
         break;
         case AVMEDIA_TYPE_VIDEO: