[FFmpeg-devel,2/2,v2] lavf/mpegts: add reading of ARIB data coding descriptor

Submitted by Jan Ekström on Feb. 3, 2019, 12:39 a.m.

Details

Message ID 20190203003906.16781-1-jeebjp@gmail.com
State New
Headers show

Commit Message

Jan Ekström Feb. 3, 2019, 12:39 a.m.
This enables us to read the data coding type utilized for
a specific private data stream, of which we currently are
interested in ARIB caption streams.

The component tag limitations are according to ARIB TR-B14,
and the component IDs are defined in ARIB STD-B10.
---
 libavformat/mpegts.c  | 44 +++++++++++++++++++++++++++++++++++++++++++
 libavformat/version.h |  2 +-
 2 files changed, 45 insertions(+), 1 deletion(-)

Comments

Carl Eugen Hoyos Feb. 3, 2019, 1:12 p.m.
2019-02-03 1:39 GMT+01:00, Jan Ekström <jeebjp@gmail.com>:
> This enables us to read the data coding type utilized for
> a specific private data stream, of which we currently are
> interested in ARIB caption streams.
>
> The component tag limitations are according to ARIB TR-B14,
> and the component IDs are defined in ARIB STD-B10.
> ---
>  libavformat/mpegts.c  | 44 +++++++++++++++++++++++++++++++++++++++++++
>  libavformat/version.h |  2 +-
>  2 files changed, 45 insertions(+), 1 deletion(-)
>
> diff --git a/libavformat/mpegts.c b/libavformat/mpegts.c
> index 300db110d4..f9dc04eb52 100644
> --- a/libavformat/mpegts.c
> +++ b/libavformat/mpegts.c
> @@ -2013,6 +2013,50 @@ int ff_parse_mpeg2_descriptor(AVFormatContext *fc,
> AVStream *st, int stream_type
>              }
>          }
>          break;
> +    case 0xfd: /* ARIB data coding type descriptor */
> +        // STD-B24, fascicle 3, chapter 4 defines private_stream_1
> +        // for captions
> +        if (stream_type == STREAM_TYPE_PRIVATE_DATA) {
> +            // This structure is defined in STD-B10, part 1, listing 5.4
> and
> +            // part 2, 6.2.20).
> +            // Listing of data_component_ids is in STD-B10, part 2, Annex
> J.
> +            // Component tag limits are documented in TR-B14, fascicle 2,
> +            // Vol. 3, Section 2, 4.2.8.1
> +            int actual_component_tag = st->stream_identifier - 1;
> +            int picked_profile = FF_PROFILE_UNKNOWN;
> +            int data_component_id = get16(pp, desc_end);

> +            if (data_component_id < 0)
> +                return AVERROR_INVALIDDATA;

What is the effect of this return if it happens?
Why isn't this just a "break" like below?

Carl Eugen
Jan Ekström Feb. 3, 2019, 1:33 p.m.
On Sun, Feb 3, 2019 at 3:12 PM Carl Eugen Hoyos <ceffmpeg@gmail.com> wrote:
>
> 2019-02-03 1:39 GMT+01:00, Jan Ekström <jeebjp@gmail.com>:
> > This enables us to read the data coding type utilized for
> > a specific private data stream, of which we currently are
> > interested in ARIB caption streams.
> >
> > The component tag limitations are according to ARIB TR-B14,
> > and the component IDs are defined in ARIB STD-B10.
> > ---
> >  libavformat/mpegts.c  | 44 +++++++++++++++++++++++++++++++++++++++++++
> >  libavformat/version.h |  2 +-
> >  2 files changed, 45 insertions(+), 1 deletion(-)
> >
> > diff --git a/libavformat/mpegts.c b/libavformat/mpegts.c
> > index 300db110d4..f9dc04eb52 100644
> > --- a/libavformat/mpegts.c
> > +++ b/libavformat/mpegts.c
> > @@ -2013,6 +2013,50 @@ int ff_parse_mpeg2_descriptor(AVFormatContext *fc,
> > AVStream *st, int stream_type
> >              }
> >          }
> >          break;
> > +    case 0xfd: /* ARIB data coding type descriptor */
> > +        // STD-B24, fascicle 3, chapter 4 defines private_stream_1
> > +        // for captions
> > +        if (stream_type == STREAM_TYPE_PRIVATE_DATA) {
> > +            // This structure is defined in STD-B10, part 1, listing 5.4
> > and
> > +            // part 2, 6.2.20).
> > +            // Listing of data_component_ids is in STD-B10, part 2, Annex
> > J.
> > +            // Component tag limits are documented in TR-B14, fascicle 2,
> > +            // Vol. 3, Section 2, 4.2.8.1
> > +            int actual_component_tag = st->stream_identifier - 1;
> > +            int picked_profile = FF_PROFILE_UNKNOWN;
> > +            int data_component_id = get16(pp, desc_end);
>
> > +            if (data_component_id < 0)
> > +                return AVERROR_INVALIDDATA;
>
> What is the effect of this return if it happens?
> Why isn't this just a "break" like below?
>

I think this might have been a half-automatic thing on my behalf after
seeing that get16 does indeed return an error value if it can't read
that amount of data.

Although, it seems like there is no concise way of handling errors
even in this function (`ff_parse_mpeg2_descriptor`):
1. 0x1E breaks.
2. 0x56, 0x59 set AVERROR_INVALIDDATA if the descriptor is not long
enough (pointer check instead of getXX though).
3. 0x26 does an exact equality check on a specific value and ignores
result otherwise.
4. 0x7f does a getXX and then returns AVERROR_INVALIDDATA if it fails
(this is getting similar to what I did).

Personally I don't have heavy feelings about how I exit this function,
but I feel like if there's not enough data and we've gotten as far as
we've got, returning "invalid data" is not incorrect (and has a
precedent).

Jan
Jan Ekström Feb. 3, 2019, 1:39 p.m.
On Sun, Feb 3, 2019 at 3:33 PM Jan Ekström <jeebjp@gmail.com> wrote:
>
> On Sun, Feb 3, 2019 at 3:12 PM Carl Eugen Hoyos <ceffmpeg@gmail.com> wrote:
> >
> > 2019-02-03 1:39 GMT+01:00, Jan Ekström <jeebjp@gmail.com>:
> > > This enables us to read the data coding type utilized for
> > > a specific private data stream, of which we currently are
> > > interested in ARIB caption streams.
> > >
> > > The component tag limitations are according to ARIB TR-B14,
> > > and the component IDs are defined in ARIB STD-B10.
> > > ---
> > >  libavformat/mpegts.c  | 44 +++++++++++++++++++++++++++++++++++++++++++
> > >  libavformat/version.h |  2 +-
> > >  2 files changed, 45 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/libavformat/mpegts.c b/libavformat/mpegts.c
> > > index 300db110d4..f9dc04eb52 100644
> > > --- a/libavformat/mpegts.c
> > > +++ b/libavformat/mpegts.c
> > > @@ -2013,6 +2013,50 @@ int ff_parse_mpeg2_descriptor(AVFormatContext *fc,
> > > AVStream *st, int stream_type
> > >              }
> > >          }
> > >          break;
> > > +    case 0xfd: /* ARIB data coding type descriptor */
> > > +        // STD-B24, fascicle 3, chapter 4 defines private_stream_1
> > > +        // for captions
> > > +        if (stream_type == STREAM_TYPE_PRIVATE_DATA) {
> > > +            // This structure is defined in STD-B10, part 1, listing 5.4
> > > and
> > > +            // part 2, 6.2.20).
> > > +            // Listing of data_component_ids is in STD-B10, part 2, Annex
> > > J.
> > > +            // Component tag limits are documented in TR-B14, fascicle 2,
> > > +            // Vol. 3, Section 2, 4.2.8.1
> > > +            int actual_component_tag = st->stream_identifier - 1;
> > > +            int picked_profile = FF_PROFILE_UNKNOWN;
> > > +            int data_component_id = get16(pp, desc_end);
> >
> > > +            if (data_component_id < 0)
> > > +                return AVERROR_INVALIDDATA;
> >
> > What is the effect of this return if it happens?
> > Why isn't this just a "break" like below?
> >
>
> I think this might have been a half-automatic thing on my behalf after
> seeing that get16 does indeed return an error value if it can't read
> that amount of data.
>
> Although, it seems like there is no concise way of handling errors
> even in this function (`ff_parse_mpeg2_descriptor`):
> 1. 0x1E breaks.
> 2. 0x56, 0x59 set AVERROR_INVALIDDATA if the descriptor is not long
> enough (pointer check instead of getXX though).
> 3. 0x26 does an exact equality check on a specific value and ignores
> result otherwise.
> 4. 0x7f does a getXX and then returns AVERROR_INVALIDDATA if it fails
> (this is getting similar to what I did).
>
> Personally I don't have heavy feelings about how I exit this function,
> but I feel like if there's not enough data and we've gotten as far as
> we've got, returning "invalid data" is not incorrect (and has a
> precedent).
>

Also sorry, I just noticed the two different types of exit. I guess I
did just a break there since we seem to still have a valid structure,
but we're just not interested in that specific coding type.

(a list of them can be seen in the specs I noted).

Jan

Patch hide | download patch | download mbox

diff --git a/libavformat/mpegts.c b/libavformat/mpegts.c
index 300db110d4..f9dc04eb52 100644
--- a/libavformat/mpegts.c
+++ b/libavformat/mpegts.c
@@ -2013,6 +2013,50 @@  int ff_parse_mpeg2_descriptor(AVFormatContext *fc, AVStream *st, int stream_type
             }
         }
         break;
+    case 0xfd: /* ARIB data coding type descriptor */
+        // STD-B24, fascicle 3, chapter 4 defines private_stream_1
+        // for captions
+        if (stream_type == STREAM_TYPE_PRIVATE_DATA) {
+            // This structure is defined in STD-B10, part 1, listing 5.4 and
+            // part 2, 6.2.20).
+            // Listing of data_component_ids is in STD-B10, part 2, Annex J.
+            // Component tag limits are documented in TR-B14, fascicle 2,
+            // Vol. 3, Section 2, 4.2.8.1
+            int actual_component_tag = st->stream_identifier - 1;
+            int picked_profile = FF_PROFILE_UNKNOWN;
+            int data_component_id = get16(pp, desc_end);
+            if (data_component_id < 0)
+                return AVERROR_INVALIDDATA;
+
+            switch (data_component_id) {
+            case 0x0008:
+                // [0x30..0x37] are component tags utilized for
+                // non-mobile captioning service ("profile A").
+                if (actual_component_tag >= 0x30 &&
+                    actual_component_tag <= 0x37) {
+                    picked_profile = FF_PROFILE_ARIB_PROFILE_A;
+                }
+                break;
+            case 0x0012:
+                // component tag 0x87 signifies a mobile/partial reception
+                // (1seg) captioning service ("profile C").
+                if (actual_component_tag == 0x87) {
+                    picked_profile = FF_PROFILE_ARIB_PROFILE_C;
+                }
+                break;
+            default:
+                break;
+            }
+
+            if (picked_profile == FF_PROFILE_UNKNOWN)
+                break;
+
+            st->codecpar->codec_type = AVMEDIA_TYPE_SUBTITLE;
+            st->codecpar->codec_id   = AV_CODEC_ID_ARIB_CAPTION;
+            st->codecpar->profile    = picked_profile;
+            st->request_probe        = 0;
+        }
+        break;
     default:
         break;
     }
diff --git a/libavformat/version.h b/libavformat/version.h
index 4408ecaa06..2e83eb4f23 100644
--- a/libavformat/version.h
+++ b/libavformat/version.h
@@ -33,7 +33,7 @@ 
 // Also please add any ticket numbers that you believe might be affected here
 #define LIBAVFORMAT_VERSION_MAJOR  58
 #define LIBAVFORMAT_VERSION_MINOR  26
-#define LIBAVFORMAT_VERSION_MICRO 100
+#define LIBAVFORMAT_VERSION_MICRO 101
 
 #define LIBAVFORMAT_VERSION_INT AV_VERSION_INT(LIBAVFORMAT_VERSION_MAJOR, \
                                                LIBAVFORMAT_VERSION_MINOR, \