Message ID | 20230209044058.2872534-1-marth64@proxyid.net |
---|---|
State | New |
Headers | show |
Series | [FFmpeg-devel] ffprobe/eac3/mlp/dca: add detection of spatial audio extensions | expand |
Context | Check | Description |
---|---|---|
andriy/make_x86 | success | Make finished |
andriy/make_fate_x86 | fail | Make fate failed |
On Thu, Feb 9, 2023 at 5:42 AM Marth64 <marth64@proxyid.net> wrote: > > Signed-off-by: Marth64 <marth64@proxyid.net> > --- > Adds detection of spatial/object-based audio extensions in E-AC-3, > TrueHD, and DCA XLL (DTS). This includes Atmos, DTS:X, and IMAX formats. > Please let me know what I could improve, I'm learning still. > Thank you. > The detection itself seems fine to me, however we should talk about how the presence is communicated back to the user. A new flag in AVCodecContext goes against a variety of designs we try to avoid - namely having codec-specific things in a global struct, as well as having only one value, rather then per-frame values. So options that present themself to me: (a) Use "profile". At least for DTS that would fit quite nicely, as it already has profiles, and it seems like a logical extension. TrueHD and eac3 do not have profiles, but it might still be sensible to put it there. The advantage here is that it also automatically is conveyed in AVCodecParameters after avformat opens a stream, so the information is available early and lets players decide how to handle the stream. (b) Use per-frame side data. The early-availability advantage is not present here, so its not my favorite. side-data could be used in the future to transport the actual object metadata, if needed. So from where I'm standing we should maybe define profiles to use for these. In the past profiles were at least suggested for TrueHD Atmos before, but there were some objections, so maybe a good time to revisit and see where we go from here. - Hendrik
On Wed, Feb 08, 2023 at 10:41:00PM -0600, Marth64 wrote: > Signed-off-by: Marth64 <marth64@proxyid.net> > --- > Adds detection of spatial/object-based audio extensions in E-AC-3, > TrueHD, and DCA XLL (DTS). This includes Atmos, DTS:X, and IMAX formats. > Please let me know what I could improve, I'm learning still. > Thank you. [...] > @@ -1054,10 +1055,23 @@ static int parse_frame(DCAXllDecoder *s, const uint8_t *data, int size, DCAExssA > return ret; > if ((ret = parse_band_data(s)) < 0) > return ret; > + > + extradata_peek_pos = (get_bits_count(&s->gb) + 31) & ~31; > + if (s->frame_size * 8 > extradata_peek_pos) { > + unsigned int extradata_syncword = show_bits(&s->gb, 32); show_bits_long() [..]
Hi, Thank you for your time and thoughts. Some of this I had wondered about the same. Re: Hendrik, Using profile > This was an original intention of mine but I changed course. I'm happy to do it, but felt too unsure for a first pass. My reasoning being that I'm not sure if the presence of extension metadata itself qualifies as a discrete profile. For DCA in particular, I was worried since DCA already expands to profiles (ES, XLL, etc.). I did not want to clutter those distinctions with a "somewhat profile of a profile, based on an educated guess without the reference docs" and break any existing integrations. Likewise, EAC3 and TrueHD didn't have profiles, so it felt tacked on for this case. So I settled with "extension" as the marker. That said, I wasn't too thrilled about adding to AVCodecContext either. I discovered and considered priv_data but then realized that this is a pattern across 3 codecs, maybe more in the future. So definitely open to guidance here. Profile is probably the next best bet. I had gone down the frame-level inspection road at some point, but came to a similar conclusion as you, it makes this less useful as a feature. I am open to other's interpretation. Will ponder this a little more. Re: Michael, show_bits_long > Will fix. I am trying to procure another IMAX DTS material to test the syncword better, so will push any of those changes together in the next 2 days. Thank you! On Thu, Feb 9, 2023 at 2:12 PM Hendrik Leppkes <h.leppkes@gmail.com> wrote: > On Thu, Feb 9, 2023 at 5:42 AM Marth64 <marth64@proxyid.net> wrote: > > > > Signed-off-by: Marth64 <marth64@proxyid.net> > > --- > > Adds detection of spatial/object-based audio extensions in E-AC-3, > > TrueHD, and DCA XLL (DTS). This includes Atmos, DTS:X, and IMAX formats. > > Please let me know what I could improve, I'm learning still. > > Thank you. > > > > The detection itself seems fine to me, however we should talk about > how the presence is communicated back to the user. > > A new flag in AVCodecContext goes against a variety of designs we try > to avoid - namely having codec-specific things in a global struct, as > well as having only one value, rather then per-frame values. > > So options that present themself to me: > (a) Use "profile". At least for DTS that would fit quite nicely, as it > already has profiles, and it seems like a logical extension. TrueHD > and eac3 do not have profiles, but it might still be sensible to put > it there. The advantage here is that it also automatically is conveyed > in AVCodecParameters after avformat opens a stream, so the information > is available early and lets players decide how to handle the stream. > (b) Use per-frame side data. The early-availability advantage is not > present here, so its not my favorite. side-data could be used in the > future to transport the actual object metadata, if needed. > > So from where I'm standing we should maybe define profiles to use for > these. In the past profiles were at least suggested for TrueHD Atmos > before, but there were some objections, so maybe a good time to > revisit and see where we go from here. > > - Hendrik > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". >
Ack'd. Nice catch. Thank you! On Thu, Feb 9, 2023 at 4:35 PM Michael Niedermayer <michael@niedermayer.cc> wrote: > On Wed, Feb 08, 2023 at 10:41:00PM -0600, Marth64 wrote: > > Signed-off-by: Marth64 <marth64@proxyid.net> > > --- > > Adds detection of spatial/object-based audio extensions in E-AC-3, > > TrueHD, and DCA XLL (DTS). This includes Atmos, DTS:X, and IMAX formats. > > Please let me know what I could improve, I'm learning still. > > Thank you. > [...] > > @@ -1054,10 +1055,23 @@ static int parse_frame(DCAXllDecoder *s, const > uint8_t *data, int size, DCAExssA > > return ret; > > if ((ret = parse_band_data(s)) < 0) > > return ret; > > + > > + extradata_peek_pos = (get_bits_count(&s->gb) + 31) & ~31; > > + if (s->frame_size * 8 > extradata_peek_pos) { > > + unsigned int extradata_syncword = show_bits(&s->gb, 32); > > show_bits_long() > > > [..] > -- > Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB > > The misfortune of the wise is better than the prosperity of the fool. > -- Epicurus > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". >
diff --git a/doc/ffprobe.xsd b/doc/ffprobe.xsd index 0920380108..a01a4359dc 100644 --- a/doc/ffprobe.xsd +++ b/doc/ffprobe.xsd @@ -247,6 +247,7 @@ <xsd:attribute name="channel_layout" type="xsd:string"/> <xsd:attribute name="bits_per_sample" type="xsd:int"/> <xsd:attribute name="initial_padding" type="xsd:int"/> + <xsd:attribute name="spatial_ext" type="xsd:boolean"/> <xsd:attribute name="id" type="xsd:string"/> <xsd:attribute name="r_frame_rate" type="xsd:string" use="required"/> diff --git a/fftools/ffprobe.c b/fftools/ffprobe.c index dfa7ff1b24..7e81088c56 100644 --- a/fftools/ffprobe.c +++ b/fftools/ffprobe.c @@ -3071,6 +3071,9 @@ static int show_stream(WriterContext *w, AVFormatContext *fmt_ctx, int stream_id print_int("bits_per_sample", av_get_bits_per_sample(par->codec_id)); print_int("initial_padding", par->initial_padding); + if (par->spatial_ext > 0) { + print_int("spatial_ext", par->spatial_ext); + } break; case AVMEDIA_TYPE_SUBTITLE: diff --git a/libavcodec/ac3dec.c b/libavcodec/ac3dec.c index 0b120e6140..009acd9ff6 100644 --- a/libavcodec/ac3dec.c +++ b/libavcodec/ac3dec.c @@ -1714,6 +1714,7 @@ skip: if (!err) { avctx->sample_rate = s->sample_rate; avctx->bit_rate = s->bit_rate + s->prev_bit_rate; + avctx->spatial_ext = s->eac3_extension_type_a == 1; } if (!avctx->sample_rate) { diff --git a/libavcodec/ac3dec.h b/libavcodec/ac3dec.h index 138b462abb..0829f4b40d 100644 --- a/libavcodec/ac3dec.h +++ b/libavcodec/ac3dec.h @@ -102,6 +102,7 @@ typedef struct AC3DecodeContext { int eac3; ///< indicates if current frame is E-AC-3 int eac3_frame_dependent_found; ///< bitstream has E-AC-3 dependent frame(s) int eac3_subsbtreamid_found; ///< bitstream has E-AC-3 additional substream(s) + int eac3_extension_type_a; ///< bitstream has E-AC-3 extension type A enabled frame(s) int dolby_surround_mode; ///< dolby surround mode (dsurmod) int dolby_surround_ex_mode; ///< dolby surround ex mode (dsurexmod) int dolby_headphone_mode; ///< dolby headphone mode (dheadphonmod) diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h index 755e543fac..8b54a99313 100644 --- a/libavcodec/avcodec.h +++ b/libavcodec/avcodec.h @@ -2104,6 +2104,12 @@ typedef struct AVCodecContext { * The decoder can then override during decoding as needed. */ AVChannelLayout ch_layout; + + /** + * Audio only. Whether or not a spatial audio extension is + * detected in the stream (object-based surround). + */ + int spatial_ext; } AVCodecContext; /** diff --git a/libavcodec/codec_par.c b/libavcodec/codec_par.c index abda649aa8..02f6da5059 100644 --- a/libavcodec/codec_par.c +++ b/libavcodec/codec_par.c @@ -161,6 +161,7 @@ FF_ENABLE_DEPRECATION_WARNINGS par->initial_padding = codec->initial_padding; par->trailing_padding = codec->trailing_padding; par->seek_preroll = codec->seek_preroll; + par->spatial_ext = codec->spatial_ext; break; case AVMEDIA_TYPE_SUBTITLE: par->width = codec->width; @@ -243,6 +244,7 @@ FF_ENABLE_DEPRECATION_WARNINGS codec->initial_padding = par->initial_padding; codec->trailing_padding = par->trailing_padding; codec->seek_preroll = par->seek_preroll; + codec->spatial_ext = par->spatial_ext; break; case AVMEDIA_TYPE_SUBTITLE: codec->width = par->width; diff --git a/libavcodec/codec_par.h b/libavcodec/codec_par.h index f51d27c590..287b138b6b 100644 --- a/libavcodec/codec_par.h +++ b/libavcodec/codec_par.h @@ -211,6 +211,12 @@ typedef struct AVCodecParameters { * Audio only. The channel layout and number of channels. */ AVChannelLayout ch_layout; + + /** + * Audio only. Whether or not a spatial audio extension is + * detected in the stream (object-based surround). + */ + int spatial_ext; } AVCodecParameters; /** diff --git a/libavcodec/dca_syncwords.h b/libavcodec/dca_syncwords.h index 4d2cd5f56d..ccbc38bb38 100644 --- a/libavcodec/dca_syncwords.h +++ b/libavcodec/dca_syncwords.h @@ -33,4 +33,7 @@ #define DCA_SYNCWORD_SUBSTREAM_CORE 0x02B09261U #define DCA_SYNCWORD_REV1AUX 0x9A1105A0U +#define DCA_SYNCWORD_XLL_X 0x20008 +#define DCA_SYNCWORD_XLL_IMAX (0xF14000D0 >> 1) + #endif /* AVCODEC_DCA_SYNCWORDS_H */ diff --git a/libavcodec/dca_xll.c b/libavcodec/dca_xll.c index fe2c766d98..6b64f907cc 100644 --- a/libavcodec/dca_xll.c +++ b/libavcodec/dca_xll.c @@ -1043,6 +1043,7 @@ static int parse_band_data(DCAXllDecoder *s) static int parse_frame(DCAXllDecoder *s, const uint8_t *data, int size, DCAExssAsset *asset) { int ret; + int extradata_peek_pos; if ((ret = init_get_bits8(&s->gb, data, size)) < 0) return ret; @@ -1054,10 +1055,23 @@ static int parse_frame(DCAXllDecoder *s, const uint8_t *data, int size, DCAExssA return ret; if ((ret = parse_band_data(s)) < 0) return ret; + + extradata_peek_pos = (get_bits_count(&s->gb) + 31) & ~31; + if (s->frame_size * 8 > extradata_peek_pos) { + unsigned int extradata_syncword = show_bits(&s->gb, 32); + + if (extradata_syncword == DCA_SYNCWORD_XLL_X) { + s->avctx->spatial_ext = 1; + } else if ((extradata_syncword >> 1) == DCA_SYNCWORD_XLL_IMAX) { + s->avctx->spatial_ext = 1; + } + } + if (ff_dca_seek_bits(&s->gb, s->frame_size * 8)) { av_log(s->avctx, AV_LOG_ERROR, "Read past end of XLL frame\n"); return AVERROR_INVALIDDATA; } + return ret; } diff --git a/libavcodec/eac3dec.c b/libavcodec/eac3dec.c index deca51dd3d..5c71751a0c 100644 --- a/libavcodec/eac3dec.c +++ b/libavcodec/eac3dec.c @@ -464,7 +464,16 @@ static int ff_eac3_parse_header(AC3DecodeContext *s) if (get_bits1(gbc)) { int addbsil = get_bits(gbc, 6); for (i = 0; i < addbsil + 1; i++) { - skip_bits(gbc, 8); // skip additional bit stream info + if (i == 0) { + /* In this 8 bit chunk, the LSB is equal to flag_ec3_extension_type_a + which can be used to detect Atmos presence */ + skip_bits(gbc, 7); + if (get_bits1(gbc)) { + s->eac3_extension_type_a = 1; + } + } else { + skip_bits(gbc, 8); // skip additional bit stream info + } } } diff --git a/libavcodec/mlpdec.c b/libavcodec/mlpdec.c index 0ee1f0982c..547e68342e 100644 --- a/libavcodec/mlpdec.c +++ b/libavcodec/mlpdec.c @@ -392,6 +392,15 @@ static int read_major_sync(MLPDecodeContext *m, GetBitContext *gb) m->num_substreams = mh.num_substreams; m->substream_info = mh.substream_info; + /* If there is a 4th substream and the MSB of substream_info is set, + * there is a 16-channel spatial presentation (Atmos in TrueHD). + */ + if (m->avctx->codec_id == AV_CODEC_ID_TRUEHD + && m->num_substreams == 4 + && m->substream_info >> 7 == 1) { + m->avctx->spatial_ext = 1; + } + /* limit to decoding 3 substreams, as the 4th is used by Dolby Atmos for non-audio data */ m->max_decoded_substream = FFMIN(m->num_substreams - 1, 2);
Signed-off-by: Marth64 <marth64@proxyid.net> --- Adds detection of spatial/object-based audio extensions in E-AC-3, TrueHD, and DCA XLL (DTS). This includes Atmos, DTS:X, and IMAX formats. Please let me know what I could improve, I'm learning still. Thank you. doc/ffprobe.xsd | 1 + fftools/ffprobe.c | 3 +++ libavcodec/ac3dec.c | 1 + libavcodec/ac3dec.h | 1 + libavcodec/avcodec.h | 6 ++++++ libavcodec/codec_par.c | 2 ++ libavcodec/codec_par.h | 6 ++++++ libavcodec/dca_syncwords.h | 3 +++ libavcodec/dca_xll.c | 14 ++++++++++++++ libavcodec/eac3dec.c | 11 ++++++++++- libavcodec/mlpdec.c | 9 +++++++++ 11 files changed, 56 insertions(+), 1 deletion(-)