Message ID | 1572585384-12000-1-git-send-email-Yuki.Tsuchiya@sony.com |
---|---|
State | Superseded |
Headers | show |
Am Fr., 1. Nov. 2019 um 06:24 Uhr schrieb Yuki Tsuchiya <Yuki.Tsuchiya@sony.com>: > > Implemented according to the specification at https://www.iso.org/standard/69561.html > The "mhm1" sample entry is registered with MP4RA, which is defined as MHAS encapsulated single stream MPEG-H 3D Audio. > "MHAS" stands for MPEG-H audio stream, which contains encoded audio data and corresponds metadata for decoding. > This patch enables extracting the MHAS bitstream from MP4. I will push this if there are no objections. Carl Eugen
On Fri, Nov 1, 2019 at 7:24 AM Yuki Tsuchiya <Yuki.Tsuchiya@sony.com> wrote: > > Implemented according to the specification at https://www.iso.org/standard/69561.html > The "mhm1" sample entry is registered with MP4RA, which is defined as MHAS encapsulated single stream MPEG-H 3D Audio. > "MHAS" stands for MPEG-H audio stream, which contains encoded audio data and corresponds metadata for decoding. > This patch enables extracting the MHAS bitstream from MP4. > > Signed-off-by: Yuki Tsuchiya <Yuki.Tsuchiya@sony.com> > --- Sorry for the late response, there have been various things recently :) . All of the samples I've seen in the wild (well, on the DASH-IF test vector list, which is the only place I've seen both AC-4 and MPEG-H Audio at until now) seem to utilize mha1, such as https://dash.akamaized.net/dash264/TestCasesMCA/fraunhofer/MPEGH_714_lc_mha1/1/Sintel/Sintel.2010_1080p_incl_Credits_new_cicp19_16bit-eng-893s-12-mpegh-256000bps_seg.mp4 . Thus my initial question is if there is any reason why 'mha1' is not added as well? Was that removed from the MP4 container specification afterwards? Additionally, are there any MPEG-H Audio specific configuration/etc boxes required to be read/written for valid decoding or to create a valid mux according to the spec which should be handled? Best regards, Jan
Hi Jan, Thank you for the comment. > All of the samples I've seen in the wild (well, on the DASH-IF test > vector list, which is the only place I've seen both AC-4 and MPEG-H > Audio at until now) seem to utilize mha1, such as > https://dash.akamaized.net/dash264/TestCasesMCA/fraunhofer/MPEGH_714_lc_mha1/1/Sintel/Sintel.2010_1080p_incl_Credits_new_cicp19_16bit-eng-893s-12-mpegh-256000bps_seg.mp4 > Thus my initial question is if there is any reason why 'mha1' is not > added as well? Was that removed from the MP4 container specification > afterwards? 'mha1' is still documented on ISO, but the latest DASH-IOP specifies to use only mhm1 (https://dashif.org/docs/DASH-IF-IOP-v4.3.pdf) from v4.3. So it seems likely that mhm1 will become majority in MPEG-H 3D Audio in MP4. This is why this patch supports mhm1 as priority. > Additionally, are there any MPEG-H Audio specific > configuration/etc boxes required to be read/written for valid decoding > or to create a valid mux according to the spec which should be > handled? In mha1 case, it is required to handle 'mhaC' box which contains configuration for decoding. In mhm1 case (this patch), MHAS bitstream in mdat has the configuration, so the 'mhaC' is not required to handle. Regards. Yuki Tsuchiya
I will rebase against current master and send the new patch. Hi Jan, Do you have any comment to my answer? Regards, Yuki Tsuchiya
diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h index bcb931f..8c1a85d 100644 --- a/libavcodec/avcodec.h +++ b/libavcodec/avcodec.h @@ -654,6 +654,7 @@ enum AVCodecID { AV_CODEC_ID_ATRAC9, AV_CODEC_ID_HCOM, AV_CODEC_ID_ACELP_KELVIN, + AV_CODEC_ID_MPEGH_3D_AUDIO, /* subtitle codecs */ AV_CODEC_ID_FIRST_SUBTITLE = 0x17000, ///< A dummy ID pointing at the start of subtitle codecs. diff --git a/libavcodec/codec_desc.c b/libavcodec/codec_desc.c index 0602ecb..a970fae 100644 --- a/libavcodec/codec_desc.c +++ b/libavcodec/codec_desc.c @@ -2998,6 +2998,13 @@ static const AVCodecDescriptor codec_descriptors[] = { .long_name = NULL_IF_CONFIG_SMALL("Sipro ACELP.KELVIN"), .props = AV_CODEC_PROP_LOSSY, }, + { + .id = AV_CODEC_ID_MPEGH_3D_AUDIO, + .type = AVMEDIA_TYPE_AUDIO, + .name = "mpegh_3d_audio", + .long_name = NULL_IF_CONFIG_SMALL("MPEG-H 3D Audio"), + .props = AV_CODEC_PROP_LOSSY, + }, /* subtitle codecs */ { diff --git a/libavcodec/version.h b/libavcodec/version.h index 27c126e..b36f331 100644 --- a/libavcodec/version.h +++ b/libavcodec/version.h @@ -28,7 +28,7 @@ #include "libavutil/version.h" #define LIBAVCODEC_VERSION_MAJOR 58 -#define LIBAVCODEC_VERSION_MINOR 60 +#define LIBAVCODEC_VERSION_MINOR 61 #define LIBAVCODEC_VERSION_MICRO 100 #define LIBAVCODEC_VERSION_INT AV_VERSION_INT(LIBAVCODEC_VERSION_MAJOR, \ diff --git a/libavformat/isom.c b/libavformat/isom.c index edd0d81..824e811 100644 --- a/libavformat/isom.c +++ b/libavformat/isom.c @@ -371,6 +371,7 @@ const AVCodecTag ff_codec_movaudio_tags[] = { { AV_CODEC_ID_FLAC, MKTAG('f', 'L', 'a', 'C') }, /* nonstandard */ { AV_CODEC_ID_TRUEHD, MKTAG('m', 'l', 'p', 'a') }, /* mp4ra.org */ { AV_CODEC_ID_OPUS, MKTAG('O', 'p', 'u', 's') }, /* mp4ra.org */ + { AV_CODEC_ID_MPEGH_3D_AUDIO, MKTAG('m', 'h', 'm', '1') }, /* MPEG-H 3D Audio bitstream */ { AV_CODEC_ID_NONE, 0 }, }; diff --git a/libavformat/movenc.c b/libavformat/movenc.c index 715bec1..ff234d9 100644 --- a/libavformat/movenc.c +++ b/libavformat/movenc.c @@ -2411,7 +2411,7 @@ static int mov_preroll_write_stbl_atoms(AVIOContext *pb, MOVTrack *track) if (!sgpd_entries) return AVERROR(ENOMEM); - av_assert0(track->par->codec_id == AV_CODEC_ID_OPUS || track->par->codec_id == AV_CODEC_ID_AAC); + av_assert0(track->par->codec_id == AV_CODEC_ID_OPUS || track->par->codec_id == AV_CODEC_ID_AAC || track->par->codec_id == AV_CODEC_ID_MPEGH_3D_AUDIO); if (track->par->codec_id == AV_CODEC_ID_OPUS) { for (i = 0; i < track->entry; i++) { @@ -2493,6 +2493,7 @@ static int mov_write_stbl_tag(AVFormatContext *s, AVIOContext *pb, MOVMuxContext mov_write_stts_tag(pb, track); if ((track->par->codec_type == AVMEDIA_TYPE_VIDEO || track->par->codec_id == AV_CODEC_ID_TRUEHD || + track->par->codec_id == AV_CODEC_ID_MPEGH_3D_AUDIO || track->par->codec_tag == MKTAG('r','t','p',' ')) && track->has_keyframes && track->has_keyframes < track->entry) mov_write_stss_tag(pb, track, MOV_SYNC_SAMPLE); @@ -2512,7 +2513,7 @@ static int mov_write_stbl_tag(AVFormatContext *s, AVIOContext *pb, MOVMuxContext if (track->cenc.aes_ctr) { ff_mov_cenc_write_stbl_atoms(&track->cenc, pb); } - if (track->par->codec_id == AV_CODEC_ID_OPUS || track->par->codec_id == AV_CODEC_ID_AAC) { + if (track->par->codec_id == AV_CODEC_ID_OPUS || track->par->codec_id == AV_CODEC_ID_AAC || track->par->codec_id == AV_CODEC_ID_MPEGH_3D_AUDIO) { mov_preroll_write_stbl_atoms(pb, track); } return update_size(pb, pos); @@ -6877,6 +6878,7 @@ const AVCodecTag codec_mp4_tags[] = { { AV_CODEC_ID_DVD_SUBTITLE, MKTAG('m', 'p', '4', 's') }, { AV_CODEC_ID_MOV_TEXT , MKTAG('t', 'x', '3', 'g') }, { AV_CODEC_ID_BIN_DATA , MKTAG('g', 'p', 'm', 'd') }, + { AV_CODEC_ID_MPEGH_3D_AUDIO, MKTAG('m', 'h', 'm', '1') }, { AV_CODEC_ID_NONE , 0 }, }; diff --git a/libavformat/utils.c b/libavformat/utils.c index cfb6d03..d271251 100644 --- a/libavformat/utils.c +++ b/libavformat/utils.c @@ -1021,7 +1021,8 @@ static int is_intra_only(enum AVCodecID id) const AVCodecDescriptor *d = avcodec_descriptor_get(id); if (!d) return 0; - if (d->type == AVMEDIA_TYPE_VIDEO && !(d->props & AV_CODEC_PROP_INTRA_ONLY)) + if ((d->type == AVMEDIA_TYPE_VIDEO && !(d->props & AV_CODEC_PROP_INTRA_ONLY)) || + id == AV_CODEC_ID_MPEGH_3D_AUDIO) return 0; return 1; }
Implemented according to the specification at https://www.iso.org/standard/69561.html The "mhm1" sample entry is registered with MP4RA, which is defined as MHAS encapsulated single stream MPEG-H 3D Audio. "MHAS" stands for MPEG-H audio stream, which contains encoded audio data and corresponds metadata for decoding. This patch enables extracting the MHAS bitstream from MP4. Signed-off-by: Yuki Tsuchiya <Yuki.Tsuchiya@sony.com> --- libavcodec/avcodec.h | 1 + libavcodec/codec_desc.c | 7 +++++++ libavcodec/version.h | 2 +- libavformat/isom.c | 1 + libavformat/movenc.c | 6 ++++-- libavformat/utils.c | 3 ++- 6 files changed, 16 insertions(+), 4 deletions(-)