diff mbox series

[FFmpeg-devel,v2,7/7] avformat/audiointerleave: use a fixed frame duration for non-audio packets

Message ID 20200305215628.19514-7-cus@passwd.hu
State Superseded
Headers show
Series [FFmpeg-devel,v2,1/7] avformat/audiointerleave: disallow using a samples_per_frame array | expand

Checks

Context Check Description
andriy/ffmpeg-patchwork success Make fate finished

Commit Message

Marton Balint March 5, 2020, 9:56 p.m. UTC
The packet durations might not be set properly which can cause the MXF muxer
to write more than one packet of a stream to an edit unit messing up the
constant byte per element index...

Also use nb_samples directly to calculate dts of audio packets because adding
packet durations might not be precise.

Signed-off-by: Marton Balint <cus@passwd.hu>
---
 libavformat/audiointerleave.c | 12 +++++++++---
 libavformat/audiointerleave.h |  3 ++-
 libavformat/gxfenc.c          |  2 +-
 libavformat/mxfenc.c          |  2 +-
 4 files changed, 13 insertions(+), 6 deletions(-)

Comments

Andreas Rheinhardt March 6, 2020, 3:19 a.m. UTC | #1
Marton Balint:
> The packet durations might not be set properly which can cause the MXF muxer
> to write more than one packet of a stream to an edit unit messing up the
> constant byte per element index...
> 
> Also use nb_samples directly to calculate dts of audio packets because adding
> packet durations might not be precise.
> 
> Signed-off-by: Marton Balint <cus@passwd.hu>
> ---
>  libavformat/audiointerleave.c | 12 +++++++++---
>  libavformat/audiointerleave.h |  3 ++-
>  libavformat/gxfenc.c          |  2 +-
>  libavformat/mxfenc.c          |  2 +-
>  4 files changed, 13 insertions(+), 6 deletions(-)
> 
> diff --git a/libavformat/audiointerleave.c b/libavformat/audiointerleave.c
> index 2e83031bd6..0643159770 100644
> --- a/libavformat/audiointerleave.c
> +++ b/libavformat/audiointerleave.c
> @@ -40,6 +40,7 @@ void ff_audio_interleave_close(AVFormatContext *s)
>  
>  int ff_audio_interleave_init(AVFormatContext *s,
>                               const int samples_per_frame,
> +                             const int frame_duration,
>                               AVRational time_base)
>  {
>      int i;
> @@ -48,6 +49,10 @@ int ff_audio_interleave_init(AVFormatContext *s,
>          av_log(s, AV_LOG_ERROR, "timebase not set for audio interleave\n");
>          return AVERROR(EINVAL);
>      }
> +    if (!frame_duration) {
> +        av_log(s, AV_LOG_ERROR, "frame_duration not set for audio interleave\n");
> +        return AVERROR(EINVAL);
> +    }

Shouldn't this be an assert given that we know that it is known at
compiletime that this error can't be triggered?

- Andreas
Marton Balint March 6, 2020, 9:06 p.m. UTC | #2
On Fri, 6 Mar 2020, Andreas Rheinhardt wrote:

> Marton Balint:
>> The packet durations might not be set properly which can cause the MXF muxer
>> to write more than one packet of a stream to an edit unit messing up the
>> constant byte per element index...
>> 
>> Also use nb_samples directly to calculate dts of audio packets because adding
>> packet durations might not be precise.
>> 
>> Signed-off-by: Marton Balint <cus@passwd.hu>
>> ---
>>  libavformat/audiointerleave.c | 12 +++++++++---
>>  libavformat/audiointerleave.h |  3 ++-
>>  libavformat/gxfenc.c          |  2 +-
>>  libavformat/mxfenc.c          |  2 +-
>>  4 files changed, 13 insertions(+), 6 deletions(-)
>> 
>> diff --git a/libavformat/audiointerleave.c b/libavformat/audiointerleave.c
>> index 2e83031bd6..0643159770 100644
>> --- a/libavformat/audiointerleave.c
>> +++ b/libavformat/audiointerleave.c
>> @@ -40,6 +40,7 @@ void ff_audio_interleave_close(AVFormatContext *s)
>>
>>  int ff_audio_interleave_init(AVFormatContext *s,
>>                               const int samples_per_frame,
>> +                             const int frame_duration,
>>                               AVRational time_base)
>>  {
>>      int i;
>> @@ -48,6 +49,10 @@ int ff_audio_interleave_init(AVFormatContext *s,
>>          av_log(s, AV_LOG_ERROR, "timebase not set for audio interleave\n");
>>          return AVERROR(EINVAL);
>>      }
>> +    if (!frame_duration) {
>> +        av_log(s, AV_LOG_ERROR, "frame_duration not set for audio interleave\n");
>> +        return AVERROR(EINVAL);
>> +    }
>
> Shouldn't this be an assert given that we know that it is known at
> compiletime that this error can't be triggered?

It depends on how you _define_ the audiointerleave API. Just because it is 
impossible to trigger in current code does not mean that it should be 
an assert.

The main reason I prefer it this way is that there are similar checks for 
other parameters of the init function.

Regards,
Marton
Michael Niedermayer March 6, 2020, 11 p.m. UTC | #3
On Thu, Mar 05, 2020 at 10:56:28PM +0100, Marton Balint wrote:
> The packet durations might not be set properly which can cause the MXF muxer
> to write more than one packet of a stream to an edit unit messing up the
> constant byte per element index...
> 
> Also use nb_samples directly to calculate dts of audio packets because adding
> packet durations might not be precise.
> 
> Signed-off-by: Marton Balint <cus@passwd.hu>
> ---
>  libavformat/audiointerleave.c | 12 +++++++++---
>  libavformat/audiointerleave.h |  3 ++-
>  libavformat/gxfenc.c          |  2 +-
>  libavformat/mxfenc.c          |  2 +-
>  4 files changed, 13 insertions(+), 6 deletions(-)

This doesnt feel correct

Either case
A. The durations are correct but not what the muxer wants
then changing them at the muxer level could lead to AV sync issues and
wrong timestamps, something which the application should have dealt with
by frame duplication / frame drops or other

B. The durations are just wrong
then changing them at the muxer level will leave the calling application
with incorrect values potentially causing all kinds of problems.

Maybe iam missing something but isnt this just fixing the issue when the
durtations are wrong in a very special way ?

thx

[...]
Marton Balint March 7, 2020, 9:21 a.m. UTC | #4
On Sat, 7 Mar 2020, Michael Niedermayer wrote:

> On Thu, Mar 05, 2020 at 10:56:28PM +0100, Marton Balint wrote:
>> The packet durations might not be set properly which can cause the MXF muxer
>> to write more than one packet of a stream to an edit unit messing up the
>> constant byte per element index...
>>
>> Also use nb_samples directly to calculate dts of audio packets because adding
>> packet durations might not be precise.
>>
>> Signed-off-by: Marton Balint <cus@passwd.hu>
>> ---
>>  libavformat/audiointerleave.c | 12 +++++++++---
>>  libavformat/audiointerleave.h |  3 ++-
>>  libavformat/gxfenc.c          |  2 +-
>>  libavformat/mxfenc.c          |  2 +-
>>  4 files changed, 13 insertions(+), 6 deletions(-)
>
> This doesnt feel correct
>
> Either case
> A. The durations are correct but not what the muxer wants
> then changing them at the muxer level could lead to AV sync issues and
> wrong timestamps, something which the application should have dealt with
> by frame duplication / frame drops or other
>
> B. The durations are just wrong
> then changing them at the muxer level will leave the calling application
> with incorrect values potentially causing all kinds of problems.
>
> Maybe iam missing something but isnt this just fixing the issue when the
> durtations are wrong in a very special way ?

It is the same "special" way that is used for audio. We ignore incoming 
DTS and duration, and make up our own.

Yes, it can cause A-V sync issues is somebody wants to push VFR video or 
sparse audio data, that is not allowed in the MXF muxer.

Maybe we should hard reject streams if there is a difference between the 
calculated and the incoming DTS? I have a feeling that such measure would 
broke a lot of existing command lines...

Regards,
Marton
Michael Niedermayer March 7, 2020, 11:42 p.m. UTC | #5
On Sat, Mar 07, 2020 at 10:21:33AM +0100, Marton Balint wrote:
> 
> 
> On Sat, 7 Mar 2020, Michael Niedermayer wrote:
> 
> >On Thu, Mar 05, 2020 at 10:56:28PM +0100, Marton Balint wrote:
> >>The packet durations might not be set properly which can cause the MXF muxer
> >>to write more than one packet of a stream to an edit unit messing up the
> >>constant byte per element index...
> >>
> >>Also use nb_samples directly to calculate dts of audio packets because adding
> >>packet durations might not be precise.
> >>
> >>Signed-off-by: Marton Balint <cus@passwd.hu>
> >>---
> >> libavformat/audiointerleave.c | 12 +++++++++---
> >> libavformat/audiointerleave.h |  3 ++-
> >> libavformat/gxfenc.c          |  2 +-
> >> libavformat/mxfenc.c          |  2 +-
> >> 4 files changed, 13 insertions(+), 6 deletions(-)
> >
> >This doesnt feel correct
> >
> >Either case
> >A. The durations are correct but not what the muxer wants
> >then changing them at the muxer level could lead to AV sync issues and
> >wrong timestamps, something which the application should have dealt with
> >by frame duplication / frame drops or other
> >
> >B. The durations are just wrong
> >then changing them at the muxer level will leave the calling application
> >with incorrect values potentially causing all kinds of problems.
> >
> >Maybe iam missing something but isnt this just fixing the issue when the
> >durtations are wrong in a very special way ?
> 
> It is the same "special" way that is used for audio. We ignore incoming DTS
> and duration, and make up our own.
> 
> Yes, it can cause A-V sync issues is somebody wants to push VFR video or
> sparse audio data, that is not allowed in the MXF muxer.
> 
> Maybe we should hard reject streams if there is a difference between the
> calculated and the incoming DTS? I have a feeling that such measure would
> broke a lot of existing command lines...

iam unsure if this concept of timestamp changing in the muxer is a good idea
but if its done, there probably should be a warning if it happens and maybe
more then that if the difference becomes larger than what is "harmless"

but maybe iam missing something

Thanks


[...]
diff mbox series

Patch

diff --git a/libavformat/audiointerleave.c b/libavformat/audiointerleave.c
index 2e83031bd6..0643159770 100644
--- a/libavformat/audiointerleave.c
+++ b/libavformat/audiointerleave.c
@@ -40,6 +40,7 @@  void ff_audio_interleave_close(AVFormatContext *s)
 
 int ff_audio_interleave_init(AVFormatContext *s,
                              const int samples_per_frame,
+                             const int frame_duration,
                              AVRational time_base)
 {
     int i;
@@ -48,6 +49,10 @@  int ff_audio_interleave_init(AVFormatContext *s,
         av_log(s, AV_LOG_ERROR, "timebase not set for audio interleave\n");
         return AVERROR(EINVAL);
     }
+    if (!frame_duration) {
+        av_log(s, AV_LOG_ERROR, "frame_duration not set for audio interleave\n");
+        return AVERROR(EINVAL);
+    }
     for (i = 0; i < s->nb_streams; i++) {
         AVStream *st = s->streams[i];
         AudioInterleaveContext *aic = st->priv_data;
@@ -67,6 +72,8 @@  int ff_audio_interleave_init(AVFormatContext *s,
             if (!(aic->fifo = av_fifo_alloc_array(100, max_samples)))
                 return AVERROR(ENOMEM);
             aic->fifo_size = 100 * max_samples;
+        } else {
+            aic->frame_duration = frame_duration;
         }
     }
 
@@ -94,10 +101,9 @@  static int interleave_new_audio_packet(AVFormatContext *s, AVPacket *pkt,
     if (size < pkt->size)
         memset(pkt->data + size, 0, pkt->size - size);
 
-    pkt->dts = pkt->pts = aic->dts;
+    pkt->dts = pkt->pts = av_rescale_q(aic->nb_samples, st->time_base, aic->time_base);
     pkt->duration = av_rescale_q(nb_samples, st->time_base, aic->time_base);
     pkt->stream_index = stream_index;
-    aic->dts += pkt->duration;
     aic->nb_samples += nb_samples;
     aic->n++;
 
@@ -124,7 +130,7 @@  int ff_audio_rechunk_interleave(AVFormatContext *s, AVPacket *out, AVPacket *pkt
         } else {
             // rewrite pts and dts to be decoded time line position
             pkt->pts = pkt->dts = aic->dts;
-            aic->dts += pkt->duration;
+            aic->dts += aic->frame_duration;
             if ((ret = ff_interleave_add_packet(s, pkt, compare_ts)) < 0)
                 return ret;
         }
diff --git a/libavformat/audiointerleave.h b/libavformat/audiointerleave.h
index 0933310f4c..0c284dedb6 100644
--- a/libavformat/audiointerleave.h
+++ b/libavformat/audiointerleave.h
@@ -34,10 +34,11 @@  typedef struct AudioInterleaveContext {
     uint64_t dts;                 ///< current dts
     int sample_size;              ///< size of one sample all channels included
     int samples_per_frame;        ///< samples per frame if fixed, 0 otherwise
+    int frame_duration;           ///< frame duration for non-audio data
     AVRational time_base;         ///< time base of output audio packets
 } AudioInterleaveContext;
 
-int ff_audio_interleave_init(AVFormatContext *s, const int samples_per_frame, AVRational time_base);
+int ff_audio_interleave_init(AVFormatContext *s, const int samples_per_frame, const int frame_duration, AVRational time_base);
 void ff_audio_interleave_close(AVFormatContext *s);
 
 /**
diff --git a/libavformat/gxfenc.c b/libavformat/gxfenc.c
index e7536a6a7e..552cc57a3f 100644
--- a/libavformat/gxfenc.c
+++ b/libavformat/gxfenc.c
@@ -818,7 +818,7 @@  static int gxf_write_header(AVFormatContext *s)
         sc->order = s->nb_streams - st->index;
     }
 
-    if (ff_audio_interleave_init(s, GXF_samples_per_frame, (AVRational){ 1, 48000 }) < 0)
+    if (ff_audio_interleave_init(s, GXF_samples_per_frame, 2, (AVRational){ 1, 48000 }) < 0)
         return -1;
 
     if (tcr && vsc)
diff --git a/libavformat/mxfenc.c b/libavformat/mxfenc.c
index 6279ba9d6d..ac409d9ccf 100644
--- a/libavformat/mxfenc.c
+++ b/libavformat/mxfenc.c
@@ -2646,7 +2646,7 @@  static int mxf_write_header(AVFormatContext *s)
         return AVERROR(ENOMEM);
     mxf->timecode_track->index = -1;
 
-    if (ff_audio_interleave_init(s, 0, av_inv_q(mxf->tc.rate)) < 0)
+    if (ff_audio_interleave_init(s, 0, 1, av_inv_q(mxf->tc.rate)) < 0)
         return -1;
 
     return 0;