diff mbox

[FFmpeg-devel,v1] avformat/movenc: split empty text sample when duration overflow

Message ID 20190906231035.23139-1-junli1026@gmail.com
State Superseded
Headers show

Commit Message

Jun Li Sept. 6, 2019, 11:10 p.m. UTC
Fix #7637
One empty/end sample is created and inserted between two caption lines when there is a gap.
This patch is to split the sample into multiple ones when its duration is too long (>= INT_MAX).
---
 libavformat/movenc.c | 24 ++++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

Comments

Jun Li Sept. 7, 2019, 9:23 p.m. UTC | #1
On Fri, Sep 6, 2019 at 4:10 PM Jun Li <junli1026@gmail.com> wrote:

> Fix #7637
> One empty/end sample is created and inserted between two caption lines
> when there is a gap.
> This patch is to split the sample into multiple ones when its duration is
> too long (>= INT_MAX).
> ---
>  libavformat/movenc.c | 24 ++++++++++++++++++------
>  1 file changed, 18 insertions(+), 6 deletions(-)
>
> diff --git a/libavformat/movenc.c b/libavformat/movenc.c
> index edddfeeb00..aeb7de351f 100644
> --- a/libavformat/movenc.c
> +++ b/libavformat/movenc.c
> @@ -5746,7 +5746,8 @@ static int mov_write_packet(AVFormatContext *s,
> AVPacket *pkt)
>           *
>           * 2) For each subtitle track, check if the current packet's
>           * dts is past the duration of the last subtitle sample. If
> -         * so, we now need to write an end sample for that subtitle.
> +         * so, we now need to write one or multiple end samples for
> +         * that subtitle.
>           *
>           * This must be done conditionally to allow for subtitles that
>           * immediately replace each other, in which case an end sample
> @@ -5760,11 +5761,22 @@ static int mov_write_packet(AVFormatContext *s,
> AVPacket *pkt)
>              int ret;
>
>              if (trk->par->codec_id == AV_CODEC_ID_MOV_TEXT &&
> -                trk->track_duration < pkt->dts &&
> -                (trk->entry == 0 || !trk->last_sample_is_subtitle_end)) {
> -                ret = mov_write_subtitle_end_packet(s, i,
> trk->track_duration);
> -                if (ret < 0) return ret;
> -                trk->last_sample_is_subtitle_end = 1;
> +                trk->track_duration < pkt->dts) {
> +                int max_duration = INT_MAX - 1;
> +                if (trk->entry == 0 || !trk->last_sample_is_subtitle_end)
> {
> +                    ret = mov_write_subtitle_end_packet(s, i,
> trk->track_duration);
> +                    if (ret < 0) return ret;
> +                    trk->last_sample_is_subtitle_end = 1;
> +                }
> +                if (trk->last_sample_is_subtitle_end &&
> +                    pkt->dts - trk->track_duration > max_duration) {
> +                    int64_t dts = trk->track_duration;
> +                    while(pkt->dts - dts > max_duration) {
> +                        dts += max_duration;
> +                        ret = mov_write_subtitle_end_packet(s, i, dts);
> +                        if (ret < 0) return ret;
> +                    }
> +                }
>              }
>          }
>
> --
> 2.17.1
>
>
Some details:
Current implementation will insert one empty sample whenever there is a gap
between two captions lines. But when the gap's duration is way too big, the
sample's duration will exceed MP4's limitation, thus mov muxer will fail on
function check_pkt.

This patch is to guarantee that, every empty samples' duration not exceed
INT_MAX-1, and insert multiple ones when the gap is bigger than INT_MAX-1.

Test results:
Fate test passed. Also tested with following SRT file with cmd:

ffmpeg -i input.mp4 -i input.srt -c copy -c:s mov_text test.mp4 -y

I used Apple's QuickTime player, confirmed that the caption shows up at the
correct time.

"input. srt" as follows
1
00:35:47.484 --> 00:35:50.000
Durations that exceed the signed
int max value break the program

2
00:40:47.484 --> 00:40:50.000
Durations2 that exceed the signed
int max value break the program

3
00:50:47.484 --> 00:50:50.000
Durations3 that exceed the signed
int max value break the program

4
02:50:47.484 --> 02:50:50.000
Durations4 that exceed the signed
int max value break the program

5
03:50:47.484 --> 03:50:50.000
Durations5 that exceed the signed
int max value break the program
Jan Ekström Sept. 7, 2019, 11:38 p.m. UTC | #2
Hi,

On Sun, Sep 8, 2019 at 12:28 AM Jun Li <junli1026@gmail.com> wrote:
> Some details:
> Current implementation will insert one empty sample whenever there is a gap
> between two captions lines. But when the gap's duration is way too big, the
> sample's duration will exceed MP4's limitation, thus mov muxer will fail on
> function check_pkt.
>
> This patch is to guarantee that, every empty samples' duration not exceed
> INT_MAX-1, and insert multiple ones when the gap is bigger than INT_MAX-1.
>

For the record, I recently hit this check as well when working on
TTML-in-MP4 (which basically requires a *single* sample for a
document/fragment - and thus getting packets with durations of a whole
~40min movie made this happen really easy). Now, I think this check of
INT_MAX is not necessarily 100% correct (but I'd have to consult both
qtff spec as well as ISOBMFF spec for that, as well as make sure that
the relevant variables are big enough), but I think the bigger issue
right now is that with ffmpeg.c for subtitle streams we currently seem
to set a time base of 1:1000000 for the AVStream. We might need to
think about setting a smaller default time base for subtitle formats
in the future.

That said, while I would be against poking at subtitle packets pushed
to us - if we're splitting just the empty samples I think that is not
a bad thing since the empty samples are in any case generated by us.

Will take a look at the patch after I've rested somewhat. I think if
we're enshrining INT_MAX as the maximum duration then I think it might
make sense to start #defining that so that it's clearer where our
definition of max packet duration is meant (and if we ever want to
change it or improve the logic of it, that could then be done in a
single place).

> Test results:
> Fate test passed. Also tested with following SRT file with cmd:
>
> ffmpeg -i input.mp4 -i input.srt -c copy -c:s mov_text test.mp4 -y
>
> I used Apple's QuickTime player, confirmed that the caption shows up at the
> correct time.
>
> "input. srt" as follows
> 1
> 00:35:47.484 --> 00:35:50.000
> Durations that exceed the signed
> int max value break the program
>
> 2
> 00:40:47.484 --> 00:40:50.000
> Durations2 that exceed the signed
> int max value break the program
>
> 3
> 00:50:47.484 --> 00:50:50.000
> Durations3 that exceed the signed
> int max value break the program
>
> 4
> 02:50:47.484 --> 02:50:50.000
> Durations4 that exceed the signed
> int max value break the program
>
> 5
> 03:50:47.484 --> 03:50:50.000
> Durations5 that exceed the signed
> int max value break the program

I think something like this could be added to FATE with an explicit
time_base and not (where one example would split and another would
not).

Best regards,
Jan
diff mbox

Patch

diff --git a/libavformat/movenc.c b/libavformat/movenc.c
index edddfeeb00..aeb7de351f 100644
--- a/libavformat/movenc.c
+++ b/libavformat/movenc.c
@@ -5746,7 +5746,8 @@  static int mov_write_packet(AVFormatContext *s, AVPacket *pkt)
          *
          * 2) For each subtitle track, check if the current packet's
          * dts is past the duration of the last subtitle sample. If
-         * so, we now need to write an end sample for that subtitle.
+         * so, we now need to write one or multiple end samples for
+         * that subtitle.
          *
          * This must be done conditionally to allow for subtitles that
          * immediately replace each other, in which case an end sample
@@ -5760,11 +5761,22 @@  static int mov_write_packet(AVFormatContext *s, AVPacket *pkt)
             int ret;
 
             if (trk->par->codec_id == AV_CODEC_ID_MOV_TEXT &&
-                trk->track_duration < pkt->dts &&
-                (trk->entry == 0 || !trk->last_sample_is_subtitle_end)) {
-                ret = mov_write_subtitle_end_packet(s, i, trk->track_duration);
-                if (ret < 0) return ret;
-                trk->last_sample_is_subtitle_end = 1;
+                trk->track_duration < pkt->dts) {
+                int max_duration = INT_MAX - 1;
+                if (trk->entry == 0 || !trk->last_sample_is_subtitle_end) {
+                    ret = mov_write_subtitle_end_packet(s, i, trk->track_duration);
+                    if (ret < 0) return ret;
+                    trk->last_sample_is_subtitle_end = 1;
+                }
+                if (trk->last_sample_is_subtitle_end &&
+                    pkt->dts - trk->track_duration > max_duration) {
+                    int64_t dts = trk->track_duration;
+                    while(pkt->dts - dts > max_duration) {
+                        dts += max_duration;
+                        ret = mov_write_subtitle_end_packet(s, i, dts);
+                        if (ret < 0) return ret;
+                    }
+                }
             }
         }