Message ID | 20200617194538.24821-1-timotej.lazar@araneo.si |
---|---|
State | Accepted |
Headers | show |
Series | [FFmpeg-devel] avformat/smacker: Support seeking to first frame | expand |
Context | Check | Description |
---|---|---|
andriy/default | pending | |
andriy/make | success | Make finished |
andriy/make_fate | success | Make fate finished |
Timotej Lazar: > Add .read_seek function to the smacker demuxer for the special case of > seeking to ts=0. This is useful because smacker – like bink, with a > similar implementation – was mostly used to encode clips in video > games, where random seeks are rare but looping media are common. > > Signed-off-by: Timotej Lazar <timotej.lazar@araneo.si> > --- > Unlike the existing bink implementation that always rewinds to start, > regardless of arguments to seek(), this will return EINVAL when seeking > to any timestamp other than 0. > > I tried to implement random seeking, but the smacker format is not > very amenable to that. While it supports keyframes, most videos I > could find do not use them. Since any video frame can contain an > incremental palette update, it might be necessary to rewind in any > case and decode all frames up to the target timestamp. > > This is the first time I dived into ffmpeg, so I’d appreciate any > feedback or suggestions on how to better approach this. Thanks! > > libavformat/smacker.c | 24 +++++++++++++++++++++++- > 1 file changed, 23 insertions(+), 1 deletion(-) > > diff --git a/libavformat/smacker.c b/libavformat/smacker.c > index 8b1e185817..19df8edc67 100644 > --- a/libavformat/smacker.c > +++ b/libavformat/smacker.c > @@ -230,7 +230,7 @@ static int smacker_read_header(AVFormatContext *s) > } > > smk->curstream = -1; > - smk->nextpos = avio_tell(pb); > + smk->nextpos = s->internal->data_offset = avio_tell(pb); > This change is unnecessary, as avformat_open_input already sets it to the exact same value. > return 0; > } > @@ -373,6 +373,27 @@ static int smacker_read_close(AVFormatContext *s) > return 0; > } > > +static int smacker_read_seek(AVFormatContext *s, int stream_index, > + int64_t timestamp, int flags) > +{ > + SmackerContext *smk = s->priv_data; > + > + /* only rewinding to start is supported */ > + if (timestamp != 0) { > + av_log(s, AV_LOG_ERROR, > + "Random seeks are not supported (can only seek to start).\n"); > + return AVERROR(EINVAL); > + } > + > + smk->cur_frame = 0; > + smk->curstream = -1; > + smk->nextpos = s->internal->data_offset; You are only telling the demuxer to seek to the beginning and you are setting the other fields in a way that makes sense if the seek succeeded. But in a seeking function you should perform the seek yourself, so that you can check whether the seek was successfull or not. And you should only reset the other variables in case it was successfull. Moreover, I have a patchset for the Smacker demuxer [1] and you seem to know a lot about Smacker, so you might take a look. If you do so, you'll see that this patchset stops seeking to nextpos every time a new frame is read (after all, that is the position where already are). But the patch I'd most like to know your opinion about is patch five [2]. In particular, I don't know if Smacker supports sparse tracks where the gaps between points where a track has data is filled by data from other tracks (think of three audio tracks where track 1 contains data for a certain language, track 2 contains data for another language and track 3 contains language-independent data (i.e. noise/music when no one is speaking)). Do you know it? If this is possible, the timestamps generated will be wrong. Furthermore, do you have samples with PCM data as audio? I suspect this to be buggy atm, because it always reads the first four bytes of the audio packets and treats them as duration (for encoded audio they contain the size (in bytes) of the decoded audio data which can be used as duration of the timebase has been set appropriately; yet for PCM audio this would be redundant and I expect it to not be done (so that the audio frame size should be used as duration in this case). This is also what [3] says. > + memset(smk->pal, 0, sizeof(smk->pal)); > + memset(smk->aud_pts, 0, sizeof(smk->aud_pts)); > + > + return 0; > +} > + > AVInputFormat ff_smacker_demuxer = { > .name = "smk", > .long_name = NULL_IF_CONFIG_SMALL("Smacker"), > @@ -381,4 +402,5 @@ AVInputFormat ff_smacker_demuxer = { > .read_header = smacker_read_header, > .read_packet = smacker_read_packet, > .read_close = smacker_read_close, > + .read_seek = smacker_read_seek, > }; > - Andreas [1]: http://ffmpeg.org/pipermail/ffmpeg-devel/2020-June/265033.html [2]: http://ffmpeg.org/pipermail/ffmpeg-devel/2020-June/265030.html [3]: https://wiki.multimedia.cx/index.php?title=Smacker#Audio_Track_Chunk
Andreas Rheinhardt <andreas.rheinhardt@gmail.com> [2020-06-28 15:32:05+0200]: > Moreover, I have a patchset for the Smacker demuxer [1] and you seem to > know a lot about Smacker, so you might take a look. Thanks for the feedback! I don’t really know much about Smacker beyond what I discovered while trying to get rewinding to work. That said, your patches look good to me – while working on my patch I was wondering about several issues you addressed, such as buffering audio packets. > In particular, I don't know if Smacker supports sparse tracks where > the gaps between points where a track has data is filled by data > from other tracks The description of the “Start mixing at what frame” setting¹ leads me to believe that all audio tracks are “complete”, padded with silence if necessary – at least when created with RAD tools. The same page hints that multiple audio tracks were intended to be used simply as an alternative to having a separate video file for each language. While there might be some program-specific extensions to do what you describe (online docs also mention the possibility of using audio tracks to store other data), I don’t think ffmpeg could or should support them. > Furthermore, do you have samples with PCM data as audio? I found no such files, the games I looked at use either smacker- or bink-compressed audio. I was also not able to produce a file with uncompressed audio using RAD tools. The libsmacker decoder² does what you (and the MultimediaWiki) suggest, only reading UnpackedLength for compressed audio tracks, so we should probably do the same. > But in a seeking function you should perform the seek yourself, so > that you can check whether the seek was successfull or not. Once your patches are merged, is it OK to redo the seek patch with the changes you suggest? Not sure if supporting only seeks to beginning is generally considered OK for ffmpeg. Finally, I’m not sure how to handle ring frames – the last frame of a Smacker video can be a “ring frame”, which is visually identical to the first frame and is intended to be used with looping videos. Libsmacker automatically loops a video indefinitely when a ring frame is detected, skipping the first frame on all but the first iteration. This doesn’t seem to fit well with ffmpeg. Another option would be to seek to the second frame (only) when rewinding from the last frame. Or we could simply ignore the ring frame. ¹ http://www.radgametools.com/binkhmas.htm ² https://sourceforge.net/p/libsmacker/code/HEAD/tree/trunk/smacker.c#l1052
Timotej Lazar: > Andreas Rheinhardt <andreas.rheinhardt@gmail.com> [2020-06-28 15:32:05+0200]: >> Moreover, I have a patchset for the Smacker demuxer [1] and you seem to >> know a lot about Smacker, so you might take a look. > > Thanks for the feedback! I don’t really know much about Smacker beyond > what I discovered while trying to get rewinding to work. That said, your > patches look good to me – while working on my patch I was wondering > about several issues you addressed, such as buffering audio packets. > Thanks for the review and confirmation. >> In particular, I don't know if Smacker supports sparse tracks where >> the gaps between points where a track has data is filled by data >> from other tracks > > The description of the “Start mixing at what frame” setting¹ leads me > to believe that all audio tracks are “complete”, padded with silence > if necessary – at least when created with RAD tools. The same page > hints that multiple audio tracks were intended to be used simply as an > alternative to having a separate video file for each language. > That's good to hear. No need to worry about something that likely doesn't exist. > While there might be some program-specific extensions to do what you > describe (online docs also mention the possibility of using audio tracks > to store other data), I don’t think ffmpeg could or should support them. > >> Furthermore, do you have samples with PCM data as audio? > > I found no such files, the games I looked at use either smacker- or > bink-compressed audio. I was also not able to produce a file with > uncompressed audio using RAD tools. The libsmacker decoder² does what > you (and the MultimediaWiki) suggest, only reading UnpackedLength for > compressed audio tracks, so we should probably do the same. > I'll implement it. >> But in a seeking function you should perform the seek yourself, so >> that you can check whether the seek was successfull or not. > > Once your patches are merged, is it OK to redo the seek patch with the > changes you suggest? Not sure if supporting only seeks to beginning is > generally considered OK for ffmpeg. > It fixes something, i.e. ffmpeg's stream_loop option works with your patch, so I don't see a reason why it should not be applied. I'll update your patch (with your authorship retained) when I have implemented the PCM duration. > Finally, I’m not sure how to handle ring frames – the last frame of a > Smacker video can be a “ring frame”, which is visually identical to the > first frame and is intended to be used with looping videos. > > Libsmacker automatically loops a video indefinitely when a ring frame is > detected, skipping the first frame on all but the first iteration. This > doesn’t seem to fit well with ffmpeg. Another option would be to seek to > the second frame (only) when rewinding from the last frame. Or we could > simply ignore the ring frame. > > ¹ http://www.radgametools.com/binkhmas.htm > ² https://sourceforge.net/p/libsmacker/code/HEAD/tree/trunk/smacker.c#l1052 Not sure how (or if) to handle this. - Andreas
diff --git a/libavformat/smacker.c b/libavformat/smacker.c index 8b1e185817..19df8edc67 100644 --- a/libavformat/smacker.c +++ b/libavformat/smacker.c @@ -230,7 +230,7 @@ static int smacker_read_header(AVFormatContext *s) } smk->curstream = -1; - smk->nextpos = avio_tell(pb); + smk->nextpos = s->internal->data_offset = avio_tell(pb); return 0; } @@ -373,6 +373,27 @@ static int smacker_read_close(AVFormatContext *s) return 0; } +static int smacker_read_seek(AVFormatContext *s, int stream_index, + int64_t timestamp, int flags) +{ + SmackerContext *smk = s->priv_data; + + /* only rewinding to start is supported */ + if (timestamp != 0) { + av_log(s, AV_LOG_ERROR, + "Random seeks are not supported (can only seek to start).\n"); + return AVERROR(EINVAL); + } + + smk->cur_frame = 0; + smk->curstream = -1; + smk->nextpos = s->internal->data_offset; + memset(smk->pal, 0, sizeof(smk->pal)); + memset(smk->aud_pts, 0, sizeof(smk->aud_pts)); + + return 0; +} + AVInputFormat ff_smacker_demuxer = { .name = "smk", .long_name = NULL_IF_CONFIG_SMALL("Smacker"), @@ -381,4 +402,5 @@ AVInputFormat ff_smacker_demuxer = { .read_header = smacker_read_header, .read_packet = smacker_read_packet, .read_close = smacker_read_close, + .read_seek = smacker_read_seek, };
Add .read_seek function to the smacker demuxer for the special case of seeking to ts=0. This is useful because smacker – like bink, with a similar implementation – was mostly used to encode clips in video games, where random seeks are rare but looping media are common. Signed-off-by: Timotej Lazar <timotej.lazar@araneo.si> --- Unlike the existing bink implementation that always rewinds to start, regardless of arguments to seek(), this will return EINVAL when seeking to any timestamp other than 0. I tried to implement random seeking, but the smacker format is not very amenable to that. While it supports keyframes, most videos I could find do not use them. Since any video frame can contain an incremental palette update, it might be necessary to rewind in any case and decode all frames up to the target timestamp. This is the first time I dived into ffmpeg, so I’d appreciate any feedback or suggestions on how to better approach this. Thanks! libavformat/smacker.c | 24 +++++++++++++++++++++++- 1 file changed, 23 insertions(+), 1 deletion(-)