Message ID | tencent_F315AB514E5570E32D6E9AC0407A9ADF9F08@qq.com |
---|---|
State | New |
Headers | show |
Series | [FFmpeg-devel,v2,1/7] avcodec/mediacodecenc: make each encoder has its own option | expand |
Context | Check | Description |
---|---|---|
andriy/configure_x86 | warning | Failed to apply patch |
yinshiyou/configure_loongarch64 | warning | Failed to apply patch |
ons 2022-12-07 klockan 17:31 +0800 skrev Zhao Zhili: > From: Zhao Zhili <zhilizhao@tencent.com> > > Use input PTS as DTS has multiple problems: > 1. If there is no reordering, it's better to just use the output > PTS as DTS, since encoder may change the timestamp value (do it > on purpose or rounding error). > > 2. If there is reordering, input PTS should be shift a few frames > as DTS to satisfy the requirement of PTS >= DTS. I can't find a > reliable way to determine how many frames to be shift. For example, > we don't known if the encoder use hierarchical B frames. The > max_num_reorder_frames can be get from VUI, but VUI is optional. > > 3. Encoder dropping frames makes the case worse. Android has an > BITRATE_MODE_CBR_FD option to allow it explicitly. Don't we already have code to parse this stuff from h.264 streams? /Tomas
> On Dec 12, 2022, at 23:28, Tomas Härdin <git@haerdin.se> wrote: > > ons 2022-12-07 klockan 17:31 +0800 skrev Zhao Zhili: >> From: Zhao Zhili <zhilizhao@tencent.com> >> >> Use input PTS as DTS has multiple problems: >> 1. If there is no reordering, it's better to just use the output >> PTS as DTS, since encoder may change the timestamp value (do it >> on purpose or rounding error). >> >> 2. If there is reordering, input PTS should be shift a few frames >> as DTS to satisfy the requirement of PTS >= DTS. I can't find a >> reliable way to determine how many frames to be shift. For example, >> we don't known if the encoder use hierarchical B frames. The >> max_num_reorder_frames can be get from VUI, but VUI is optional. >> >> 3. Encoder dropping frames makes the case worse. Android has an >> BITRATE_MODE_CBR_FD option to allow it explicitly. > > Don't we already have code to parse this stuff from h.264 streams? We have code to parse max_num_reorder_frames, if the stream contains the field. We have some code to generate/guess DTS but doesn’t work reliably for H.264/H.265. I guess it’s hard, if not impossible. We don’t know the prediction structures. Use encoder’s input PTS as DTS works for some transport protocols like RTP, and MP4 with negative_cts_offsets, otherwise it’s broken due to PTS >= DTS. If we can and decided to implement such algorithm, it should go in a higher level, so remux raw stream works better. > > /Tomas > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
tis 2022-12-13 klockan 10:55 +0800 skrev zhilizhao(赵志立): > > > > On Dec 12, 2022, at 23:28, Tomas Härdin <git@haerdin.se> wrote: > > > > ons 2022-12-07 klockan 17:31 +0800 skrev Zhao Zhili: > > > From: Zhao Zhili <zhilizhao@tencent.com> > > > > > > Use input PTS as DTS has multiple problems: > > > 1. If there is no reordering, it's better to just use the output > > > PTS as DTS, since encoder may change the timestamp value (do it > > > on purpose or rounding error). > > > > > > 2. If there is reordering, input PTS should be shift a few frames > > > as DTS to satisfy the requirement of PTS >= DTS. I can't find a > > > reliable way to determine how many frames to be shift. For > > > example, > > > we don't known if the encoder use hierarchical B frames. The > > > max_num_reorder_frames can be get from VUI, but VUI is optional. > > > > > > 3. Encoder dropping frames makes the case worse. Android has an > > > BITRATE_MODE_CBR_FD option to allow it explicitly. > > > > Don't we already have code to parse this stuff from h.264 streams? > > We have code to parse max_num_reorder_frames, if the stream contains > the field. > > We have some code to generate/guess DTS but doesn’t work reliably for > H.264/H.265. I guess it’s hard, if not impossible. We don’t know the > prediction structures. Use encoder’s input PTS as DTS works for some > transport protocols like RTP, and MP4 with negative_cts_offsets, > otherwise it’s broken due to PTS >= DTS. Right, H.264 and H.265 support rather arbitrary reference structures. I guess a pessimistic estimate is good enough /Tomas
Does this mean the encoder will produce packets with dts=AV_NOPTS_VALUE?
> On Jan 4, 2023, at 18:16, Anton Khirnov <anton@khirnov.net> wrote: > > Does this mean the encoder will produce packets with dts=AV_NOPTS_VALUE? MediaCodec should not encode B frames by default, so dts = pts by default. B frames can be enabled explicitly, in that case dts is AV_NOPTS_VALUE. Android system’s MP4 muxer works very hard to recreate dts to workaround the limitation of MediaCodec API. It create ‘valid’ but almost useless files with a lot of jitters. > > -- > Anton Khirnov >
ons 2023-01-04 klockan 19:31 +0800 skrev zhilizhao(赵志立): > > > On Jan 4, 2023, at 18:16, Anton Khirnov <anton@khirnov.net> wrote: > > > > Does this mean the encoder will produce packets with > > dts=AV_NOPTS_VALUE? > > MediaCodec should not encode B frames by default, so dts = pts by > default. > B frames can be enabled explicitly, in that case dts is > AV_NOPTS_VALUE. > > Android system’s MP4 muxer works very hard to recreate dts to > workaround > the limitation of MediaCodec API. It create ‘valid’ but almost > useless > files with a lot of jitters. Maybe we should disallow B-frames entirely until the MediaCodec API is fixed so as to not add to this mess? /Tomas
> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of Tomas Härdin > Sent: 2023年1月4日 22:00 > To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> > Subject: Re: [FFmpeg-devel] [Internet]Re: [PATCH v2 5/7] avcodec/mediacodecenc: remove the strategy to create DTS > > ons 2023-01-04 klockan 19:31 +0800 skrev zhilizhao(赵志立): > > > > > On Jan 4, 2023, at 18:16, Anton Khirnov <anton@khirnov.net> wrote: > > > > > > Does this mean the encoder will produce packets with > > > dts=AV_NOPTS_VALUE? > > > > MediaCodec should not encode B frames by default, so dts = pts by > > default. > > B frames can be enabled explicitly, in that case dts is > > AV_NOPTS_VALUE. > > > > Android system’s MP4 muxer works very hard to recreate dts to > > workaround > > the limitation of MediaCodec API. It create ‘valid’ but almost > > useless > > files with a lot of jitters. > > Maybe we should disallow B-frames entirely until the MediaCodec API is > fixed so as to not add to this mess? There are some valid usecases. For example, RTP works fine without DTS. With some restriction or precondition, use can create DTS too. So I'd like to keep the feature. > > /Tomas > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Quoting Zhao Zhili (2023-01-04 15:46:52) > > > From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of Tomas Härdin > > Sent: 2023年1月4日 22:00 > > To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> > > Subject: Re: [FFmpeg-devel] [Internet]Re: [PATCH v2 5/7] avcodec/mediacodecenc: remove the strategy to create DTS > > > > ons 2023-01-04 klockan 19:31 +0800 skrev zhilizhao(赵志立): > > > > > > > On Jan 4, 2023, at 18:16, Anton Khirnov <anton@khirnov.net> wrote: > > > > > > > > Does this mean the encoder will produce packets with > > > > dts=AV_NOPTS_VALUE? > > > > > > MediaCodec should not encode B frames by default, so dts = pts by > > > default. > > > B frames can be enabled explicitly, in that case dts is > > > AV_NOPTS_VALUE. > > > > > > Android system’s MP4 muxer works very hard to recreate dts to > > > workaround > > > the limitation of MediaCodec API. It create ‘valid’ but almost > > > useless > > > files with a lot of jitters. > > > > Maybe we should disallow B-frames entirely until the MediaCodec API is > > fixed so as to not add to this mess? > > There are some valid usecases. For example, RTP works fine without DTS. > With some restriction or precondition, use can create DTS too. So I'd like > to keep the feature. It is currently an API guarantee that all encoders return valid DTS values, so this encoder is behaving in an invalid way. If you want to keep B-frame functionality in this half-working state, I'd say it should be hidden behind something like -strict experimental.
> -----Original Message----- > From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of Anton Khirnov > Sent: 2023年1月4日 23:16 > To: 'FFmpeg development discussions and patches' <ffmpeg-devel@ffmpeg.org> > Subject: Re: [FFmpeg-devel] [Internet]Re: [PATCH v2 5/7] avcodec/mediacodecenc: remove the strategy to create DTS > > Quoting Zhao Zhili (2023-01-04 15:46:52) > > > > > From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of Tomas Härdin > > > Sent: 2023年1月4日 22:00 > > > To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> > > > Subject: Re: [FFmpeg-devel] [Internet]Re: [PATCH v2 5/7] avcodec/mediacodecenc: remove the strategy to create DTS > > > > > > ons 2023-01-04 klockan 19:31 +0800 skrev zhilizhao(赵志立): > > > > > > > > > On Jan 4, 2023, at 18:16, Anton Khirnov <anton@khirnov.net> wrote: > > > > > > > > > > Does this mean the encoder will produce packets with > > > > > dts=AV_NOPTS_VALUE? > > > > > > > > MediaCodec should not encode B frames by default, so dts = pts by > > > > default. > > > > B frames can be enabled explicitly, in that case dts is > > > > AV_NOPTS_VALUE. > > > > > > > > Android system’s MP4 muxer works very hard to recreate dts to > > > > workaround > > > > the limitation of MediaCodec API. It create ‘valid’ but almost > > > > useless > > > > files with a lot of jitters. > > > > > > Maybe we should disallow B-frames entirely until the MediaCodec API is > > > fixed so as to not add to this mess? > > > > There are some valid usecases. For example, RTP works fine without DTS. > > With some restriction or precondition, use can create DTS too. So I'd like > > to keep the feature. > > It is currently an API guarantee that all encoders return valid DTS > values, so this encoder is behaving in an invalid way. It's an reasonable requirement, but could you elaborate on where is the doc explicitly says that? > > If you want to keep B-frame functionality in this half-working state, > I'd say it should be hidden behind something like -strict experimental. Will do. > > -- > Anton Khirnov > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Quoting Zhao Zhili (2023-01-04 17:12:56) > > It is currently an API guarantee that all encoders return valid DTS > > values, so this encoder is behaving in an invalid way. > > It's an reasonable requirement, but could you elaborate on where is the doc > explicitly says that? I am not aware of this being explicitly documented, but then our documentation is sadly far from complete. But this was one of the major reasons for adding avcodec_encode_video2() back in 2012 (later replaced by avcodec_send_frame()+receive_packet()). Also see 8de1ee9f725aa3c550f425bd3120bcd95d5b2ea8.
diff --git a/libavcodec/mediacodecenc.c b/libavcodec/mediacodecenc.c index 7f2ae88285..8e28a50e0d 100644 --- a/libavcodec/mediacodecenc.c +++ b/libavcodec/mediacodecenc.c @@ -64,18 +64,6 @@ typedef struct MediaCodecEncContext { uint8_t *extradata; int extradata_size; - - // Since MediaCodec doesn't output DTS, use a timestamp queue to save pts - // of AVFrame and generate DTS for AVPacket. - // - // This doesn't work when use Surface as input, in that case frames can be - // sent to encoder without our notice. One exception is frames come from - // our MediaCodec decoder wrapper, since we can control it's render by - // av_mediacodec_release_buffer. - int64_t timestamps[32]; - int ts_head; - int ts_tail; - int eof_sent; AVFrame *frame; @@ -368,11 +356,6 @@ static int mediacodec_receive(AVCodecContext *avctx, } memcpy(pkt->data + extradata_size, out_buf + out_info.offset, out_info.size); pkt->pts = av_rescale_q(out_info.presentationTimeUs, AV_TIME_BASE_Q, avctx->time_base); - if (s->ts_tail != s->ts_head) { - pkt->dts = s->timestamps[s->ts_tail]; - s->ts_tail = (s->ts_tail + 1) % FF_ARRAY_ELEMS(s->timestamps); - } - if (out_info.flags & ff_AMediaCodec_getBufferFlagKeyFrame(codec)) pkt->flags |= AV_PKT_FLAG_KEY; ret = 0; @@ -437,14 +420,8 @@ static int mediacodec_send(AVCodecContext *avctx, return ff_AMediaCodec_signalEndOfInputStream(codec); } - - if (frame->data[3]) { - pts = av_rescale_q(frame->pts, avctx->time_base, AV_TIME_BASE_Q); - s->timestamps[s->ts_head] = frame->pts; - s->ts_head = (s->ts_head + 1) % FF_ARRAY_ELEMS(s->timestamps); - + if (frame->data[3]) av_mediacodec_release_buffer((AVMediaCodecBuffer *)frame->data[3], 1); - } return 0; } @@ -463,9 +440,6 @@ static int mediacodec_send(AVCodecContext *avctx, copy_frame_to_buffer(avctx, frame, input_buf, input_size); pts = av_rescale_q(frame->pts, avctx->time_base, AV_TIME_BASE_Q); - - s->timestamps[s->ts_head] = frame->pts; - s->ts_head = (s->ts_head + 1) % FF_ARRAY_ELEMS(s->timestamps); } else { flags |= ff_AMediaCodec_getBufferFlagEndOfStream(codec); s->eof_sent = 1;
From: Zhao Zhili <zhilizhao@tencent.com> Use input PTS as DTS has multiple problems: 1. If there is no reordering, it's better to just use the output PTS as DTS, since encoder may change the timestamp value (do it on purpose or rounding error). 2. If there is reordering, input PTS should be shift a few frames as DTS to satisfy the requirement of PTS >= DTS. I can't find a reliable way to determine how many frames to be shift. For example, we don't known if the encoder use hierarchical B frames. The max_num_reorder_frames can be get from VUI, but VUI is optional. 3. Encoder dropping frames makes the case worse. Android has an BITRATE_MODE_CBR_FD option to allow it explicitly. --- libavcodec/mediacodecenc.c | 28 +--------------------------- 1 file changed, 1 insertion(+), 27 deletions(-)