Message ID | 20230507133255.20881-4-anton@khirnov.net |
---|---|
State | New |
Headers | show |
Series | [FFmpeg-devel,01/13] lavu/frame: extend AVFrame.repeat_pict documentation | expand |
Context | Check | Description |
---|---|---|
andriy/configure_x86 | warning | Failed to apply patch |
On Sun, May 07, 2023 at 03:32:46PM +0200, Anton Khirnov wrote: > H.264 and mpeg12 parsers need to be adjusted at the same time to stop > using the value of AVCodecContext.ticks_per_frame, because it is not set > correctly unless the codec has been opened. Previously this would result > in both the parser and lavf seeing the same incorrect value, which would > cancel out. > Updating lavf and not the parsers would result in correct value in lavf, > but the wrong one in parsers, which would break some tests. > --- > libavcodec/h264_parser.c | 4 ++-- > libavcodec/mpegvideo_parser.c | 2 +- > libavformat/avformat.c | 9 ++++++--- > libavformat/demux.c | 29 +++++++++++++++++++---------- > libavformat/internal.h | 3 +++ > 5 files changed, 31 insertions(+), 16 deletions(-) breaks: ./ffmpeg -i ~/tickets/104/cartonfold.avi -bitexact -f framecrc - fps and timestamps look strange [...]
On Sun, May 07, 2023 at 03:32:46PM +0200, Anton Khirnov wrote: > H.264 and mpeg12 parsers need to be adjusted at the same time to stop > using the value of AVCodecContext.ticks_per_frame, because it is not set > correctly unless the codec has been opened. Previously this would result > in both the parser and lavf seeing the same incorrect value, which would > cancel out. > Updating lavf and not the parsers would result in correct value in lavf, > but the wrong one in parsers, which would break some tests. > --- > libavcodec/h264_parser.c | 4 ++-- > libavcodec/mpegvideo_parser.c | 2 +- > libavformat/avformat.c | 9 ++++++--- > libavformat/demux.c | 29 +++++++++++++++++++---------- > libavformat/internal.h | 3 +++ > 5 files changed, 31 insertions(+), 16 deletions(-) Doesnt this sort of change need a major ABI bump ? it sounds like lavc and lavf interdepend here both ways thx [...]
Quoting Michael Niedermayer (2023-05-08 16:15:42) > On Sun, May 07, 2023 at 03:32:46PM +0200, Anton Khirnov wrote: > > H.264 and mpeg12 parsers need to be adjusted at the same time to stop > > using the value of AVCodecContext.ticks_per_frame, because it is not set > > correctly unless the codec has been opened. Previously this would result > > in both the parser and lavf seeing the same incorrect value, which would > > cancel out. > > Updating lavf and not the parsers would result in correct value in lavf, > > but the wrong one in parsers, which would break some tests. > > --- > > libavcodec/h264_parser.c | 4 ++-- > > libavcodec/mpegvideo_parser.c | 2 +- > > libavformat/avformat.c | 9 ++++++--- > > libavformat/demux.c | 29 +++++++++++++++++++---------- > > libavformat/internal.h | 3 +++ > > 5 files changed, 31 insertions(+), 16 deletions(-) > > Doesnt this sort of change need a major ABI bump ? > it sounds like lavc and lavf interdepend here both ways No, we do not guarantee bug compatibility. Libavformat seeing ticks_per_frame=1 for codecs that set it to 2 upon being opened is a bug. Same for the parser. It just so happens that libavformat AND its internal parser instance see the same incorrect value and this cancels out in cases that are tested by FATE (it would break if we had more thorough tests for repeating single fields). I could split this into two patches, the first of which would fix one of the bugs, expose the other one, breaking some tests. Then the second patch would fix the second bug, fixing the tests again. It seems better to do it in a single step to avoid the noise.
On Tue, May 09, 2023 at 10:44:50AM +0200, Anton Khirnov wrote: > Quoting Michael Niedermayer (2023-05-08 16:15:42) > > On Sun, May 07, 2023 at 03:32:46PM +0200, Anton Khirnov wrote: > > > H.264 and mpeg12 parsers need to be adjusted at the same time to stop > > > using the value of AVCodecContext.ticks_per_frame, because it is not set > > > correctly unless the codec has been opened. Previously this would result > > > in both the parser and lavf seeing the same incorrect value, which would > > > cancel out. > > > Updating lavf and not the parsers would result in correct value in lavf, > > > but the wrong one in parsers, which would break some tests. > > > --- > > > libavcodec/h264_parser.c | 4 ++-- > > > libavcodec/mpegvideo_parser.c | 2 +- > > > libavformat/avformat.c | 9 ++++++--- > > > libavformat/demux.c | 29 +++++++++++++++++++---------- > > > libavformat/internal.h | 3 +++ > > > 5 files changed, 31 insertions(+), 16 deletions(-) > > > > Doesnt this sort of change need a major ABI bump ? > > it sounds like lavc and lavf interdepend here both ways > > No, we do not guarantee bug compatibility. > > Libavformat seeing ticks_per_frame=1 for codecs that set it to 2 upon > being opened is a bug. Same for the parser. > > It just so happens that libavformat AND its internal parser instance see > the same incorrect value and this cancels out in cases that are tested > by FATE (it would break if we had more thorough tests for repeating > single fields). This patch seems to change tbr ./ffmpeg -i fate-suite//h264/lossless.h264 Stream #0:0: Video: h264 (High 4:4:4 Predictive), yuv420p(progressive), 640x480, 25 fps, 60 tbr, 1200k tbn vs. Stream #0:0: Video: h264 (High 4:4:4 Predictive), yuv420p(progressive), 640x480, 25 fps, 120 tbr, 1200k tbn with ./ffmpeg -i fate-suite//h264/lossless.h264 -f framecrc - The output uses 1/60 thats odd because if every frame can be represented in 1/60 then tbr is 1/60 or more course OTOH if tbr is finer than 1/60 then not every frame can be represented in 1/60 maybe iam missing something but the new value seems worse and also not consistent with what ffmpeg actually uses thx [...]
Quoting Michael Niedermayer (2023-05-15 20:59:42) > On Tue, May 09, 2023 at 10:44:50AM +0200, Anton Khirnov wrote: > > Quoting Michael Niedermayer (2023-05-08 16:15:42) > > > On Sun, May 07, 2023 at 03:32:46PM +0200, Anton Khirnov wrote: > > > > H.264 and mpeg12 parsers need to be adjusted at the same time to stop > > > > using the value of AVCodecContext.ticks_per_frame, because it is not set > > > > correctly unless the codec has been opened. Previously this would result > > > > in both the parser and lavf seeing the same incorrect value, which would > > > > cancel out. > > > > Updating lavf and not the parsers would result in correct value in lavf, > > > > but the wrong one in parsers, which would break some tests. > > > > --- > > > > libavcodec/h264_parser.c | 4 ++-- > > > > libavcodec/mpegvideo_parser.c | 2 +- > > > > libavformat/avformat.c | 9 ++++++--- > > > > libavformat/demux.c | 29 +++++++++++++++++++---------- > > > > libavformat/internal.h | 3 +++ > > > > 5 files changed, 31 insertions(+), 16 deletions(-) > > > > > > Doesnt this sort of change need a major ABI bump ? > > > it sounds like lavc and lavf interdepend here both ways > > > > No, we do not guarantee bug compatibility. > > > > Libavformat seeing ticks_per_frame=1 for codecs that set it to 2 upon > > being opened is a bug. Same for the parser. > > > > It just so happens that libavformat AND its internal parser instance see > > the same incorrect value and this cancels out in cases that are tested > > by FATE (it would break if we had more thorough tests for repeating > > single fields). > > This patch seems to change tbr > ./ffmpeg -i fate-suite//h264/lossless.h264 > Stream #0:0: Video: h264 (High 4:4:4 Predictive), yuv420p(progressive), 640x480, 25 fps, 60 tbr, 1200k tbn > vs. > Stream #0:0: Video: h264 (High 4:4:4 Predictive), yuv420p(progressive), 640x480, 25 fps, 120 tbr, 1200k tbn > > with > ./ffmpeg -i fate-suite//h264/lossless.h264 -f framecrc - > > The output uses 1/60 thats odd because if every frame can be represented in > 1/60 then tbr is 1/60 or more course > OTOH if tbr is finer than 1/60 then not every frame can be represented in 1/60 > > maybe iam missing something but the new value seems worse and also > not consistent with what ffmpeg actually uses ticks_per_frame was added by you in 3797c74ba53, and according to your code it's supposed to be 2 for H.264. It just so happens that for this specific sample libavformat invokes the parser without opening the decoder, so it sees the default value of 1. If it did open the decoder, it would see 2. This patch at least makes it consistent, even if it might not always be the optimal choice. As far as I'm concerned, the entire notion of 'tbr' is fundamentally flawed and should be abandoned. There is no magical way for the code to know what timebase is truly the right one here, without reading the whole file. Furthermore, the entire approach of "some sample X is now getting slightly worse arbitrary numbers than before" seems highly questionable to me. Our timestamps code is a unholy mess of hacks upon hacks upon hacks. For pretty much ANY change one can find or construct a sample that decodes worse after it. We should stop focusing on individual samples and prioritize overall consistency and correctness.
On Mon, May 15, 2023 at 10:44:56PM +0200, Anton Khirnov wrote: > Quoting Michael Niedermayer (2023-05-15 20:59:42) > > On Tue, May 09, 2023 at 10:44:50AM +0200, Anton Khirnov wrote: > > > Quoting Michael Niedermayer (2023-05-08 16:15:42) > > > > On Sun, May 07, 2023 at 03:32:46PM +0200, Anton Khirnov wrote: > > > > > H.264 and mpeg12 parsers need to be adjusted at the same time to stop > > > > > using the value of AVCodecContext.ticks_per_frame, because it is not set > > > > > correctly unless the codec has been opened. Previously this would result > > > > > in both the parser and lavf seeing the same incorrect value, which would > > > > > cancel out. > > > > > Updating lavf and not the parsers would result in correct value in lavf, > > > > > but the wrong one in parsers, which would break some tests. > > > > > --- > > > > > libavcodec/h264_parser.c | 4 ++-- > > > > > libavcodec/mpegvideo_parser.c | 2 +- > > > > > libavformat/avformat.c | 9 ++++++--- > > > > > libavformat/demux.c | 29 +++++++++++++++++++---------- > > > > > libavformat/internal.h | 3 +++ > > > > > 5 files changed, 31 insertions(+), 16 deletions(-) > > > > > > > > Doesnt this sort of change need a major ABI bump ? > > > > it sounds like lavc and lavf interdepend here both ways > > > > > > No, we do not guarantee bug compatibility. > > > > > > Libavformat seeing ticks_per_frame=1 for codecs that set it to 2 upon > > > being opened is a bug. Same for the parser. > > > > > > It just so happens that libavformat AND its internal parser instance see > > > the same incorrect value and this cancels out in cases that are tested > > > by FATE (it would break if we had more thorough tests for repeating > > > single fields). > > > > This patch seems to change tbr > > ./ffmpeg -i fate-suite//h264/lossless.h264 > > Stream #0:0: Video: h264 (High 4:4:4 Predictive), yuv420p(progressive), 640x480, 25 fps, 60 tbr, 1200k tbn > > vs. > > Stream #0:0: Video: h264 (High 4:4:4 Predictive), yuv420p(progressive), 640x480, 25 fps, 120 tbr, 1200k tbn > > > > with > > ./ffmpeg -i fate-suite//h264/lossless.h264 -f framecrc - > > > > The output uses 1/60 thats odd because if every frame can be represented in > > 1/60 then tbr is 1/60 or more course > > OTOH if tbr is finer than 1/60 then not every frame can be represented in 1/60 > > > > maybe iam missing something but the new value seems worse and also > > not consistent with what ffmpeg actually uses > > ticks_per_frame was added by you in 3797c74ba53, and according to your > code it's supposed to be 2 for H.264. It just so happens that for this > specific sample libavformat invokes the parser without opening the > decoder, so it sees the default value of 1. If it did open the decoder, > it would see 2. This patch at least makes it consistent, even if it > might not always be the optimal choice. Iam not sure how it is consistent if the value used is different than the value displayed > > As far as I'm concerned, the entire notion of 'tbr' is fundamentally > flawed and should be abandoned. There is no magical way for the code to > know what timebase is truly the right one here, without reading the > whole file. > > Furthermore, the entire approach of "some sample X is now getting > slightly worse arbitrary numbers than before" seems highly questionable > to me. Our timestamps code is a unholy mess of hacks upon hacks upon > hacks. For pretty much ANY change one can find or construct a sample > that decodes worse after it. We should stop focusing on individual > samples and prioritize overall consistency and correctness. I think the important part is provide the user with what (s)he wants If more files work better, thats a win. The world of multimedia is a bit messy in some corners (as you know) so i am not sure if true beauty, cleanliness and consistency can be achieved while having well working/fast code But i certainly support making the code nicer. about "correctness", iam not even sure what exactly is "correct" in some cases. just the recent hls case, the first 4 links i tried used 2 mime types the rfc would consider wrong. id say adding them is "correct" with the "SHOULD" recommandition but others surely will disagree about tbr, i think its a usefull field, It wont be the global optimal value for some videos but neither would width and height be, if they change midstream. But any improvment is good and i support that, in this case here i saw one file change and i reported it. Thanks [...]
diff --git a/libavcodec/h264_parser.c b/libavcodec/h264_parser.c index 46134a1c48..43abc45f9c 100644 --- a/libavcodec/h264_parser.c +++ b/libavcodec/h264_parser.c @@ -568,7 +568,7 @@ static inline int parse_nal_units(AVCodecParserContext *s, if (p->sei.common.unregistered.x264_build < 44U) den *= 2; av_reduce(&avctx->framerate.den, &avctx->framerate.num, - sps->num_units_in_tick * avctx->ticks_per_frame, den, 1 << 30); + sps->num_units_in_tick * 2, den, 1 << 30); } av_freep(&rbsp.rbsp_buffer); @@ -625,7 +625,7 @@ static int h264_parse(AVCodecParserContext *s, parse_nal_units(s, avctx, buf, buf_size); if (avctx->framerate.num) - time_base = av_inv_q(av_mul_q(avctx->framerate, (AVRational){avctx->ticks_per_frame, 1})); + time_base = av_inv_q(av_mul_q(avctx->framerate, (AVRational){2, 1})); if (p->sei.picture_timing.cpb_removal_delay >= 0) { s->dts_sync_point = p->sei.buffering_period.present; s->dts_ref_dts_delta = p->sei.picture_timing.cpb_removal_delay; diff --git a/libavcodec/mpegvideo_parser.c b/libavcodec/mpegvideo_parser.c index 8e7e88ff25..08e5316960 100644 --- a/libavcodec/mpegvideo_parser.c +++ b/libavcodec/mpegvideo_parser.c @@ -129,6 +129,7 @@ static void mpegvideo_extract_headers(AVCodecParserContext *s, s->pict_type = (buf[1] >> 3) & 7; if (bytes_left >= 4) vbv_delay = ((buf[1] & 0x07) << 13) | (buf[2] << 5) | (buf[3] >> 3); + s->repeat_pict = 1; } break; case SEQ_START_CODE: @@ -186,7 +187,6 @@ static void mpegvideo_extract_headers(AVCodecParserContext *s, progressive_frame = buf[4] & (1 << 7); /* check if we must repeat the frame */ - s->repeat_pict = 1; if (repeat_first_field) { if (pc->progressive_sequence) { if (top_field_first) diff --git a/libavformat/avformat.c b/libavformat/avformat.c index 708d90b38c..fea905693d 100644 --- a/libavformat/avformat.c +++ b/libavformat/avformat.c @@ -679,6 +679,7 @@ AVRational av_guess_sample_aspect_ratio(AVFormatContext *format, AVStream *strea AVRational av_guess_frame_rate(AVFormatContext *format, AVStream *st, AVFrame *frame) { AVRational fr = st->r_frame_rate; + const AVCodecDescriptor *desc = cffstream(st)->codec_desc; AVCodecContext *const avctx = ffstream(st)->avctx; AVRational codec_fr = avctx->framerate; AVRational avg_fr = st->avg_frame_rate; @@ -688,7 +689,7 @@ AVRational av_guess_frame_rate(AVFormatContext *format, AVStream *st, AVFrame *f fr = avg_fr; } - if (avctx->ticks_per_frame > 1) { + if (desc && (desc->props & AV_CODEC_PROP_FIELDS)) { if ( codec_fr.num > 0 && codec_fr.den > 0 && (fr.num == 0 || av_q2d(codec_fr) < av_q2d(fr)*0.7 && fabs(1.0 - av_q2d(av_div_q(avg_fr, fr))) > 0.1)) fr = codec_fr; @@ -701,10 +702,12 @@ int avformat_transfer_internal_stream_timing_info(const AVOutputFormat *ofmt, AVStream *ost, const AVStream *ist, enum AVTimebaseSource copy_tb) { + const AVCodecDescriptor *desc = cffstream(ist)->codec_desc; const AVCodecContext *const dec_ctx = cffstream(ist)->avctx; AVCodecContext *const enc_ctx = ffstream(ost)->avctx; - AVRational dec_ctx_tb = dec_ctx->framerate.num ? av_inv_q(av_mul_q(dec_ctx->framerate, - (AVRational){dec_ctx->ticks_per_frame, 1})) + + AVRational mul = (AVRational){ desc && (desc->props & AV_CODEC_PROP_FIELDS) ? 2 : 1, 1 }; + AVRational dec_ctx_tb = dec_ctx->framerate.num ? av_inv_q(av_mul_q(dec_ctx->framerate, mul)) : (ist->codecpar->codec_type == AVMEDIA_TYPE_AUDIO ? (AVRational){0, 1} : ist->time_base); diff --git a/libavformat/demux.c b/libavformat/demux.c index 45e5f5c4c2..1e47cd2bba 100644 --- a/libavformat/demux.c +++ b/libavformat/demux.c @@ -213,6 +213,8 @@ FF_ENABLE_DEPRECATION_WARNINGS if (ret < 0) return ret; + sti->codec_desc = avcodec_descriptor_get(sti->avctx->codec_id); + sti->need_context_update = 0; } return 0; @@ -677,10 +679,11 @@ static void compute_frame_duration(AVFormatContext *s, int *pnum, int *pden, *pnum = st->time_base.num; *pden = st->time_base.den; } else if (codec_framerate.den * 1000LL > codec_framerate.num) { - av_assert0(sti->avctx->ticks_per_frame); + int ticks_per_frame = (sti->codec_desc && + (sti->codec_desc->props & AV_CODEC_PROP_FIELDS)) ? 2 : 1; av_reduce(pnum, pden, codec_framerate.den, - codec_framerate.num * (int64_t)sti->avctx->ticks_per_frame, + codec_framerate.num * (int64_t)ticks_per_frame, INT_MAX); if (pc && pc->repeat_pict) { @@ -692,7 +695,8 @@ static void compute_frame_duration(AVFormatContext *s, int *pnum, int *pden, /* If this codec can be interlaced or progressive then we need * a parser to compute duration of a packet. Thus if we have * no parser in such case leave duration undefined. */ - if (sti->avctx->ticks_per_frame > 1 && !pc) + if (sti->codec_desc && + (sti->codec_desc->props & AV_CODEC_PROP_FIELDS) && !pc) *pnum = *pden = 0; } break; @@ -1288,6 +1292,8 @@ static int read_frame_internal(AVFormatContext *s, AVPacket *pkt) return ret; } + sti->codec_desc = avcodec_descriptor_get(sti->avctx->codec_id); + sti->need_context_update = 0; } @@ -2164,9 +2170,10 @@ static int get_std_framerate(int i) static int tb_unreliable(AVFormatContext *ic, AVStream *st) { FFStream *const sti = ffstream(st); + const AVCodecDescriptor *desc = sti->codec_desc; AVCodecContext *c = sti->avctx; - AVRational time_base = c->framerate.num ? av_inv_q(av_mul_q(c->framerate, - (AVRational){c->ticks_per_frame, 1})) + AVRational mul = (AVRational){ desc && (desc->props & AV_CODEC_PROP_FIELDS) ? 2 : 1, 1 }; + AVRational time_base = c->framerate.num ? av_inv_q(av_mul_q(c->framerate, mul)) /* NOHEADER check added to not break existing behavior */ : (((ic->ctx_flags & AVFMTCTX_NOHEADER) || st->codecpar->codec_type == AVMEDIA_TYPE_AUDIO) ? (AVRational){0, 1} @@ -2718,13 +2725,14 @@ int avformat_find_stream_info(AVFormatContext *ic, AVDictionary **options) break; } if (pkt->duration > 0) { + const int fields = sti->codec_desc && (sti->codec_desc->props & AV_CODEC_PROP_FIELDS); if (avctx->codec_type == AVMEDIA_TYPE_SUBTITLE && pkt->pts != AV_NOPTS_VALUE && st->start_time != AV_NOPTS_VALUE && pkt->pts >= st->start_time && (uint64_t)pkt->pts - st->start_time < INT64_MAX ) { sti->info->codec_info_duration = FFMIN(pkt->pts - st->start_time, sti->info->codec_info_duration + pkt->duration); } else sti->info->codec_info_duration += pkt->duration; - sti->info->codec_info_duration_fields += sti->parser && sti->need_parsing && avctx->ticks_per_frame == 2 + sti->info->codec_info_duration_fields += sti->parser && sti->need_parsing && fields ? sti->parser->repeat_pict + 1 : 2; } } @@ -2864,15 +2872,16 @@ int avformat_find_stream_info(AVFormatContext *ic, AVDictionary **options) best_fps, 12 * 1001, INT_MAX); } if (!st->r_frame_rate.num) { - AVRational time_base = avctx->framerate.num ? av_inv_q(av_mul_q(avctx->framerate, - (AVRational){avctx->ticks_per_frame, 1})) + const AVCodecDescriptor *desc = sti->codec_desc; + AVRational mul = (AVRational){ desc && (desc->props & AV_CODEC_PROP_FIELDS) ? 2 : 1, 1 }; + AVRational time_base = avctx->framerate.num ? av_inv_q(av_mul_q(avctx->framerate, mul)) /* NOHEADER check added to not break existing behavior */ : ((ic->ctx_flags & AVFMTCTX_NOHEADER) ? (AVRational){0, 1} : st->time_base); if ( time_base.den * (int64_t) st->time_base.num - <= time_base.num * (uint64_t)avctx->ticks_per_frame * st->time_base.den) { + <= time_base.num * (uint64_t)mul.num * st->time_base.den) { av_reduce(&st->r_frame_rate.num, &st->r_frame_rate.den, - time_base.den, (int64_t)time_base.num * avctx->ticks_per_frame, INT_MAX); + time_base.den, (int64_t)time_base.num * mul.num, INT_MAX); } else { st->r_frame_rate.num = st->time_base.den; st->r_frame_rate.den = st->time_base.num; diff --git a/libavformat/internal.h b/libavformat/internal.h index f575064e8f..40c46311c8 100644 --- a/libavformat/internal.h +++ b/libavformat/internal.h @@ -23,6 +23,7 @@ #include <stdint.h> +#include "libavcodec/codec_desc.h" #include "libavcodec/packet_internal.h" #include "avformat.h" @@ -408,6 +409,8 @@ typedef struct FFStream { */ int64_t first_dts; int64_t cur_dts; + + const AVCodecDescriptor *codec_desc; } FFStream; static av_always_inline FFStream *ffstream(AVStream *st)