From patchwork Wed Nov 2 22:09:34 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?Q2zDqW1lbnQgQsWTc2No?= X-Patchwork-Id: 1265 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.103.90.1 with SMTP id o1csp493050vsb; Wed, 2 Nov 2016 15:07:27 -0700 (PDT) X-Received: by 10.28.141.143 with SMTP id p137mr4787327wmd.5.1478124447008; Wed, 02 Nov 2016 15:07:27 -0700 (PDT) Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id zl8si5224246wjb.48.2016.11.02.15.07.26; Wed, 02 Nov 2016 15:07:26 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 72CB5689BD7; Thu, 3 Nov 2016 00:07:20 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from kimiko.pkh.me (pkh.me [5.196.73.185]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 5C8A768980A for ; Thu, 3 Nov 2016 00:07:14 +0200 (EET) Received: from localhost (kimiko.pkh.me [local]) by kimiko.pkh.me (OpenSMTPD) with ESMTPA id f80da8ee; Wed, 2 Nov 2016 22:09:36 +0000 (UTC) From: =?UTF-8?q?Cl=C3=A9ment=20B=C5=93sch?= To: ffmpeg-devel@ffmpeg.org Date: Wed, 2 Nov 2016 23:09:34 +0100 Message-Id: <20161102220934.26010-1-u@pkh.me> X-Mailer: git-send-email 2.10.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] WIP: subtitles in AVFrame X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: =?UTF-8?q?Cl=C3=A9ment=20B=C5=93sch?= Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" --- So this is just a prototype that is starting to work. It needs a split (and completion of the long TODO list) before review but I wanted to share it for early feedbacks. So long story short: AVSubtitle → AVFrame, you have to use the new M:N API, and it's integrated into lavfi. Now a bit longer: the AVFrame->extended_data are AVFrameSubtitleRectangle (declared in lavu/frame). Frames are refcounted. So far this is not really meaningful as a frame ref will alloc and copy every rectangle but that's a limitation we will have to deal with for now but it allows proper integration into the framework. It actually simplifies a lot the workflow so I'm optimistic about the end result. Unfortunately, there is a lot of work remaining. End note: despite my obsession with subtitles, I'd like to remind everyone that I actually hate working with them, I just believe it's really important to get that historical burden out of sight. That means, if anyone wants to help with that, they are extremelly welcome. Thanks, TODO: - what to do about sub2video? - properly deal with -ss and -t (need strim filter?) - sub_start_display/sub_end_display needs to be honored - find a test case for dvbsub as it's likely broken (ffmpeg.c hack is removed and should be replaced by a EAGAIN logic in lavc/utils.c) - make it pass FATE: * fix cc/subcc * broke various other stuff - split commit - Changelog/APIchanges - proper API doxy - update lavfi/subtitles? --- ffmpeg.c | 231 ++++++++++++++++----------------------------- ffmpeg_filter.c | 94 +++++++++++++++++- ffmpeg_opt.c | 33 ++++++- libavcodec/avcodec.h | 9 ++ libavcodec/utils.c | 214 +++++++++++++++++++++++++++++++++++++++++ libavfilter/Makefile | 3 + libavfilter/allfilters.c | 8 ++ libavfilter/avfilter.c | 8 ++ libavfilter/buffersink.c | 28 ++++++ libavfilter/buffersrc.c | 59 ++++++++++++ libavfilter/fifo.c | 31 ++++++ libavfilter/formats.c | 4 + libavfilter/internal.h | 8 ++ libavfilter/sf_snull.c | 48 ++++++++++ libavfilter/subtitle.c | 60 ++++++++++++ libavfilter/subtitle.h | 31 ++++++ libavfilter/vf_subtitles.c | 42 +++++---- libavformat/assenc.c | 3 + libavutil/frame.c | 131 +++++++++++++++++++------ libavutil/frame.h | 59 ++++++++++-- tests/ref/fate/sub-charenc | 6 +- 21 files changed, 895 insertions(+), 215 deletions(-) create mode 100644 libavfilter/sf_snull.c create mode 100644 libavfilter/subtitle.c create mode 100644 libavfilter/subtitle.h diff --git a/ffmpeg.c b/ffmpeg.c index 36a921a..8222b8b 100644 --- a/ffmpeg.c +++ b/ffmpeg.c @@ -856,6 +856,50 @@ static int check_recording_time(OutputStream *ost) return 1; } +static void do_subtitle_out(OutputFile *of, + OutputStream *ost, + AVFrame *frame) +{ + AVCodecContext *enc = ost->enc_ctx; + AVPacket pkt; + int ret; + + av_init_packet(&pkt); + pkt.data = NULL; + pkt.size = 0; + + if (!check_recording_time(ost)) + return; + + ost->frames_encoded++; + + av_assert0(pkt.size || !pkt.data); + update_benchmark(NULL); + + ret = avcodec_send_frame(enc, frame); + if (ret < 0) + goto error; + + while (1) { + ret = avcodec_receive_packet(enc, &pkt); + if (ret == AVERROR(EAGAIN)) + break; + if (ret < 0) + goto error; + + update_benchmark("encode_subtitle %d.%d", ost->file_index, ost->index); + + av_packet_rescale_ts(&pkt, enc->time_base, ost->st->time_base); + + output_packet(of, &pkt, ost); + } + + return; +error: + av_log(NULL, AV_LOG_FATAL, "Subtitle encoding failed\n"); + exit_program(1); +} + static void do_audio_out(OutputFile *of, OutputStream *ost, AVFrame *frame) { @@ -916,89 +960,6 @@ error: exit_program(1); } -static void do_subtitle_out(OutputFile *of, - OutputStream *ost, - AVSubtitle *sub) -{ - int subtitle_out_max_size = 1024 * 1024; - int subtitle_out_size, nb, i; - AVCodecContext *enc; - AVPacket pkt; - int64_t pts; - - if (sub->pts == AV_NOPTS_VALUE) { - av_log(NULL, AV_LOG_ERROR, "Subtitle packets must have a pts\n"); - if (exit_on_error) - exit_program(1); - return; - } - - enc = ost->enc_ctx; - - if (!subtitle_out) { - subtitle_out = av_malloc(subtitle_out_max_size); - if (!subtitle_out) { - av_log(NULL, AV_LOG_FATAL, "Failed to allocate subtitle_out\n"); - exit_program(1); - } - } - - /* Note: DVB subtitle need one packet to draw them and one other - packet to clear them */ - /* XXX: signal it in the codec context ? */ - if (enc->codec_id == AV_CODEC_ID_DVB_SUBTITLE) - nb = 2; - else - nb = 1; - - /* shift timestamp to honor -ss and make check_recording_time() work with -t */ - pts = sub->pts; - if (output_files[ost->file_index]->start_time != AV_NOPTS_VALUE) - pts -= output_files[ost->file_index]->start_time; - for (i = 0; i < nb; i++) { - unsigned save_num_rects = sub->num_rects; - - ost->sync_opts = av_rescale_q(pts, AV_TIME_BASE_Q, enc->time_base); - if (!check_recording_time(ost)) - return; - - sub->pts = pts; - // start_display_time is required to be 0 - sub->pts += av_rescale_q(sub->start_display_time, (AVRational){ 1, 1000 }, AV_TIME_BASE_Q); - sub->end_display_time -= sub->start_display_time; - sub->start_display_time = 0; - if (i == 1) - sub->num_rects = 0; - - ost->frames_encoded++; - - subtitle_out_size = avcodec_encode_subtitle(enc, subtitle_out, - subtitle_out_max_size, sub); - if (i == 1) - sub->num_rects = save_num_rects; - if (subtitle_out_size < 0) { - av_log(NULL, AV_LOG_FATAL, "Subtitle encoding failed\n"); - exit_program(1); - } - - av_init_packet(&pkt); - pkt.data = subtitle_out; - pkt.size = subtitle_out_size; - pkt.pts = av_rescale_q(sub->pts, AV_TIME_BASE_Q, ost->st->time_base); - pkt.duration = av_rescale_q(sub->end_display_time, (AVRational){ 1, 1000 }, ost->st->time_base); - if (enc->codec_id == AV_CODEC_ID_DVB_SUBTITLE) { - /* XXX: the pts correction is handled here. Maybe handling - it in the codec would be better */ - if (i == 0) - pkt.pts += 90 * sub->start_display_time; - else - pkt.pts += 90 * sub->end_display_time; - } - pkt.dts = pkt.pts; - output_packet(of, &pkt, ost); - } -} - static void do_video_out(OutputFile *of, OutputStream *ost, AVFrame *next_picture, @@ -1431,6 +1392,8 @@ static int reap_filters(int flush) filtered_frame->pts = av_rescale_q(filtered_frame->pts, filter->inputs[0]->time_base, enc->time_base) - av_rescale_q(start_time, AV_TIME_BASE_Q, enc->time_base); + filtered_frame->pkt_duration = + av_rescale_q(filtered_frame->pkt_duration, filter->inputs[0]->time_base, enc->time_base); } //if (ost->source_index >= 0) // *filtered_frame= *input_streams[ost->source_index]->decoded_frame; //for me_threshold @@ -1458,8 +1421,10 @@ static int reap_filters(int flush) } do_audio_out(of, ost, filtered_frame); break; + case AVMEDIA_TYPE_SUBTITLE: + do_subtitle_out(of, ost, filtered_frame); + break; default: - // TODO support subtitle filters av_assert0(0); } @@ -2081,6 +2046,36 @@ static int send_frame_to_filters(InputStream *ist, AVFrame *decoded_frame) return ret; } +static int decode_subtitle(InputStream *ist, AVPacket *pkt, int *got_output) +{ + int i, ret, err; + AVFrame *decoded_frame; + + if (!ist->decoded_frame && !(ist->decoded_frame = av_frame_alloc())) + return AVERROR(ENOMEM); + if (!ist->filter_frame && !(ist->filter_frame = av_frame_alloc())) + return AVERROR(ENOMEM); + decoded_frame = ist->decoded_frame; + + update_benchmark(NULL); + ret = decode(ist->dec_ctx, decoded_frame, got_output, pkt); + update_benchmark("decode_subtitle %d.%d", ist->file_index, ist->st->index); + + if (ret != AVERROR_EOF) + check_decode_result(ist, got_output, ret); + + if (!*got_output || ret < 0) + return ret; + + ist->frames_decoded++; + + err = send_frame_to_filters(ist, decoded_frame); + + av_frame_unref(ist->filter_frame); + av_frame_unref(decoded_frame); + return err < 0 ? err : ret; +} + static int decode_audio(InputStream *ist, AVPacket *pkt, int *got_output) { AVFrame *decoded_frame; @@ -2335,65 +2330,6 @@ fail: return err < 0 ? err : ret; } -static int transcode_subtitles(InputStream *ist, AVPacket *pkt, int *got_output) -{ - AVSubtitle subtitle; - int i, ret = avcodec_decode_subtitle2(ist->dec_ctx, - &subtitle, got_output, pkt); - - check_decode_result(NULL, got_output, ret); - - if (ret < 0 || !*got_output) { - if (!pkt->size) - sub2video_flush(ist); - return ret; - } - - if (ist->fix_sub_duration) { - int end = 1; - if (ist->prev_sub.got_output) { - end = av_rescale(subtitle.pts - ist->prev_sub.subtitle.pts, - 1000, AV_TIME_BASE); - if (end < ist->prev_sub.subtitle.end_display_time) { - av_log(ist->dec_ctx, AV_LOG_DEBUG, - "Subtitle duration reduced from %d to %d%s\n", - ist->prev_sub.subtitle.end_display_time, end, - end <= 0 ? ", dropping it" : ""); - ist->prev_sub.subtitle.end_display_time = end; - } - } - FFSWAP(int, *got_output, ist->prev_sub.got_output); - FFSWAP(int, ret, ist->prev_sub.ret); - FFSWAP(AVSubtitle, subtitle, ist->prev_sub.subtitle); - if (end <= 0) - goto out; - } - - if (!*got_output) - return ret; - - sub2video_update(ist, &subtitle); - - if (!subtitle.num_rects) - goto out; - - ist->frames_decoded++; - - for (i = 0; i < nb_output_streams; i++) { - OutputStream *ost = output_streams[i]; - - if (!check_output_constraints(ist, ost) || !ost->encoding_needed - || ost->enc->type != AVMEDIA_TYPE_SUBTITLE) - continue; - - do_subtitle_out(output_files[ost->file_index], ost, &subtitle); - } - -out: - avsubtitle_free(&subtitle); - return ret; -} - static int send_filter_eof(InputStream *ist) { int i, ret; @@ -2477,11 +2413,7 @@ static int process_input_packet(InputStream *ist, const AVPacket *pkt, int no_eo ist->next_pts += duration; //FIXME the duration is not correct in some cases break; case AVMEDIA_TYPE_SUBTITLE: - if (repeating) - break; - ret = transcode_subtitles(ist, &avpkt, &got_output); - if (!pkt && ret >= 0) - ret = AVERROR_EOF; + ret = decode_subtitle(ist, repeating ? NULL : &avpkt, &got_output); break; default: return -1; @@ -3293,7 +3225,8 @@ static int transcode_init(void) #endif if ((enc_ctx->codec_type == AVMEDIA_TYPE_VIDEO || - enc_ctx->codec_type == AVMEDIA_TYPE_AUDIO) && + enc_ctx->codec_type == AVMEDIA_TYPE_AUDIO || + enc_ctx->codec_type == AVMEDIA_TYPE_SUBTITLE) && filtergraph_is_simple(ost->filter->graph)) { FilterGraph *fg = ost->filter->graph; if (configure_filtergraph(fg)) { diff --git a/ffmpeg_filter.c b/ffmpeg_filter.c index 27aeca0..743c96b 100644 --- a/ffmpeg_filter.c +++ b/ffmpeg_filter.c @@ -231,8 +231,10 @@ static void init_input_filter(FilterGraph *fg, AVFilterInOut *in) int i; // TODO: support other filter types - if (type != AVMEDIA_TYPE_VIDEO && type != AVMEDIA_TYPE_AUDIO) { - av_log(NULL, AV_LOG_FATAL, "Only video and audio filters supported " + if (type != AVMEDIA_TYPE_VIDEO && + type != AVMEDIA_TYPE_AUDIO && + type != AVMEDIA_TYPE_SUBTITLE) { + av_log(NULL, AV_LOG_FATAL, "Only video, audio and subtitle filters supported " "currently.\n"); exit_program(1); } @@ -633,6 +635,29 @@ static int configure_output_audio_filter(FilterGraph *fg, OutputFilter *ofilter, return 0; } +static int configure_output_subtitle_filter(FilterGraph *fg, OutputFilter *ofilter, AVFilterInOut *out) +{ + OutputStream *ost = ofilter->ost; + //OutputFile *of = output_files[ost->file_index]; + //AVCodecContext *codec = ost->enc_ctx; + AVFilterContext *last_filter = out->filter_ctx; + int pad_idx = out->pad_idx; + char name[255]; + int ret; + + snprintf(name, sizeof(name), "output stream %d:%d", ost->file_index, ost->index); + ret = avfilter_graph_create_filter(&ofilter->filter, + avfilter_get_by_name("sbuffersink"), + name, NULL, NULL, fg->graph); + if (ret < 0) + return ret; + + if ((ret = avfilter_link(last_filter, pad_idx, ofilter->filter, 0)) < 0) + return ret; + + return 0; +} + #define DESCRIBE_FILTER_LINK(f, inout, in) \ { \ AVFilterContext *ctx = inout->filter_ctx; \ @@ -663,6 +688,7 @@ int configure_output_filter(FilterGraph *fg, OutputFilter *ofilter, AVFilterInOu switch (avfilter_pad_get_type(out->filter_ctx->output_pads, out->pad_idx)) { case AVMEDIA_TYPE_VIDEO: return configure_output_video_filter(fg, ofilter, out); case AVMEDIA_TYPE_AUDIO: return configure_output_audio_filter(fg, ofilter, out); + case AVMEDIA_TYPE_SUBTITLE: return configure_output_subtitle_filter(fg, ofilter, out); default: av_assert0(0); } } @@ -957,6 +983,69 @@ static int configure_input_audio_filter(FilterGraph *fg, InputFilter *ifilter, return 0; } +static int configure_input_subtitle_filter(FilterGraph *fg, InputFilter *ifilter, + AVFilterInOut *in) +{ + AVFilterContext *last_filter; + const AVFilter *sbuffer_filt = avfilter_get_by_name("sbuffer"); + InputStream *ist = ifilter->ist; + const AVCodecContext *avctx = ist->dec_ctx; + //InputFile *f = input_files[ist->file_index]; + AVBPrint args; + char name[255]; + int ret; //, pad_idx = 0; + //int64_t tsoffset = 0; + const AVRational tb = ist->st->time_base; + + // XXX: codec should be opened here so we could access avctx->codec_descriptor + const AVCodecDescriptor *codec_desc = avcodec_descriptor_get(avctx->codec_id); + + if (avctx->codec_type != AVMEDIA_TYPE_SUBTITLE) { + av_log(NULL, AV_LOG_ERROR, "Cannot connect subtitle filter to %s input\n", + av_get_media_type_string(avctx->codec_type)); + return AVERROR(EINVAL); + } + + av_bprint_init(&args, 0, AV_BPRINT_SIZE_AUTOMATIC); + av_bprintf(&args, "time_base=%d/%d:format=", tb.num, tb.den); + + if (codec_desc->props & AV_CODEC_PROP_TEXT_SUB) + av_bprintf(&args, "text"); + else if (codec_desc->props & AV_CODEC_PROP_BITMAP_SUB) + av_bprintf(&args, "bitmap"); + else + av_bprintf(&args, "unspecified"); + + snprintf(name, sizeof(name), "graph %d input from stream %d:%d", fg->index, + ist->file_index, ist->st->index); + + if ((ret = avfilter_graph_create_filter(&ifilter->filter, sbuffer_filt, + name, args.str, NULL, + fg->graph)) < 0) + return ret; + last_filter = ifilter->filter; + +#if 0 + snprintf(name, sizeof(name), "trim for input stream %d:%d", + ist->file_index, ist->st->index); + if (copy_ts) { + tsoffset = f->start_time == AV_NOPTS_VALUE ? 0 : f->start_time; + if (!start_at_zero && f->ctx->start_time != AV_NOPTS_VALUE) + tsoffset += f->ctx->start_time; + } + ret = insert_trim(((f->start_time == AV_NOPTS_VALUE) || !f->accurate_seek) ? + AV_NOPTS_VALUE : tsoffset, f->recording_time, + &last_filter, &pad_idx, name); + if (ret < 0) + return ret; +#endif + + if ((ret = avfilter_link(last_filter, 0, in->filter_ctx, in->pad_idx)) < 0) + return ret; + + return 0; +} + static int configure_input_filter(FilterGraph *fg, InputFilter *ifilter, AVFilterInOut *in) { @@ -972,6 +1061,7 @@ static int configure_input_filter(FilterGraph *fg, InputFilter *ifilter, switch (avfilter_pad_get_type(in->filter_ctx->input_pads, in->pad_idx)) { case AVMEDIA_TYPE_VIDEO: return configure_input_video_filter(fg, ifilter, in); case AVMEDIA_TYPE_AUDIO: return configure_input_audio_filter(fg, ifilter, in); + case AVMEDIA_TYPE_SUBTITLE: return configure_input_subtitle_filter(fg, ifilter, in); default: av_assert0(0); } } diff --git a/ffmpeg_opt.c b/ffmpeg_opt.c index 4d25fff..a8d9a9f 100644 --- a/ffmpeg_opt.c +++ b/ffmpeg_opt.c @@ -1487,8 +1487,13 @@ static char *get_ost_filters(OptionsContext *o, AVFormatContext *oc, else if (ost->filters) return av_strdup(ost->filters); - return av_strdup(st->codecpar->codec_type == AVMEDIA_TYPE_VIDEO ? - "null" : "anull"); + switch (st->codecpar->codec_type) { + case AVMEDIA_TYPE_VIDEO: return av_strdup("null"); + case AVMEDIA_TYPE_AUDIO: return av_strdup("anull"); + case AVMEDIA_TYPE_SUBTITLE: return av_strdup("snull"); + default: + av_assert0(0); + } } static void check_streamcopy_filters(OptionsContext *o, AVFormatContext *oc, @@ -1824,6 +1829,9 @@ static OutputStream *new_subtitle_stream(OptionsContext *o, AVFormatContext *oc, MATCH_PER_STREAM_OPT(copy_initial_nonkeyframes, i, ost->copy_initial_nonkeyframes, oc, st); + MATCH_PER_STREAM_OPT(filter_scripts, str, ost->filters_script, oc, st); + MATCH_PER_STREAM_OPT(filters, str, ost->filters, oc, st); + if (!ost->stream_copy) { char *frame_size = NULL; @@ -1832,8 +1840,15 @@ static OutputStream *new_subtitle_stream(OptionsContext *o, AVFormatContext *oc, av_log(NULL, AV_LOG_FATAL, "Invalid frame size: %s.\n", frame_size); exit_program(1); } + + ost->avfilter = get_ost_filters(o, oc, ost); + if (!ost->avfilter) + exit_program(1); } + if (ost->stream_copy) + check_streamcopy_filters(o, oc, ost, AVMEDIA_TYPE_SUBTITLE); + return ost; } @@ -1961,6 +1976,7 @@ static void init_output_filter(OutputFilter *ofilter, OptionsContext *o, switch (ofilter->type) { case AVMEDIA_TYPE_VIDEO: ost = new_video_stream(o, oc, -1); break; case AVMEDIA_TYPE_AUDIO: ost = new_audio_stream(o, oc, -1); break; + case AVMEDIA_TYPE_SUBTITLE: ost = new_subtitle_stream(o, oc, -1); break; default: av_log(NULL, AV_LOG_FATAL, "Only video and audio filters are supported " "currently.\n"); @@ -1980,7 +1996,7 @@ static void init_output_filter(OutputFilter *ofilter, OptionsContext *o, } if (ost->avfilter && (ost->filters || ost->filters_script)) { - const char *opt = ost->filters ? "-vf/-af/-filter" : "-filter_script"; + const char *opt = ost->filters ? "-vf/-af/-sf/-filter" : "-filter_script"; av_log(NULL, AV_LOG_ERROR, "%s '%s' was specified through the %s option " "for output stream %d:%d, which is fed from a complex filtergraph.\n" @@ -2388,7 +2404,8 @@ loop_end: ist->decoding_needed |= DECODING_FOR_OST; if (ost->st->codecpar->codec_type == AVMEDIA_TYPE_VIDEO || - ost->st->codecpar->codec_type == AVMEDIA_TYPE_AUDIO) { + ost->st->codecpar->codec_type == AVMEDIA_TYPE_AUDIO || + ost->st->codecpar->codec_type == AVMEDIA_TYPE_SUBTITLE) { err = init_simple_filtergraph(ist, ost); if (err < 0) { av_log(NULL, AV_LOG_ERROR, @@ -2929,6 +2946,12 @@ static int opt_audio_filters(void *optctx, const char *opt, const char *arg) return parse_option(o, "filter:a", arg, options); } +static int opt_subtitle_filters(void *optctx, const char *opt, const char *arg) +{ + OptionsContext *o = optctx; + return parse_option(o, "filter:s", arg, options); +} + static int opt_vsync(void *optctx, const char *opt, const char *arg) { if (!av_strcasecmp(arg, "cfr")) video_sync_method = VSYNC_CFR; @@ -3540,6 +3563,8 @@ const OptionDef options[] = { "fix subtitles duration" }, { "canvas_size", OPT_SUBTITLE | HAS_ARG | OPT_STRING | OPT_SPEC | OPT_INPUT, { .off = OFFSET(canvas_sizes) }, "set canvas size (WxH or abbreviation)", "size" }, + { "sf", OPT_SUBTITLE | HAS_ARG | OPT_PERFILE | OPT_OUTPUT, { .func_arg = opt_subtitle_filters }, + "set subtitle filters", "filter_graph" }, /* grab options */ { "vc", HAS_ARG | OPT_EXPERT | OPT_VIDEO, { .func_arg = opt_video_channel }, diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h index 211112f..0912bff 100644 --- a/libavcodec/avcodec.h +++ b/libavcodec/avcodec.h @@ -4838,7 +4838,10 @@ int avcodec_decode_video2(AVCodecContext *avctx, AVFrame *picture, * must be freed with avsubtitle_free if *got_sub_ptr is set. * @param[in,out] got_sub_ptr Zero if no subtitle could be decompressed, otherwise, it is nonzero. * @param[in] avpkt The input AVPacket containing the input buffer. + * + * @deprecated Use avcodec_send_packet() and avcodec_receive_frame(). */ +attribute_deprecated int avcodec_decode_subtitle2(AVCodecContext *avctx, AVSubtitle *sub, int *got_sub_ptr, AVPacket *avpkt); @@ -5322,6 +5325,12 @@ attribute_deprecated int avcodec_encode_video2(AVCodecContext *avctx, AVPacket *avpkt, const AVFrame *frame, int *got_packet_ptr); +/** + * Encode a subtitle. + * + * @deprecated use avcodec_send_frame()/avcodec_receive_packet() instead + */ +attribute_deprecated int avcodec_encode_subtitle(AVCodecContext *avctx, uint8_t *buf, int buf_size, const AVSubtitle *sub); diff --git a/libavcodec/utils.c b/libavcodec/utils.c index 87de15f..add7edf 100644 --- a/libavcodec/utils.c +++ b/libavcodec/utils.c @@ -2050,6 +2050,86 @@ int avcodec_encode_subtitle(AVCodecContext *avctx, uint8_t *buf, int buf_size, return ret; } +// TODO: delete this compatibility code when all subtitles encoders moved +// to send_frame +static int encode_subtitle_frame(AVCodecContext *avctx, + AVPacket *avpkt, + const AVFrame *frame, + int *got_packet_ptr) +{ + int i, ret; + +FF_DISABLE_DEPRECATION_WARNINGS + AVPacket tmp_pkt; + AVSubtitle subtitle; + + get_subtitle_defaults(&subtitle); + + /* Allocate (once) a temporary large output subtitle buffer */ + if (!avctx->internal->byte_buffer) { + const int subtitle_out_max_size = 1024 * 1024; + uint8_t *subtitle_out = av_malloc(subtitle_out_max_size); + if (!subtitle_out) + return AVERROR(ENOMEM); + + avctx->internal->byte_buffer = subtitle_out; + avctx->internal->byte_buffer_size = subtitle_out_max_size; + } + + /* Craft an AVSubtitle from the AVFrame */ + subtitle.format = frame->format == AV_PIX_FMT_NONE; + subtitle.rects = av_mallocz_array(frame->sub_nb_rects, sizeof(*subtitle.rects)); + if (!subtitle.rects) + return AVERROR(ENOMEM); + subtitle.num_rects = frame->sub_nb_rects; + subtitle.pts = frame->pts; + + for (i = 0; i < frame->sub_nb_rects; i++) { + const AVFrameSubtitleRectangle *src_rect = (AVFrameSubtitleRectangle *)frame->extended_data[i]; + AVSubtitleRect *dst_rect; + + dst_rect = av_mallocz(sizeof(*subtitle.rects[i])); + if (!dst_rect) + return AVERROR(ENOMEM); + subtitle.rects[i] = dst_rect; + + if (subtitle.format) { + dst_rect->type = SUBTITLE_ASS; + dst_rect->ass = src_rect->text; + } else { + dst_rect->type = SUBTITLE_BITMAP; + dst_rect->x = src_rect->x; + dst_rect->y = src_rect->x; + dst_rect->w = src_rect->w; + dst_rect->h = src_rect->h; + memcpy(dst_rect->data, src_rect->data, sizeof(dst_rect->data)); + memcpy(dst_rect->linesize, src_rect->linesize, sizeof(dst_rect->linesize)); + } + dst_rect->flags = src_rect->flags; + } + + /* Send it to the old API */ + ret = avcodec_encode_subtitle(avctx, + avctx->internal->byte_buffer, + avctx->internal->byte_buffer_size, + &subtitle); + if (ret < 0) + return ret; + + /* Wrap the encoded buffer into a ref-counted AVPacket */ + av_init_packet(&tmp_pkt); + tmp_pkt.data = avctx->internal->byte_buffer; + tmp_pkt.size = ret; + av_packet_ref(avpkt, &tmp_pkt); + avpkt->pts = frame->pts; + avpkt->dts = frame->pkt_dts; + avpkt->duration = frame->pkt_duration; +FF_ENABLE_DEPRECATION_WARNINGS + + *got_packet_ptr = 1; + return ret; +} + /** * Attempt to guess proper monotonic timestamps for decoded video frames * which might have incorrect times. Input timestamps may wrap around, in @@ -2764,6 +2844,134 @@ void avsubtitle_free(AVSubtitle *sub) memset(sub, 0, sizeof(AVSubtitle)); } +static int decode_subtitle_frame(AVCodecContext *avctx, + AVFrame *frame, + int *got_frame_ptr, + const AVPacket *avpkt) +{ + int i, ret = 0; + + *got_frame_ptr = 0; + + if (!avctx->codec) + return AVERROR(EINVAL); + + if (!avpkt->data && avpkt->size) { + av_log(avctx, AV_LOG_ERROR, "invalid packet: NULL data, size != 0\n"); + return AVERROR(EINVAL); + } + + av_assert0(avctx->codec->type == AVMEDIA_TYPE_SUBTITLE); + + av_frame_unref(frame); + + if ((avctx->codec->capabilities & AV_CODEC_CAP_DELAY) || avpkt->size) { + AVPacket pkt_recoded; + AVPacket tmp = *avpkt; + int did_split = av_packet_split_side_data(&tmp); + + if (did_split) { + /* FFMIN() prevents overflow in case the packet wasn't allocated with + * proper padding. + * If the side data is smaller than the buffer padding size, the + * remaining bytes should have already been filled with zeros by the + * original packet allocation anyway. */ + memset(tmp.data + tmp.size, 0, + FFMIN(avpkt->size - tmp.size, AV_INPUT_BUFFER_PADDING_SIZE)); + } + pkt_recoded = tmp; + + ret = recode_subtitle(avctx, &pkt_recoded, &tmp); + if (ret >= 0) { + int err_ret; + AVSubtitle sub = {0}; + + avctx->internal->pkt = &pkt_recoded; + + ret = avctx->codec->decode(avctx, &sub, got_frame_ptr, &pkt_recoded); + av_assert1((ret >= 0) >= !!*got_frame_ptr && + !!*got_frame_ptr >= !!sub.num_rects); + + for (i = 0; i < sub.num_rects; i++) { + if (sub.rects[i]->ass && !utf8_check(sub.rects[i]->ass)) { + av_log(avctx, AV_LOG_ERROR, + "Invalid UTF-8 in decoded subtitles text; " + "maybe missing -sub_charenc option\n"); + avsubtitle_free(&sub); + return AVERROR_INVALIDDATA; + } + } + + frame->pts = + frame->best_effort_timestamp = avpkt->pts; + frame->pkt_duration = avpkt->duration; + frame->pkt_pos = avpkt->pos; + frame->sub_nb_rects = sub.num_rects; + + if (avctx->codec_descriptor->props & AV_CODEC_PROP_BITMAP_SUB) + frame->format = AV_PIX_FMT_PAL8; + else if (avctx->codec_descriptor->props & AV_CODEC_PROP_TEXT_SUB) + frame->format = AV_PIX_FMT_NONE; + + frame->sub_start_display = sub.start_display_time ? av_rescale_q(sub.start_display_time, av_make_q(1, 1000), AV_TIME_BASE_Q) + : AV_NOPTS_VALUE; + frame->sub_end_display = sub.end_display_time ? av_rescale_q(sub.end_display_time, av_make_q(1, 1000), AV_TIME_BASE_Q) + : AV_NOPTS_VALUE; + + /* Allocate sub_nb_rects AVFrameSubtitleRectangle */ + err_ret = av_frame_get_buffer(frame, 0); + if (err_ret < 0) { + avsubtitle_free(&sub); + return err_ret; + } + + /* Transfer data from AVSubtitleRect to AVFrameSubtitleRectangle */ + for (i = 0; i < sub.num_rects; i++) { + AVSubtitleRect *src_rect = sub.rects[i]; + AVFrameSubtitleRectangle *dst_rect = (AVFrameSubtitleRectangle *)frame->extended_data[i]; + + if (frame->format == AV_PIX_FMT_NONE) { + dst_rect->text = src_rect->ass; + } else { + dst_rect->x = src_rect->x; + dst_rect->y = src_rect->y; + dst_rect->w = src_rect->w; + dst_rect->h = src_rect->h; + memcpy(dst_rect->data, src_rect->data, sizeof(src_rect->data)); + memcpy(dst_rect->linesize, src_rect->linesize, sizeof(src_rect->linesize)); + } + dst_rect->flags = src_rect->flags; + + // we free and make sure it is set to NULL so the data inside + // is not freed after destructing the AVSubtitle + av_freep(&sub.rects[i]); + } + sub.num_rects = 0; + avsubtitle_free(&sub); + + if (tmp.data != pkt_recoded.data) { // did we recode? + /* prevent from destroying side data from original packet */ + pkt_recoded.side_data = NULL; + pkt_recoded.side_data_elems = 0; + + av_packet_unref(&pkt_recoded); + } + avctx->internal->pkt = NULL; + } + + if (did_split) { + av_packet_free_side_data(&tmp); + if(ret == tmp.size) + ret = avpkt->size; + } + + if (*got_frame_ptr) + avctx->frame_number++; + } + + return ret; +} + static int do_decode(AVCodecContext *avctx, AVPacket *pkt) { int got_frame; @@ -2792,6 +3000,9 @@ static int do_decode(AVCodecContext *avctx, AVPacket *pkt) } else if (avctx->codec_type == AVMEDIA_TYPE_AUDIO) { ret = avcodec_decode_audio4(avctx, avctx->internal->buffer_frame, &got_frame, pkt); + } else if (avctx->codec_type == AVMEDIA_TYPE_SUBTITLE) { + ret = decode_subtitle_frame(avctx, avctx->internal->buffer_frame, + &got_frame, pkt); } else { ret = AVERROR(EINVAL); } @@ -2941,6 +3152,9 @@ static int do_encode(AVCodecContext *avctx, const AVFrame *frame, int *got_packe } else if (avctx->codec_type == AVMEDIA_TYPE_AUDIO) { ret = avcodec_encode_audio2(avctx, avctx->internal->buffer_pkt, frame, got_packet); + } else if (avctx->codec_type == AVMEDIA_TYPE_SUBTITLE) { + ret = encode_subtitle_frame(avctx, avctx->internal->buffer_pkt, + frame, got_packet); } else { ret = AVERROR(EINVAL); } diff --git a/libavfilter/Makefile b/libavfilter/Makefile index 7ed4696..5972b27 100644 --- a/libavfilter/Makefile +++ b/libavfilter/Makefile @@ -329,6 +329,9 @@ OBJS-$(CONFIG_YUVTESTSRC_FILTER) += vsrc_testsrc.o OBJS-$(CONFIG_NULLSINK_FILTER) += vsink_nullsink.o +# subtitle filters +OBJS-$(CONFIG_SNULL_FILTER) += sf_snull.o + # multimedia filters OBJS-$(CONFIG_ADRAWGRAPH_FILTER) += f_drawgraph.o OBJS-$(CONFIG_AHISTOGRAM_FILTER) += avf_ahistogram.o diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c index 82a65ee..e5d20cd 100644 --- a/libavfilter/allfilters.c +++ b/libavfilter/allfilters.c @@ -45,6 +45,7 @@ void avfilter_register_all(void) return; initialized = 1; + /* audio filters */ REGISTER_FILTER(ABENCH, abench, af); REGISTER_FILTER(ACOMPRESSOR, acompressor, af); REGISTER_FILTER(ACROSSFADE, acrossfade, af); @@ -138,6 +139,7 @@ void avfilter_register_all(void) REGISTER_FILTER(ANULLSINK, anullsink, asink); + /* video filters */ REGISTER_FILTER(ALPHAEXTRACT, alphaextract, vf); REGISTER_FILTER(ALPHAMERGE, alphamerge, vf); REGISTER_FILTER(ASS, ass, vf); @@ -345,6 +347,9 @@ void avfilter_register_all(void) REGISTER_FILTER(NULLSINK, nullsink, vsink); + /* subtitle filters */ + REGISTER_FILTER(SNULL, snull, sf); + /* multimedia filters */ REGISTER_FILTER(ADRAWGRAPH, adrawgraph, avf); REGISTER_FILTER(AHISTOGRAM, ahistogram, avf); @@ -367,10 +372,13 @@ void avfilter_register_all(void) /* those filters are part of public or internal API => registered * unconditionally */ REGISTER_FILTER_UNCONDITIONAL(asrc_abuffer); + REGISTER_FILTER_UNCONDITIONAL(ssrc_sbuffer); REGISTER_FILTER_UNCONDITIONAL(vsrc_buffer); REGISTER_FILTER_UNCONDITIONAL(asink_abuffer); + REGISTER_FILTER_UNCONDITIONAL(ssink_sbuffer); REGISTER_FILTER_UNCONDITIONAL(vsink_buffer); REGISTER_FILTER_UNCONDITIONAL(af_afifo); + REGISTER_FILTER_UNCONDITIONAL(sf_sfifo); REGISTER_FILTER_UNCONDITIONAL(vf_fifo); ff_opencl_register_filter_kernel_code_all(); } diff --git a/libavfilter/avfilter.c b/libavfilter/avfilter.c index 1d469c3..908ab2c 100644 --- a/libavfilter/avfilter.c +++ b/libavfilter/avfilter.c @@ -314,6 +314,12 @@ int avfilter_config_links(AVFilterContext *filter) if (!link->time_base.num && !link->time_base.den) link->time_base = (AVRational) {1, link->sample_rate}; + break; + + case AVMEDIA_TYPE_SUBTITLE: + if (!link->time_base.num && !link->time_base.den) + link->time_base = inlink ? inlink->time_base : AV_TIME_BASE_Q; + break; } if (link->src->nb_inputs && link->src->inputs[0]->hw_frames_ctx && @@ -1201,6 +1207,8 @@ int ff_filter_frame(AVFilterLink *link, AVFrame *frame) av_assert1(frame->width == link->w); av_assert1(frame->height == link->h); } + } else if (link->type == AVMEDIA_TYPE_SUBTITLE) { + // TODO? } else { if (frame->format != link->format) { av_log(link->dst, AV_LOG_ERROR, "Format change is not supported\n"); diff --git a/libavfilter/buffersink.c b/libavfilter/buffersink.c index 2feb56d..f38fb1a 100644 --- a/libavfilter/buffersink.c +++ b/libavfilter/buffersink.c @@ -388,6 +388,11 @@ static int asink_query_formats(AVFilterContext *ctx) return 0; } +static av_cold int ssink_init(AVFilterContext *ctx, void *opaque) +{ + return common_init(ctx); +} + #define OFFSET(x) offsetof(BufferSinkContext, x) #define FLAGS AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_VIDEO_PARAM static const AVOption buffersink_options[] = { @@ -405,9 +410,11 @@ static const AVOption abuffersink_options[] = { { NULL }, }; #undef FLAGS +static const AVOption sbuffersink_options[] = {NULL}; AVFILTER_DEFINE_CLASS(buffersink); AVFILTER_DEFINE_CLASS(abuffersink); +AVFILTER_DEFINE_CLASS(sbuffersink); static const AVFilterPad avfilter_vsink_buffer_inputs[] = { { @@ -452,3 +459,24 @@ AVFilter ff_asink_abuffer = { .inputs = avfilter_asink_abuffer_inputs, .outputs = NULL, }; + +static const AVFilterPad avfilter_ssink_sbuffer_inputs[] = { + { + .name = "default", + .type = AVMEDIA_TYPE_SUBTITLE, + .filter_frame = filter_frame, + }, + { NULL } +}; + +AVFilter ff_ssink_sbuffer = { + .name = "sbuffersink", + .description = NULL_IF_CONFIG_SMALL("Buffer subtitle frames, and make them available to the end of the filter graph."), + .priv_class = &sbuffersink_class, + .priv_size = sizeof(BufferSinkContext), + .init_opaque = ssink_init, + .uninit = uninit, + //.query_formats = ssink_query_formats, + .inputs = avfilter_ssink_sbuffer_inputs, + .outputs = NULL, +}; diff --git a/libavfilter/buffersrc.c b/libavfilter/buffersrc.c index 9294811..c20b7b2 100644 --- a/libavfilter/buffersrc.c +++ b/libavfilter/buffersrc.c @@ -63,6 +63,9 @@ typedef struct BufferSourceContext { uint64_t channel_layout; char *channel_layout_str; + /* subtitle only */ + int sub_format; + int got_format_from_params; int eof; } BufferSourceContext; @@ -128,6 +131,8 @@ int av_buffersrc_parameters_set(AVFilterContext *ctx, AVBufferSrcParameters *par if (param->channel_layout) s->channel_layout = param->channel_layout; break; + case AVMEDIA_TYPE_SUBTITLE: + break; default: return AVERROR_BUG; } @@ -204,6 +209,8 @@ static int av_buffersrc_add_frame_internal(AVFilterContext *ctx, CHECK_AUDIO_PARAM_CHANGE(ctx, s, frame->sample_rate, frame->channel_layout, av_frame_get_channels(frame), frame->format); break; + case AVMEDIA_TYPE_SUBTITLE: + break; default: return AVERROR(EINVAL); } @@ -271,6 +278,7 @@ unsigned av_buffersrc_get_nb_failed_requests(AVFilterContext *buffer_src) #define OFFSET(x) offsetof(BufferSourceContext, x) #define A AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_AUDIO_PARAM #define V AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_VIDEO_PARAM +#define S AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_SUBTITLE_PARAM static const AVOption buffer_options[] = { { "width", NULL, OFFSET(w), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, INT_MAX, V }, @@ -306,6 +314,17 @@ static const AVOption abuffer_options[] = { AVFILTER_DEFINE_CLASS(abuffer); +static const AVOption sbuffer_options[] = { + { "time_base", NULL, OFFSET(time_base), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, 0, INT_MAX, S }, + { "format", NULL, OFFSET(sub_format), AV_OPT_TYPE_INT, { .i64 = -1 }, -1, 1, S, "format" }, + { "unspecified", NULL, 0, AV_OPT_TYPE_CONST, {.i64=-1}, INT_MIN, INT_MAX, S, "format" }, + { "bitmap", NULL, 0, AV_OPT_TYPE_CONST, {.i64=0}, INT_MIN, INT_MAX, S, "format" }, + { "text", NULL, 0, AV_OPT_TYPE_CONST, {.i64=1}, INT_MIN, INT_MAX, S, "format" }, + { NULL } +}; + +AVFILTER_DEFINE_CLASS(sbuffer); + static av_cold int init_audio(AVFilterContext *ctx) { BufferSourceContext *s = ctx->priv; @@ -397,6 +416,10 @@ static int query_formats(AVFilterContext *ctx) if ((ret = ff_set_common_channel_layouts(ctx, channel_layouts)) < 0) return ret; break; + case AVMEDIA_TYPE_SUBTITLE: + if ((ret = ff_add_format(&formats, c->sub_format)) < 0) + return ret; + break; default: return AVERROR(EINVAL); } @@ -424,6 +447,8 @@ static int config_props(AVFilterLink *link) if (!c->channel_layout) c->channel_layout = link->channel_layout; break; + case AVMEDIA_TYPE_SUBTITLE: + break; default: return AVERROR(EINVAL); } @@ -461,6 +486,15 @@ static int poll_frame(AVFilterLink *link) return size/sizeof(AVFrame*); } +static av_cold int init_subtitle(AVFilterContext *ctx) +{ + BufferSourceContext *c = ctx->priv; + + if (!(c->fifo = av_fifo_alloc(sizeof(AVFrame*)))) + return AVERROR(ENOMEM); + return 0; +} + static const AVFilterPad avfilter_vsrc_buffer_outputs[] = { { .name = "default", @@ -510,3 +544,28 @@ AVFilter ff_asrc_abuffer = { .outputs = avfilter_asrc_abuffer_outputs, .priv_class = &abuffer_class, }; + +static const AVFilterPad avfilter_ssrc_sbuffer_outputs[] = { + { + .name = "default", + .type = AVMEDIA_TYPE_SUBTITLE, + .request_frame = request_frame, + .poll_frame = poll_frame, + .config_props = config_props, + }, + { NULL } +}; + +AVFilter ff_ssrc_sbuffer = { + .name = "sbuffer", + .description = NULL_IF_CONFIG_SMALL("Buffer subtitle frames, and make them accessible to the filterchain."), + .priv_size = sizeof(BufferSourceContext), + .query_formats = query_formats, + + .init = init_subtitle, + .uninit = uninit, + + .inputs = NULL, + .outputs = avfilter_ssrc_sbuffer_outputs, + .priv_class = &sbuffer_class, +}; diff --git a/libavfilter/fifo.c b/libavfilter/fifo.c index abfbba1..feda5bd 100644 --- a/libavfilter/fifo.c +++ b/libavfilter/fifo.c @@ -313,3 +313,34 @@ AVFilter ff_af_afifo = { .inputs = avfilter_af_afifo_inputs, .outputs = avfilter_af_afifo_outputs, }; + +static const AVFilterPad avfilter_sf_sfifo_inputs[] = { + { + .name = "default", + .type = AVMEDIA_TYPE_SUBTITLE, + .filter_frame = add_to_queue, + }, + { NULL } +}; + +static const AVFilterPad avfilter_sf_sfifo_outputs[] = { + { + .name = "default", + .type = AVMEDIA_TYPE_SUBTITLE, + .request_frame = request_frame, + }, + { NULL } +}; + +AVFilter ff_sf_sfifo = { + .name = "sfifo", + .description = NULL_IF_CONFIG_SMALL("Buffer input subtitles and send them when they are requested."), + + .init = init, + .uninit = uninit, + + .priv_size = sizeof(FifoContext), + + .inputs = avfilter_sf_sfifo_inputs, + .outputs = avfilter_sf_sfifo_outputs, +}; diff --git a/libavfilter/formats.c b/libavfilter/formats.c index 20f45e3..66bf138 100644 --- a/libavfilter/formats.c +++ b/libavfilter/formats.c @@ -364,6 +364,10 @@ AVFilterFormats *ff_all_formats(enum AVMediaType type) return NULL; fmt++; } + } else if (type == AVMEDIA_TYPE_SUBTITLE) { + if (ff_add_format(&ret, 0 /* bitmap */) < 0 || + ff_add_format(&ret, 1 /* text */) < 0) + return NULL; } return ret; diff --git a/libavfilter/internal.h b/libavfilter/internal.h index 3856012..cd8db73 100644 --- a/libavfilter/internal.h +++ b/libavfilter/internal.h @@ -142,6 +142,14 @@ struct AVFilterPad { * input pads only. */ int needs_writable; + + /** + * Callback function to get a subtitle frame. If NULL, the filter system + * will use ff_default_get_subtitle_buffer(). + * + * Input subtitle pads only. + */ + AVFrame *(*get_subtitle_buffer)(AVFilterLink *link, int nb_rects); }; struct AVFilterGraphInternal { diff --git a/libavfilter/sf_snull.c b/libavfilter/sf_snull.c new file mode 100644 index 0000000..c20afdd --- /dev/null +++ b/libavfilter/sf_snull.c @@ -0,0 +1,48 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/** + * @file + * null subtitle filter + */ + +#include "avfilter.h" +#include "internal.h" + +static const AVFilterPad sf_snull_inputs[] = { + { + .name = "default", + .type = AVMEDIA_TYPE_SUBTITLE, + }, + { NULL } +}; + +static const AVFilterPad sf_snull_outputs[] = { + { + .name = "default", + .type = AVMEDIA_TYPE_SUBTITLE, + }, + { NULL } +}; + +AVFilter ff_sf_snull = { + .name = "snull", + .description = NULL_IF_CONFIG_SMALL("Pass the source unchanged to the output."), + .inputs = sf_snull_inputs, + .outputs = sf_snull_outputs, +}; diff --git a/libavfilter/subtitle.c b/libavfilter/subtitle.c new file mode 100644 index 0000000..88e43d5 --- /dev/null +++ b/libavfilter/subtitle.c @@ -0,0 +1,60 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/common.h" + +#include "subtitle.h" +#include "avfilter.h" +#include "internal.h" + +AVFrame *ff_null_get_subtitle_buffer(AVFilterLink *link, int nb_rects) +{ + return ff_get_subtitle_buffer(link->dst->outputs[0], nb_rects); +} + +AVFrame *ff_default_get_subtitle_buffer(AVFilterLink *link, int nb_rects) +{ + AVFrame *frame = av_frame_alloc(); + int ret; + + if (!frame) + return NULL; + + frame->sub_nb_rects = nb_rects; + frame->format = link->format; + ret = av_frame_get_buffer(frame, 0); + if (ret < 0) { + av_frame_free(&frame); + return NULL; + } + + return frame; +} + +AVFrame *ff_get_subtitle_buffer(AVFilterLink *link, int nb_samples) +{ + AVFrame *ret = NULL; + + if (link->dstpad->get_subtitle_buffer) + ret = link->dstpad->get_audio_buffer(link, nb_samples); + + if (!ret) + ret = ff_default_get_audio_buffer(link, nb_samples); + + return ret; +} diff --git a/libavfilter/subtitle.h b/libavfilter/subtitle.h new file mode 100644 index 0000000..1db9ebe --- /dev/null +++ b/libavfilter/subtitle.h @@ -0,0 +1,31 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVFILTER_SUBTITLE_H +#define AVFILTER_SUBTITLE_H + +#include "avfilter.h" +#include "internal.h" + +AVFrame *ff_default_get_subtitle_buffer(AVFilterLink *link, int nb_rects); + +AVFrame *ff_null_get_subtitle_buffer(AVFilterLink *link, int nb_rects); + +AVFrame *ff_get_subtitle_buffer(AVFilterLink *link, int nb_rects); + +#endif /* AVFILTER_SUBTITLE_H */ diff --git a/libavfilter/vf_subtitles.c b/libavfilter/vf_subtitles.c index 0f22644..20b3c77 100644 --- a/libavfilter/vf_subtitles.c +++ b/libavfilter/vf_subtitles.c @@ -305,6 +305,10 @@ static av_cold int init_subtitles(AVFilterContext *ctx) AVStream *st; AVPacket pkt; AssContext *ass = ctx->priv; + AVFrame *frame = av_frame_alloc(); + + if (!frame) + return AVERROR(ENOMEM); /* Init libass */ ret = init(ctx); @@ -447,32 +451,32 @@ static av_cold int init_subtitles(AVFilterContext *ctx) pkt.data = NULL; pkt.size = 0; while (av_read_frame(fmt, &pkt) >= 0) { - int i, got_subtitle; - AVSubtitle sub = {0}; - if (pkt.stream_index == sid) { - ret = avcodec_decode_subtitle2(dec_ctx, &sub, &got_subtitle, &pkt); - if (ret < 0) { - av_log(ctx, AV_LOG_WARNING, "Error decoding: %s (ignored)\n", - av_err2str(ret)); - } else if (got_subtitle) { - const int64_t start_time = av_rescale_q(sub.pts, AV_TIME_BASE_Q, av_make_q(1, 1000)); - const int64_t duration = sub.end_display_time; - for (i = 0; i < sub.num_rects; i++) { - char *ass_line = sub.rects[i]->ass; + int i; + + ret = avcodec_send_packet(dec_ctx, &pkt); + if (ret < 0) + break; + // XXX probably not correct api usage + ret = avcodec_receive_frame(dec_ctx, frame); + if (ret >= 0) { + // XXX: honor start/end display time + const int64_t start_time = av_rescale_q(frame->pts, st->time_base, av_make_q(1, 1000)); + const int64_t duration = av_rescale_q(frame->pkt_duration, st->time_base, av_make_q(1, 1000)); + + for (i = 0; i < frame->sub_nb_rects; i++) { + AVSubtitleRect *rect = (AVSubtitleRect *)frame->extended_data[i]; // XXX move to lavu + char *ass_line = rect->ass; if (!ass_line) break; - if (LIBAVCODEC_VERSION_INT < AV_VERSION_INT(57,25,100)) - ass_process_data(ass->track, ass_line, strlen(ass_line)); - else - ass_process_chunk(ass->track, ass_line, strlen(ass_line), - start_time, duration); + ass_process_chunk(ass->track, ass_line, strlen(ass_line), + start_time, duration); } } + av_packet_unref(&pkt); } - av_packet_unref(&pkt); - avsubtitle_free(&sub); } + av_frame_free(&frame); end: av_dict_free(&codec_opts); diff --git a/libavformat/assenc.c b/libavformat/assenc.c index d50f18f..97a8de8 100644 --- a/libavformat/assenc.c +++ b/libavformat/assenc.c @@ -164,6 +164,9 @@ static int write_packet(AVFormatContext *s, AVPacket *pkt) int hh2, mm2, ss2, ms2; DialogueLine *dialogue = av_mallocz(sizeof(*dialogue)); + if (pkt->duration < 0) + end = INT64_MAX; + if (!dialogue) return AVERROR(ENOMEM); diff --git a/libavutil/frame.c b/libavutil/frame.c index 53e6174..8bb3ed2 100644 --- a/libavutil/frame.c +++ b/libavutil/frame.c @@ -216,52 +216,33 @@ fail: return AVERROR(ENOMEM); } -static int get_audio_buffer(AVFrame *frame, int align) +static int get_data_buffer(AVFrame *frame, int n, size_t size) { - int channels; - int planar = av_sample_fmt_is_planar(frame->format); - int planes; - int ret, i; - - if (!frame->channels) - frame->channels = av_get_channel_layout_nb_channels(frame->channel_layout); - - channels = frame->channels; - planes = planar ? channels : 1; - - CHECK_CHANNELS_CONSISTENCY(frame); - if (!frame->linesize[0]) { - ret = av_samples_get_buffer_size(&frame->linesize[0], channels, - frame->nb_samples, frame->format, - align); - if (ret < 0) - return ret; - } + int i; - if (planes > AV_NUM_DATA_POINTERS) { - frame->extended_data = av_mallocz_array(planes, - sizeof(*frame->extended_data)); - frame->extended_buf = av_mallocz_array((planes - AV_NUM_DATA_POINTERS), - sizeof(*frame->extended_buf)); + if (n > AV_NUM_DATA_POINTERS) { + frame->extended_data = av_mallocz_array(n, sizeof(*frame->extended_data)); + frame->extended_buf = av_mallocz_array(n - AV_NUM_DATA_POINTERS, + sizeof(*frame->extended_buf)); if (!frame->extended_data || !frame->extended_buf) { av_freep(&frame->extended_data); av_freep(&frame->extended_buf); return AVERROR(ENOMEM); } - frame->nb_extended_buf = planes - AV_NUM_DATA_POINTERS; + frame->nb_extended_buf = n - AV_NUM_DATA_POINTERS; } else frame->extended_data = frame->data; - for (i = 0; i < FFMIN(planes, AV_NUM_DATA_POINTERS); i++) { - frame->buf[i] = av_buffer_alloc(frame->linesize[0]); + for (i = 0; i < FFMIN(n, AV_NUM_DATA_POINTERS); i++) { + frame->buf[i] = av_buffer_alloc(size); if (!frame->buf[i]) { av_frame_unref(frame); return AVERROR(ENOMEM); } frame->extended_data[i] = frame->data[i] = frame->buf[i]->data; } - for (i = 0; i < planes - AV_NUM_DATA_POINTERS; i++) { - frame->extended_buf[i] = av_buffer_alloc(frame->linesize[0]); + for (i = 0; i < n - AV_NUM_DATA_POINTERS; i++) { + frame->extended_buf[i] = av_buffer_alloc(size); if (!frame->extended_buf[i]) { av_frame_unref(frame); return AVERROR(ENOMEM); @@ -269,11 +250,50 @@ static int get_audio_buffer(AVFrame *frame, int align) frame->extended_data[i + AV_NUM_DATA_POINTERS] = frame->extended_buf[i]->data; } return 0; +} + +static int get_audio_buffer(AVFrame *frame, int align) +{ + int channels; + int planar = av_sample_fmt_is_planar(frame->format); + int planes; + int ret; + + if (!frame->channels) + frame->channels = av_get_channel_layout_nb_channels(frame->channel_layout); + channels = frame->channels; + planes = planar ? channels : 1; + + CHECK_CHANNELS_CONSISTENCY(frame); + if (!frame->linesize[0]) { + ret = av_samples_get_buffer_size(&frame->linesize[0], channels, + frame->nb_samples, frame->format, + align); + if (ret < 0) + return ret; + } + + return get_data_buffer(frame, planes, frame->linesize[0]); +} + +static int get_subtitle_buffer(AVFrame *frame) +{ + int i, ret; + + ret = get_data_buffer(frame, frame->sub_nb_rects, sizeof(AVFrameSubtitleRectangle)); + if (ret < 0) + return ret; + for (i = 0; i < frame->sub_nb_rects; i++) + memset(frame->extended_data[i], 0, sizeof(AVFrameSubtitleRectangle)); + return 0; } int av_frame_get_buffer(AVFrame *frame, int align) { + if (frame->sub_nb_rects) + return get_subtitle_buffer(frame); + if (frame->format < 0) return AVERROR(EINVAL); @@ -320,6 +340,8 @@ FF_ENABLE_DEPRECATION_WARNINGS dst->colorspace = src->colorspace; dst->color_range = src->color_range; dst->chroma_location = src->chroma_location; + dst->sub_start_display = src->sub_start_display; + dst->sub_end_display = src->sub_end_display; av_dict_copy(&dst->metadata, src->metadata, 0); @@ -561,6 +583,7 @@ int av_frame_make_writable(AVFrame *frame) tmp.channels = frame->channels; tmp.channel_layout = frame->channel_layout; tmp.nb_samples = frame->nb_samples; + tmp.sub_nb_rects = frame->sub_nb_rects; ret = av_frame_get_buffer(&tmp, 32); if (ret < 0) return ret; @@ -716,11 +739,57 @@ static int frame_copy_audio(AVFrame *dst, const AVFrame *src) return 0; } +static int frame_copy_subtitle(AVFrame *dst, const AVFrame *src) +{ + int i, ret; + + for (i = 0; i < dst->sub_nb_rects; i++) { + const AVFrameSubtitleRectangle *src_rect = (const AVFrameSubtitleRectangle *)src->extended_data[i]; + AVFrameSubtitleRectangle *dst_rect = (AVFrameSubtitleRectangle *)dst->extended_data[i]; + + memset(dst_rect, 0, sizeof(*dst_rect)); + + dst_rect->flags = src_rect->flags; + + if (src->format != AV_PIX_FMT_NONE) { + /* (Alloc and) copy bitmap subtitle */ + + av_assert1(!src_rect->text); + + dst_rect->x = src_rect->x; + dst_rect->y = src_rect->y; + dst_rect->w = src_rect->w; + dst_rect->h = src_rect->h; + + ret = av_image_alloc(dst_rect->data, dst_rect->linesize, + dst_rect->w, dst_rect->h, + dst->format, 0); + if (ret < 0) + return ret; + av_image_copy(dst_rect->data, dst_rect->linesize, + src_rect->data, src_rect->linesize, + dst->format, dst_rect->w, dst_rect->h); + } else { + /* Copy text subtitle */ + + av_assert1(!src_rect->x && !src_rect->y); + av_assert1(!src_rect->w && !src_rect->w); + + dst_rect->text = av_strdup(src_rect->text); + if (!dst_rect->text) + return AVERROR(ENOMEM); + } + } + return 0; +} + int av_frame_copy(AVFrame *dst, const AVFrame *src) { - if (dst->format != src->format || dst->format < 0) + if (dst->format != src->format || (dst->format < 0 && !dst->sub_nb_rects)) return AVERROR(EINVAL); + if (dst->sub_nb_rects) + return frame_copy_subtitle(dst, src); if (dst->width > 0 && dst->height > 0) return frame_copy_video(dst, src); else if (dst->nb_samples > 0 && dst->channel_layout) diff --git a/libavutil/frame.h b/libavutil/frame.h index 8e51361..d531cfa 100644 --- a/libavutil/frame.h +++ b/libavutil/frame.h @@ -148,6 +148,25 @@ typedef struct AVFrameSideData { AVBufferRef *buf; } AVFrameSideData; +#define AV_NUM_DATA_POINTERS 8 + +/** + * This structure describes decoded subtitle rectangle + */ +typedef struct AVFrameSubtitleRectangle { + int x, y; + int w, h; + + /* image data for bitmap subtitles, in AVFrame.format (AVPixelFormat) */ + uint8_t *data[AV_NUM_DATA_POINTERS]; + int linesize[AV_NUM_DATA_POINTERS]; + + /* decoded text for text subtitles, in ASS */ + char *text; + + int flags; +} AVFrameSubtitleRectangle; + /** * This structure describes decoded (raw) audio or video data. * @@ -182,7 +201,6 @@ typedef struct AVFrameSideData { * for AVFrame can be obtained from avcodec_get_frame_class() */ typedef struct AVFrame { -#define AV_NUM_DATA_POINTERS 8 /** * pointer to the picture/channel planes. * This might be different from the first allocated byte @@ -224,9 +242,12 @@ typedef struct AVFrame { * For packed audio, there is just one data pointer, and linesize[0] * contains the total size of the buffer for all channels. * + * For subtitles, each pointer corresponds to an AVFrameSubtitleRectangle. + * * Note: Both data and extended_data should always be set in a valid frame, - * but for planar audio with more channels that can fit in data, - * extended_data must be used in order to access all channels. + * but for subtitles and planar audio with more channels that can fit in + * data, extended_data must be used in order to access all audio channels + * or subtitles rectangles. */ uint8_t **extended_data; @@ -359,10 +380,10 @@ typedef struct AVFrame { * also be non-NULL for all j < i. * * There may be at most one AVBuffer per data plane, so for video this array - * always contains all the references. For planar audio with more than - * AV_NUM_DATA_POINTERS channels, there may be more buffers than can fit in - * this array. Then the extra AVBufferRef pointers are stored in the - * extended_buf array. + * always contains all the references. For planar audio or subtitles with + * more than AV_NUM_DATA_POINTERS channels or rectangles, there may be more + * buffers than can fit in this array. Then the extra AVBufferRef pointers + * are stored in the extended_buf array. */ AVBufferRef *buf[AV_NUM_DATA_POINTERS]; @@ -532,6 +553,29 @@ typedef struct AVFrame { * AVHWFramesContext describing the frame. */ AVBufferRef *hw_frames_ctx; + + /** + * Number of rectangles available in extended_data + * - encoding: set by user (XXX: no encoding yet) + * - decoding: set by libavcodec, read by user + */ + int sub_nb_rects; + + /** + * Subtitle start display time in AV_TIME_BASE, relative to pts + * XXX: semantic for <=0? + * - encoding: unused + * - decoding: set by libavcodec, read by user. + */ + int64_t sub_start_display; + + /** + * Subtitle end display time in AV_TIME_BASE, relative to pts + * XXX: semantic for <=0? + * - encoding: unused + * - decoding: set by libavcodec, read by user. + */ + int64_t sub_end_display; } AVFrame; /** @@ -641,6 +685,7 @@ void av_frame_move_ref(AVFrame *dst, AVFrame *src); * - format (pixel format for video, sample format for audio) * - width and height for video * - nb_samples and channel_layout for audio + * - sub_nb_rects for subtitle * * This function will fill AVFrame.data and AVFrame.buf arrays and, if * necessary, allocate and fill AVFrame.extended_data and AVFrame.extended_buf. diff --git a/tests/ref/fate/sub-charenc b/tests/ref/fate/sub-charenc index a056cd1..050807b 100644 --- a/tests/ref/fate/sub-charenc +++ b/tests/ref/fate/sub-charenc @@ -23,7 +23,7 @@ Dialogue: 0,0:02:31.61,0:02:33.90,Default,,0,0,0,,- А ти как си?\N- До Dialogue: 0,0:02:40.16,0:02:42.79,Default,,0,0,0,,Монахът Дзенг каза,\Nче си в планината Удан. Dialogue: 0,0:02:43.04,0:02:46.54,Default,,0,0,0,,Каза, че практикуваш\Nдълбока медитация. Dialogue: 0,0:02:48.84,0:02:50.68,Default,,0,0,0,,Сигурно в планината\Nе много спокойно. -Dialogue: 0,0:02:51.25,0:02:53.46,Default,,0,0,0,,Завиждам ти. +Dialogue: 0,0:02:51.26,0:02:53.47,Default,,0,0,0,,Завиждам ти. Dialogue: 0,0:02:53.67,0:02:58.34,Default,,0,0,0,,Имам толкова много работа,\Nпочти не ми остава\Nвреме за почивка. Dialogue: 0,0:03:00.26,0:03:03.89,Default,,0,0,0,,Оставих обучението рано. Dialogue: 0,0:03:05.69,0:03:11.28,Default,,0,0,0,,Защо? Ти си боец на Удан.\NОбучението е всичко. @@ -40,7 +40,7 @@ Dialogue: 0,0:03:53.40,0:03:56.57,Default,,0,0,0,,Не можах да издъ Dialogue: 0,0:03:57.49,0:03:59.74,Default,,0,0,0,,Прекъснах медитацията си. Dialogue: 0,0:03:59.95,0:04:02.24,Default,,0,0,0,,Не можах да продължа. Dialogue: 0,0:04:03.20,0:04:07.79,Default,,0,0,0,,Нещо...\Nме дърпаше назад. -Dialogue: 0,0:04:09.62,0:04:10.91,Default,,0,0,0,,Какво беше? +Dialogue: 0,0:04:09.63,0:04:10.92,Default,,0,0,0,,Какво беше? Dialogue: 0,0:04:15.46,0:04:18.00,Default,,0,0,0,,Нещо, от което не\Nмога да се освободя. Dialogue: 0,0:04:23.39,0:04:24.68,Default,,0,0,0,,Скоро ли ще тръгваш? Dialogue: 0,0:04:26.77,0:04:30.27,Default,,0,0,0,,Подготвяме охрана\Nза една доставка... @@ -52,7 +52,7 @@ Dialogue: 0,0:04:48.37,0:04:52.67,Default,,0,0,0,,Да. Той винаги е Dialogue: 0,0:04:52.88,0:04:56.55,Default,,0,0,0,,Не разбирам.\NКак можеш да се разделиш с него? Dialogue: 0,0:04:56.76,0:04:59.93,Default,,0,0,0,,Той винаги е бил с теб. Dialogue: 0,0:05:01.18,0:05:05.52,Default,,0,0,0,,Твърде много хора са\Nзагинали от това острие. -Dialogue: 0,0:05:09.68,0:05:14.52,Default,,0,0,0,,Чисто е единствено защото\Nкръвта се отмива лесно. +Dialogue: 0,0:05:09.69,0:05:14.53,Default,,0,0,0,,Чисто е единствено защото\Nкръвта се отмива лесно. Dialogue: 0,0:05:15.40,0:05:20.61,Default,,0,0,0,,Ти го използваш справедливо.\NДостоен си за него. Dialogue: 0,0:05:23.66,0:05:27.37,Default,,0,0,0,,Дойде време\Nда го оставя. Dialogue: 0,0:05:27.58,0:05:31.21,Default,,0,0,0,,Е, какво ще правиш\Nот сега нататък?