From patchwork Tue Oct 13 09:25:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Evans X-Patchwork-Id: 22925 Delivered-To: andriy.gelman@gmail.com Received: by 2002:a25:3414:0:0:0:0:0 with SMTP id b20csp1653217yba; Tue, 13 Oct 2020 02:26:12 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz6Ig+oI16LMZxZaEpepuiWinxtUr1s3QEI18hUpRqfD2vccB6a7sD6d+GZGWjQcmSZiEIM X-Received: by 2002:a1c:bd0a:: with SMTP id n10mr13029602wmf.177.1602581172627; Tue, 13 Oct 2020 02:26:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1602581172; cv=none; d=google.com; s=arc-20160816; b=SPt2H7JIg6v8ZTJ6hsRbCa086alE6YK4UjZeVNCp+cbqL+RyQluTnt6A+wzFRpChKQ GJ2bNvwX+shAQY6mFQXY1Miz0hb/IWy/sdqkda1bW5pAHTA+mkG8DVyFJUc3WQugpSsy RG48HKixa8Dn0JneNxS7UhsnXE2JshV/S8MQTaKT6mVPt6etmZaShpJIvAjZ+XqjySJB 6Wgt3M6wepFN/GKnrSyrtWByMGygPaqogtdpeslp7uDInmsXMXhSdjcSFpTqrERTxfOg WtmpXUrZZ7gs9jZki0ykLA8eFxEqzLSHzuAw2p8Tk91MqZWGyNvQx02OAxINpCD/fZWC /yPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject:to :message-id:date:from:mime-version:dkim-signature:delivered-to; bh=f8itkIa6NPMuBrnC+MYAEZxWB/Q2q/tJ0jLijygk478=; b=sSGAOsumXdyZH/qU0FtgOa3NhAtLZqD5ArlhXmMR7RsyQ0pHd10JFNqaLIKKoQuwxJ YhaqFvLzDBhgCBua61TJqndYNR4GBIvVwtFYIuR7XiTZLjbimzxB3LW6n8PiIeoMnqcL lSiHS9JCCcn+NtcJnL/u3OG/xW3TpnVfVkJiUjJcFBwESo9OA8ri/3YquAh0kc2ik0a/ Qg9miNJe9RRIVCwDHmwH0slNQoBvtiEtkcRnB5ILD5GTw9Uppi9JJscS0abPMj7BZWtJ d2qotRqoxLBMXRi0w4vn/Q6hUSDQTKlsCQJzOjij+IkE+LbOYte8E/VEwiVEYHGdLPnX C8xA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@m2amedia-tv.20150623.gappssmtp.com header.s=20150623 header.b=KsrVi7wS; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id s8si18081560wmh.79.2020.10.13.02.26.11; Tue, 13 Oct 2020 02:26:12 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@m2amedia-tv.20150623.gappssmtp.com header.s=20150623 header.b=KsrVi7wS; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6E89368BB3D; Tue, 13 Oct 2020 12:26:10 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pg1-f175.google.com (mail-pg1-f175.google.com [209.85.215.175]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id D0CA068BB37 for ; Tue, 13 Oct 2020 12:26:03 +0300 (EEST) Received: by mail-pg1-f175.google.com with SMTP id n9so3714435pgt.8 for ; Tue, 13 Oct 2020 02:26:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=m2amedia-tv.20150623.gappssmtp.com; s=20150623; h=mime-version:from:date:message-id:subject:to; bh=oeVE4L4h+cBc7lUUrZEb6hV93+jlQ9U4o3AcXm635xo=; b=KsrVi7wSlTMI5Adt4PvpoBJXkMK4z2yZFxeCI9C3kAlrSMNSXrLqSIxCPR/gqwoIGM SzDNwGo59XgZKVuHv5kAHyMWp3159h/2L6GDB12yiqDXwWrMbtNfaC1IVkPXaAbHEhtR 32CFMmuApfhQ3+bek91kxppy9JQr/tIaF1Ch2Dz7Naasz8QB2jVrPwIdH/Solw1VVEUd +bvVLR2Rqegjllah8aK5uISGgDn56qxW9ehDIbtDX4CQXX16liCuGFoCMeaz6Z7aRm8i tDt7xO5X2No2WjxB4Q0s1WQ9RwliEN0VLD2fwJrzSdEc+/7cmFFavLfbCAntYnKNqKZ6 eZMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=oeVE4L4h+cBc7lUUrZEb6hV93+jlQ9U4o3AcXm635xo=; b=EYPyoDnSWm7xecbql70vDGXa2tSMSv9xe9VpYLF5DCla/Nognm6/RNh3gHDJ+7ghQK XWxbYLPDW3ZAykeVKNdsVwu3nGYod4UzSMChGfjdFCLeRKk2V23XLT3csZkevCeYS7O5 rba80mAyaZuyEX+aczt+Wh3ezaEyqcmU7gBWT3JVXW0v7w/ilUlLexvp8V8qn40HQO/r dLXmDEAWKatlrtO+r39x+8IFxgPg8HX9SLyDDLj89BFmTWPsoS1hYfeSrU3NxapL4rbX yRFHVRkYn85rmWxBB53WmdybQBQ0O+Q0tQvobLtaGcuYXv27IrWDIJ//JDslQtqzi3YJ wWHQ== X-Gm-Message-State: AOAM532cdR4ifacuoUAePddD8IVzrn1vf3VhmNnkybH6+KFv3djoCK+d 44DrQ5cJ+ULk75TSr2J9C2CUDDtDukTWM5XMfdeaya+Azcyksg== X-Received: by 2002:a63:6c6:: with SMTP id 189mr16499673pgg.133.1602581161569; Tue, 13 Oct 2020 02:26:01 -0700 (PDT) MIME-Version: 1.0 From: Dave Evans Date: Tue, 13 Oct 2020 10:25:49 +0100 Message-ID: To: FFmpeg development discussions and patches Subject: [FFmpeg-devel] [PATCH v2] avformat/webvttdec, enc: correctly process files containing STYLE, REGION blocks X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: U+C7c/CW7t7r This patch fixes the total failure to parse cues when style and region definition blocks are contained in the input file, and ensures those blocks are written to the output when copying. The sample attached needs to be added to samples at the path shown in the patch in order to validate that the original issue is fixed. Same as v1 except the test has been changed as requested in the original review. Cheers, Dave Subject: [PATCH] avformat/webvttdec,enc: correctly decode files containing STYLE, REGION blocks Add the ability to extract cues from files containing one or more WebVTT region definition block or WebVTT style block. Previously the decoder would misinterpret these and fail to parse any cues at all. Also add the ability to write these, or alternative header information, back out in the encoder using the codec extradata. --- libavformat/webvttdec.c | 16 ++++++- libavformat/webvttenc.c | 12 ++++++ tests/fate/subtitles.mak | 3 ++ tests/ref/fate/sub-webvtt-styleandregions | 51 +++++++++++++++++++++++ 4 files changed, 80 insertions(+), 2 deletions(-) create mode 100644 tests/ref/fate/sub-webvtt-styleandregions diff --git a/libavformat/webvttdec.c b/libavformat/webvttdec.c index 8d2fdfed37..c6cc367383 100644 --- a/libavformat/webvttdec.c +++ b/libavformat/webvttdec.c @@ -60,7 +60,7 @@ static int64_t read_ts(const char *s) static int webvtt_read_header(AVFormatContext *s) { WebVTTContext *webvtt = s->priv_data; - AVBPrint cue; + AVBPrint cue, header; int res = 0; AVStream *st = avformat_new_stream(s, NULL); @@ -72,6 +72,7 @@ static int webvtt_read_header(AVFormatContext *s) st->disposition |= webvtt->kind; av_bprint_init(&cue, 0, AV_BPRINT_SIZE_UNLIMITED); + av_bprint_init(&header, 0, AV_BPRINT_SIZE_UNLIMITED); for (;;) { int i; @@ -89,12 +90,18 @@ static int webvtt_read_header(AVFormatContext *s) p = identifier = cue.str; pos = avio_tell(s->pb); - /* ignore header chunk */ + /* ignore the magic word and any comments */ if (!strncmp(p, "\xEF\xBB\xBFWEBVTT", 9) || !strncmp(p, "WEBVTT", 6) || !strncmp(p, "NOTE", 4)) continue; + /* store the style and region blocks from the header */ + if (!strncmp(p, "STYLE", 5) || !strncmp(p, "REGION", 6)) { + av_bprintf(&header, "%s%s", header.len ? "\n\n" : "", p); + continue; + } + /* optional cue identifier (can be a number like in SRT or some kind of * chaptering id) */ for (i = 0; p[i] && p[i] != '\n' && p[i] != '\r'; i++) { @@ -161,12 +168,17 @@ static int webvtt_read_header(AVFormatContext *s) SET_SIDE_DATA(settings, AV_PKT_DATA_WEBVTT_SETTINGS); } + res = ff_bprint_to_codecpar_extradata(st->codecpar, &header); + if (res < 0) + goto end; + ff_subtitles_queue_finalize(s, &webvtt->q); end: if (res < 0) ff_subtitles_queue_clean(&webvtt->q); av_bprint_finalize(&cue, NULL); + av_bprint_finalize(&header, NULL); return res; } diff --git a/libavformat/webvttenc.c b/libavformat/webvttenc.c index cbd989dcb6..fcbd3ee10a 100644 --- a/libavformat/webvttenc.c +++ b/libavformat/webvttenc.c @@ -58,6 +58,18 @@ static int webvtt_write_header(AVFormatContext *ctx) avio_printf(pb, "WEBVTT\n"); + if (par->extradata_size > 0) { + size_t header_size = par->extradata_size; + + if (par->extradata[0] != '\n') + avio_printf(pb, "\n"); + + avio_write(pb, par->extradata, header_size); + + if (par->extradata[header_size - 1] != '\n') + avio_printf(pb, "\n"); + } + return 0; } diff --git a/tests/fate/subtitles.mak b/tests/fate/subtitles.mak index 6323d0f93d..375f81ef93 100644 --- a/tests/fate/subtitles.mak +++ b/tests/fate/subtitles.mak @@ -91,6 +91,9 @@ fate-sub-webvtt: CMD = fmtstdout ass -i $(TARGET_SAMPLES)/sub/WebVTT_capability_ FATE_SUBTITLES_ASS-$(call DEMDEC, WEBVTT, WEBVTT) += fate-sub-webvtt2 fate-sub-webvtt2: CMD = fmtstdout ass -i $(TARGET_SAMPLES)/sub/WebVTT_extended_tester.vtt +FATE_SUBTITLES-$(call ALLYES, WEBVTT_DEMUXER, WEBVTT_MUXER) += fate-sub-webvtt-styleandregions +fate-sub-webvtt-styleandregions: CMD = fmtstdout webvtt -i $(TARGET_SAMPLES)/sub/webvtt_style_and_regions.vtt -c:s copy + FATE_SUBTITLES-$(call ALLYES, SRT_DEMUXER SUBRIP_DECODER WEBVTT_ENCODER WEBVTT_MUXER) += fate-sub-webvttenc fate-sub-webvttenc: CMD = fmtstdout webvtt -i $(TARGET_SAMPLES)/sub/SubRip_capability_tester.srt diff --git a/tests/ref/fate/sub-webvtt-styleandregions b/tests/ref/fate/sub-webvtt-styleandregions new file mode 100644 index 0000000000..8c8776adab --- /dev/null +++ b/tests/ref/fate/sub-webvtt-styleandregions @@ -0,0 +1,51 @@ +WEBVTT + +REGION +id:son +width:40% +lines:3 +regionanchor:20%,80% +viewportanchor:20%,80% +scroll:up + +REGION +id:father +width:40% +lines:3 +regionanchor:80%,80% +viewportanchor:80%,80% +scroll:up + +STYLE +::cue(i) { + /* make i tags italic */ + font-style: italic +} + +STYLE +::cue(v[voice="Son"]) { + color: magenta +} + +STYLE +::cue(v[voice="Father"]) { + color: yellow +} + +00:10.000 --> 00:25.000 region:son align:left +Can I tell you a joke, Dad? + +00:12.500 --> 00:27.500 region:father align:right +Sure, I could do with a laugh. + +00:15.000 --> 00:30.000 region:son align:left +Where do sheep go to get their hair cut? + +00:17.500 --> 00:32.500 region:father align:right +I don't know, son. Where do sheep go to get their hair cut? + +00:20.000 --> 00:35.000 region:son align:left +To the baa-baa shop! + +00:22.500 --> 00:37.500 region:father align:right +[facepalms]