From patchwork Sun Jan 7 15:07:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marth64 X-Patchwork-Id: 45518 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:2449:b0:199:85ec:e0df with SMTP id t9csp542994pzc; Sun, 7 Jan 2024 07:08:32 -0800 (PST) X-Google-Smtp-Source: AGHT+IHoJIOIXPBA6gU3B4sJ53Q7bk5B86xejCqIexnuJa9IuysPquEV/bOdh0hf0KygNtHPCWBv X-Received: by 2002:a50:cdca:0:b0:554:20a:5d56 with SMTP id h10-20020a50cdca000000b00554020a5d56mr1565888edj.2.1704640112165; Sun, 07 Jan 2024 07:08:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1704640112; cv=none; d=google.com; s=arc-20160816; b=WXSGt2OwZa2YV4QBuuKCy0YNlat7djVm2HLgpsfDFl6NgeAgMmWN0xp/fuCpDIryDU OrBarZYCVf2q2mNlwWNyMumA9wya5NAxdE84HLNhzyWt9ReI7zEaDD6EKKrLwkAbvchy VFvsVJZ4dxxQuHVK589goXZKi0Tf6+zmsP0ZE5PrgQ6gXkcjcqpMDjIgXotJE+soTPu/ CAYpAWACCTgviGbZ43osXcQ1w9YIMb/UuTTuaemI8oUEWEYPduUMWDzwtojawiTs4cnF Msrl7rhT0flmwA3H2juITVCNnSCf32BJyVmCeS0tcNdOCbJY0tAOB2mzvuHkfzW0rZXG qe5Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=eoO+f3YML2eXCSwZtgaAO9HnjlNQ9yLxJyia6Jfipts=; fh=PlWMzmI9LD2qGS7ipLrQl8z0iaQTLQLHzoGuXcBzpCg=; b=w+EECqRrKArDQ4uSpv056iY/eRFS1mXs40R7PordxOKh5n/soEOCz2AjB19Hv99xuV 5q5To4NBH4RMDloT3EnmAbkjzy/8tbyMUDvs5NaiR3PU2+T139j8Ed99qcjEaoJkyC40 /NaAUaFXpX/0ZyUE5vN5l7i3+dmALuxKiTCLIWPcgCp3qnj1mTXZ7oh5tzWhO/2N1rHb aZdSpaIyp6AbMK24h/CXjypc9URjACetRzqAjSOmca/Qu/SHjyV4+rXeMbr4SXFyvoj+ O2Oo68zrXzuTVDaget2A6Rtkgjiq8GlI/uc03uWRwTg2DdRqdHrr5Yb6ZrDLUiiUh4Ov c8kQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@proxyid.net header.s=google header.b=Pc1xiA0d; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a10-20020a05640213ca00b005549909c92fsi2370186edx.87.2024.01.07.07.08.31; Sun, 07 Jan 2024 07:08:32 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@proxyid.net header.s=google header.b=Pc1xiA0d; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6BDE768CCFE; Sun, 7 Jan 2024 17:08:28 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-il1-f228.google.com (mail-il1-f228.google.com [209.85.166.228]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 2AA1168C22A for ; Sun, 7 Jan 2024 17:08:22 +0200 (EET) Received: by mail-il1-f228.google.com with SMTP id e9e14a558f8ab-3606ebda57cso8561595ab.2 for ; Sun, 07 Jan 2024 07:08:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=proxyid.net; s=google; t=1704640100; x=1705244900; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Eh62P4bmxQ7djngsaDE3f0IKZ4YghGoy/jFHjRWEhpU=; b=Pc1xiA0dQ34mhWgILlLbjx+4XqQKZn3iec50DWhmQzRM8GgoUlVHbHNEt9jTy5ifhF sG0lz7ne2fhX05d1FSweVrX/jUyOJKQL8ikahZiG50ASKvnGGQlnQflZ+LV/DO5qdA6G W8TZu60oUm+szPfbLUhZ2LGKisDmZ37CdL1lyOLENEuusxJRf+XfrkJ+d4foHf+UaQzq /OZwlHH9HoM7nMMZ0FobpYqXbI36pJikLrErbRVp26Htol1G7MmRI1FTDDT+N5wFRIPF h8cb6b+AFwved0glc92p6TAPQf2XaIL6NaELaSSR6ZNeDSx96bH/ZGFPtwGy0LXftNFD Jc/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704640100; x=1705244900; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Eh62P4bmxQ7djngsaDE3f0IKZ4YghGoy/jFHjRWEhpU=; b=RCXUkcxTwmjKFuz8b99djynh7lMmRw4nmMP3Eec3ZTw8nQF0vT/Zhc7OWjmqiZX67V vztJH92hAUveHMi5X8GjfkPsjgjNpeMSzzmOxwQBW9hzqrFBo8UltXN+uIIPYbzzuqqY 7gAqXvF0wV++CGU+MXgCqzJZtX3HUMwpTtndYe9oca7yPW7tZDTZ9sxGUpVCZmmfvqpD 77keaEjcInqZcd8ueIZTxRForVZtzKT6RLe45ZWHmOHn/iobIoAAVpDSC/esl2tNUxbB fFVMIB90kTisZABCp+P0TamXEjtLqCkkVjO6IeM/KYzBzE1Nzc3AlUDtgAwaiuB4YLWI RE2Q== X-Gm-Message-State: AOJu0YzUQ8GC0zZCLalnUxHo/sqHFToSRhykM4Km4+QEbPG7Avu15huS qfAzOGR1YnZm9JxBiESDO6d9B0MhjNYIZ9EK7gDFqeAFezErgnu4xfWCh0Z+xBeeAg== X-Received: by 2002:a05:6e02:1948:b0:360:885f:28ae with SMTP id x8-20020a056e02194800b00360885f28aemr3107624ilu.59.1704640100551; Sun, 07 Jan 2024 07:08:20 -0800 (PST) Received: from wsx-cc1-001.. (c-76-141-0-17.hsd1.il.comcast.net. [76.141.0.17]) by smtp-relay.gmail.com with ESMTPS id w14-20020a63474e000000b005cedde3efedsm151713pgk.13.2024.01.07.07.08.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 07 Jan 2024 07:08:20 -0800 (PST) X-Relaying-Domain: proxyid.net From: Marth64 To: ffmpeg-devel@ffmpeg.org Date: Sun, 7 Jan 2024 09:07:01 -0600 Message-Id: <20240107150700.1604665-1-marth64@proxyid.net> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3] libavformat: add RCWT closed caption muxer X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Marth64 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: ww4AxPzj7GR6 Thanks, long night. Should come together nicer now. Signed-off-by: Marth64 --- Changelog | 1 + doc/muxers.texi | 22 +++++ libavformat/Makefile | 1 + libavformat/allformats.c | 1 + libavformat/rcwtenc.c | 202 +++++++++++++++++++++++++++++++++++++++ tests/fate/subtitles.mak | 3 + tests/ref/fate/sub-rcwt | 1 + 7 files changed, 231 insertions(+) create mode 100644 libavformat/rcwtenc.c create mode 100644 tests/ref/fate/sub-rcwt diff --git a/Changelog b/Changelog index 5b2899d05b..4e7c1ce2c1 100644 --- a/Changelog +++ b/Changelog @@ -18,6 +18,7 @@ version : - lavu/eval: introduce randomi() function in expressions - VVC decoder - fsync filter +- Raw Captions with Time (RCWT) closed caption muxer version 6.1: - libaribcaption decoder diff --git a/doc/muxers.texi b/doc/muxers.texi index 7b705b6a9e..9cacbfc23e 100644 --- a/doc/muxers.texi +++ b/doc/muxers.texi @@ -2232,6 +2232,28 @@ Extensions: thd SMPTE 421M / VC-1 video. +@anchor{rcwt} +@section rcwt + +Raw Captions With Time (RCWT) is a format native to ccextractor, a commonly +used open source tool for processing 608/708 closed caption (CC) sources. +It can be used to archive the original, raw CC bitstream and to produce +a source file for later CC processing or conversion. As a result, +it also allows for interopability with ccextractor for processing CC data +extracted via ffmpeg. The format is simple to parse and can be used +to retain all lines and variants of CC. + +This muxer implements the specification as of 2024-01-05, which has +been stable and unchanged for 10 years as of this writing. + +This muxer will have some nuances from the way that ccextractor muxes RCWT. +No compatibility issues when processing the output with ccextractor +have been observed as a result of this so far, but mileage may vary +and outputs will not be a bit-exact match. + +A free specification of RCWT can be found here: +@url{https://github.com/CCExtractor/ccextractor/blob/master/docs/BINARY_FILE_FORMAT.TXT} + @anchor{segment} @section segment, stream_segment, ssegment diff --git a/libavformat/Makefile b/libavformat/Makefile index 581e378d95..dcc99eeac4 100644 --- a/libavformat/Makefile +++ b/libavformat/Makefile @@ -490,6 +490,7 @@ OBJS-$(CONFIG_QOA_DEMUXER) += qoadec.o OBJS-$(CONFIG_R3D_DEMUXER) += r3d.o OBJS-$(CONFIG_RAWVIDEO_DEMUXER) += rawvideodec.o OBJS-$(CONFIG_RAWVIDEO_MUXER) += rawenc.o +OBJS-$(CONFIG_RCWT_MUXER) += rcwtenc.o subtitles.o OBJS-$(CONFIG_REALTEXT_DEMUXER) += realtextdec.o subtitles.o OBJS-$(CONFIG_REDSPARK_DEMUXER) += redspark.o OBJS-$(CONFIG_RKA_DEMUXER) += rka.o apetag.o img2.o diff --git a/libavformat/allformats.c b/libavformat/allformats.c index ce6be5f04d..b04b43cab3 100644 --- a/libavformat/allformats.c +++ b/libavformat/allformats.c @@ -389,6 +389,7 @@ extern const AVInputFormat ff_qoa_demuxer; extern const AVInputFormat ff_r3d_demuxer; extern const AVInputFormat ff_rawvideo_demuxer; extern const FFOutputFormat ff_rawvideo_muxer; +extern const FFOutputFormat ff_rcwt_muxer; extern const AVInputFormat ff_realtext_demuxer; extern const AVInputFormat ff_redspark_demuxer; extern const AVInputFormat ff_rka_demuxer; diff --git a/libavformat/rcwtenc.c b/libavformat/rcwtenc.c new file mode 100644 index 0000000000..839436ce84 --- /dev/null +++ b/libavformat/rcwtenc.c @@ -0,0 +1,202 @@ +/* + * Raw Captions With Time (RCWT) muxer + * Author: Marth64 + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/* + * Raw Captions With Time (RCWT) is a format native to ccextractor, a commonly + * used open source tool for processing 608/708 closed caption (CC) sources. + * It can be used to archive the original, raw CC bitstream and to produce + * a source file for later CC processing or conversion. As a result, + * it also allows for interopability with ccextractor for processing CC data + * extracted via ffmpeg. The format is simple to parse and can be used + * to retain all lines and variants of CC. + * + * This muxer implements the specification as of 2024-01-05, which has + * been stable and unchanged for 10 years as of this writing. + * + * This muxer will have some nuances from the way that ccextractor muxes RCWT. + * No compatibility issues when processing the output with ccextractor + * have been observed as a result of this so far, but mileage may vary + * and outputs will not be a bit-exact match. + * + * Specifically, the differences are: + * (1) This muxer will identify as "FF" as the writing program identifier, so + * as to be honest about the output's origin. + * + * (2) ffmpeg's MPEG-1/2, H264, HEVC, etc. decoders extract closed captioning + * data differently than ccextractor from embedded SEI/user data. + * For example, DVD captioning bytes will be translated to ATSC A53 format. + * This allows ffmpeg to handle 608/708 in a consistant way downstream. + * This is a lossless conversion and the meaningful data is retained. + * + * (3) This muxer will not alter the extracted data except to remove invalid + * packets in between valid CC blocks. On the other hand, ccextractor + * will by default remove mid-stream padding, and add padding at the end + * of the stream (in order to convey the end time of the source video). + * + * A free specification of RCWT can be found here: + * @url{https://github.com/CCExtractor/ccextractor/blob/master/docs/BINARY_FILE_FORMAT.TXT} + */ + +#include "avformat.h" +#include "internal.h" +#include "mux.h" +#include "libavutil/log.h" +#include "libavutil/intreadwrite.h" + +#define RCWT_CLUSTER_MAX_BLOCKS 65535 +#define RCWT_BLOCK_SIZE 3 + +typedef struct RCWTContext { + int cluster_nb_blocks; + int cluster_pos; + int64_t cluster_pts; + uint8_t cluster_buf[RCWT_CLUSTER_MAX_BLOCKS * RCWT_BLOCK_SIZE]; +} RCWTContext; + +static void rcwt_init_cluster(AVFormatContext *avf) +{ + RCWTContext *rcwt = avf->priv_data; + + rcwt->cluster_nb_blocks = 0; + rcwt->cluster_pos = 0; + rcwt->cluster_pts = AV_NOPTS_VALUE; + memset(rcwt->cluster_buf, 0, sizeof(rcwt->cluster_buf)); +} + +static void rcwt_flush_cluster(AVFormatContext *avf) +{ + RCWTContext *rcwt = avf->priv_data; + + if (rcwt->cluster_nb_blocks > 0) { + avio_wl64(avf->pb, rcwt->cluster_pts); + avio_wl16(avf->pb, rcwt->cluster_nb_blocks); + avio_write(avf->pb, rcwt->cluster_buf, (rcwt->cluster_nb_blocks * RCWT_BLOCK_SIZE)); + } + + rcwt_init_cluster(avf); +} + +static int rcwt_write_header(AVFormatContext *avf) +{ + if (avf->nb_streams != 1 || avf->streams[0]->codecpar->codec_id != AV_CODEC_ID_EIA_608) { + av_log(avf, AV_LOG_ERROR, + "RCWT supports only one CC (608/708) stream, more than one stream was " + "provided or its codec type was not CC (608/708)\n"); + return AVERROR(EINVAL); + } + + avpriv_set_pts_info(avf->streams[0], 64, 1, 1000); + + /* magic number */ + avio_wb16(avf->pb, 0xCCCC); + avio_w8(avf->pb, 0xED); + + /* program version (identify as ffmpeg) */ + avio_wb16(avf->pb, 0xFF00); + avio_w8(avf->pb, 0x60); + + /* format version, only version 0.001 supported for now */ + avio_wb16(avf->pb, 0x0001); + + /* reserved */ + avio_wb16(avf->pb, 0x000); + avio_w8(avf->pb, 0x00); + + rcwt_init_cluster(avf); + + return 0; +} + +static int rcwt_write_packet(AVFormatContext *avf, AVPacket *pkt) +{ + RCWTContext *rcwt = avf->priv_data; + + int in_block = 0; + int nb_block_bytes = 0; + + if (pkt->size == 0) + return 0; + + /* new PTS, new cluster */ + if (pkt->pts != rcwt->cluster_pts) { + rcwt_flush_cluster(avf); + rcwt->cluster_pts = pkt->pts; + } + + if (pkt->pts == AV_NOPTS_VALUE) { + av_log(avf, AV_LOG_WARNING, "Ignoring CC packet with no PTS\n"); + return 0; + } + + for (int i = 0; i < pkt->size; i++) { + uint8_t cc_valid; + uint8_t cc_type; + + if (rcwt->cluster_nb_blocks == RCWT_CLUSTER_MAX_BLOCKS) { + av_log(avf, AV_LOG_WARNING, "Starting new cluster due to size\n"); + rcwt_flush_cluster(avf); + } + + cc_valid = (pkt->data[i] & 0x04) >> 2; + cc_type = pkt->data[i] & 0x03; + + if (!in_block && !(cc_valid || cc_type == 3)) + continue; + + memcpy(&rcwt->cluster_buf[rcwt->cluster_pos], &pkt->data[i], 1); + rcwt->cluster_pos++; + + if (!in_block) { + in_block = 1; + nb_block_bytes = 1; + continue; + } + + nb_block_bytes++; + + if (nb_block_bytes == RCWT_BLOCK_SIZE) { + in_block = 0; + nb_block_bytes = 0; + rcwt->cluster_nb_blocks++; + } + } + + return 0; +} + +static int rcwt_write_trailer(AVFormatContext *avf) +{ + rcwt_flush_cluster(avf); + + return 0; +} + +const FFOutputFormat ff_rcwt_muxer = { + .p.name = "rcwt", + .p.long_name = NULL_IF_CONFIG_SMALL("Raw Captions With Time"), + .p.extensions = "bin", + .p.flags = AVFMT_GLOBALHEADER | AVFMT_VARIABLE_FPS | AVFMT_TS_NONSTRICT, + .p.subtitle_codec = AV_CODEC_ID_EIA_608, + .priv_data_size = sizeof(RCWTContext), + .write_header = rcwt_write_header, + .write_packet = rcwt_write_packet, + .write_trailer = rcwt_write_trailer +}; diff --git a/tests/fate/subtitles.mak b/tests/fate/subtitles.mak index 59595b9cc1..d7edd31e85 100644 --- a/tests/fate/subtitles.mak +++ b/tests/fate/subtitles.mak @@ -118,6 +118,9 @@ fate-sub-scc: CMD = fmtstdout ass -ss 57 -i $(TARGET_SAMPLES)/sub/witch.scc FATE_SUBTITLES-$(call DEMMUX, SCC, SCC) += fate-sub-scc-remux fate-sub-scc-remux: CMD = fmtstdout scc -i $(TARGET_SAMPLES)/sub/witch.scc -ss 4:00 -map 0 -c copy +FATE_SUBTITLES-$(call DEMMUX, SCC, RCWT) += fate-sub-rcwt +fate-sub-rcwt: CMD = md5 -i $(TARGET_SAMPLES)/sub/witch.scc -map 0 -c copy -f rcwt + FATE_SUBTITLES-$(call ALLYES, MPEGTS_DEMUXER DVBSUB_DECODER DVBSUB_ENCODER) += fate-sub-dvb fate-sub-dvb: CMD = framecrc -i $(TARGET_SAMPLES)/sub/dvbsubtest_filter.ts -map s:0 -c dvbsub diff --git a/tests/ref/fate/sub-rcwt b/tests/ref/fate/sub-rcwt new file mode 100644 index 0000000000..722cbe1c5b --- /dev/null +++ b/tests/ref/fate/sub-rcwt @@ -0,0 +1 @@ +d86f179094a5752d68aa97d82cf887b0