From patchwork Sat Mar 24 12:43:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: wm4 X-Patchwork-Id: 8137 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.2.1.70 with SMTP id c67csp1509947jad; Sat, 24 Mar 2018 05:49:34 -0700 (PDT) X-Google-Smtp-Source: AG47ELvB1D9idclcH2epChDX5GuxKxht9x+3IG9FE0TxsKVofsuCKmHuKmrw6ahwXjp936na4jFH X-Received: by 10.223.209.198 with SMTP id m6mr25088731wri.267.1521895774188; Sat, 24 Mar 2018 05:49:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521895774; cv=none; d=google.com; s=arc-20160816; b=cFNvcQa7W1dYEyYoGc0pmui7NLeR2EOSE7VKyoFMaN6CDJmmk8ia+z4oNRdxVyLapt 7CdG+XeAnRbKR83LHiJFY8/xC++0eqo8oqFgQJxF2C1X7VQtAKSddDm5hyn6uLVYEfh6 TPioKHO3SkVi6xhUtABd8vDgrC5YfAgbd7gxqUq6LNCIWhMXJJ1RckfOyews6K4/kHAq jjmJaGW9/ogko5cQ7aveKUyhvOK/XM2g0aH79QkDCv9YPAwytkRudTsEy1fMEOcSRXFc 17mT2QYFdyoZBefbpB3TDogl8n67aRcTdAjxnzoWnZBfNfYxJxKh7SCdVNWJ5Gzn5PF0 jZfA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:message-id:date:to:from:dkim-signature :delivered-to:arc-authentication-results; bh=CbGzyl2KYkmIjwe2pZ1ny4/Nhe9Yd/GWQdHbXomQ8/Q=; b=JKSFiNwOjP3PsWS8hRfrEhN47YKV7d9ffy8F8fQhE9rPqz7PCF7pM3jcIfYuCFDota 9xag2gACH6DvRNRTJTKQPVYCwoA8G2hjY/ZlE5n8Max2FYaCGAUjIkau2EATsAEK6n3C QIp8Ia1x17HUcnq3+EV0Y+3cMRuH7wtjipr3atm0AZ4LhzJNmDmGjr+BjOKjpQkKKenu 8tNRSFWBfL8AdhJjBLEw2GKHInJylmnnUEzkw1m5y+EPo+K/5fmuUumyLZTQER30dRwW pFe2Mf+TkmrvqrFADdQTNGOwLGeqN/8NjtXKzpxx4P8XVtLR4u4H4oibe+9l4uTUf+CG EdZw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@googlemail.com header.s=20161025 header.b=T1E5tn0F; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=googlemail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id j9si7057515wmh.25.2018.03.24.05.49.33; Sat, 24 Mar 2018 05:49:34 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@googlemail.com header.s=20161025 header.b=T1E5tn0F; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=googlemail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 60C6C689C14; Sat, 24 Mar 2018 14:49:15 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wm0-f46.google.com (mail-wm0-f46.google.com [74.125.82.46]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id F0C3368061C for ; Sat, 24 Mar 2018 14:49:08 +0200 (EET) Received: by mail-wm0-f46.google.com with SMTP id r131so1102257wmb.2 for ; Sat, 24 Mar 2018 05:49:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=ioLGVWjeL6c8PrajeQViZvpas8aso1KozXntltQebR8=; b=T1E5tn0FLE87LeTVwwa9Dgoh9MffvaduJgXqOTZzHJqz8d72x7fT9/m/lfum/RfZcw J5X6DcDeUAu2wPjpjA4Bd7d5V+1R2OYLSOWNhjB6qSe5EShKvV40qBAmhycQSwbT9bEs RMNrzErXNlY7C7CtVABGrscTXlERfYt9Lxv6djt6Awlm8qXbNgcT7zFdIFcsq9UL9w+4 D8NdKGWiEWSB+/gBQm+ZJ0pQCv+k5/SprlvjDvqpzi2W1lY/szCQd5UyfMsm0PixvA4a TZg2jY8O4ehv5dG9hnH+vU1k22SU4tjRR8mte2w1rcYeK74pI4a28qOkAUQqjG2rgSai m1yA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=ioLGVWjeL6c8PrajeQViZvpas8aso1KozXntltQebR8=; b=QS6iiWWORHFQdbHugsDuYcPLR5uM79Wo1m1JzlhtiSY+iPsawb0/q3jSVQcvQtTlgz zx2i2fLkyF5s0p5iOPUH5vWNeEscyIm4cU7gpIAh5IPPcwGteWFSG8pTfh1TJneNYOvC obf1o2C8udUeP43FtJtrWF1iagXLozM0dgNYQcwVni/DXzk7bcNY28vFsjFqGLMvv9gY bwOv7M80/Tr5hHWAT6bPGN4MLOYM/UDksQ3kvNThNU/Htc1hjafEfgMm0GIiPdgnJdyE 3qVbxuAtbUPQcV8f2Li6mF4llrdBUYUsY/NqZCCDU58dFR4MJjWoZRBzX9NSKSp34aYu 1ZHQ== X-Gm-Message-State: AElRT7FrBOgaMMQkPKfEWvCIN3C8qw0LbDwWakiPs6/j/DoJ8142l0mj Ew/0qwBpKe8SBRrG83qA5izudw== X-Received: by 10.28.142.210 with SMTP id q201mr11688515wmd.73.1521895399780; Sat, 24 Mar 2018 05:43:19 -0700 (PDT) Received: from debian.speedport.ip (p2003006CCD4EDC610CC341E3A28E97C1.dip0.t-ipconnect.de. [2003:6c:cd4e:dc61:cc3:41e3:a28e:97c1]) by smtp.googlemail.com with ESMTPSA id y9sm2316774wrg.34.2018.03.24.05.43.18 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 24 Mar 2018 05:43:19 -0700 (PDT) From: wm4 To: ffmpeg-devel@ffmpeg.org Date: Sat, 24 Mar 2018 13:43:21 +0100 Message-Id: <20180324124321.25932-1-nfxjfg@googlemail.com> X-Mailer: git-send-email 2.16.1 Subject: [FFmpeg-devel] [PATCH] avcodec: add a subcharenc mode that disables UTF-8 check X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: wm4 MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" This is for applications which want to explicitly check for invalid UTF-8 manually, and take actions that are better than dropping invalid subtitles silently. (It's pretty much silent because sporadic avcodec error messages are so common that you can't reasonably display them in a prominent and meaningful way in a application GUI.) --- doc/APIchanges | 3 +++ libavcodec/avcodec.h | 1 + libavcodec/decode.c | 3 ++- libavcodec/options_table.h | 1 + libavcodec/version.h | 2 +- 5 files changed, 8 insertions(+), 2 deletions(-) diff --git a/doc/APIchanges b/doc/APIchanges index a099afd9bc..95b5cd772f 100644 --- a/doc/APIchanges +++ b/doc/APIchanges @@ -15,6 +15,9 @@ libavutil: 2017-10-21 API changes, most recent first: +2018-03-xx - xxxxxxx - lavc 58.16.100 - avcodec.h + Add FF_SUB_CHARENC_MODE_IGNORE. + 2018-xx-xx - xxxxxxx - lavu 56.8.100 - encryption_info.h Add AVEncryptionInitInfo and AVEncryptionInfo structures to hold new side-data for encryption info. diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h index 495242faf0..50c34dbff9 100644 --- a/libavcodec/avcodec.h +++ b/libavcodec/avcodec.h @@ -3092,6 +3092,7 @@ typedef struct AVCodecContext { #define FF_SUB_CHARENC_MODE_DO_NOTHING -1 ///< do nothing (demuxer outputs a stream supposed to be already in UTF-8, or the codec is bitmap for instance) #define FF_SUB_CHARENC_MODE_AUTOMATIC 0 ///< libavcodec will select the mode itself #define FF_SUB_CHARENC_MODE_PRE_DECODER 1 ///< the AVPacket data needs to be recoded to UTF-8 before being fed to the decoder, requires iconv +#define FF_SUB_CHARENC_MODE_IGNORE 2 ///< neither convert the subtitles, nor check them for valid UTF-8 /** * Skip processing alpha if supported by codec. diff --git a/libavcodec/decode.c b/libavcodec/decode.c index ea2168ad0c..40c8a8855c 100644 --- a/libavcodec/decode.c +++ b/libavcodec/decode.c @@ -1057,7 +1057,8 @@ int avcodec_decode_subtitle2(AVCodecContext *avctx, AVSubtitle *sub, sub->format = 1; for (i = 0; i < sub->num_rects; i++) { - if (sub->rects[i]->ass && !utf8_check(sub->rects[i]->ass)) { + if (avctx->sub_charenc_mode != FF_SUB_CHARENC_MODE_IGNORE && + sub->rects[i]->ass && !utf8_check(sub->rects[i]->ass)) { av_log(avctx, AV_LOG_ERROR, "Invalid UTF-8 in decoded subtitles text; " "maybe missing -sub_charenc option\n"); diff --git a/libavcodec/options_table.h b/libavcodec/options_table.h index 5a5eae65fb..099261e168 100644 --- a/libavcodec/options_table.h +++ b/libavcodec/options_table.h @@ -447,6 +447,7 @@ static const AVOption avcodec_options[] = { {"do_nothing", NULL, 0, AV_OPT_TYPE_CONST, {.i64 = FF_SUB_CHARENC_MODE_DO_NOTHING}, INT_MIN, INT_MAX, S|D, "sub_charenc_mode"}, {"auto", NULL, 0, AV_OPT_TYPE_CONST, {.i64 = FF_SUB_CHARENC_MODE_AUTOMATIC}, INT_MIN, INT_MAX, S|D, "sub_charenc_mode"}, {"pre_decoder", NULL, 0, AV_OPT_TYPE_CONST, {.i64 = FF_SUB_CHARENC_MODE_PRE_DECODER}, INT_MIN, INT_MAX, S|D, "sub_charenc_mode"}, +{"ignore", NULL, 0, AV_OPT_TYPE_CONST, {.i64 = FF_SUB_CHARENC_MODE_IGNORE}, INT_MIN, INT_MAX, S|D, "sub_charenc_mode"}, #if FF_API_ASS_TIMING {"sub_text_format", "set decoded text subtitle format", OFFSET(sub_text_format), AV_OPT_TYPE_INT, {.i64 = FF_SUB_TEXT_FMT_ASS_WITH_TIMINGS}, 0, 1, S|D, "sub_text_format"}, #else diff --git a/libavcodec/version.h b/libavcodec/version.h index a5b7f752d1..8ac4626da7 100644 --- a/libavcodec/version.h +++ b/libavcodec/version.h @@ -28,7 +28,7 @@ #include "libavutil/version.h" #define LIBAVCODEC_VERSION_MAJOR 58 -#define LIBAVCODEC_VERSION_MINOR 15 +#define LIBAVCODEC_VERSION_MINOR 16 #define LIBAVCODEC_VERSION_MICRO 100 #define LIBAVCODEC_VERSION_INT AV_VERSION_INT(LIBAVCODEC_VERSION_MAJOR, \