From patchwork Mon Dec 20 08:31:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gyan Doshi X-Patchwork-Id: 32744 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a6b:cd86:0:0:0:0:0 with SMTP id d128csp4126350iog; Mon, 20 Dec 2021 00:31:48 -0800 (PST) X-Google-Smtp-Source: ABdhPJz9ul0XRY3luHzq6L8VS7fICKP+pQReKGjCJTdE8bsLVDylKv+bcmW57GhMwVAuT54wEHwG X-Received: by 2002:a17:906:37d3:: with SMTP id o19mr12773405ejc.32.1639989108032; Mon, 20 Dec 2021 00:31:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1639989108; cv=none; d=google.com; s=arc-20160816; b=0OiYssEeHg62m03vvQpS/4zyXaRQ0GB9aCzCA0FzNxsFJFLoMA6UE8P9Qn/EP37034 adkulSt1KlUkNkmKsYvq6T4FeBaXUbqXQCNCPKuinbqtI2sgpIpxrhV9oBGHRsDQOHN1 7oI1t3TN+tBhYWzjz1v4EL/G5h0WyEsoLfqCwDImcBt7PHK0QvWBUzKu815LPgZxufEZ Ir6Lu0toZitk1AbCI+1C7Hf1v3YkZ9kx5sA3d1aRyF182ip7K4wFSPeixjGJjRqRC7jX ycacIwwbQgxi34atBkQdvzM2iVglKBjjxX3v0ZYzy+5AYnSSHVDpQSwqFi57zQpooh4y N+QA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=mbMLCu1Fzsfc1KQos8qNYjWVwcHszW+ic0UwdsYe78g=; b=EBIhejVci3PukMXZPI5YHdSomr9F3J3uO5Bz43qYSEuwqB9G5KNQIqAfSmeo7B2xGw jqCMtcUcBkdk2JtsUYecm60wwKy1jlVbwYAzbO6Ny7opCn5m4OgXGKF9ddu0ozg2e5Jc QyhJ+bmDOJfo72tdGkydC+CT7l5iZlg47gJVyo3hv+n/EQQpqcq9aUprwY6rHAKN+urd SSRyDkZCDmisq4JhiSWalV08+81bRSeiiXtm4yDoLhylO+CbgfP12e+9tCm8sZoRKGrz wGYLGRa7I2e9e44cLDeU1iGMSPjfv6UKiu2thRTJKm2EP/ltbUa6+xdTdCxmPYuwC7Ax LMqw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id gn15si5869922ejc.316.2021.12.20.00.31.47; Mon, 20 Dec 2021 00:31:48 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D4A90680CE9; Mon, 20 Dec 2021 10:31:43 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org [80.241.56.172]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 0F13468A82A for ; Mon, 20 Dec 2021 10:31:37 +0200 (EET) Received: from smtp102.mailbox.org (smtp102.mailbox.org [IPv6:2001:67c:2050:105:465:1:3:0]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4JHXqq5yNkzQkC3 for ; Mon, 20 Dec 2021 09:31:35 +0100 (CET) X-Virus-Scanned: amavisd-new at heinlein-support.de From: Gyan Doshi To: ffmpeg-devel@ffmpeg.org Date: Mon, 20 Dec 2021 14:01:05 +0530 Message-Id: <20211220083105.1452-1-ffmpeg@gyani.pro> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2] avformat/mov: add option max_stts_delta X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: FQe6baAXJ/6S Very high stts sample deltas may occasionally be intended but usually they are written in error or used to store a negative value for dts correction when treated as signed 32-bit integers. This option lets the user set an upper limit, beyond which the delta is clamped to 1. Values greater than the limit if negative when cast to int32 are used to adjust onward dts. Unit is the track time scale. Default is UINT_MAX - 48000*10 which allows upto a 10 second dts correction for 48 kHz audio streams while accommodating 99.9% of uint32 range. --- v2 changes: mp4-negative-stts-problem.mp4 plays in sync with the default value chosen. stts adjustment shifted to mov_read_stts so that any downstream code that references raw stts does not have to contend with possibly changed stts values. libavformat/isom.h | 1 + libavformat/mov.c | 64 +++++++++++++++++++++++++++------------------- 2 files changed, 39 insertions(+), 26 deletions(-) diff --git a/libavformat/isom.h b/libavformat/isom.h index ef8f19b18c..625dea8421 100644 --- a/libavformat/isom.h +++ b/libavformat/isom.h @@ -305,6 +305,7 @@ typedef struct MOVContext { int32_t movie_display_matrix[3][3]; ///< display matrix from mvhd int have_read_mfra_size; uint32_t mfra_size; + uint32_t max_stts_delta; } MOVContext; int ff_mp4_read_descr_len(AVIOContext *pb); diff --git a/libavformat/mov.c b/libavformat/mov.c index 2aed6e80ef..a36752c6a4 100644 --- a/libavformat/mov.c +++ b/libavformat/mov.c @@ -2925,6 +2925,9 @@ static int mov_read_stts(MOVContext *c, AVIOContext *pb, MOVAtom atom) unsigned int i, entries, alloc_size = 0; int64_t duration = 0; int64_t total_sample_count = 0; + int64_t current_dts = 0; + int64_t last_dts = 0; + int64_t dts_correction = 0; if (c->fc->nb_streams < 1) return 0; @@ -2948,6 +2951,7 @@ static int mov_read_stts(MOVContext *c, AVIOContext *pb, MOVAtom atom) for (i = 0; i < entries && !pb->eof_reached; i++) { unsigned int sample_duration; unsigned int sample_count; + unsigned int stts_warn = 0; unsigned int min_entries = FFMIN(FFMAX(i + 1, 1024 * 1024), entries); MOVStts *stts_data = av_fast_realloc(sc->stts_data, &alloc_size, min_entries * sizeof(*sc->stts_data)); @@ -2965,13 +2969,41 @@ static int mov_read_stts(MOVContext *c, AVIOContext *pb, MOVAtom atom) sc->stts_data[i].count= sample_count; sc->stts_data[i].duration= sample_duration; - av_log(c->fc, AV_LOG_TRACE, "sample_count=%d, sample_duration=%d\n", + av_log(c->fc, AV_LOG_TRACE, "sample_count=%u, sample_duration=%u\n", sample_count, sample_duration); - duration+=(int64_t)sample_duration*(uint64_t)sample_count; - total_sample_count+=sample_count; - } + for (int j = 0; j < sample_count; j++) { + /* STTS sample offsets are uint32 but some files store it as int32 + * with negative values used to correct DTS delays. + There may be abnormally large values as well. */ + if (sample_duration > c->max_stts_delta) { + // assume high delta is a negative correction if greater than c->max_stts_delta + int32_t delta_magnitude = *((int32_t *)&sample_duration); + av_log_once(c->fc, AV_LOG_WARNING, AV_LOG_DEBUG, &stts_warn, + "Too large sample offset %u in stts entry %u with count %u in st:%d. Clipping to 1.\n", + sample_duration, i, sample_count, st->index); + sc->stts_data[i].duration = 1; + dts_correction += (delta_magnitude < 0 ? delta_magnitude - 1 : 0); + } + current_dts += sc->stts_data[i].duration; + + if (!dts_correction || current_dts + dts_correction > last_dts) { + current_dts += dts_correction; + if (!j) + sc->stts_data[i].duration += dts_correction/sample_count; + dts_correction = 0; + } else { + /* Avoid creating non-monotonous DTS */ + dts_correction += current_dts - last_dts - 1; + current_dts = last_dts + 1; + } + last_dts = current_dts; + } + duration+=(int64_t)sc->stts_data[i].duration*(uint64_t)sc->stts_data[i].count; + total_sample_count+=sc->stts_data[i].count; + + } sc->stts_count = i; if (duration > 0 && @@ -3856,13 +3888,10 @@ static void mov_build_index(MOVContext *mov, AVStream *st) unsigned int distance = 0; unsigned int rap_group_index = 0; unsigned int rap_group_sample = 0; - int64_t last_dts = 0; - int64_t dts_correction = 0; int rap_group_present = sc->rap_group_count && sc->rap_group; int key_off = (sc->keyframe_count && sc->keyframes[0] > 0) || (sc->stps_count && sc->stps_data[0] > 0); current_dts -= sc->dts_shift; - last_dts = current_dts; if (!sc->sample_count || sti->nb_index_entries) return; @@ -3973,26 +4002,8 @@ static void mov_build_index(MOVContext *mov, AVStream *st) current_offset += sample_size; stream_size += sample_size; - /* A negative sample duration is invalid based on the spec, - * but some samples need it to correct the DTS. */ - if (sc->stts_data[stts_index].duration < 0) { - av_log(mov->fc, AV_LOG_WARNING, - "Invalid SampleDelta %d in STTS, at %d st:%d\n", - sc->stts_data[stts_index].duration, stts_index, - st->index); - dts_correction += sc->stts_data[stts_index].duration - 1; - sc->stts_data[stts_index].duration = 1; - } current_dts += sc->stts_data[stts_index].duration; - if (!dts_correction || current_dts + dts_correction > last_dts) { - current_dts += dts_correction; - dts_correction = 0; - } else { - /* Avoid creating non-monotonous DTS */ - dts_correction += current_dts - last_dts - 1; - current_dts = last_dts + 1; - } - last_dts = current_dts; + distance++; stts_sample++; current_sample++; @@ -8577,6 +8588,7 @@ static const AVOption mov_options[] = { { "decryption_key", "The media decryption key (hex)", OFFSET(decryption_key), AV_OPT_TYPE_BINARY, .flags = AV_OPT_FLAG_DECODING_PARAM }, { "enable_drefs", "Enable external track support.", OFFSET(enable_drefs), AV_OPT_TYPE_BOOL, {.i64 = 0}, 0, 1, FLAGS }, + { "max_stts_delta", "treat offsets above this value as invalid", OFFSET(max_stts_delta), AV_OPT_TYPE_INT, {.i64 = UINT_MAX-48000*10 }, 0, UINT_MAX, .flags = AV_OPT_FLAG_DECODING_PARAM }, { NULL }, };