From patchwork Fri Oct 16 13:16:46 2020
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: =?utf-8?q?Jan_Ekstr=C3=B6m?= <jeebjp@gmail.com>
X-Patchwork-Id: 23017
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
X-Original-To: patchwork@ffaux-bg.ffmpeg.org
Delivered-To: patchwork@ffaux-bg.ffmpeg.org
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100])
	by ffaux.localdomain (Postfix) with ESMTP id 1E8DD44B2DD
	for <patchwork@ffaux-bg.ffmpeg.org>; Fri, 16 Oct 2020 16:25:19 +0300 (EEST)
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 017D668BAA9;
	Fri, 16 Oct 2020 16:25:19 +0300 (EEST)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from mail-lj1-f193.google.com (mail-lj1-f193.google.com
 [209.85.208.193])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 7A12E68BAA0
 for <ffmpeg-devel@ffmpeg.org>; Fri, 16 Oct 2020 16:25:12 +0300 (EEST)
Received: by mail-lj1-f193.google.com with SMTP id y16so2449468ljk.1
 for <ffmpeg-devel@ffmpeg.org>; Fri, 16 Oct 2020 06:25:12 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=from:to:subject:date:message-id:in-reply-to:references:mime-version
 :content-transfer-encoding;
 bh=QZDujKRqxy50dU9Ez/Jd2dZb6cEWubUIb9i7xwQFgr0=;
 b=FXCPFc3TaynmtXQuKnUxZckh1uj/789mVaaCj9+3o4X2LxS6zQ1MgHJbSy0XNRGbn6
 LM+wNwKUSOkaqClssTnn3YdOKSxsMg8KJf3W/txlZkJSnSo46Lns8R4vRwW6Sl9Gvf7P
 Vc973WMTZB6OyFwy2lnTpwlhqnHTiDanPpTd4my6xSnTPgNMZl/Gv5d4lbn5Ua+OxqxE
 UvyGYqIubaqz9m3SV6riR6kZA3GrEq2r1wBZTwn7pvDe6l6QLv5YGKH413veGVrcNYNL
 0Zx+AXEvlUZzPLzVY1MvkpnecTkUpQbmnnAUx9DJThpl8LLIhJ7VmkZzF8LhJJyzmfeq
 ivWw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to
 :references:mime-version:content-transfer-encoding;
 bh=QZDujKRqxy50dU9Ez/Jd2dZb6cEWubUIb9i7xwQFgr0=;
 b=fZ43NRAXzdlTezmMRVkTVNvEWV0yIhPowTHqEPMqLWoVv+fCX01K/sai6lUrAkH9zb
 qoajpiLEXYVC/LtRVUdMUKnwR1fW8MxdphT1dLr+81mDjYZThbIdrEnRTTq7vRGJQtYY
 WZJPznSI6TvjQf4lobFbYqBgCVxedolWMD0C0fBayadBv9A6P098fR7oY5Yg8hqr1sdL
 iaXGdFfIz6kj2OpG9lCHqFxPucE73x9ArQHwdXXOwHq5fAsMF5IvLWmnXjPiAdx6DAfX
 1hKptRn7OPLY8+qpZJ9Oo7gHVEfykCmSgVEYUzLD2gEWT84LlkyoE6WtnWSgGR2aRt6T
 kWpQ==
X-Gm-Message-State: AOAM5308WN4T6zyQ8rfvySfh0N7GIbZBYPpkfhutS/+5MhpZBtDhe/VV
 oA6LC5JmE5NwOemQwpbLc1RjrkWC8kc=
X-Google-Smtp-Source: 
 ABdhPJwnD/jtTrbf2MB6p/JouOWXMOAzYdn6/Cd7HV+aDtWW4LqU1Hwm3agRJS+XQ2ksQAkr1eZ2Nw==
X-Received: by 2002:a05:6512:3305:: with SMTP id
 k5mr1410401lfe.472.1602854219066;
 Fri, 16 Oct 2020 06:16:59 -0700 (PDT)
Received: from localhost.localdomain (n89ridqjqdjpsztph-2.v6.elisa-mobile.fi.
 [2001:999:11:8336:994:7f72:f89e:9125])
 by smtp.gmail.com with ESMTPSA id x13sm280594lfe.101.2020.10.16.06.16.57
 for <ffmpeg-devel@ffmpeg.org>
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Fri, 16 Oct 2020 06:16:58 -0700 (PDT)
From: =?utf-8?q?Jan_Ekstr=C3=B6m?= <jeebjp@gmail.com>
To: ffmpeg-devel@ffmpeg.org
Date: Fri, 16 Oct 2020 16:16:46 +0300
Message-Id: <20201016131649.4361-4-jeebjp@gmail.com>
X-Mailer: git-send-email 2.26.2
In-Reply-To: <20201016131649.4361-1-jeebjp@gmail.com>
References: <20201016131649.4361-1-jeebjp@gmail.com>
MIME-Version: 1.0
Subject: [FFmpeg-devel] [PATCH v3 3/6] ffmpeg: move A/V non-streamcopy
	initialization to a later point
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>

- For video, this means a single initialization point in do_video_out.
- For audio we unfortunately need to do it in two places just
  before the buffer sink is utilized (if av_buffersink_get_samples
  would still work according to its specification after a call to
  avfilter_graph_request_oldest was made, we could at least remove
  the one in transcode_step).

Other adjustments to make things work:
- As the AVFrame PTS adjustment to encoder time base needs the encoder
  to be initialized, so it is now moved to do_{video,audio}_out,
  right after the encoder has been initialized. Due to this,
  the additional parameter in do_video_out is removed as it is no
  longer necessary.
---
 fftools/ffmpeg.c | 112 ++++++++++++++++++++++++++++++++---------------
 1 file changed, 77 insertions(+), 35 deletions(-)

diff --git a/fftools/ffmpeg.c b/fftools/ffmpeg.c
index 0d8ed26912..08db67a6ab 100644
--- a/fftools/ffmpeg.c
+++ b/fftools/ffmpeg.c
@@ -941,6 +941,28 @@ early_exit:
     return float_pts;
 }
 
+static int init_output_stream(OutputStream *ost, char *error, int error_len);
+
+static int init_output_stream_wrapper(OutputStream *ost, unsigned int fatal)
+{
+    int ret = AVERROR_BUG;
+    char error[1024] = {0};
+
+    if (ost->initialized)
+        return 0;
+
+    ret = init_output_stream(ost, error, sizeof(error));
+    if (ret < 0) {
+        av_log(NULL, AV_LOG_ERROR, "Error initializing output stream %d:%d -- %s\n",
+               ost->file_index, ost->index, error);
+
+        if (fatal)
+            exit_program(1);
+    }
+
+    return ret;
+}
+
 static void do_audio_out(OutputFile *of, OutputStream *ost,
                          AVFrame *frame)
 {
@@ -952,6 +974,8 @@ static void do_audio_out(OutputFile *of, OutputStream *ost,
     pkt.data = NULL;
     pkt.size = 0;
 
+    adjust_frame_pts_to_encoder_tb(of, ost, frame);
+
     if (!check_recording_time(ost))
         return;
 
@@ -1086,8 +1110,7 @@ static void do_subtitle_out(OutputFile *of,
 
 static void do_video_out(OutputFile *of,
                          OutputStream *ost,
-                         AVFrame *next_picture,
-                         double sync_ipts)
+                         AVFrame *next_picture)
 {
     int ret, format_video_sync;
     AVPacket pkt;
@@ -1097,10 +1120,14 @@ static void do_video_out(OutputFile *of,
     int nb_frames, nb0_frames, i;
     double delta, delta0;
     double duration = 0;
+    double sync_ipts = AV_NOPTS_VALUE;
     int frame_size = 0;
     InputStream *ist = NULL;
     AVFilterContext *filter = ost->filter->filter;
 
+    init_output_stream_wrapper(ost, 1);
+    sync_ipts = adjust_frame_pts_to_encoder_tb(of, ost, next_picture);
+
     if (ost->source_index >= 0)
         ist = input_streams[ost->source_index];
 
@@ -1434,28 +1461,6 @@ static void do_video_stats(OutputStream *ost, int frame_size)
     }
 }
 
-static int init_output_stream(OutputStream *ost, char *error, int error_len);
-
-static int init_output_stream_wrapper(OutputStream *ost, unsigned int fatal)
-{
-    int ret = AVERROR_BUG;
-    char error[1024] = {0};
-
-    if (ost->initialized)
-        return 0;
-
-    ret = init_output_stream(ost, error, sizeof(error));
-    if (ret < 0) {
-        av_log(NULL, AV_LOG_ERROR, "Error initializing output stream %d:%d -- %s\n",
-               ost->file_index, ost->index, error);
-
-        if (fatal)
-            exit_program(1);
-    }
-
-    return ret;
-}
-
 static void finish_output_stream(OutputStream *ost)
 {
     OutputFile *of = output_files[ost->file_index];
@@ -1492,7 +1497,17 @@ static int reap_filters(int flush)
             continue;
         filter = ost->filter->filter;
 
-        init_output_stream_wrapper(ost, 1);
+        /*
+         * Unlike video, with audio the audio frame size matters.
+         * Currently we are fully reliant on the lavfi filter chain to
+         * do the buffering deed for us, and thus the frame size parameter
+         * needs to be set accordingly. Where does one get the required
+         * frame size? From the initialized AVCodecContext of an audio
+         * encoder. Thus, if we have gotten to an audio stream, initialize
+         * the encoder earlier than receiving the first AVFrame.
+         */
+        if (av_buffersink_get_type(filter) == AVMEDIA_TYPE_AUDIO)
+            init_output_stream_wrapper(ost, 1);
 
         if (!ost->filtered_frame && !(ost->filtered_frame = av_frame_alloc())) {
             return AVERROR(ENOMEM);
@@ -1500,7 +1515,6 @@ static int reap_filters(int flush)
         filtered_frame = ost->filtered_frame;
 
         while (1) {
-            double float_pts = AV_NOPTS_VALUE; // this is identical to filtered_frame.pts but with higher precision
             ret = av_buffersink_get_frame_flags(filter, filtered_frame,
                                                AV_BUFFERSINK_FLAG_NO_REQUEST);
             if (ret < 0) {
@@ -1509,7 +1523,7 @@ static int reap_filters(int flush)
                            "Error in av_buffersink_get_frame_flags(): %s\n", av_err2str(ret));
                 } else if (flush && ret == AVERROR_EOF) {
                     if (av_buffersink_get_type(filter) == AVMEDIA_TYPE_VIDEO)
-                        do_video_out(of, ost, NULL, AV_NOPTS_VALUE);
+                        do_video_out(of, ost, NULL);
                 }
                 break;
             }
@@ -1518,15 +1532,12 @@ static int reap_filters(int flush)
                 continue;
             }
 
-            float_pts = adjust_frame_pts_to_encoder_tb(of, ost,
-                                                       filtered_frame);
-
             switch (av_buffersink_get_type(filter)) {
             case AVMEDIA_TYPE_VIDEO:
                 if (!ost->frame_aspect_ratio.num)
                     enc->sample_aspect_ratio = filtered_frame->sample_aspect_ratio;
 
-                do_video_out(of, ost, filtered_frame, float_pts);
+                do_video_out(of, ost, filtered_frame);
                 break;
             case AVMEDIA_TYPE_AUDIO:
                 if (!(enc->codec->capabilities & AV_CODEC_CAP_PARAM_CHANGE) &&
@@ -3691,10 +3702,19 @@ static int transcode_init(void)
             goto dump_format;
         }
 
-    /* open each encoder */
+    /*
+     * initialize stream copy and subtitle/data streams.
+     * Encoded AVFrame based streams will get initialized as follows:
+     * - when the first AVFrame is received in do_video_out
+     * - just before the first AVFrame is received in either transcode_step
+     *   or reap_filters due to us requiring the filter chain buffer sink
+     *   to be configured with the correct audio frame size, which is only
+     *   known after the encoder is initialized.
+     */
     for (i = 0; i < nb_output_streams; i++) {
-        // skip streams fed from filtergraphs until we have a frame for them
-        if (output_streams[i]->filter)
+        if (!output_streams[i]->stream_copy &&
+            (output_streams[i]->enc_ctx->codec_type == AVMEDIA_TYPE_VIDEO ||
+             output_streams[i]->enc_ctx->codec_type == AVMEDIA_TYPE_AUDIO))
             continue;
 
         ret = init_output_stream_wrapper(output_streams[i], 0);
@@ -4608,7 +4628,29 @@ static int transcode_step(void)
     }
 
     if (ost->filter && ost->filter->graph->graph) {
-        init_output_stream_wrapper(ost, 1);
+        /*
+         * Similar case to the early audio initialization in reap_filters.
+         * Audio is special in ffmpeg.c currently as we depend on lavfi's
+         * audio frame buffering/creation to get the output audio frame size
+         * in samples correct. The audio frame size for the filter chain is
+         * configured during the output stream initialization.
+         *
+         * Apparently avfilter_graph_request_oldest (called in
+         * transcode_from_filter just down the line) peeks. Peeking already
+         * puts one frame "ready to be given out", which means that any
+         * update in filter buffer sink configuration afterwards will not
+         * help us. And yes, even if it would be utilized,
+         * av_buffersink_get_samples is affected, as it internally utilizes
+         * the same early exit for peeked frames.
+         *
+         * In other words, if avfilter_graph_request_oldest would not make
+         * further filter chain configuration or usage of
+         * av_buffersink_get_samples useless (by just causing the return
+         * of the peeked AVFrame as-is), we could get rid of this additional
+         * early encoder initialization.
+         */
+        if (av_buffersink_get_type(ost->filter->filter) == AVMEDIA_TYPE_AUDIO)
+            init_output_stream_wrapper(ost, 1);
 
         if ((ret = transcode_from_filter(ost->filter->graph, &ist)) < 0)
             return ret;