From patchwork Sun Jan 30 17:30:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Romain Beauxis X-Patchwork-Id: 33945 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6602:2c4e:0:0:0:0 with SMTP id x14csp2124749iov; Sun, 30 Jan 2022 09:35:56 -0800 (PST) X-Google-Smtp-Source: ABdhPJwrhwbzVfaVV1Od1aimBYdH1CLTOhGyy9dve63Hd/gZUzbi0K7O/0lt3MBSbqnTFyYbkHcA X-Received: by 2002:a05:6402:1d4d:: with SMTP id dz13mr17850056edb.84.1643564156071; Sun, 30 Jan 2022 09:35:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643564156; cv=none; d=google.com; s=arc-20160816; b=Y0oqRjuACKBsNsvT5B//GHrq9UGIVjtcFz4AOGBsUQIaAPMm0adMhA5Sm8EExxV9Em kWGOkF7luPVrFmdQh5zBCLG39dQeq+M5F5E343DbO1rnJtA3ILA7dNyzCWzY329AvjC/ 7Ou0DL8IYk2+4JoI6PUEQSbkQRemDv1I3EJuXtKkgKMeCuwBT4ty709wr0OkVrxKQcv0 YWDqpY9DslN8U7hFVr6G5mEqnVOZvE0KuSBQxiSB+3Aa39qDVHF3Ru4ia4jaLGFor33A /XNWwSyw1yODv/p0K0QCR8PhTHqPIQ98pvi1QrCsN3718WMLMvfwfB7JBhT2IdCNhUEx XmFw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:delivered-to; bh=K9kqFLQJ+vYwRwLAZkv8rKOPuzP+kiWq1gnkEmqusWQ=; b=WTxqnkSjWtY1Oz9zGrOCPMJtDgz51EdU/Yj3IywbrlK02BWOu3fIH4MHqFbJU6wRhR p1+wGnv1MIP3OYYVC0AANcJ8x5InoEwWnIDS21uPfTPlFFXZQPLD+ZGd67eoWZD4D460 Ap3590vpB/JOcPzWzb2GCwsc5VTHGcvmgZVkloswmp6rk9DfaNhtm8ExnHajA1j+0G8Q PRazE1V7dHjl3Sh3WtnaPlQ4ynUxrXPEIXiHNbrwfgSTK9zkJ3Fl5hPZCMBh3tm+yMv7 PxRS/tExD3Dy/AY6DKKT0NtJ8XDMClvAuX+MpJ0IUgnPMSmEdtZUzubKFaBKeBDmNnzR M0cw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id d18si6686407ejo.680.2022.01.30.09.35.55; Sun, 30 Jan 2022 09:35:56 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 58FE768B249; Sun, 30 Jan 2022 19:35:53 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-oo1-f49.google.com (mail-oo1-f49.google.com [209.85.161.49]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id BA07968B106 for ; Sun, 30 Jan 2022 19:35:46 +0200 (EET) Received: by mail-oo1-f49.google.com with SMTP id b15-20020a4a878f000000b002dccc412166so2675207ooi.11 for ; Sun, 30 Jan 2022 09:35:46 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=RR7+A3QW0wKNwXGfBo5t8MN0j5OhgmoHDbQo8pSD6d0=; b=654fMfMVm8L9hthY2OStCqtgm0czTdYhwp/ubFEFuXjaqFVJfyLk52NupOTNye2f3/ TzJ+Hqpjm1HbY1gkAwefWORsf6kjoXqAmzO8OlxzQpiJJbN3+WW8Sg0JFMhd6LqPOmnz QIHRMOfyE3hQpKjr/P3H7BNexI+97oOBB0mMM4vhx0gu+aBqNwfDG1RjSCJQYGZG7EX3 eqikKpyeyxUnwL/BGk4Th8M73wgpcmFvTVNuoIQ7r0xcg840ISgF8TrLQILmSejR3k08 /vEsMMGblEdlP3/K+PYm8iC9oRIX2QqFwL3WIhLQTOjYW+W7YhqJu4Hu6ziPZCndY8LO MjNA== X-Gm-Message-State: AOAM531BVtv2St1uqWu1SIP855huYVxFzPra7PODZPijcPM3JdO7B2RT 0SLPn2d5RjnMnL9YPhM237a9HPga00/0GFNj X-Received: by 2002:a4a:b401:: with SMTP id y1mr8648950oon.95.1643564144794; Sun, 30 Jan 2022 09:35:44 -0800 (PST) Received: from localhost.localdomain (wsip-98-173-234-196.no.no.cox.net. [98.173.234.196]) by smtp.gmail.com with ESMTPSA id 71sm8303211otn.43.2022.01.30.09.35.43 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Sun, 30 Jan 2022 09:35:44 -0800 (PST) From: toots@rastageeks.org To: ffmpeg-devel@ffmpeg.org Date: Sun, 30 Jan 2022 11:30:47 -0600 Message-Id: <20220130173045.32690-4-toots@rastageeks.org> X-Mailer: git-send-email 2.32.0 (Apple Git-132) In-Reply-To: <20220130173045.32690-1-toots@rastageeks.org> References: <20220130173045.32690-1-toots@rastageeks.org> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 3/4] libavdevice/avfoundation.m: use setAudioSettings, extend supported formats X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: thilo.borgmann@mail.de, Romain Beauxis , epirat07@gmail.com Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: NcGrZf5hMRyE From: Romain Beauxis This patch reworks the workflow for audio input to extend the range of supported input formats and delegate to the underlying OS the task of converting audio input format. Previous version of these changes used the AudioConverter API to perform audio conversion explicitly however, it was found to be bug prone with issues seemingly coming from the underlying OS. This fixes: https://trac.ffmpeg.org/ticket/9502 Signed-off-by: Romain Beauxis --- libavdevice/avfoundation.m | 206 ++++++++++++------------------------- 1 file changed, 63 insertions(+), 143 deletions(-) diff --git a/libavdevice/avfoundation.m b/libavdevice/avfoundation.m index a837042a6d..db9501445d 100644 --- a/libavdevice/avfoundation.m +++ b/libavdevice/avfoundation.m @@ -96,6 +96,11 @@ AVRational framerate; int width, height; + int channels; + int big_endian; + int sample_rate; + enum AVSampleFormat sample_format; + int capture_cursor; int capture_mouse_clicks; int capture_raw_data; @@ -114,17 +119,6 @@ int num_video_devices; - int audio_channels; - int audio_bits_per_sample; - int audio_float; - int audio_be; - int audio_signed_integer; - int audio_packed; - int audio_non_interleaved; - - int32_t *audio_buffer; - int audio_buffer_size; - enum AVPixelFormat pixel_format; AVCaptureSession *capture_session; @@ -301,14 +295,6 @@ static void destroy_context(AVFContext* ctx) ctx->audio_output = NULL; ctx->avf_delegate = NULL; ctx->avf_audio_delegate = NULL; - - av_freep(&ctx->audio_buffer); - - pthread_mutex_destroy(&ctx->frame_lock); - - if (ctx->current_frame) { - CFRelease(ctx->current_frame); - } } static void parse_device_name(AVFormatContext *s) @@ -674,88 +660,62 @@ static int get_video_config(AVFormatContext *s) static int get_audio_config(AVFormatContext *s) { AVFContext *ctx = (AVFContext*)s->priv_data; - CMFormatDescriptionRef format_desc; - AVStream* stream = avformat_new_stream(s, NULL); + AVStream* stream; + int bits_per_sample, is_float; - if (!stream) { - return 1; - } + enum AVCodecID codec_id = av_get_pcm_codec(ctx->sample_format, ctx->big_endian); - // Take stream info from the first frame. - while (ctx->audio_frames_captured < 1) { - CFRunLoopRunInMode(kCFRunLoopDefaultMode, 0.1, YES); + if (codec_id == AV_CODEC_ID_NONE) { + av_log(ctx, AV_LOG_ERROR, "Error: invalid sample format!\n"); + return AVERROR(EINVAL); } - lock_frames(ctx); - - ctx->audio_stream_index = stream->index; - - avpriv_set_pts_info(stream, 64, 1, avf_time_base); - - format_desc = CMSampleBufferGetFormatDescription(ctx->current_audio_frame); - const AudioStreamBasicDescription *basic_desc = CMAudioFormatDescriptionGetStreamBasicDescription(format_desc); + switch (ctx->sample_format) { + case AV_SAMPLE_FMT_S16: + bits_per_sample = 16; + is_float = 0; + break; + case AV_SAMPLE_FMT_S32: + bits_per_sample = 32; + is_float = 0; + break; + case AV_SAMPLE_FMT_FLT: + bits_per_sample = 32; + is_float = 1; + break; + default: + av_log(ctx, AV_LOG_ERROR, "Error: invalid sample format!\n"); + unlock_frames(ctx); + return AVERROR(EINVAL); + } - if (!basic_desc) { + [ctx->audio_output setAudioSettings:@{ + AVFormatIDKey: @(kAudioFormatLinearPCM), + AVLinearPCMBitDepthKey: @(bits_per_sample), + AVLinearPCMIsFloatKey: @(is_float), + AVLinearPCMIsBigEndianKey: @(ctx->big_endian), + AVNumberOfChannelsKey: @(ctx->channels), + AVLinearPCMIsNonInterleaved: @NO, + AVSampleRateKey: @(ctx->sample_rate) + }]; + + stream = avformat_new_stream(s, NULL); + if (!stream) { unlock_frames(ctx); - av_log(s, AV_LOG_ERROR, "audio format not available\n"); - return 1; + return -1; } + avpriv_set_pts_info(stream, 64, 1, avf_time_base); + stream->codecpar->codec_type = AVMEDIA_TYPE_AUDIO; - stream->codecpar->sample_rate = basic_desc->mSampleRate; - stream->codecpar->channels = basic_desc->mChannelsPerFrame; + stream->codecpar->sample_rate = ctx->sample_rate; + stream->codecpar->channels = ctx->channels; stream->codecpar->channel_layout = av_get_default_channel_layout(stream->codecpar->channels); + stream->codecpar->codec_id = codec_id; - ctx->audio_channels = basic_desc->mChannelsPerFrame; - ctx->audio_bits_per_sample = basic_desc->mBitsPerChannel; - ctx->audio_float = basic_desc->mFormatFlags & kAudioFormatFlagIsFloat; - ctx->audio_be = basic_desc->mFormatFlags & kAudioFormatFlagIsBigEndian; - ctx->audio_signed_integer = basic_desc->mFormatFlags & kAudioFormatFlagIsSignedInteger; - ctx->audio_packed = basic_desc->mFormatFlags & kAudioFormatFlagIsPacked; - ctx->audio_non_interleaved = basic_desc->mFormatFlags & kAudioFormatFlagIsNonInterleaved; - - if (basic_desc->mFormatID == kAudioFormatLinearPCM && - ctx->audio_float && - ctx->audio_bits_per_sample == 32 && - ctx->audio_packed) { - stream->codecpar->codec_id = ctx->audio_be ? AV_CODEC_ID_PCM_F32BE : AV_CODEC_ID_PCM_F32LE; - } else if (basic_desc->mFormatID == kAudioFormatLinearPCM && - ctx->audio_signed_integer && - ctx->audio_bits_per_sample == 16 && - ctx->audio_packed) { - stream->codecpar->codec_id = ctx->audio_be ? AV_CODEC_ID_PCM_S16BE : AV_CODEC_ID_PCM_S16LE; - } else if (basic_desc->mFormatID == kAudioFormatLinearPCM && - ctx->audio_signed_integer && - ctx->audio_bits_per_sample == 24 && - ctx->audio_packed) { - stream->codecpar->codec_id = ctx->audio_be ? AV_CODEC_ID_PCM_S24BE : AV_CODEC_ID_PCM_S24LE; - } else if (basic_desc->mFormatID == kAudioFormatLinearPCM && - ctx->audio_signed_integer && - ctx->audio_bits_per_sample == 32 && - ctx->audio_packed) { - stream->codecpar->codec_id = ctx->audio_be ? AV_CODEC_ID_PCM_S32BE : AV_CODEC_ID_PCM_S32LE; - } else { - unlock_frames(ctx); - av_log(s, AV_LOG_ERROR, "audio format is not supported\n"); - return 1; - } - - if (ctx->audio_non_interleaved) { - CMBlockBufferRef block_buffer = CMSampleBufferGetDataBuffer(ctx->current_audio_frame); - ctx->audio_buffer_size = CMBlockBufferGetDataLength(block_buffer); - ctx->audio_buffer = av_malloc(ctx->audio_buffer_size); - if (!ctx->audio_buffer) { - unlock_frames(ctx); - av_log(s, AV_LOG_ERROR, "error allocating audio buffer\n"); - return 1; - } - } - - CFRelease(ctx->current_audio_frame); - ctx->current_audio_frame = nil; + ctx->audio_stream_index = stream->index; unlock_frames(ctx); - return 0; } @@ -1056,6 +1016,7 @@ static int avf_read_header(AVFormatContext *s) goto fail; } if (audio_device && add_audio_device(s, audio_device)) { + goto fail; } [ctx->capture_session startRunning]; @@ -1129,6 +1090,7 @@ static int copy_cvpixelbuffer(AVFormatContext *s, static int avf_read_packet(AVFormatContext *s, AVPacket *pkt) { + OSStatus ret; AVFContext* ctx = (AVFContext*)s->priv_data; do { @@ -1172,7 +1134,7 @@ static int avf_read_packet(AVFormatContext *s, AVPacket *pkt) status = copy_cvpixelbuffer(s, image_buffer, pkt); } else { status = 0; - OSStatus ret = CMBlockBufferCopyDataBytes(block_buffer, 0, pkt->size, pkt->data); + ret = CMBlockBufferCopyDataBytes(block_buffer, 0, pkt->size, pkt->data); if (ret != kCMBlockBufferNoErr) { status = AVERROR(EIO); } @@ -1186,19 +1148,17 @@ static int avf_read_packet(AVFormatContext *s, AVPacket *pkt) } } else if (ctx->current_audio_frame != nil) { CMBlockBufferRef block_buffer = CMSampleBufferGetDataBuffer(ctx->current_audio_frame); - int block_buffer_size = CMBlockBufferGetDataLength(block_buffer); - if (!block_buffer || !block_buffer_size) { - unlock_frames(ctx); - return AVERROR(EIO); - } + size_t buffer_size = CMBlockBufferGetDataLength(block_buffer); - if (ctx->audio_non_interleaved && block_buffer_size > ctx->audio_buffer_size) { + int status = av_new_packet(pkt, buffer_size); + if (status < 0) { unlock_frames(ctx); - return AVERROR_BUFFER_TOO_SMALL; + return status; } - if (av_new_packet(pkt, block_buffer_size) < 0) { + ret = CMBlockBufferCopyDataBytes(block_buffer, 0, pkt->size, pkt->data); + if (ret != kCMBlockBufferNoErr) { unlock_frames(ctx); return AVERROR(EIO); } @@ -1214,54 +1174,10 @@ static int avf_read_packet(AVFormatContext *s, AVPacket *pkt) pkt->stream_index = ctx->audio_stream_index; pkt->flags |= AV_PKT_FLAG_KEY; - if (ctx->audio_non_interleaved) { - int sample, c, shift, num_samples; - - OSStatus ret = CMBlockBufferCopyDataBytes(block_buffer, 0, pkt->size, ctx->audio_buffer); - if (ret != kCMBlockBufferNoErr) { - unlock_frames(ctx); - return AVERROR(EIO); - } - - num_samples = pkt->size / (ctx->audio_channels * (ctx->audio_bits_per_sample >> 3)); - - // transform decoded frame into output format - #define INTERLEAVE_OUTPUT(bps) \ - { \ - int##bps##_t **src; \ - int##bps##_t *dest; \ - src = av_malloc(ctx->audio_channels * sizeof(int##bps##_t*)); \ - if (!src) { \ - unlock_frames(ctx); \ - return AVERROR(EIO); \ - } \ - \ - for (c = 0; c < ctx->audio_channels; c++) { \ - src[c] = ((int##bps##_t*)ctx->audio_buffer) + c * num_samples; \ - } \ - dest = (int##bps##_t*)pkt->data; \ - shift = bps - ctx->audio_bits_per_sample; \ - for (sample = 0; sample < num_samples; sample++) \ - for (c = 0; c < ctx->audio_channels; c++) \ - *dest++ = src[c][sample] << shift; \ - av_freep(&src); \ - } - - if (ctx->audio_bits_per_sample <= 16) { - INTERLEAVE_OUTPUT(16) - } else { - INTERLEAVE_OUTPUT(32) - } - } else { - OSStatus ret = CMBlockBufferCopyDataBytes(block_buffer, 0, pkt->size, pkt->data); - if (ret != kCMBlockBufferNoErr) { - unlock_frames(ctx); - return AVERROR(EIO); - } - } - CFRelease(ctx->current_audio_frame); ctx->current_audio_frame = nil; + + unlock_frames(ctx); } else { pkt->data = NULL; unlock_frames(ctx); @@ -1286,6 +1202,10 @@ static int avf_close(AVFormatContext *s) } static const AVOption options[] = { + { "channels", "number of audio channels", offsetof(AVFContext, channels), AV_OPT_TYPE_INT, {.i64=2}, 0, INT_MAX, AV_OPT_FLAG_ENCODING_PARAM }, + { "sample_rate", "audio sample rate", offsetof(AVFContext, sample_rate), AV_OPT_TYPE_INT, {.i64=44100}, 0, INT_MAX, AV_OPT_FLAG_ENCODING_PARAM }, + { "big_endian", "return big endian samples for audio data", offsetof(AVFContext, big_endian), AV_OPT_TYPE_BOOL, {.i64=0}, 0, 1, AV_OPT_FLAG_ENCODING_PARAM }, + { "sample_format", "audio sample format", offsetof(AVFContext, sample_format), AV_OPT_TYPE_INT, {.i64=AV_SAMPLE_FMT_S16}, 0, INT_MAX, AV_OPT_FLAG_ENCODING_PARAM }, { "list_devices", "list available devices", offsetof(AVFContext, list_devices), AV_OPT_TYPE_BOOL, {.i64=0}, 0, 1, AV_OPT_FLAG_DECODING_PARAM }, { "video_device_index", "select video device by index for devices with same name (starts at 0)", offsetof(AVFContext, video_device_index), AV_OPT_TYPE_INT, {.i64 = -1}, -1, INT_MAX, AV_OPT_FLAG_DECODING_PARAM }, { "audio_device_index", "select audio device by index for devices with same name (starts at 0)", offsetof(AVFContext, audio_device_index), AV_OPT_TYPE_INT, {.i64 = -1}, -1, INT_MAX, AV_OPT_FLAG_DECODING_PARAM },