From patchwork Mon May 20 20:16:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yigithan Yigit X-Patchwork-Id: 49068 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a21:3a48:b0:1af:fc2d:ff5a with SMTP id zu8csp4768325pzb; Mon, 20 May 2024 13:16:38 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCWXzPwVclbwoRFfajNzDrgOwQSovSTmOUy+N1x5cRVhlk3XUyLZQHogYXE8WZ9Ew4XLtWUYCSTK5S8MzCF8gnixd4GhlWCvaOGEiA== X-Google-Smtp-Source: AGHT+IH9r3N2Y7o3fPog+HlOurIxiDwpMnresBC/98kVES+oXo+7wbvFUjpQV3iOclenxyK/pEt7 X-Received: by 2002:a17:907:9405:b0:a5a:5b8b:d14 with SMTP id a640c23a62f3a-a5a5b8b0de6mr1829057966b.40.1716236198316; Mon, 20 May 2024 13:16:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1716236198; cv=none; d=google.com; s=arc-20160816; b=xs9PxWMHt12iLRDyiHOwJE+mHo7VvZnmX0Lr3tXqyjDN9/XxshpTN2YzW7FileHK72 ciXPzcccjwyRCEhbwaxCUE27brvqnZAI9GKVLhHWJrgK9wIIG2pISnjtXwfDR71IJL5t iSIlpQpnVWPsT7u1yZY4AMbcXcVGeKKvtk5UB65Sx2b/t0rny7A1BoDzR+Ia26Hyxbqz w14dyswjOZ5M1o4WZ6+JB0X3XTKVXs9ArxslltN5/OrleAv9fy5rFC1CoYGhUjMPyHug drEtgJZpDd+yvPrIYAgY+4YU1uOYXQ9UX9c48scG6BcFvDg+CHKkeodpJ6dHpcpKGvmQ u37Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=1312aOE6n9l91LYzCrP4C7b3bDgnyvjAg3KzRf38lA8=; fh=oYrUj4vT4ivb560z23e71q/pal11/Mm+yKMO8RKszaM=; b=HanmfWkJ2SxeoGZhy4AgnhZZJ0qVpZpBp2yyIHB186SX5CbCVUvwknKA3XhlslZmlM 7UZoOK3LdF/ujOWYu0LwdGMTUNfz4/0v4rz0XfoQygVCkuSD4IqznWWW/ay9PJzO4Zcj 37g6juMLXbALCJg7Biko1iPKhdzJ+uxhqQvNPotS6tHU5DAxUnOO2Enicw3TiM5P704k GMiExkog1cslZqYstQGw6DYVhm5XrhUkjB1WIpqpSm607cuWSXja2kfjlwdEZ5FOVg05 T2VVRsMUcpPRN0ENHHft4AuigoKSnhsr395BefUr8Ko7lscJoGKq5PRgNiwxbENwZa90 7vwQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=HgYtbANC; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a640c23a62f3a-a5cd5eec373si725523266b.672.2024.05.20.13.16.37; Mon, 20 May 2024 13:16:38 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=HgYtbANC; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5CA3E68D2A7; Mon, 20 May 2024 23:16:21 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-ej1-f41.google.com (mail-ej1-f41.google.com [209.85.218.41]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 9AD5768D212 for ; Mon, 20 May 2024 23:16:13 +0300 (EEST) Received: by mail-ej1-f41.google.com with SMTP id a640c23a62f3a-a5a5cb0e6b7so796581866b.1 for ; Mon, 20 May 2024 13:16:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1716236172; x=1716840972; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=rC+G1COcA2uRwnPHyU33C91MeQaOHt8+sD4bmcdl3Hs=; b=HgYtbANCPIMsz6EKSON4VzlQfLDEEr/jaAGpCGAAJAi+kZIfwpOarMX4Q+XZGGnZwN WUYQoPdrai6C92Fw4YpudJ8w66VCiSKEJxQoJJZndxCgDv5GTFB26m9KLHcVIpsjTukY ZIwJ3w2eDc7S34Bh393kaj6kp+N4L+bEB0293asB8POjhgJyTzBesTRLMydelSDXRZkQ hoo+5qgwEwK3S0GTk4JeKmjZrC3AX3Ykr4bGiokscYmmFzDeUG73g5kwSftxyMFtDd5s FMAEOSKgWvtd1ED0mZE+s8f/GreiREsZy5GOMdyfUK7QTvgOq28zCYfNieEogmJhEP2V sOqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716236172; x=1716840972; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rC+G1COcA2uRwnPHyU33C91MeQaOHt8+sD4bmcdl3Hs=; b=wQx4vDEL/JAYgGIYjfcwHdxW7WuM0WHye1m3I2ftX4gjlxDDA/gf6i3rPBoGvPsgVC iBImGHLX+OPMgV6a75H0OQI+sGYFYnFUgu+tBb5fLOaQSYFeQfVVPFhMvW7tTXOMfBVZ xmetB2d+JigYvpwqJ2U3TJX25XEuqMldTAoMauE7gU6h22B9XHhFnq/rDNdAVbumJ86Q p/ie/R0jzn+ZtN0TdxZGQBjigLEguxpkV986CZKmlFhwn7yfmB11gF5idAw715NiRhHY fERZ/w+5BWIbCZ3mi6lhgcH/obHG9qhyg381Epl1CYV6ZYcV9F6vTOoiWrz3QWrEwHgS bgww== X-Gm-Message-State: AOJu0Ywi5ha+jtTZhIyJVN32VnSm1P+pAo+86HLc4W/J2Vc46Ht5VkOA 2hA7p0gR+llniGG3DE1MV/HKVEYeS2/7kfb/0cRNEVYOnbGHeT6bLjwal6xK X-Received: by 2002:a17:906:194a:b0:a59:9b75:b90 with SMTP id a640c23a62f3a-a5a2d53ad53mr2025263366b.2.1716236172427; Mon, 20 May 2024 13:16:12 -0700 (PDT) Received: from localhost.localdomain ([2a02:e0:8bd6:600:b19e:f464:654a:2d9b]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a5a1787c6ffsm1498463466b.49.2024.05.20.13.16.11 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 20 May 2024 13:16:11 -0700 (PDT) From: Yigithan Yigit To: ffmpeg-devel@ffmpeg.org Date: Mon, 20 May 2024 23:16:05 +0300 Message-ID: <20240520201606.90567-3-yigithanyigitdevel@gmail.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240520201606.90567-1-yigithanyigitdevel@gmail.com> References: <20240520201606.90567-1-yigithanyigitdevel@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 2/3] avfilter/af_volumedetect.c: Add 32bit float audio support X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: thilo.borgmann@mail.de, yigithanyigitdevel@gmail.com Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: aVuMhk7MdKW9 --- libavfilter/af_volumedetect.c | 159 ++++++++++++++++++++++++++++------ 1 file changed, 133 insertions(+), 26 deletions(-) diff --git a/libavfilter/af_volumedetect.c b/libavfilter/af_volumedetect.c index 327801a7f9..dbbcd037a5 100644 --- a/libavfilter/af_volumedetect.c +++ b/libavfilter/af_volumedetect.c @@ -20,27 +20,51 @@ #include "libavutil/channel_layout.h" #include "libavutil/avassert.h" +#include "libavutil/mem.h" #include "audio.h" #include "avfilter.h" #include "internal.h" +#define MAX_DB_FLT 1024 #define MAX_DB 91 +#define HISTOGRAM_SIZE 0x10000 +#define HISTOGRAM_SIZE_FLT (MAX_DB_FLT*2) typedef struct VolDetectContext { - /** - * Number of samples at each PCM value. - * histogram[0x8000 + i] is the number of samples at value i. - * The extra element is there for symmetry. - */ - uint64_t histogram[0x10001]; + uint64_t* histogram; ///< for integer number of samples at each PCM value, for float number of samples at each dB + uint64_t nb_samples; ///< number of samples + double sum2; ///< sum of the squares of the samples + double max; ///< maximum sample value + int is_float; ///< true if the input is in floating point } VolDetectContext; -static inline double logdb(uint64_t v) +static inline double logdb(double v, enum AVSampleFormat sample_fmt) { - double d = v / (double)(0x8000 * 0x8000); - if (!v) - return MAX_DB; - return -log10(d) * 10; + if (sample_fmt == AV_SAMPLE_FMT_FLT) { + if (!v) + return MAX_DB_FLT; + return -log10(v) * 10; + } else { + double d = v / (double)(0x8000 * 0x8000); + if (!v) + return MAX_DB; + return -log10(d) * 10; + } +} + +static void update_float_stats(VolDetectContext *vd, float *audio_data) +{ + double sample; + int idx; + if(!isnormal(*audio_data)) + return; + sample = fabsf(*audio_data); + if (sample > vd->max) + vd->max = sample; + vd->sum2 += sample * sample; + idx = lrintf(floorf(logdb(sample * sample, AV_SAMPLE_FMT_FLT))) + MAX_DB_FLT; + vd->histogram[idx]++; + vd->nb_samples++; } static int filter_frame(AVFilterLink *inlink, AVFrame *samples) @@ -51,18 +75,41 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *samples) int nb_channels = samples->ch_layout.nb_channels; int nb_planes = nb_channels; int plane, i; - int16_t *pcm; + int planar = 0; - if (!av_sample_fmt_is_planar(samples->format)) { - nb_samples *= nb_channels; + planar = av_sample_fmt_is_planar(samples->format); + if (!planar) nb_planes = 1; + if (vd->is_float) { + float *audio_data; + for (plane = 0; plane < nb_planes; plane++) { + audio_data = (float *)samples->extended_data[plane]; + for (i = 0; i < nb_samples; i++) { + if (planar) { + update_float_stats(vd, &audio_data[i]); + } else { + for (int j = 0; j < nb_channels; j++) + update_float_stats(vd, &audio_data[i * nb_channels + j]); + } + } + } + } else { + int16_t *pcm; + for (plane = 0; plane < nb_planes; plane++) { + pcm = (int16_t *)samples->extended_data[plane]; + for (i = 0; i < nb_samples; i++) { + if (planar) { + vd->histogram[pcm[i] + 0x8000]++; + vd->nb_samples++; + } else { + for (int j = 0; j < nb_channels; j++) { + vd->histogram[pcm[i * nb_channels + j] + 0x8000]++; + vd->nb_samples++; + } + } + } + } } - for (plane = 0; plane < nb_planes; plane++) { - pcm = (int16_t *)samples->extended_data[plane]; - for (i = 0; i < nb_samples; i++) - vd->histogram[pcm[i] + 0x8000]++; - } - return ff_filter_frame(inlink->dst->outputs[0], samples); } @@ -73,6 +120,20 @@ static void print_stats(AVFilterContext *ctx) uint64_t nb_samples = 0, power = 0, nb_samples_shift = 0, sum = 0; uint64_t histdb[MAX_DB + 1] = { 0 }; + if (!vd->nb_samples) + return; + if (vd->is_float) { + av_log(ctx, AV_LOG_INFO, "n_samples: %" PRId64 "\n", vd->nb_samples); + av_log(ctx, AV_LOG_INFO, "mean_volume: %.1f dB\n", -logdb(vd->sum2 / vd->nb_samples, AV_SAMPLE_FMT_FLT)); + av_log(ctx, AV_LOG_INFO, "max_volume: %.1f dB\n", -2.0*logdb(vd->max, AV_SAMPLE_FMT_FLT)); + for (i = 0; i < HISTOGRAM_SIZE_FLT && !vd->histogram[i]; i++); + for (; i >= 0 && sum < vd->nb_samples / 1000; i++) { + if (!vd->histogram[i]) + continue; + av_log(ctx, AV_LOG_INFO, "histogram_%ddb: %" PRId64 "\n", MAX_DB_FLT - i, vd->histogram[i]); + sum += vd->histogram[i]; + } + } else { for (i = 0; i < 0x10000; i++) nb_samples += vd->histogram[i]; av_log(ctx, AV_LOG_INFO, "n_samples: %"PRId64"\n", nb_samples); @@ -92,26 +153,61 @@ static void print_stats(AVFilterContext *ctx) return; power = (power + nb_samples_shift / 2) / nb_samples_shift; av_assert0(power <= 0x8000 * 0x8000); - av_log(ctx, AV_LOG_INFO, "mean_volume: %.1f dB\n", -logdb(power)); + av_log(ctx, AV_LOG_INFO, "mean_volume: %.1f dB\n", -logdb((double)power, AV_SAMPLE_FMT_S16)); max_volume = 0x8000; while (max_volume > 0 && !vd->histogram[0x8000 + max_volume] && !vd->histogram[0x8000 - max_volume]) max_volume--; - av_log(ctx, AV_LOG_INFO, "max_volume: %.1f dB\n", -logdb(max_volume * max_volume)); + av_log(ctx, AV_LOG_INFO, "max_volume: %.1f dB\n", -logdb((double)(max_volume * max_volume), AV_SAMPLE_FMT_S16)); for (i = 0; i < 0x10000; i++) - histdb[(int)logdb((i - 0x8000) * (i - 0x8000))] += vd->histogram[i]; + histdb[(int)logdb((double)(i - 0x8000) * (i - 0x8000), AV_SAMPLE_FMT_S16)] += vd->histogram[i]; for (i = 0; i <= MAX_DB && !histdb[i]; i++); for (; i <= MAX_DB && sum < nb_samples / 1000; i++) { - av_log(ctx, AV_LOG_INFO, "histogram_%ddb: %"PRId64"\n", i, histdb[i]); + av_log(ctx, AV_LOG_INFO, "histogram_%ddb: %"PRId64"\n", -i, histdb[i]); sum += histdb[i]; } + } +} + +static int config_output(AVFilterLink *outlink) +{ + AVFilterContext *ctx = outlink->src; + VolDetectContext *vd = ctx->priv; + size_t histogram_size; + + vd->is_float = outlink->format == AV_SAMPLE_FMT_FLT || + outlink->format == AV_SAMPLE_FMT_FLTP; + + if (!vd->is_float) { + /* + * Number of samples at each PCM value. + * Only used for integer formats. + * For 16 bit signed PCM there are 65536. + * histogram[0x8000 + i] is the number of samples at value i. + * The extra element is there for symmetry. + */ + histogram_size = HISTOGRAM_SIZE + 1; + } else { + /* + * The histogram is used to store the number of samples at each dB + * instead of the number of samples at each PCM value. + */ + histogram_size = HISTOGRAM_SIZE_FLT + 1; + } + vd->histogram = av_calloc(histogram_size, sizeof(uint64_t)); + if (!vd->histogram) + return AVERROR(ENOMEM); + return 0; } static av_cold void uninit(AVFilterContext *ctx) { + VolDetectContext *vd = ctx->priv; print_stats(ctx); + if (vd->histogram) + av_freep(&vd->histogram); } static const AVFilterPad volumedetect_inputs[] = { @@ -122,6 +218,14 @@ static const AVFilterPad volumedetect_inputs[] = { }, }; +static const AVFilterPad volumedetect_outputs[] = { + { + .name = "default", + .type = AVMEDIA_TYPE_AUDIO, + .config_props = config_output, + }, +}; + const AVFilter ff_af_volumedetect = { .name = "volumedetect", .description = NULL_IF_CONFIG_SMALL("Detect audio volume."), @@ -129,6 +233,9 @@ const AVFilter ff_af_volumedetect = { .uninit = uninit, .flags = AVFILTER_FLAG_METADATA_ONLY, FILTER_INPUTS(volumedetect_inputs), - FILTER_OUTPUTS(ff_audio_default_filterpad), - FILTER_SAMPLEFMTS(AV_SAMPLE_FMT_S16, AV_SAMPLE_FMT_S16P), + FILTER_OUTPUTS(volumedetect_outputs), + FILTER_SAMPLEFMTS(AV_SAMPLE_FMT_S16, + AV_SAMPLE_FMT_S16P, + AV_SAMPLE_FMT_FLT, + AV_SAMPLE_FMT_FLTP), };