From patchwork Sun Feb 11 05:30:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Baudanza X-Patchwork-Id: 46171 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:9002:b0:19e:cdac:8cce with SMTP id d2csp34303pzc; Sat, 10 Feb 2024 21:30:59 -0800 (PST) X-Google-Smtp-Source: AGHT+IGtHiFBzwRRy804eHpvMwQuSlFR24zST7cnu/beOPB4SVUz28EwBS1viRDhhKFATWbmPmc+ X-Received: by 2002:aa7:cf02:0:b0:55e:aca4:aab2 with SMTP id a2-20020aa7cf02000000b0055eaca4aab2mr2387569edy.19.1707629458772; Sat, 10 Feb 2024 21:30:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1707629458; cv=none; d=google.com; s=arc-20160816; b=J1apJ8IDA09iFqFl3z/0hGvzqprLqwUSeblmE+qAkResDPZfKY51RVVz+BLdyg/4V6 DkORAXOOU2EJh+z6vJP+lcPh7LBmNsZDy9wBDZ2KGHdS9cbjpBcQuBhbcMjgBNmo/HPK 4ilDX4IH9wQjFi7Ix8nYvOMTS343GJDA3V1NW8mO1xsYHlCQ37yrKB74/R+aNMarc9nt kmhM9l1rYYpeJhrBo/tJuy1npm227cbN7OpBT4QHws6gj9fKQI+D9nrkrypkBCutGIh2 jD2BwJfgjoA/Ixl9OTIvA6w/FYLk/RiKx31PIt3y3ZlJlukwEsq4utVR5nmThaFzXhwA QmnA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:message-id:date:to:from :feedback-id:dkim-signature:dkim-signature:delivered-to; bh=+/lvZk1n+ssQZSFl7oc/djEubt20NAaX9gj9CzkC8WY=; fh=gQWIrCeOnpePit0fUeJz2IAHTxwIUkhawouIOjeb8Zk=; b=QcwQy88IapqOqEzSeNHfAI7oekqQrhb5TJtWBGj+4+tOrHL0+h9MC38tMHtlKCOcLS aSs+15gkwPRPlbQydCkSdutD4HTmMIbDrQCnrNpjT7cbkgUOfvGr7HkemVUiWtSrepri t1tqLYzosv9vxJEUvyW/HktFJGK2ywZI8p8RB+TI6sLGp7z3QR4iAYYGwjWHYkDQtf3o x8Ij5ej2XDx/U1X8Jh3Dy7dL2t81joSv14uiZ5rJ03fws0TdTtPRexUhjvmAcgcd46MS uWhEY6Fetj32Ggeno8mrnNhniXCUIynCsSkBfO88ZnmGeHYF0ZttN8YrgTI/pZUY6mh3 rm9g==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@jonb.org header.s=fm1 header.b=0oQ2aKAe; dkim=neutral (body hash did not verify) header.i=@messagingengine.com header.s=fm3 header.b=PYGbZu5t; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org X-Forwarded-Encrypted: i=1; AJvYcCWiwuVJHUMTLyyAnEl4THywKyYuGARHWt8EKm7aZOICU1v8+wBVfH7ds1GrTe2AWOkpinrXgKzI71kfdFFv3KGGSPvGrGUlGcPiC3G3i9NsvOLCbMSb783WqPiCLf3qNrPPQ731L3YJ++CUSWmfU/STeyy46SkdZPP5m6lJ9Xi8iVRdcMX8NKxZO7iBpFtEvmmnC3k7IHxRowONGVfQHtVxLQ+QODG7CGGnqAPn64lyvAlFRkJwaiXgz47ukej0rksjUNUEfho7wVYx65ZupVAh23Cart4I2TnGAG6OLkR8EkQd+8JZ2FgpGPGgLDbN7o/7OtzwEdiR0qyl+77GwGfN1vnHjhYJy7Li8/JjOPThzHzDX9aQAQPuHyBxoebMvhFbI474cXVpsjPxDbZh45lGDbVmQlFzsPxGafuxq7L/EyxzqfzweGvvxlqRmC4NVv4P4uH8y2dHcrVZCkIPZX9n9PuYGrrhHUgw4Ae0zpd2denbdQixtPgoLP9FRVCW2TldeEF9JvArKHhIHh0HTYHTj8v0PEA/kF+o1Ppl1Gq4IonVwEY9476slbDh5R1tH5zSFBcnfOZVvHRMHMZTxmoZPwi8LrH64jgYLEuDE8ys3eVR6ZtQUv067r8v0y+rHp0wXe64r+b1GpXYy1vWbEHrlYmCxTLBCkZDHmr9Ijnjg/Ux5ml5zXJzew0SjiHe3QtsZLmVXKA2FX7/8tnsejLvYxjybrUb/vhmE9YnpV0DtFp99IO/Qf1dOrLfSBox9ngEGZVdkcLCuJwHBLm2dN/RxkCo2BersS6RLPalao2/vgxYEk4nUVFZxU8XUYVTe/Gf25liNZVsa091wUCLozXtFzJCspIYojYtFYGyNcNEUjZgfjiFMWjtAJ5hNI7NbBTokk+cQiMtL33ovD95QFGvVRBZRJJNzaeOsQZNfYwncgsU3F33b5U34hvdLuiVKAhJl6 LHP2BlOaq1JwklrADKaSnHPJu4ux4SdAoKiqXtSw79kNWDI/XirKAILrFg2PYSoeacr5CX+MEkCLvvNAp1svIfmD5ASB/+0pGALa32t+X8z/Nz69CYFrpvQE2asGDauMIcWRxXsPheHzx803/2xXVHv/b1vNzPpVXdAWug3i13t9Vtub84iwLvXpDpMxQ/BcGVHXdJNWaWSUE3KKTvKoHV6Qnns3e+j/k3iBhwwzRYAhk51teWT7q3SZOavDKrCNehLjan8W7Zz8FXrWc9YveIPoXyY3GjSRsbIUoplcbbVqN5LQOfUUr2DCzuaNLdq8XiFX+2Psi7iEK3N9KXj1ixrAn+eEdURwf5uDGijcBaB9iPFOawvhyuhUyMNAoq9kK8WQxrwcw3oPsztOWElyljoCN3KQYX3W4I/kIARD9OULwPluq/HYPoG7Fbaa1gSCHXzhOeEnCOxy2dI54o4lIuBKABDRsKhuA3cFuHtkl5mxS9PZoSZTXcNU816ybilW+veae3vJONDTTYyAp7zDsVO8Ljyc3F/3Yaw+ovKhueuJMImkR+PZUtOow6d/m8hWeO8oTU9DJDfbNDYSgz74mk57TiOQTGiF9G1PpemvKVXxU/ceCIQuYnO8+MyhSO Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id u21-20020a50a415000000b005600e34d331si1502892edb.689.2024.02.10.21.30.58; Sat, 10 Feb 2024 21:30:58 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@jonb.org header.s=fm1 header.b=0oQ2aKAe; dkim=neutral (body hash did not verify) header.i=@messagingengine.com header.s=fm3 header.b=PYGbZu5t; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5CDE568CDD8; Sun, 11 Feb 2024 07:30:54 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from fhigh3-smtp.messagingengine.com (fhigh3-smtp.messagingengine.com [103.168.172.154]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id A09AE68CA86 for ; Sun, 11 Feb 2024 07:30:47 +0200 (EET) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailfhigh.nyi.internal (Postfix) with ESMTP id AE49011400AD; Sun, 11 Feb 2024 00:30:45 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute1.internal (MEProxy); Sun, 11 Feb 2024 00:30:45 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=jonb.org; h=cc :cc:content-transfer-encoding:content-type:date:date:from:from :in-reply-to:message-id:mime-version:reply-to:subject:subject:to :to; s=fm1; t=1707629445; x=1707715845; bh=igaXNnzXXZvJGpBi71S76 Xk4wZ7tBBPaXT1g/C0L3wc=; b=0oQ2aKAe23AcY5uDyz/3ISxlBgkSNJmRd1MRf 9DgfzSvVsZZ8iVy9C0Zcl2na6Hn6GrGsK7MDU7QNyCcr4k8dbZII3nz1ycqDPKkd Arte7D4+rEX8nuA5sMGCBgck4XkGYHI9s9O63NUTZgDQwwkYD0AL+FSEB/Cs44zp uuBOPMKnfdDmn/wX2+7/poUfzk4psKRGUKeOwEoDQSG6yER+v9J/CmN/4RMHv6Zb 1yBUEevmqBGX3RcKgDshkKrbTkA4048f3ljLRHiXpw37DMNXX6R2pgZLKwbzQ0Mm XxJlB4+5HE9+r7koIr9GT5+PCzrodPG924DMwaXT75CS2r6/g== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:message-id:mime-version:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; t=1707629445; x=1707715845; bh=igaXNnzXXZvJGpBi71S76Xk4wZ7t BBPaXT1g/C0L3wc=; b=PYGbZu5t2VbjLNAspM5FXMgDC6xt5DHw7d1uRU0YARnc O69KOzkXjMNwels3rj8QtOmxqDl/g8sHiNzT5cWQOrPOtTELZf42NLORt+yTDRWe rhDE9oj09J/qQ6eAjBdKGJ+mOdFiz9P0/abYJ2l4ijNjweguUMc/0v8zmhN+TPwq w4s414CYZ76L0A3L5qQFBYQlR+ASEFBxL7lXZS5n4mnx3+C1JiF9Pl4V5Rn2PcpJ WmPZsMB97ybgTR4xqeEg+YeqjhCxTbPpY94OYzPhTxLEHi/UVdaHDORDo7BMIG2O YYaYHFnwDLmmcAbbHrofpTf6TVPWRwkPWjkXcfPg0A== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledruddtgdekgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvvefufffkofgggfestdekredtre dttdenucfhrhhomhepjhhonhesjhhonhgsrdhorhhgnecuggftrfgrthhtvghrnhepgfef hfetledvudeltdetgffftedtleettdeuteeiffeufeeljeeileekjeetfeefnecuvehluh hsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepjhhonhesjhhonhgs rdhorhhg X-ME-Proxy: Feedback-ID: iff8147ab:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Sun, 11 Feb 2024 00:30:44 -0500 (EST) From: jon@jonb.org To: ffmpeg-devel@ffmpeg.org Date: Sun, 11 Feb 2024 14:30:38 +0900 Message-ID: <20240211053038.74908-1-jon@jonb.org> X-Mailer: git-send-email 2.41.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [RFC PATCH] avformat/rtpdec: Audio level RTP extension RFC6464 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Jonathan Baudanza Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 8x+Wy+rrptoi From: Jonathan Baudanza libwebrtc will add audio level (in decibels) and VAD status to each RTP packet. This patch will add both values to the packet sidedata. I've been using this patch in production for about a year on live audio RTP streams to detect when users are speaking without needing to decode the audio data. --- libavcodec/avpacket.c | 1 + libavcodec/defs.h | 15 ++++++++ libavcodec/packet.h | 5 +++ libavformat/rtpdec.c | 87 +++++++++++++++++++++++++++++++++++++++++++ libavformat/rtpdec.h | 5 +++ libavformat/rtsp.c | 16 ++++++++ libavformat/rtsp.h | 2 + 7 files changed, 131 insertions(+) diff --git a/libavcodec/avpacket.c b/libavcodec/avpacket.c index e118bbaad1..73e0341bf7 100644 --- a/libavcodec/avpacket.c +++ b/libavcodec/avpacket.c @@ -305,6 +305,7 @@ const char *av_packet_side_data_name(enum AVPacketSideDataType type) case AV_PKT_DATA_IAMF_MIX_GAIN_PARAM: return "IAMF Mix Gain Parameter Data"; case AV_PKT_DATA_IAMF_DEMIXING_INFO_PARAM: return "IAMF Demixing Info Parameter Data"; case AV_PKT_DATA_IAMF_RECON_GAIN_INFO_PARAM: return "IAMF Recon Gain Info Parameter Data"; + case AV_PKT_DATA_SSRC_AUDIO_LEVEL: return "RTP SSRC Audio Level"; } return NULL; } diff --git a/libavcodec/defs.h b/libavcodec/defs.h index 00d840ec19..87e8814760 100644 --- a/libavcodec/defs.h +++ b/libavcodec/defs.h @@ -323,6 +323,21 @@ typedef struct AVProducerReferenceTime { int flags; } AVProducerReferenceTime; +/** + * Audio level structure from the ssrc-audio-level RTP header extension. + */ +typedef struct AVAudioLevel { + /** + * Audio level for this packet, measured in dBov: -127 - 0 + */ + int8_t level; + + /** + * Set to 1 if the encoder believes this packet contains voice. + */ + int voice; +} AVAudioLevel; + /** * Encode extradata length to a buffer. Used by xiph codecs. * diff --git a/libavcodec/packet.h b/libavcodec/packet.h index 8558ae849e..f7f1deb6e0 100644 --- a/libavcodec/packet.h +++ b/libavcodec/packet.h @@ -330,6 +330,11 @@ enum AVPacketSideDataType { */ AV_PKT_DATA_AMBIENT_VIEWING_ENVIRONMENT, + /** + * Audio Level and VAD data from the RTP header extension as defined by RFC 6464. + */ + AV_PKT_DATA_SSRC_AUDIO_LEVEL, + /** * The number of side data types. * This is not part of the public API/ABI in the sense that it may diff --git a/libavformat/rtpdec.c b/libavformat/rtpdec.c index fa7544cc07..479ea2e245 100644 --- a/libavformat/rtpdec.c +++ b/libavformat/rtpdec.c @@ -694,6 +694,79 @@ static void finalize_packet(RTPDemuxContext *s, AVPacket *pkt, uint32_t timestam s->base_timestamp; } + +static const uint8_t* find_header_ext_data(int id, const uint8_t *buf, uint8_t *len) { + int buflen = (AV_RB16(buf + 2)) * 4; + + const uint8_t *p = buf + 4; + int idx = 0; + int this_id; + int this_len; + + // This is a one-byte extention format, as defined by RFC rfc5285 + if (buf[0] == 0xbe && buf[1] == 0xde) { + while (idx + 1 < buflen) { + if (p[idx] == 0) { + idx++; // skip padding + } else { + this_id = p[idx] >> 4; + this_len = (p[idx] & 0xf) + 1; + + // spec says 15 is reserved + if (this_id == 15) { + break; // reject + } + + if (this_id == id) { + if (this_len > buflen - idx - 1) { + break; // reject + } + + if (len != NULL) + *len = this_len; + + return p + idx + 1; + } + + idx += 1 + this_len; + } + } + } else if (buf[0] == 0x10 && (buf[1] & 0xff) == 0) { + // This is a two-byte extention format + while (idx + 1 < buflen) { + if (p[idx] == 0) { + idx++; // Skip padding + } else { + this_id = p[idx]; + this_len = p[idx + 1]; + + // spec says 15 is reserved + if (this_id == 15) { + break; // reject + } + + if (this_id == id) { + if (this_len > buflen - idx - 2) { + break; // reject + } + + if (len != NULL) + *len = this_len; + return p + idx + 2; + } + + idx += 2 + this_len; + } + } + } + + if (len != NULL) + *len = 0; + + return NULL; +} + + static int rtp_parse_packet_internal(RTPDemuxContext *s, AVPacket *pkt, const uint8_t *buf, int len) { @@ -703,6 +776,7 @@ static int rtp_parse_packet_internal(RTPDemuxContext *s, AVPacket *pkt, AVStream *st; uint32_t timestamp; int rv = 0; + const uint8_t *audio_level_data = NULL; csrc = buf[0] & 0x0f; ext = buf[0] & 0x10; @@ -753,6 +827,11 @@ static int rtp_parse_packet_internal(RTPDemuxContext *s, AVPacket *pkt, if (len < ext) return -1; + + if (s->ssrc_audio_level_ext_id) { + audio_level_data = find_header_ext_data(s->ssrc_audio_level_ext_id, buf, NULL); + } + // skip past RTP header extension len -= ext; buf += ext; @@ -774,6 +853,14 @@ static int rtp_parse_packet_internal(RTPDemuxContext *s, AVPacket *pkt, // now perform timestamp things.... finalize_packet(s, pkt, timestamp); + if (audio_level_data) { + AVAudioLevel *side_data = (struct AVAudioLevel *)av_packet_new_side_data(pkt, AV_PKT_DATA_SSRC_AUDIO_LEVEL, sizeof(AVAudioLevel)); + if (side_data) { + side_data->voice = ((*audio_level_data & 0x80) == 0x80); + side_data->level = -(*audio_level_data & 0x7f); + } + } + return rv; } diff --git a/libavformat/rtpdec.h b/libavformat/rtpdec.h index 5a02e72dc2..91a338200a 100644 --- a/libavformat/rtpdec.h +++ b/libavformat/rtpdec.h @@ -188,6 +188,11 @@ struct RTPDemuxContext { /* dynamic payload stuff */ const RTPDynamicProtocolHandler *handler; PayloadContext *dynamic_protocol_context; + + /** + * RFC 6464 header extension id + */ + int ssrc_audio_level_ext_id; }; /** diff --git a/libavformat/rtsp.c b/libavformat/rtsp.c index c7d9b48684..63bc67fdf7 100644 --- a/libavformat/rtsp.c +++ b/libavformat/rtsp.c @@ -691,6 +691,21 @@ static void sdp_parse_line(AVFormatContext *s, SDPParseState *s1, } } } + } else if (av_strstart(p, "extmap:", &p)) { + char *end; + int id; + id = strtol(p, &end, 10); + if (p == end) { + break; + } + p = end; + + get_word(buf1, sizeof(buf1), &p); + + if (!strcmp(buf1, "urn:ietf:params:rtp-hdrext:ssrc-audio-level")) { + rtsp_st = rt->rtsp_streams[rt->nb_rtsp_streams - 1]; + rtsp_st->ssrc_audio_level_ext_id = id; + } } else { if (rt->server_type == RTSP_SERVER_WMS) ff_wms_parse_sdp_a_line(s, p); @@ -868,6 +883,7 @@ int ff_rtsp_open_transport_ctx(AVFormatContext *s, RTSPStream *rtsp_st) s->iformat) { RTPDemuxContext *rtpctx = rtsp_st->transport_priv; rtpctx->ssrc = rtsp_st->ssrc; + rtpctx->ssrc_audio_level_ext_id = rtsp_st->ssrc_audio_level_ext_id; if (rtsp_st->dynamic_handler) { ff_rtp_parse_set_dynamic_protocol(rtsp_st->transport_priv, rtsp_st->dynamic_protocol_context, diff --git a/libavformat/rtsp.h b/libavformat/rtsp.h index 83b2e3f4fb..4315bbe2c8 100644 --- a/libavformat/rtsp.h +++ b/libavformat/rtsp.h @@ -483,6 +483,8 @@ typedef struct RTSPStream { char crypto_suite[40]; char crypto_params[100]; + + int ssrc_audio_level_ext_id; } RTSPStream; void ff_rtsp_parse_line(AVFormatContext *s,