From patchwork Sat Feb 25 03:39:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: JonHGee X-Patchwork-Id: 40520 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:5494:b0:bf:7b3a:fd32 with SMTP id i20csp1792967pzk; Fri, 24 Feb 2023 19:39:35 -0800 (PST) X-Google-Smtp-Source: AK7set+MGMakxP39YBHEFA5CUI3VQ83HIVfLJvLmm9BF2DeqtPUazK8gKaoO9bZT7jd5zmbWL60Q X-Received: by 2002:a17:907:c002:b0:8f2:bd2f:e321 with SMTP id ss2-20020a170907c00200b008f2bd2fe321mr6267702ejc.45.1677296375301; Fri, 24 Feb 2023 19:39:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677296375; cv=none; d=google.com; s=arc-20160816; b=g7gzrwgI/A+76+89Dt6OAoCsOQ2iui/rUH4Z3I3EWOlCsPI2Rx//E3uttRclZkqQbl Pc16c/5oCLhKfcv02tKJOAqNvzEp++FQBp26jZRAgNgF+PzTGRsBp7Jv12awyqISa+59 3L7lIHJ8V0+WCgY0FeVHeiKiRLNxwaOVInz4TaTEIARSII/BqnF8obkkhZpWbaXu0Am9 rsUgjEDW1UWGo86lRlhAzc4nAAMlLaT7T4JE7vNgXpzBdThyUQqxOvymKIhKcq8L70RT mS04Juawp0kBUKHDj4CGN7hqd0BQUaBo3csSJk6d2WKlu0LHGuAMUVPDvB3/ohC7O4oF lPaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=qJ0KJTvBMFhwp6AivRPXymDTusWB3mwMgtA+LITGfkc=; b=u0y9j1Y3P1Z5ueOQ69S2U6WZZauG57VYFYnTlr9msO0vHG67g+8ksgYz/tikZGOOR4 jTG92t+7hCQQTw+lK0DNgVgSua2uHWEKNMn+K9R4XOUITB+mr5l6rqMLGlbVnNSAXoUs y6Rwp8OP0g4NAToyU3kEmCas9Ru9FNjhHWHVNVykdbYzfJZ4owaxpiyQJPpX22Lk5TfF witS5rvVDsYkZ/k/Xgk8wjAM6iEItYGx0TPAPUQ8XpWCOYo8/lBWO6nYOqtoz7WE0fLs Ix1wdi/1mVsx/0GXfkIFh8rFnTPJwRgKe+3MtSUhla5JAq3Pdg2QOpNwoctWHAOvr1UW 2ewA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=esbBgeOG; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id sa6-20020a170906eda600b008d518d4bf6csi872732ejb.861.2023.02.24.19.39.34; Fri, 24 Feb 2023 19:39:35 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=esbBgeOG; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C307A68C0F5; Sat, 25 Feb 2023 05:39:29 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C925668BB83 for ; Sat, 25 Feb 2023 05:39:22 +0200 (EET) Received: by mail-pl1-f175.google.com with SMTP id e5so1451565plg.8 for ; Fri, 24 Feb 2023 19:39:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=ZANrNoPViZge5aFa3vxe6V8fFpXLDIoqxh/o+/LcaWU=; b=esbBgeOG+YwdHji8r9Wi4TnAxqaauBRMwBIhk2gIZBWJC82rOr4TQ0+earWcu3toqR z0b/vcQVVUS8DUoUJT+mBAvS2O/OAD+cLvDUfUBU5CozwCJpEpjNUIY1e65CHVxupXsz wDPvnnnmREp9B+Tmt1z65pbJrRpMt2sSXPoKzkMXTZJ95W+EvOPPFrUvsoS4u7etQgLM P3wtxPlhIWZBaZmxAibd2UXnxzNYY0t6ux47OAQKvmvOClH+nbHSsMPbgkILX1vk1+ej +xkwU99z7zwWuUmvYm5o9j8UpLPqySat7AYotwm/b82XBbsuMyFMnfrEupFCMPTGtwdF i+IA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ZANrNoPViZge5aFa3vxe6V8fFpXLDIoqxh/o+/LcaWU=; b=Qs40jbnpDHF0qagC0XLFFrsdPmlsG1apbCSJhRQarRtgv1W22teZHVkQopM/yj8pUR ZBOiScdogOQK4YLkJ5AswiXYxQ0k4zLCLwskKeZCzjsUEwrGaHrO3kwRE0dk9jxT9nsV q8rZKrykVZQTnZNXJ2Vcs6Q2LhLLNbPSHSCJh4maF9HCg6UX27XHY+/7IgeMLC820wIy nTUlu4oMVcvAJy4JPOMUhvT4weW3RdxwVDQPIeoJWDttYJs/ahy9m3JQ+TqgzvAGbrop eUjx/c27VmwfTvfJCenVDfVAQURmFlBiupPtpXEvRHwAVKwA/nzyPGCVjA7kfQx329yY OgcA== X-Gm-Message-State: AO0yUKW2FTtx/rBJVVzc5xmnS6eMsUe3ImarbjeX267htDOYswI9GcuN yotWZ4paBqkWYtVE15AQn1PFA6wEfOk= X-Received: by 2002:a05:6a20:3d17:b0:cc:75b8:7cba with SMTP id y23-20020a056a203d1700b000cc75b87cbamr5509979pzi.43.1677296360515; Fri, 24 Feb 2023 19:39:20 -0800 (PST) Received: from jongeegti.c.googlers.com.com (170.102.105.34.bc.googleusercontent.com. [34.105.102.170]) by smtp.gmail.com with ESMTPSA id d7-20020aa78147000000b005ccbe5346ebsm248383pfn.163.2023.02.24.19.39.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Feb 2023 19:39:20 -0800 (PST) From: JonHGee X-Google-Original-From: JonHGee To: ffmpeg-devel@ffmpeg.org Date: Sat, 25 Feb 2023 03:39:01 +0000 Message-Id: <20230225033901.3264517-1-JonHGee@gmail.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] libavcodec/libfdk-aacenc: Enable writing DRC metadata X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: JonHGee Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: hNCdCt48/6Lu Added basic DRC options and ref levels to AV options. If any options are selected, metadata mode is set accordingly and metadata is added to input buffer per FDK encoder API. --- libavcodec/libfdk-aacenc.c | 72 ++++++++++++++++++++++++++++++-------- 1 file changed, 58 insertions(+), 14 deletions(-) diff --git a/libavcodec/libfdk-aacenc.c b/libavcodec/libfdk-aacenc.c index 54549de473..4e67b1ece3 100644 --- a/libavcodec/libfdk-aacenc.c +++ b/libavcodec/libfdk-aacenc.c @@ -46,6 +46,12 @@ typedef struct AACContext { int latm; int header_period; int vbr; + int drc_profile; + int drc_target_ref; + int comp_profile; + int comp_target_ref; + int prog_ref; + AACENC_MetaData metaDataSetup; AudioFrameQueue afq; } AACContext; @@ -64,6 +70,11 @@ static const AVOption aac_enc_options[] = { { "latm", "Output LATM/LOAS encapsulated data", offsetof(AACContext, latm), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, 1, AV_OPT_FLAG_AUDIO_PARAM | AV_OPT_FLAG_ENCODING_PARAM }, { "header_period", "StreamMuxConfig and PCE repetition period (in frames)", offsetof(AACContext, header_period), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, 0xffff, AV_OPT_FLAG_AUDIO_PARAM | AV_OPT_FLAG_ENCODING_PARAM }, { "vbr", "VBR mode (1-5)", offsetof(AACContext, vbr), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, 5, AV_OPT_FLAG_AUDIO_PARAM | AV_OPT_FLAG_ENCODING_PARAM }, + { "drc_profile", "The desired compression profile for AAC DRC", offsetof(AACContext, drc_profile), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, 256, AV_OPT_FLAG_AUDIO_PARAM | AV_OPT_FLAG_ENCODING_PARAM }, + { "drc_target_ref", "Expected target reference level at decoder side in dB (for clipping prevention/limiter)", offsetof(AACContext, drc_target_ref), AV_OPT_TYPE_INT, { .i64 = 0.0 }, -31.75, 0, AV_OPT_FLAG_AUDIO_PARAM | AV_OPT_FLAG_ENCODING_PARAM }, + { "comp_profile", "The desired compression profile for AAC DRC", offsetof(AACContext, comp_profile), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, 256, AV_OPT_FLAG_AUDIO_PARAM | AV_OPT_FLAG_ENCODING_PARAM }, + { "comp_target_ref", "Expected target reference level at decoder side in dB (for clipping prevention/limiter)", offsetof(AACContext, comp_target_ref), AV_OPT_TYPE_INT, { .i64 = 0.0 }, -31.75, 0, AV_OPT_FLAG_AUDIO_PARAM | AV_OPT_FLAG_ENCODING_PARAM }, + { "prog_ref", "The program reference level or dialog level in dB", offsetof(AACContext, prog_ref), AV_OPT_TYPE_INT, { .i64 = 0.0 }, -31.75, 0, AV_OPT_FLAG_AUDIO_PARAM | AV_OPT_FLAG_ENCODING_PARAM }, FF_AAC_PROFILE_OPTS { NULL } }; @@ -127,6 +138,7 @@ static av_cold int aac_encode_init(AVCodecContext *avctx) AACENC_ERROR err; int aot = FF_PROFILE_AAC_LOW + 1; int sce = 0, cpe = 0; + int metadata_mode = 0; if ((err = aacEncOpen(&s->handle, 0, avctx->ch_layout.nb_channels)) != AACENC_OK) { av_log(avctx, AV_LOG_ERROR, "Unable to open the encoder: %s\n", @@ -319,6 +331,29 @@ static av_cold int aac_encode_init(AVCodecContext *avctx) } } + if (s->prog_ref) { + metadata_mode = 1; + s->metaDataSetup.prog_ref_level_present = 1; + s->metaDataSetup.prog_ref_level = s->prog_ref << 16; + } + if (s->drc_profile) { + metadata_mode = 1; + s->metaDataSetup.drc_profile = s->drc_profile; + s->metaDataSetup.drc_TargetRefLevel = s->drc_target_ref << 16; + if (s->comp_profile) { + // Including the comp_profile means that we need to set the mode to ETSI + metadata_mode = 2; + s->metaDataSetup.comp_profile = s->comp_profile; + s->metaDataSetup.comp_TargetRefLevel = s->comp_target_ref << 16; + } + } + + if ((err = aacEncoder_SetParam(s->handle, AACENC_METADATA_MODE, metadata_mode)) != AACENC_OK) { + av_log(avctx, AV_LOG_ERROR, "Unable to set metadata mode to %d: %s\n", + metadata_mode, aac_get_error(err)); + goto error; + } + if ((err = aacEncEncode(s->handle, NULL, NULL, NULL, NULL)) != AACENC_OK) { av_log(avctx, AV_LOG_ERROR, "Unable to initialize the encoder: %s\n", aac_get_error(err)); @@ -363,11 +398,13 @@ static int aac_encode_frame(AVCodecContext *avctx, AVPacket *avpkt, AACENC_BufDesc in_buf = { 0 }, out_buf = { 0 }; AACENC_InArgs in_args = { 0 }; AACENC_OutArgs out_args = { 0 }; - int in_buffer_identifier = IN_AUDIO_DATA; - int in_buffer_size, in_buffer_element_size; + void* inBuffer[] = { 0, &s->metaDataSetup }; + int in_buffer_identifiers[] = { IN_AUDIO_DATA, IN_METADATA_SETUP }; + int in_buffer_element_sizes[] = { 2, sizeof(AACENC_MetaData) }; + int in_buffer_sizes[] = { 0 , sizeof(s->metaDataSetup) }; + void *out_ptr; int out_buffer_identifier = OUT_BITSTREAM_DATA; int out_buffer_size, out_buffer_element_size; - void *in_ptr, *out_ptr; int ret; uint8_t dummy_buf[1]; AACENC_ERROR err; @@ -376,27 +413,34 @@ static int aac_encode_frame(AVCodecContext *avctx, AVPacket *avpkt, if (!frame) { /* Must be a non-null pointer, even if it's a dummy. We could use * the address of anything else on the stack as well. */ - in_ptr = dummy_buf; - in_buffer_size = 0; + inBuffer[0] = dummy_buf; in_args.numInSamples = -1; } else { - in_ptr = frame->data[0]; - in_buffer_size = 2 * avctx->ch_layout.nb_channels * frame->nb_samples; + inBuffer[0] = frame->data[0]; + in_buffer_sizes[0] = 2 * avctx->channels * frame->nb_samples; - in_args.numInSamples = avctx->ch_layout.nb_channels * frame->nb_samples; + in_args.numInSamples = avctx->channels * frame->nb_samples; /* add current frame to the queue */ if ((ret = ff_af_queue_add(&s->afq, frame)) < 0) return ret; } - in_buffer_element_size = 2; - in_buf.numBufs = 1; - in_buf.bufs = &in_ptr; - in_buf.bufferIdentifiers = &in_buffer_identifier; - in_buf.bufSizes = &in_buffer_size; - in_buf.bufElSizes = &in_buffer_element_size; + // Only use audio input data if metadata mode is none. + if (aacEncoder_GetParam(s->handle, AACENC_METADATA_MODE) == 0) { + in_buf.numBufs = 1; + in_buf.bufs = &inBuffer[0]; + in_buf.bufferIdentifiers = &in_buffer_identifiers[0]; + in_buf.bufSizes = &in_buffer_sizes[0]; + in_buf.bufElSizes = &in_buffer_element_sizes[0]; + } else { + in_buf.numBufs = 2; + in_buf.bufs = (void**)&inBuffer; + in_buf.bufferIdentifiers = &in_buffer_identifiers; + in_buf.bufSizes = &in_buffer_sizes; + in_buf.bufElSizes = &in_buffer_element_sizes; + } /* The maximum packet size is 6144 bits aka 768 bytes per channel. */ ret = ff_alloc_packet(avctx, avpkt, FFMAX(8192, 768 * avctx->ch_layout.nb_channels));