diff mbox series

[FFmpeg-devel,v4] decklink: Add support for compressed AC-3 output over SDI

Message ID 20230403212823.890-1-dheitmueller@ltnglobal.com
State New
Headers show
Series [FFmpeg-devel,v4] decklink: Add support for compressed AC-3 output over SDI | expand

Checks

Context Check Description
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Devin Heitmueller April 3, 2023, 9:28 p.m. UTC
Extend the decklink output to include support for compressed AC-3,
encapsulated using the SMPTE ST 377:2015 standard.

This functionality can be exercised by using the "copy" codec when
the input audio stream is AC-3.  For example:

./ffmpeg -i ~/foo.ts -codec:a copy -f decklink 'UltraStudio Mini Monitor'

Note that the default behavior continues to be to do PCM output,
which means without specifying the copy codec a stream containing
AC-3 will be decoded and downmixed to stereo audio before output.

Thanks to Marton Balint for providing feedback.

Signed-off-by: Devin Heitmueller <dheitmueller@ltnglobal.com>
---
 libavdevice/decklink_enc.cpp | 97 ++++++++++++++++++++++++++++++------
 1 file changed, 82 insertions(+), 15 deletions(-)

Comments

Marton Balint April 5, 2023, 9:52 p.m. UTC | #1
On Mon, 3 Apr 2023, Devin Heitmueller wrote:

> Extend the decklink output to include support for compressed AC-3,
> encapsulated using the SMPTE ST 377:2015 standard.
>
> This functionality can be exercised by using the "copy" codec when
> the input audio stream is AC-3.  For example:
>
> ./ffmpeg -i ~/foo.ts -codec:a copy -f decklink 'UltraStudio Mini Monitor'
>
> Note that the default behavior continues to be to do PCM output,
> which means without specifying the copy codec a stream containing
> AC-3 will be decoded and downmixed to stereo audio before output.
>
> Thanks to Marton Balint for providing feedback.
>
> Signed-off-by: Devin Heitmueller <dheitmueller@ltnglobal.com>
> ---
> libavdevice/decklink_enc.cpp | 97 ++++++++++++++++++++++++++++++------
> 1 file changed, 82 insertions(+), 15 deletions(-)
>
> diff --git a/libavdevice/decklink_enc.cpp b/libavdevice/decklink_enc.cpp
> index 8d423f6b6e..9ee1925fd0 100644

[...]

> --- a/libavdevice/decklink_enc.cpp
> +++ b/libavdevice/decklink_enc.cpp
> +/* Wrap the AC-3 packet into an S337 payload that is in S16LE format which can be easily
> +   injected into the PCM stream.  Note: despite the function name, only AC-3 is implemented */
> +static int create_s337_payload(AVPacket *pkt, enum AVCodecID codec_id, uint8_t **outbuf, int *outsize)

Actually you can remove the codec_id parameter as well...

> +{
> +    // Note: if the packet is an odd-number of bytes, we need to make
> +    // the actual payload one byte larger to ensure it ends on an S16LE boundary
> +    int payload_size = pkt->size + (pkt->size % 2) + 8;

FFALIGN(pkt->size, 2). But you'd want FFALIGN(pkt->size, 4) because you 
want the buffer size to be divisable by 4 because later decklink needs a 
sample count...

> +    uint16_t bitcount = pkt->size * 8;

Is this supposed to be aligned too? I see similar code in 
libavformat/spdifenc.c and FFALIGN(pkt->size, 2) << 3 is used there.

> +    uint8_t *s337_payload;
> +    PutByteContext pb;
> +
> +    /* Sanity check:  According to SMPTE ST 340:2015 Sec 4.1, the AC-3 sync frame will
> +       exactly match the 1536 samples of baseband (PCM) audio that it represents.  */
> +    if (pkt->size > 1536)
> +        return AVERROR(EINVAL);
> +
> +    /* Encapsulate AC3 syncframe into SMPTE 337 packet */
> +    s337_payload = (uint8_t *) av_malloc(payload_size);
> +    if (s337_payload == NULL)
> +        return AVERROR(ENOMEM);
> +    bytestream2_init_writer(&pb, s337_payload, payload_size);
> +    bytestream2_put_le16u(&pb, 0xf872); /* Sync word 1 */
> +    bytestream2_put_le16u(&pb, 0x4e1f); /* Sync word 1 */
> +    bytestream2_put_le16u(&pb, 0x0001); /* Burst Info, including data type (1=ac3) */
> +    bytestream2_put_le16u(&pb, bitcount); /* Length code */
> +    for (int i = 0; i < (pkt->size - 1); i += 2)
> +        bytestream2_put_le16u(&pb, (pkt->data[i] << 8) | pkt->data[i+1]);
> +    if (pkt->size % 2)

pkt->size & 1

> +        bytestream2_put_le16u(&pb, pkt->data[pkt->size - 1] << 8);
> +

And you likely want a bytestream2_put_le16(&pb, 0) in the end so even 
the end of the 4-byte aligned buffer is properly zeroed.

Thanks,
Marton
Devin Heitmueller April 7, 2023, 7:03 p.m. UTC | #2
Hello Marton,

Thanks for the continued feedback.  Comments inline.

On Wed, Apr 5, 2023 at 5:52 PM Marton Balint <cus@passwd.hu> wrote:
> > --- a/libavdevice/decklink_enc.cpp
> > +++ b/libavdevice/decklink_enc.cpp
> > +/* Wrap the AC-3 packet into an S337 payload that is in S16LE format which can be easily
> > +   injected into the PCM stream.  Note: despite the function name, only AC-3 is implemented */
> > +static int create_s337_payload(AVPacket *pkt, enum AVCodecID codec_id, uint8_t **outbuf, int *outsize)
>
> Actually you can remove the codec_id parameter as well...

Ok.

> > +{
> > +    // Note: if the packet is an odd-number of bytes, we need to make
> > +    // the actual payload one byte larger to ensure it ends on an S16LE boundary
> > +    int payload_size = pkt->size + (pkt->size % 2) + 8;
>
> FFALIGN(pkt->size, 2). But you'd want FFALIGN(pkt->size, 4) because you
> want the buffer size to be divisable by 4 because later decklink needs a
> sample count...

Ok.

> > +    uint16_t bitcount = pkt->size * 8;
>
> Is this supposed to be aligned too? I see similar code in
> libavformat/spdifenc.c and FFALIGN(pkt->size, 2) << 3 is used there.

I reviewed SMPTE ST337:2015 as well as ST338:2016, and I think this
might actually be a mistake in spdifenc.c.  There's nothing to suggest
a hard requirement that the payload be aligned on a two byte boundary,
and in fact I suspect it would cause checksum failures in certain
codecs given the checksums are often at the end of the packet payload
(and adding a padding byte would cause the checksum field itself to be
in the wrong position relative to the end of the packet).

Now in practice I suspect you wouldn't likely find packets that aren't
aligned on a two-byte boundary, simply because of the nature of the
codecs used (e.g. AC-3 packets are always 1536 bytes).  This would
certainly explain how the logic in spdifenc.c is incorrect but it
never causes any failures in real-world use.

I'm inclined to leave the logic as-is, unless somebody can offer a
good counter argument.

> > +    if (pkt->size % 2)
>
> pkt->size & 1

Ok.

> > +        bytestream2_put_le16u(&pb, pkt->data[pkt->size - 1] << 8);
> > +
>
> And you likely want a bytestream2_put_le16(&pb, 0) in the end so even
> the end of the 4-byte aligned buffer is properly zeroed.

Ok.

I will submit an updated patch reflecting the changes above.

Devin
diff mbox series

Patch

diff --git a/libavdevice/decklink_enc.cpp b/libavdevice/decklink_enc.cpp
index 8d423f6b6e..9ee1925fd0 100644
--- a/libavdevice/decklink_enc.cpp
+++ b/libavdevice/decklink_enc.cpp
@@ -32,6 +32,7 @@  extern "C" {
 
 extern "C" {
 #include "libavformat/avformat.h"
+#include "libavcodec/bytestream.h"
 #include "libavutil/internal.h"
 #include "libavutil/imgutils.h"
 #include "avdevice.h"
@@ -243,19 +244,32 @@  static int decklink_setup_audio(AVFormatContext *avctx, AVStream *st)
         av_log(avctx, AV_LOG_ERROR, "Only one audio stream is supported!\n");
         return -1;
     }
-    if (c->sample_rate != 48000) {
-        av_log(avctx, AV_LOG_ERROR, "Unsupported sample rate!"
-               " Only 48kHz is supported.\n");
-        return -1;
-    }
-    if (c->ch_layout.nb_channels != 2 && c->ch_layout.nb_channels != 8 && c->ch_layout.nb_channels != 16) {
-        av_log(avctx, AV_LOG_ERROR, "Unsupported number of channels!"
-               " Only 2, 8 or 16 channels are supported.\n");
+
+    if (c->codec_id == AV_CODEC_ID_AC3) {
+        /* Regardless of the number of channels in the codec, we're only
+           using 2 SDI audio channels at 48000Hz */
+        ctx->channels = 2;
+    } else if (c->codec_id == AV_CODEC_ID_PCM_S16LE) {
+        if (c->sample_rate != 48000) {
+            av_log(avctx, AV_LOG_ERROR, "Unsupported sample rate!"
+                   " Only 48kHz is supported.\n");
+            return -1;
+        }
+        if (c->ch_layout.nb_channels != 2 && c->ch_layout.nb_channels != 8 && c->ch_layout.nb_channels != 16) {
+            av_log(avctx, AV_LOG_ERROR, "Unsupported number of channels!"
+                   " Only 2, 8 or 16 channels are supported.\n");
+            return -1;
+        }
+        ctx->channels = c->ch_layout.nb_channels;
+    } else {
+        av_log(avctx, AV_LOG_ERROR, "Unsupported codec specified!"
+               " Only PCM_S16LE and AC-3 are supported.\n");
         return -1;
     }
+
     if (ctx->dlo->EnableAudioOutput(bmdAudioSampleRate48kHz,
                                     bmdAudioSampleType16bitInteger,
-                                    c->ch_layout.nb_channels,
+                                    ctx->channels,
                                     bmdAudioOutputStreamTimestamped) != S_OK) {
         av_log(avctx, AV_LOG_ERROR, "Could not enable audio output!\n");
         return -1;
@@ -266,14 +280,48 @@  static int decklink_setup_audio(AVFormatContext *avctx, AVStream *st)
     }
 
     /* The device expects the sample rate to be fixed. */
-    avpriv_set_pts_info(st, 64, 1, c->sample_rate);
-    ctx->channels = c->ch_layout.nb_channels;
+    avpriv_set_pts_info(st, 64, 1, 48000);
 
     ctx->audio = 1;
 
     return 0;
 }
 
+/* Wrap the AC-3 packet into an S337 payload that is in S16LE format which can be easily
+   injected into the PCM stream.  Note: despite the function name, only AC-3 is implemented */
+static int create_s337_payload(AVPacket *pkt, enum AVCodecID codec_id, uint8_t **outbuf, int *outsize)
+{
+    // Note: if the packet is an odd-number of bytes, we need to make
+    // the actual payload one byte larger to ensure it ends on an S16LE boundary
+    int payload_size = pkt->size + (pkt->size % 2) + 8;
+    uint16_t bitcount = pkt->size * 8;
+    uint8_t *s337_payload;
+    PutByteContext pb;
+
+    /* Sanity check:  According to SMPTE ST 340:2015 Sec 4.1, the AC-3 sync frame will
+       exactly match the 1536 samples of baseband (PCM) audio that it represents.  */
+    if (pkt->size > 1536)
+        return AVERROR(EINVAL);
+
+    /* Encapsulate AC3 syncframe into SMPTE 337 packet */
+    s337_payload = (uint8_t *) av_malloc(payload_size);
+    if (s337_payload == NULL)
+        return AVERROR(ENOMEM);
+    bytestream2_init_writer(&pb, s337_payload, payload_size);
+    bytestream2_put_le16u(&pb, 0xf872); /* Sync word 1 */
+    bytestream2_put_le16u(&pb, 0x4e1f); /* Sync word 1 */
+    bytestream2_put_le16u(&pb, 0x0001); /* Burst Info, including data type (1=ac3) */
+    bytestream2_put_le16u(&pb, bitcount); /* Length code */
+    for (int i = 0; i < (pkt->size - 1); i += 2)
+        bytestream2_put_le16u(&pb, (pkt->data[i] << 8) | pkt->data[i+1]);
+    if (pkt->size % 2)
+        bytestream2_put_le16u(&pb, pkt->data[pkt->size - 1] << 8);
+
+    *outsize = payload_size;
+    *outbuf = s337_payload;
+    return 0;
+}
+
 av_cold int ff_decklink_write_trailer(AVFormatContext *avctx)
 {
     struct decklink_cctx *cctx = (struct decklink_cctx *)avctx->priv_data;
@@ -531,21 +579,40 @@  static int decklink_write_audio_packet(AVFormatContext *avctx, AVPacket *pkt)
 {
     struct decklink_cctx *cctx = (struct decklink_cctx *)avctx->priv_data;
     struct decklink_ctx *ctx = (struct decklink_ctx *)cctx->ctx;
-    int sample_count = pkt->size / (ctx->channels << 1);
+    AVStream *st = avctx->streams[pkt->stream_index];
+    int sample_count;
     uint32_t buffered;
+    uint8_t *outbuf = NULL;
+    int ret = 0;
 
     ctx->dlo->GetBufferedAudioSampleFrameCount(&buffered);
     if (pkt->pts > 1 && !buffered)
         av_log(avctx, AV_LOG_WARNING, "There's no buffered audio."
                " Audio will misbehave!\n");
 
-    if (ctx->dlo->ScheduleAudioSamples(pkt->data, sample_count, pkt->pts,
+    if (st->codecpar->codec_id == AV_CODEC_ID_AC3) {
+        /* Encapsulate AC3 syncframe into SMPTE 337 packet */
+        int outbuf_size;
+        ret = create_s337_payload(pkt, st->codecpar->codec_id,
+                                  &outbuf, &outbuf_size);
+        if (ret < 0)
+            return ret;
+        sample_count = outbuf_size / 4;
+    } else {
+        sample_count = pkt->size / (ctx->channels << 1);
+        outbuf = pkt->data;
+    }
+
+    if (ctx->dlo->ScheduleAudioSamples(outbuf, sample_count, pkt->pts,
                                        bmdAudioSampleRate48kHz, NULL) != S_OK) {
         av_log(avctx, AV_LOG_ERROR, "Could not schedule audio samples.\n");
-        return AVERROR(EIO);
+        ret = AVERROR(EIO);
     }
 
-    return 0;
+    if (st->codecpar->codec_id == AV_CODEC_ID_AC3)
+        av_freep(&outbuf);
+
+    return ret;
 }
 
 extern "C" {