diff mbox

[FFmpeg-devel] decklink 24/32 bit question

Message ID 1e3594e415405a5899ec7be9730bb469@dx9s.net
State New
Headers show

Commit Message

ffmpeg@dx9s.net Oct. 4, 2017, 3:39 a.m. UTC
After digging around in places, made the following changes:

dx@x299:~/git/ffmpeg$ git diff
      avpriv_set_pts_info(st, 64, 1, 1000000);  /* 64 bits pts in us */
@@ -1028,7 +1028,7 @@ av_cold int 
ff_decklink_read_header(AVFormatContext *avctx)
      }

      av_log(avctx, AV_LOG_VERBOSE, "Using %d input audio channels\n", 
ctx->audio_st->codecpar->channels);
-    result = ctx->dli->EnableAudioInput(bmdAudioSampleRate48kHz, 
bmdAudioSampleType16bitInteger, ctx->audio_st->codecpar->channels);
+    result = ctx->dli->EnableAudioInput(bmdAudioSampleRate48kHz, 
bmdAudioSampleType32bitInteger, ctx->audio_st->codecpar->channels);

      if (result != S_OK) {
          av_log(avctx, AV_LOG_ERROR, "Cannot enable audio input\n");


It doesn't work (the audio capture is close but wrong), but believe it 
is a step in the correct direction. Anybody have a clue? Saw several 
names in cpp,c,h files including: Ramiro Polla, Luca Barbato, Deti 
Fliegl, Rafaël Carré and Akamai Technologies Inc.

Thanks in advance!

--Doug (dx9s)

Comments

Moritz Barsnick Oct. 17, 2017, 6:59 p.m. UTC | #1
Hi Doug,

On Tue, Oct 03, 2017 at 20:39:49 -0700, Douglas Marsh wrote:
> After digging around in places, made the following changes:
[...]
> It doesn't work (the audio capture is close but wrong), but believe it 
> is a step in the correct direction. Anybody have a clue? Saw several 
> names in cpp,c,h files including: Ramiro Polla, Luca Barbato, Deti 
> Fliegl, Rafaël Carré and Akamai Technologies Inc.

Did you check out Dave Rice's recent patch (on this list)? It touches
code in a few more places, and adds an option to select 16 vs. 32 bits.
Please test, if you can.

Is your subject indicating that 24 bits depth could also be supported?
If so, Dave perhaps should expand his patch to cover that.

Moritz
Dave Rice Oct. 17, 2017, 7:12 p.m. UTC | #2
> On Oct 17, 2017, at 2:59 PM, Moritz Barsnick <barsnick@gmx.net> wrote:
> 
> Hi Doug,
> 
> On Tue, Oct 03, 2017 at 20:39:49 -0700, Douglas Marsh wrote:
>> After digging around in places, made the following changes:
> [...]
>> It doesn't work (the audio capture is close but wrong), but believe it 
>> is a step in the correct direction. Anybody have a clue? Saw several 
>> names in cpp,c,h files including: Ramiro Polla, Luca Barbato, Deti 
>> Fliegl, Rafaël Carré and Akamai Technologies Inc.
> 
> Did you check out Dave Rice's recent patch (on this list)? It touches
> code in a few more places, and adds an option to select 16 vs. 32 bits.
> Please test, if you can.
> 
> Is your subject indicating that 24 bits depth could also be supported?
> If so, Dave perhaps should expand his patch to cover that.

The decklink sdk only defines two BMDAudioSampleType values: bmdAudioSampleType16bitInteger and bmdAudioSampleType32bitInteger. I don't think there's an easy way to support a 24 bit input here. Generally in this case I've used bmdAudioSampleType32bitInteger and then encode it at pcm_s24le.
Dave Rice
ffmpeg@dx9s.net Oct. 17, 2017, 7:16 p.m. UTC | #3
On 2017-10-17 11:59, Moritz Barsnick wrote:

> Did you check out Dave Rice's recent patch (on this list)? It touches
> code in a few more places, and adds an option to select 16 vs. 32 bits.
> Please test, if you can.
> 
> Is your subject indicating that 24 bits depth could also be supported?
> If so, Dave perhaps should expand his patch to cover that.
> 

I did see it!

Been busy, trying to get a video project done and under the belt before 
I start tinkering with dependent things. I *WILL* report back my 
experiences (positive or negative). I was hacking the source before that 
patch and see the two areas that I missed. But his patch added command 
line parameters for selecting (which I prefer). Was going to try it out 
and change the default to 32-bit as well.

I am guessing that some capture hardware may not support higher than 16 
bit depths? If all support higher depths, then why should the default 
not be higher and then let FFMPEG convert the input stream to the 
correct output stream (it might be 16 bits)? Is truncating CPU 
intensive? Dithering be applied? What does the DAC do in hardware when 
outputting 16-bits (truncate or dither)?

Also from what I understand word/sample size is 32-bits but the DAC is 
24 (assuming 8-bits are padded) -- hence why I said 24/32. I only see in 
the SDK 16-bits and 32-bits, but know the documentation for the card I 
have (Studio 4K) states 24-bits. Do others have cards that state 
anything additional over 16 and 32?

--Doug (dx9s)
Devin Heitmueller Oct. 17, 2017, 7:44 p.m. UTC | #4
> 
> The decklink sdk only defines two BMDAudioSampleType values: bmdAudioSampleType16bitInteger and bmdAudioSampleType32bitInteger. I don't think there's an easy way to support a 24 bit input here. Generally in this case I've used bmdAudioSampleType32bitInteger and then encode it at pcm_s24le.
> Dave Rice

For what it’s worth, I’ve got deinterleaving code in the works to handle capture of multiple pairs of audio (i.e. break 16 channels into 8 pairs and announce them as separate S16LE streams).  If we really thought 24-bit was desirable, that code could be adjusted accordingly (the hardware would still capture 32-bit, but the deinterleaver would put out S24LE).

Devin
Marton Balint Oct. 18, 2017, 7:02 p.m. UTC | #5
On Tue, 17 Oct 2017, Devin Heitmueller wrote:

>> > The decklink sdk only defines two BMDAudioSampleType values: bmdAudioSampleType16bitInteger and bmdAudioSampleType32bitInteger. I don't think there's an easy way to support a 24 bit input here. Generally in this case I've used bmdAudioSampleType32bitInteger and then encode it at pcm_s24le.
>> Dave Rice
>
> For what it’s worth, I’ve got deinterleaving code in the works to handle 
> capture of multiple pairs of audio (i.e. break 16 channels into 8 pairs 
> and announce them as separate S16LE streams).  If we really thought 
> 24-bit was desirable, that code could be adjusted accordingly (the 
> hardware would still capture 32-bit, but the deinterleaver would put out 
> S24LE).

Breaking 8/16 channels to stereo streams can be done by an audio filter 
(by using "asplit" to multiply the source to 4 outputs and then "pan" or 
"channelmap" on each output to select the proper source channels), so I 
don't think direct support of splitting channels in the decklink device is 
acceptable.

If you have performance or convenience problems with using the 
filters above then propose a filter wich does this in a single step or 
extend one of the existing ones to be able to do this.

Regards,
Marton
Devin Heitmueller Oct. 18, 2017, 7:27 p.m. UTC | #6
Hello Marton,

> On Oct 18, 2017, at 3:02 PM, Marton Balint <cus@passwd.hu> wrote:
> 
> 
> 
> On Tue, 17 Oct 2017, Devin Heitmueller wrote:
> 
>>> > The decklink sdk only defines two BMDAudioSampleType values: bmdAudioSampleType16bitInteger and bmdAudioSampleType32bitInteger. I don't think there's an easy way to support a 24 bit input here. Generally in this case I've used bmdAudioSampleType32bitInteger and then encode it at pcm_s24le.
>>> Dave Rice
>> 
>> For what it’s worth, I’ve got deinterleaving code in the works to handle capture of multiple pairs of audio (i.e. break 16 channels into 8 pairs and announce them as separate S16LE streams).  If we really thought 24-bit was desirable, that code could be adjusted accordingly (the hardware would still capture 32-bit, but the deinterleaver would put out S24LE).
> 
> Breaking 8/16 channels to stereo streams can be done by an audio filter (by using "asplit" to multiply the source to 4 outputs and then "pan" or "channelmap" on each output to select the proper source channels), so I don't think direct support of splitting channels in the decklink device is acceptable.

So using an audio filter sounds like a great idea and was my initial instinct.  However then I dug into the ffmpeg API interfaces and discovered that audio filters cannot output anything but audio samples.  This prevents me from doing detection of compressed audio over SDI (i.e. S302M) during the probing phase (i.e. for Dolby-E or AC-3), since the output would be something other than uncompressed audio samples.

It also means you cannot do a simple use case for having the decklink demux announce 8 streams which can be easily fed to eight different encoders through the standard map facility.  You would have to probe the input, and use filter_complex to insert an audio filter which deinterleaves the audio, and then manually instantiate a series of audio encoders mapped to each output.

The approach I’ve proposed will “just work” in the ffmpeg command line use case.  Run the command, say “-map 0”, and eight streams will be fed to eight encoders (or pass through with the copy codec for compressed audio) and inserted into the transport stream.

I’m not against refactoring the existing S302 codec to create a demux to detect audio and create the streams with either raw or compressed codec type based on detection.  In that case the decklink demux would spawn an S302 sub-demux for each of the streams.  However in any case it would still require the audio to be de-interleaved in order to detect the S302 packets.

Now I welcome someone to better design the filter interface to allow filters which can take in raw audio and output compressed data.  Likewise I welcome someone to introduce improvements which allow filters to create new streams.  However in the absence of either of these rather large redesigns of ffmpeg’s internals, the approach I’m proposing is the only thing I could come up with which allows for these decisions to be made at the probing phase.

I’m also not against the notion of invoking a filter from inside the demux  if such a deinterleaving filter exists (and then have the decklink demux create the streams based on passing the data through the filter).  That would allow for some code reuse but is still a hack to overcome the limitations of ffmpeg’s probing framework.

I’m also not against the notion of creating a demux which is fed all eight audio streams as one blob and having that demux create the streams (and then invoking that demux from inside the decklink demux).  While that might lend itself to a bit of code sharing if we had other SDI input cards we wanted to support (assuming the provide audio in the same basic format), it still wouldn’t be an audio filter and I’m not confident the extra abstraction is worth the effort.

Devin
ffmpeg@dx9s.net Oct. 18, 2017, 8:15 p.m. UTC | #7
On 2017-10-17 12:44, Devin Heitmueller wrote:
>> 
>> The decklink sdk only defines two BMDAudioSampleType values: 
>> bmdAudioSampleType16bitInteger and bmdAudioSampleType32bitInteger. I 
>> don't think there's an easy way to support a 24 bit input here. 
>> Generally in this case I've used bmdAudioSampleType32bitInteger and 
>> then encode it at pcm_s24le.
>> Dave Rice
> 
> For what it’s worth, I’ve got deinterleaving code in the works to
> handle capture of multiple pairs of audio (i.e. break 16 channels into
> 8 pairs and announce them as separate S16LE streams).  If we really
> thought 24-bit was desirable, that code could be adjusted accordingly
> (the hardware would still capture 32-bit, but the deinterleaver would
> put out S24LE).
> 

I am not really sure I follow. I am not sure supporting 24-bit is a big 
issue. A sample size of 32-bit should work fine for most folks. I can 
only think of people (in the output stream) converting to 24-bits (via 
truncate or dither*) to save disk space or pre-processing for some other 
step [compression] (but video is really the bit-hog). I only mentioned 
24-bits because the ADC/DACs are mentioned at supporting PCM 24-bits 
natively meaning the PCI card is (assuming) padding the LSB (hence 
truncate is more logical for any conversion 32->24). AS for what comes 
in digitally over SDI or HDMI is too assumed to only support PCM 24-bits 
(but it is subject to standards that can update).

Making the workflow (stream capturing) of 32-bits is simpler, and moving 
any bit-depth conversion to the output stream side. Only concerns would 
be CPU processing (of which truncating bits is very fast and logical due 
to the assumed padding).

--Doug (dx9s)

*dither= I am not aware of any audio bit-depth dithering algorithms in 
FFMPEG, however it would make sense they do exist as this software is 
quite simply an amazing 'swiss-army knife'
Devin Heitmueller Oct. 18, 2017, 8:27 p.m. UTC | #8
Hi Doug,

> On Oct 18, 2017, at 4:15 PM, Douglas Marsh <ffmpeg@dx9s.net> wrote:
> 
> I am not really sure I follow. I am not sure supporting 24-bit is a big issue. A sample size of 32-bit should work fine for most folks. I can only think of people (in the output stream) converting to 24-bits (via truncate or dither*) to save disk space or pre-processing for some other step [compression] (but video is really the bit-hog). I only mentioned 24-bits because the ADC/DACs are mentioned at supporting PCM 24-bits natively meaning the PCI card is (assuming) padding the LSB (hence truncate is more logical for any conversion 32->24). AS for what comes in digitally over SDI or HDMI is too assumed to only support PCM 24-bits (but it is subject to standards that can update).
> 
> Making the workflow (stream capturing) of 32-bits is simpler, and moving any bit-depth conversion to the output stream side. Only concerns would be CPU processing (of which truncating bits is very fast and logical due to the assumed padding).

I think you and I are on the same page.  It wasn't clear to me what would prompt someone to say they want 24-bit audio as opposed to 32 (which is way easier to work with because of alignment).  That said, if you had such a use case I think this could be done.

That said, I'm all about not adding functionality nobody cares about.

Thanks for adding this functionality, as I need it to reliably do compressed audio detection (which is next on my list after support for multi-channel audio is working).

Devin
Michael Niedermayer Oct. 19, 2017, 1:41 a.m. UTC | #9
On Wed, Oct 18, 2017 at 01:15:21PM -0700, Douglas Marsh wrote:
[...]
> *dither= I am not aware of any audio bit-depth dithering algorithms
> in FFMPEG, however it would make sense they do exist as this
> software is quite simply an amazing 'swiss-army knife'

i didnt follow the thread but theres dithering code for audio in
libswresample

[...]
diff mbox

Patch

diff --git a/libavdevice/decklink_dec.cpp b/libavdevice/decklink_dec.cpp
index 3ce2cab..afd255f 100644
--- a/libavdevice/decklink_dec.cpp
+++ b/libavdevice/decklink_dec.cpp
@@ -937,7 +937,7 @@  av_cold int ff_decklink_read_header(AVFormatContext 
*avctx)
          goto error;
      }
      st->codecpar->codec_type  = AVMEDIA_TYPE_AUDIO;
-    st->codecpar->codec_id    = AV_CODEC_ID_PCM_S16LE;
+    st->codecpar->codec_id    = AV_CODEC_ID_PCM_S32LE;
      st->codecpar->sample_rate = bmdAudioSampleRate48kHz;
      st->codecpar->channels    = cctx->audio_channels;