diff mbox

[FFmpeg-devel] avcodec: add a WavPack DSD decoder

Message ID a3cc0d1f-c544-3815-6cb3-cd0b5fa25dcc@wavpack.com
State Superseded
Headers show

Commit Message

David Bryant July 25, 2019, 5:12 a.m. UTC
On 7/21/19 5:57 PM, Lynne wrote:
> Jul 22, 2019, 12:03 AM by david@wavpack.com:
>
>> Hi,
>>
>> As I promised late last year, here is a patch to add a WavPack DSD decoder.
>>
>> Thanks!
>>
>> -David Bryant
>>
>> +    unsigned char probabilities [MAX_HISTORY_BINS] [256];
>> +    unsigned char *value_lookup [MAX_HISTORY_BINS];
> Use uint8_t throughout the patch.
> Also don't add spaces between array declarations or lookups.
Done. New patch attached for all changes (hope that's the right way to do it).
>
>
>> +static void init_ptable (int *table, int rate_i, int rate_s)
>> +{
>> +    int value = 0x808000, rate = rate_i << 8, c, i;
>> +
>> +    for (c = (rate + 128) >> 8; c--;)
>> +        value += (DOWN - value) >> DECAY;
>> +
>> +    for (i = 0; i < PTABLE_BINS/2; ++i) {
> What's up with the random increment position in loops? It changes to before and after the variable throughout. Make it consistent and after the variable.
> Also we support declarative for (int loops. Can save lines.

Fixed random increment position and put in declarative int loops (even when they didn't save a line).


>
>
>> +    DSDfilters filters [2], *sp = filters;
> Same, spaces after variables for arrays, all throughout the file.

Done. Also fixed spaces after function names.


>
>
>> +            if (code > max_probability) {
>> +                int zcount = code - max_probability;
>> +
>> +                while (outptr < outend && zcount--)
>> +                    *outptr++ = 0;
>> +            }
>> +            else if (code)
>> +                *outptr++ = code;
>> +            else
>> +                break;
> We don't put else on a new line, and prefer to have each branch wrapped in bracket unless all branches are single lines.

Fixed.


>
>
>> +    for (p0 = 0; p0 < history_bins; ++p0) {
>> +        int32_t sum_values;
>> +        unsigned char *vp;
>> +
>> +        for (sum_values = i = 0; i < 256; ++i)
>> +            s->summed_probabilities [p0] [i] = sum_values += s->probabilities [p0] [i];
> sum_values is uninitialized. Does you compiler not warn about this?

This was pointed out already as actually initialized, but I made it clearer (obviously it wasn't).


>
>
>> +        if (sum_values) {
>> +            total_summed_probabilities += sum_values;
>> +            vp = s->value_lookup [p0] = av_malloc (sum_values);
> I don't like the per-frame alloc. The maximum sum_values can be is 255*255 = UINT16_MAX.
>  60k of memory isn't much at all, just define value_lookup[255*255] in the context and you'll probably plug a few out of bounds accesses too.

It's actually up to 32 allocs per frame because there's one for each history bin (value_lookup is an array of pointers, not
uint8s), and I didn't like it either because I had to worry about de-allocing on error. Refactored to use a single array in the
context as a pool. Thanks for the suggestion!


>
>
>> +            mult = high / s->summed_probabilities [p0] [255];
> s->summed_probabilities [p0] [255]; can be zero, you already check if its zero when allocating currently. You should probably check for divide by zero unless you're very sure it can't happen.

I'm very sure. The checks are within a few lines above each of the three divides.


>
>
>> +        crc += (crc << 1) + code;
> Don't NIH CRCs, we have av_crc in lavu. See below how to use it.

It's not a standard crc, but more of a recirculating checksum, so the NIH code is required.


>
>
>> +static int wv_unpack_dsd_copy(WavpackFrameContext *s, void *dst_l, void *dst_r)
>> +{
>> +    uint8_t *dsd_l              = dst_l;
>> +    uint8_t *dsd_r              = dst_r;
> You're shadowing arguments. Your compiler doesn't warn on this either?
> You're calling the function with uint8_ts anyway, just change the type.

They're not shadowed (dsd vs. dst) which is why my compiler didn't complain, but I took your suggestion of just changing the types.


>
>
>> +    while (total_samples--) {
>> +        crc += (crc << 1) + (*dsd_l = bytestream2_get_byte(&s->gb));
>> +        dsd_l += 4;
>> +
>> +        if (dst_r) {
>> +            crc += (crc << 1) + (*dsd_r = bytestream2_get_byte(&s->gb));
>> +            dsd_r += 4;
>> +        }
>> +    }
> av_crc(av_crc_get_table(AV_CRC_32_IEEE/LE), UINT32_MAX, dsd_start_r/l, dsd_r/l - dsd_start_r/l) should work and be faster.

see above


>
>
>> +    s->fdec_num = 0;
> Private codec context is always zeroed already.

removed


>
>
>> +    int chan = 0, chmask = 0, sample_rate = 0, rate_x = 1, dsd_mode = 0;
>> +                chmask = avctx->channel_layout;
>>       uint32_t chmask, flags;
> frame->channel_layout is uint64_t.

good to know...fixed


>
>
>> +    samples_l = frame->extended_data[wc->ch_offset];
>> +    if (s->stereo)
>> +        samples_r = frame->extended_data[wc->ch_offset + 1];
>> +
>> +    wc->ch_offset += 1 + s->stereo;
> Have you checked non-stereo decodes fine and the channels are correctly ordered?

Yes.


>
>
>> +        if (id & WP_IDF_LONG) {
>> +            size |= (bytestream2_get_byte(&gb)) << 8;
>> +            size |= (bytestream2_get_byte(&gb)) << 16;
>> +        }
> Could use bytestream2_get_le16u/be16u to save 2 lines.

Thanks. Also found a few places where I could use bytestream2_get_be32()!


>
>
>> +    if (!got_dsd) {
>> +        av_log(avctx, AV_LOG_ERROR, "Packed samples not found\n");
>> +        return AVERROR_INVALIDDATA;
>> +    }
> I think you should check avctx is completely configured before this, after parsing all WP_IDs, in case something is corrupt.

I don't think anything is required here because all the medadata frames that we recognize are checked when parsed, and
WP_ID_DSD_DATA is the only one required for DSD audio, and all the others are silently ignored.


>
>
>> +        frame->nb_samples = s->samples + 1;
>> +        if ((ret = ff_get_buffer(avctx, frame, 0)) < 0)
>> +            return ret;
>> +        frame->nb_samples = s->samples;
> ?. Is the extra sample used as temporary buffer or something?

Your guess is as good as mine. This was part of the code "borrowed" from the PCM version (with the threading removed) so maybe
there is (or was) a situation that was writing one extra sample off the end. The code here certainly doesn't, but it seemed
pretty innocuous and I don't like just ripping out things I don't understand.


>
>> +AVCodec ff_wavpack_dsd_decoder = {
>> +    .name           = "wavpack_dsd",
>> +    .long_name      = NULL_IF_CONFIG_SMALL("WavPack DSD"),
>> +    .type           = AVMEDIA_TYPE_AUDIO,
>> +    .id             = AV_CODEC_ID_WAVPACK_DSD,
>> +    .priv_data_size = sizeof(WavpackContext),
>> +    .init           = wavpack_decode_init,
>> +    .close          = wavpack_decode_end,
>> +    .decode         = wavpack_decode_frame,
>> +    .capabilities   = AV_CODEC_CAP_DR1,
>> +};
> Seeking is probably broken. You should add a flush function to reset the decoder entirely.

Seeking seems to work fine, again because it was working with PCM and I didn't mess with any of that. I added a flush to clear
the dsd2pcm contexts because that might have left an audible tick (that's the only context that's held over between frames that
isn't identical for the whole file).

Kind regards,

David

> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
From 8c13bdb9e4a7748d3f9d937db573fc65ef645d65 Mon Sep 17 00:00:00 2001
From: David Bryant <david@wavpack.com>
Date: Wed, 24 Jul 2019 21:25:26 -0700
Subject: [PATCH] avcodec: add a WavPack DSD decoder

Signed-off-by: David Bryant <david@wavpack.com>
---
 libavcodec/Makefile      |   1 +
 libavcodec/allcodecs.c   |   1 +
 libavcodec/avcodec.h     |   1 +
 libavcodec/codec_desc.c  |   7 +
 libavcodec/wavpack.h     |   2 +
 libavcodec/wavpack_dsd.c | 775 +++++++++++++++++++++++++++++++++++++++++++++++
 libavformat/wvdec.c      |  32 +-
 7 files changed, 807 insertions(+), 12 deletions(-)
 create mode 100644 libavcodec/wavpack_dsd.c

Comments

Paul B Mahol July 25, 2019, 6:55 a.m. UTC | #1
Hi.

Until my comments are resolved, this should not get applied.
Lynne July 25, 2019, 3:36 p.m. UTC | #2
Jul 25, 2019, 6:12 AM by david@wavpack.com:

>>> +        crc += (crc << 1) + code;
>>>
>> Don't NIH CRCs, we have av_crc in lavu. See below how to use it.
>>
>
> It's not a standard crc, but more of a recirculating checksum, so the NIH code is required.
>

Could you not call it a CRC then? "checksum" is more appropriate.
Wish a CRC was used, its so much better than a checksum and only slightly slower.



>>> +        frame->nb_samples = s->samples + 1;
>>> +        if ((ret = ff_get_buffer(avctx, frame, 0)) < 0)
>>> +            return ret;
>>> +        frame->nb_samples = s->samples;
>>>
>> ?. Is the extra sample used as temporary buffer or something?
>>
>
> Your guess is as good as mine. This was part of the code "borrowed" from the PCM version (with the threading removed) so maybe
> there is (or was) a situation that was writing one extra sample off the end. The code here certainly doesn't, but it seemed
> pretty innocuous and I don't like just ripping out things I don't understand.
>

Just change it and run it through valgrind, I can't see the code using it.


Rest looks fine to me.
diff mbox

Patch

diff --git a/libavcodec/Makefile b/libavcodec/Makefile
index 3cd73fb..b94327e 100644
--- a/libavcodec/Makefile
+++ b/libavcodec/Makefile
@@ -686,6 +686,7 @@  OBJS-$(CONFIG_VPLAYER_DECODER)         += textdec.o ass.o
 OBJS-$(CONFIG_VP9_V4L2M2M_DECODER)     += v4l2_m2m_dec.o
 OBJS-$(CONFIG_VQA_DECODER)             += vqavideo.o
 OBJS-$(CONFIG_WAVPACK_DECODER)         += wavpack.o
+OBJS-$(CONFIG_WAVPACK_DSD_DECODER)     += wavpack_dsd.o dsd.o
 OBJS-$(CONFIG_WAVPACK_ENCODER)         += wavpackenc.o
 OBJS-$(CONFIG_WCMV_DECODER)            += wcmv.o
 OBJS-$(CONFIG_WEBP_DECODER)            += webp.o
diff --git a/libavcodec/allcodecs.c b/libavcodec/allcodecs.c
index d2f9a39..a2f414b 100644
--- a/libavcodec/allcodecs.c
+++ b/libavcodec/allcodecs.c
@@ -485,6 +485,7 @@  extern AVCodec ff_vorbis_encoder;
 extern AVCodec ff_vorbis_decoder;
 extern AVCodec ff_wavpack_encoder;
 extern AVCodec ff_wavpack_decoder;
+extern AVCodec ff_wavpack_dsd_decoder;
 extern AVCodec ff_wmalossless_decoder;
 extern AVCodec ff_wmapro_decoder;
 extern AVCodec ff_wmav1_encoder;
diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
index d234271..8d3a551 100644
--- a/libavcodec/avcodec.h
+++ b/libavcodec/avcodec.h
@@ -652,6 +652,7 @@  enum AVCodecID {
     AV_CODEC_ID_SBC,
     AV_CODEC_ID_ATRAC9,
     AV_CODEC_ID_HCOM,
+    AV_CODEC_ID_WAVPACK_DSD,
 
     /* subtitle codecs */
     AV_CODEC_ID_FIRST_SUBTITLE = 0x17000,          ///< A dummy ID pointing at the start of subtitle codecs.
diff --git a/libavcodec/codec_desc.c b/libavcodec/codec_desc.c
index 4d033c2..bee88b8 100644
--- a/libavcodec/codec_desc.c
+++ b/libavcodec/codec_desc.c
@@ -2985,6 +2985,13 @@  static const AVCodecDescriptor codec_descriptors[] = {
         .long_name = NULL_IF_CONFIG_SMALL("HCOM Audio"),
         .props     = AV_CODEC_PROP_LOSSY,
     },
+    {
+        .id        = AV_CODEC_ID_WAVPACK_DSD,
+        .type      = AVMEDIA_TYPE_AUDIO,
+        .name      = "wavpack_dsd",
+        .long_name = NULL_IF_CONFIG_SMALL("WavPack DSD"),
+        .props     = AV_CODEC_PROP_LOSSLESS,
+    },
 
     /* subtitle codecs */
     {
diff --git a/libavcodec/wavpack.h b/libavcodec/wavpack.h
index 6caad03..43aaac8 100644
--- a/libavcodec/wavpack.h
+++ b/libavcodec/wavpack.h
@@ -35,6 +35,7 @@ 
 #define WV_FLOAT_DATA     0x00000080
 #define WV_INT32_DATA     0x00000100
 #define WV_FALSE_STEREO   0x40000000
+#define WV_DSD_DATA       0x80000000
 
 #define WV_HYBRID_MODE    0x00000008
 #define WV_HYBRID_SHAPE   0x00000008
@@ -77,6 +78,7 @@  enum WP_ID {
     WP_ID_CORR,
     WP_ID_EXTRABITS,
     WP_ID_CHANINFO,
+    WP_ID_DSD_DATA,
     WP_ID_SAMPLE_RATE = 0x27,
 };
 
diff --git a/libavcodec/wavpack_dsd.c b/libavcodec/wavpack_dsd.c
new file mode 100644
index 0000000..4a89817
--- /dev/null
+++ b/libavcodec/wavpack_dsd.c
@@ -0,0 +1,775 @@ 
+/*
+ * WavPack lossless DSD audio decoder
+ * Copyright (c) 2006,2011 Konstantin Shishkov
+ * Copyright (c) 2019 David Bryant
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavutil/channel_layout.h"
+
+#include "avcodec.h"
+#include "bytestream.h"
+#include "internal.h"
+#include "wavpack.h"
+#include "dsd.h"
+
+/**
+ * @file
+ * WavPack lossless DSD audio decoder
+ */
+
+#define DSD_BYTE_READY(low,high) (!(((low) ^ (high)) & 0xff000000))
+
+#define PTABLE_BITS 8
+#define PTABLE_BINS (1<<PTABLE_BITS)
+#define PTABLE_MASK (PTABLE_BINS-1)
+
+#define UP   0x010000fe
+#define DOWN 0x00010000
+#define DECAY 8
+
+#define PRECISION 20
+#define VALUE_ONE (1 << PRECISION)
+#define PRECISION_USE 12
+
+#define RATE_S 20
+
+#define MAX_HISTORY_BITS    5
+#define MAX_HISTORY_BINS    (1 << MAX_HISTORY_BITS)
+#define MAX_BIN_BYTES       1280    // for value_lookup, per bin (2k - 512 - 256)
+
+typedef struct WavpackFrameContext {
+    AVCodecContext *avctx;
+    int stereo, stereo_in;
+    uint32_t CRC;
+    int samples;
+    GetByteContext gb;
+    int ptable[PTABLE_BINS];
+    uint8_t value_lookup_buffer[MAX_HISTORY_BINS*MAX_BIN_BYTES];
+    int16_t summed_probabilities[MAX_HISTORY_BINS][256];
+    uint8_t probabilities[MAX_HISTORY_BINS][256];
+    uint8_t *value_lookup[MAX_HISTORY_BINS];
+    DSDContext dsdctx[2];
+} WavpackFrameContext;
+
+#define WV_MAX_FRAME_DECODERS 14
+
+typedef struct WavpackContext {
+    AVCodecContext *avctx;
+
+    WavpackFrameContext *fdec[WV_MAX_FRAME_DECODERS];
+    int fdec_num;
+
+    int block;
+    int samples;
+    int ch_offset;
+} WavpackContext;
+
+static inline int wv_check_crc(WavpackFrameContext *s, uint32_t crc)
+{
+    if (crc != s->CRC) {
+        av_log(s->avctx, AV_LOG_ERROR, "CRC error\n");
+        return AVERROR_INVALIDDATA;
+    }
+
+    return 0;
+}
+
+static void init_ptable(int *table, int rate_i, int rate_s)
+{
+    int value = 0x808000, rate = rate_i << 8;
+
+    for (int c = (rate + 128) >> 8; c--;)
+        value += (DOWN - value) >> DECAY;
+
+    for (int i = 0; i < PTABLE_BINS/2; i++) {
+        table[i] = value;
+        table[PTABLE_BINS-1-i] = 0x100ffff - value;
+
+        if (value > 0x010000) {
+            rate += (rate * rate_s + 128) >> 8;
+
+            for (int c = (rate + 64) >> 7; c--;)
+                value += (DOWN - value) >> DECAY;
+        }
+    }
+}
+
+typedef struct {
+    int32_t value, fltr0, fltr1, fltr2, fltr3, fltr4, fltr5, fltr6, factor, byte;
+} DSDfilters;
+
+static int wv_unpack_dsd_high(WavpackFrameContext *s, uint8_t *dst_l, uint8_t *dst_r)
+{
+    uint32_t crc                    = 0xFFFFFFFF;
+    int total_samples = s->samples, stereo = dst_r ? 1 : 0;
+    DSDfilters filters[2], *sp = filters;
+    int rate_i, rate_s;
+    uint32_t low, high, value;
+
+    if (bytestream2_get_bytes_left(&s->gb) < (stereo ? 20 : 13))
+        return AVERROR_INVALIDDATA;
+
+    rate_i = bytestream2_get_byte(&s->gb);
+    rate_s = bytestream2_get_byte(&s->gb);
+
+    if (rate_s != RATE_S)
+        return AVERROR_INVALIDDATA;
+
+    init_ptable(s->ptable, rate_i, rate_s);
+
+    for (int channel = 0; channel < stereo + 1; channel++) {
+        DSDfilters *sp = filters + channel;
+
+        sp->fltr1 = bytestream2_get_byte(&s->gb) << (PRECISION - 8);
+        sp->fltr2 = bytestream2_get_byte(&s->gb) << (PRECISION - 8);
+        sp->fltr3 = bytestream2_get_byte(&s->gb) << (PRECISION - 8);
+        sp->fltr4 = bytestream2_get_byte(&s->gb) << (PRECISION - 8);
+        sp->fltr5 = bytestream2_get_byte(&s->gb) << (PRECISION - 8);
+        sp->fltr6 = 0;
+        sp->factor = bytestream2_get_byte(&s->gb) & 0xff;
+        sp->factor |= (bytestream2_get_byte(&s->gb) << 8) & 0xff00;
+        sp->factor = (sp->factor << 16) >> 16;
+    }
+
+    value = bytestream2_get_be32(&s->gb);
+    high = 0xffffffff;
+    low = 0x0;
+
+    memset(dst_l, 0x69, total_samples * 4);
+
+    if (stereo)
+        memset(dst_r, 0x69, total_samples * 4);
+
+    while (total_samples--) {
+        int bitcount = 8;
+
+        sp[0].value = sp[0].fltr1 - sp[0].fltr5 + ((sp[0].fltr6 * sp[0].factor) >> 2);
+
+        if (stereo)
+            sp[1].value = sp[1].fltr1 - sp[1].fltr5 + ((sp[1].fltr6 * sp[1].factor) >> 2);
+
+        while (bitcount--) {
+            int32_t *pp = s->ptable + ((sp[0].value >> (PRECISION - PRECISION_USE)) & PTABLE_MASK);
+            uint32_t split = low + ((high - low) >> 8) * (*pp >> 16);
+
+            if (value <= split) {
+                high = split;
+                *pp += (UP - *pp) >> DECAY;
+                sp[0].fltr0 = -1;
+            } else {
+                low = split + 1;
+                *pp += (DOWN - *pp) >> DECAY;
+                sp[0].fltr0 = 0;
+            }
+
+            while (DSD_BYTE_READY(high, low) && bytestream2_get_bytes_left(&s->gb)) {
+                value = (value << 8) | bytestream2_get_byte(&s->gb);
+                high = (high << 8) | 0xff;
+                low <<= 8;
+            }
+
+            sp[0].value += sp[0].fltr6 << 3;
+            sp[0].byte = (sp[0].byte << 1) | (sp[0].fltr0 & 1);
+            sp[0].factor += (((sp[0].value ^ sp[0].fltr0) >> 31) | 1) &
+                ((sp[0].value ^ (sp[0].value - (sp[0].fltr6 << 4))) >> 31);
+            sp[0].fltr1 += ((sp[0].fltr0 & VALUE_ONE) - sp[0].fltr1) >> 6;
+            sp[0].fltr2 += ((sp[0].fltr0 & VALUE_ONE) - sp[0].fltr2) >> 4;
+            sp[0].fltr3 += (sp[0].fltr2 - sp[0].fltr3) >> 4;
+            sp[0].fltr4 += (sp[0].fltr3 - sp[0].fltr4) >> 4;
+            sp[0].value = (sp[0].fltr4 - sp[0].fltr5) >> 4;
+            sp[0].fltr5 += sp[0].value;
+            sp[0].fltr6 += (sp[0].value - sp[0].fltr6) >> 3;
+            sp[0].value = sp[0].fltr1 - sp[0].fltr5 + ((sp[0].fltr6 * sp[0].factor) >> 2);
+
+            if (!stereo)
+                continue;
+
+            pp = s->ptable + ((sp[1].value >> (PRECISION - PRECISION_USE)) & PTABLE_MASK);
+            split = low + ((high - low) >> 8) * (*pp >> 16);
+
+            if (value <= split) {
+                high = split;
+                *pp += (UP - *pp) >> DECAY;
+                sp[1].fltr0 = -1;
+            } else {
+                low = split + 1;
+                *pp += (DOWN - *pp) >> DECAY;
+                sp[1].fltr0 = 0;
+            }
+
+            while (DSD_BYTE_READY(high, low) && bytestream2_get_bytes_left(&s->gb)) {
+                value = (value << 8) | bytestream2_get_byte(&s->gb);
+                high = (high << 8) | 0xff;
+                low <<= 8;
+            }
+
+            sp[1].value += sp[1].fltr6 << 3;
+            sp[1].byte = (sp[1].byte << 1) | (sp[1].fltr0 & 1);
+            sp[1].factor += (((sp[1].value ^ sp[1].fltr0) >> 31) | 1) &
+                ((sp[1].value ^ (sp[1].value - (sp[1].fltr6 << 4))) >> 31);
+            sp[1].fltr1 += ((sp[1].fltr0 & VALUE_ONE) - sp[1].fltr1) >> 6;
+            sp[1].fltr2 += ((sp[1].fltr0 & VALUE_ONE) - sp[1].fltr2) >> 4;
+            sp[1].fltr3 += (sp[1].fltr2 - sp[1].fltr3) >> 4;
+            sp[1].fltr4 += (sp[1].fltr3 - sp[1].fltr4) >> 4;
+            sp[1].value = (sp[1].fltr4 - sp[1].fltr5) >> 4;
+            sp[1].fltr5 += sp[1].value;
+            sp[1].fltr6 += (sp[1].value - sp[1].fltr6) >> 3;
+            sp[1].value = sp[1].fltr1 - sp[1].fltr5 + ((sp[1].fltr6 * sp[1].factor) >> 2);
+        }
+
+        crc += (crc << 1) + (*dst_l = sp[0].byte & 0xff);
+        sp[0].factor -= (sp[0].factor + 512) >> 10;
+        dst_l += 4;
+
+        if (stereo) {
+            crc += (crc << 1) + (*dst_r = filters[1].byte & 0xff);
+            filters[1].factor -= (filters[1].factor + 512) >> 10;
+            dst_r += 4;
+        }
+    }
+
+    if ((s->avctx->err_recognition & AV_EF_CRCCHECK) && wv_check_crc(s, crc))
+        return AVERROR_INVALIDDATA;
+
+    return 0;
+}
+
+static int wv_unpack_dsd_fast(WavpackFrameContext *s, uint8_t *dst_l, uint8_t *dst_r)
+{
+    uint8_t history_bits, max_probability;
+    int total_summed_probabilities  = 0;
+    int total_samples               = s->samples;
+    uint8_t *vlb                    = s->value_lookup_buffer;
+    int history_bins, p0, p1, chan;
+    uint32_t crc                    = 0xFFFFFFFF;
+    uint32_t low, high, value;
+
+    if (!bytestream2_get_bytes_left(&s->gb))
+        return AVERROR_INVALIDDATA;
+
+    history_bits = bytestream2_get_byte(&s->gb);
+
+    if (!bytestream2_get_bytes_left(&s->gb) || history_bits > MAX_HISTORY_BITS)
+        return AVERROR_INVALIDDATA;
+
+    history_bins = 1 << history_bits;
+    max_probability = bytestream2_get_byte(&s->gb);
+
+    if (max_probability < 0xff) {
+        uint8_t *outptr = (uint8_t *) s->probabilities;
+        uint8_t *outend = outptr + sizeof (*s->probabilities) * history_bins;
+
+        while (outptr < outend && bytestream2_get_bytes_left(&s->gb)) {
+            int code = bytestream2_get_byte(&s->gb);
+
+            if (code > max_probability) {
+                int zcount = code - max_probability;
+
+                while (outptr < outend && zcount--)
+                    *outptr++ = 0;
+            } else if (code) {
+                *outptr++ = code;
+            }
+            else {
+                break;
+            }
+        }
+
+        if (outptr < outend ||
+            (bytestream2_get_bytes_left(&s->gb) && bytestream2_get_byte(&s->gb)))
+                return AVERROR_INVALIDDATA;
+    } else if (bytestream2_get_bytes_left(&s->gb) > (int) sizeof (*s->probabilities) * history_bins) {
+        bytestream2_get_buffer(&s->gb, (uint8_t *) s->probabilities,
+            sizeof (*s->probabilities) * history_bins);
+    } else {
+        return AVERROR_INVALIDDATA;
+    }
+
+    for (p0 = 0; p0 < history_bins; p0++) {
+        int32_t sum_values = 0;
+
+        for (int i = 0; i < 256; i++)
+            s->summed_probabilities[p0][i] = sum_values += s->probabilities[p0][i];
+
+        if (sum_values) {
+            total_summed_probabilities += sum_values;
+
+            if (total_summed_probabilities > history_bins * MAX_BIN_BYTES)
+                return AVERROR_INVALIDDATA;
+
+            s->value_lookup[p0] = vlb;
+
+            for (int i = 0; i < 256; i++) {
+                int c = s->probabilities[p0][i];
+
+                while (c--)
+                    *vlb++ = i;
+            }
+        }
+    }
+
+    if (bytestream2_get_bytes_left(&s->gb) < 4)
+        return AVERROR_INVALIDDATA;
+
+    chan = p0 = p1 = 0;
+    low = 0; high = 0xffffffff;
+    value = bytestream2_get_be32(&s->gb);
+
+    memset(dst_l, 0x69, total_samples * 4);
+
+    if (dst_r) {
+        memset(dst_r, 0x69, total_samples * 4);
+        total_samples *= 2;
+    }
+
+    while (total_samples--) {
+        int mult, index, code;
+
+        if (!s->summed_probabilities[p0][255])
+            return AVERROR_INVALIDDATA;
+
+        mult = (high - low) / s->summed_probabilities[p0][255];
+
+        if (!mult) {
+            if (bytestream2_get_bytes_left(&s->gb) >= 4)
+                value = bytestream2_get_be32(&s->gb);
+
+            low = 0;
+            high = 0xffffffff;
+            mult = high / s->summed_probabilities[p0][255];
+
+            if (!mult)
+                return AVERROR_INVALIDDATA;
+        }
+
+        index = (value - low) / mult;
+
+        if (index >= s->summed_probabilities[p0][255])
+            return AVERROR_INVALIDDATA;
+
+        if (!dst_r) {
+            if ((*dst_l = code = s->value_lookup[p0][index]))
+                low += s->summed_probabilities[p0][code-1] * mult;
+
+            dst_l += 4;
+        } else {
+            if ((code = s->value_lookup[p0][index]))
+                low += s->summed_probabilities[p0][code-1] * mult;
+
+            if (chan) {
+                *dst_r = code;
+                dst_r += 4;
+            }
+            else {
+                *dst_l = code;
+                dst_l += 4;
+            }
+
+            chan ^= 1;
+        }
+
+        high = low + s->probabilities[p0][code] * mult - 1;
+        crc += (crc << 1) + code;
+
+        if (!dst_r) {
+            p0 = code & (history_bins-1);
+        } else {
+            p0 = p1;
+            p1 = code & (history_bins-1);
+        }
+
+        while (DSD_BYTE_READY(high, low) && bytestream2_get_bytes_left(&s->gb)) {
+            value = (value << 8) | bytestream2_get_byte(&s->gb);
+            high = (high << 8) | 0xff;
+            low <<= 8;
+        }
+    }
+
+    if ((s->avctx->err_recognition & AV_EF_CRCCHECK) && wv_check_crc(s, crc))
+        return AVERROR_INVALIDDATA;
+
+    return 0;
+}
+
+static int wv_unpack_dsd_copy(WavpackFrameContext *s, uint8_t *dst_l, uint8_t *dst_r)
+{
+    int total_samples           = s->samples;
+    uint32_t crc                = 0xFFFFFFFF;
+
+    if (bytestream2_get_bytes_left(&s->gb) != total_samples * (dst_r ? 2 : 1))
+        return AVERROR_INVALIDDATA;
+
+    memset(dst_l, 0x69, total_samples * 4);
+
+    if (dst_r)
+        memset(dst_r, 0x69, total_samples * 4);
+
+    while (total_samples--) {
+        crc += (crc << 1) + (*dst_l = bytestream2_get_byte(&s->gb));
+        dst_l += 4;
+
+        if (dst_r) {
+            crc += (crc << 1) + (*dst_r = bytestream2_get_byte(&s->gb));
+            dst_r += 4;
+        }
+    }
+
+    if ((s->avctx->err_recognition & AV_EF_CRCCHECK) && wv_check_crc(s, crc))
+        return AVERROR_INVALIDDATA;
+
+    return 0;
+}
+
+static av_cold int wv_alloc_frame_context(WavpackContext *c)
+{
+    if (c->fdec_num == WV_MAX_FRAME_DECODERS)
+        return -1;
+
+    c->fdec[c->fdec_num] = av_mallocz(sizeof(**c->fdec));
+    if (!c->fdec[c->fdec_num])
+        return -1;
+    c->fdec_num++;
+    c->fdec[c->fdec_num - 1]->avctx = c->avctx;
+    memset(c->fdec[c->fdec_num - 1]->dsdctx[0].buf, 0x69,
+        sizeof(c->fdec[c->fdec_num - 1]->dsdctx[0].buf));
+    memset(c->fdec[c->fdec_num - 1]->dsdctx[1].buf, 0x69,
+        sizeof(c->fdec[c->fdec_num - 1]->dsdctx[1].buf));
+
+    return 0;
+}
+
+static av_cold int wavpack_decode_init(AVCodecContext *avctx)
+{
+    WavpackContext *s = avctx->priv_data;
+
+    s->avctx = avctx;
+
+    s->fdec_num = 0;
+
+    ff_init_dsd_data();
+
+    return 0;
+}
+
+static av_cold int wavpack_decode_end(AVCodecContext *avctx)
+{
+    WavpackContext *s = avctx->priv_data;
+
+    s->fdec_num = 0;
+
+    return 0;
+}
+
+static int wavpack_decode_block(AVCodecContext *avctx, int block_no,
+                                AVFrame *frame, const uint8_t *buf, int buf_size)
+{
+    WavpackContext *wc = avctx->priv_data;
+    WavpackFrameContext *s;
+    GetByteContext gb;
+    void *samples_l = NULL, *samples_r = NULL;
+    int ret;
+    int got_dsd = 0;
+    int id, size, ssize;
+    int chan = 0, sample_rate = 0, rate_x = 1, dsd_mode = 0;
+    int frame_flags, multiblock;
+    uint64_t chmask = 0;
+
+    if (block_no >= wc->fdec_num && wv_alloc_frame_context(wc) < 0) {
+        av_log(avctx, AV_LOG_ERROR, "Error creating frame decode context\n");
+        return AVERROR_INVALIDDATA;
+    }
+
+    s = wc->fdec[block_no];
+    if (!s) {
+        av_log(avctx, AV_LOG_ERROR, "Context for block %d is not present\n",
+               block_no);
+        return AVERROR_INVALIDDATA;
+    }
+
+    bytestream2_init(&gb, buf, buf_size);
+
+    s->samples = bytestream2_get_le32(&gb);
+    if (s->samples != wc->samples) {
+        av_log(avctx, AV_LOG_ERROR, "Mismatching number of samples in "
+               "a sequence: %d and %d\n", wc->samples, s->samples);
+        return AVERROR_INVALIDDATA;
+    }
+    frame_flags = bytestream2_get_le32(&gb);
+    multiblock     = (frame_flags & WV_SINGLE_BLOCK) != WV_SINGLE_BLOCK;
+
+    s->stereo         = !(frame_flags & WV_MONO);
+    s->stereo_in      =  (frame_flags & WV_FALSE_STEREO) ? 0 : s->stereo;
+    s->CRC            = bytestream2_get_le32(&gb);
+
+    // parse metadata blocks
+    while (bytestream2_get_bytes_left(&gb)) {
+        id   = bytestream2_get_byte(&gb);
+        size = bytestream2_get_byte(&gb);
+        if (id & WP_IDF_LONG)
+            size |= (bytestream2_get_le16u(&gb)) << 8;
+        size <<= 1; // size is specified in words
+        ssize  = size;
+        if (id & WP_IDF_ODD)
+            size--;
+        if (size < 0) {
+            av_log(avctx, AV_LOG_ERROR,
+                   "Got incorrect block %02X with size %i\n", id, size);
+            break;
+        }
+        if (bytestream2_get_bytes_left(&gb) < ssize) {
+            av_log(avctx, AV_LOG_ERROR,
+                   "Block size %i is out of bounds\n", size);
+            break;
+        }
+        switch (id & WP_IDF_MASK) {
+        case WP_ID_DSD_DATA:
+            if (size < 2) {
+                av_log(avctx, AV_LOG_ERROR, "Invalid DSD_DATA, size = %i\n",
+                       size);
+                bytestream2_skip(&gb, ssize);
+                continue;
+            }
+            rate_x = 1 << bytestream2_get_byte(&gb);
+            dsd_mode = bytestream2_get_byte(&gb);
+            if (dsd_mode && dsd_mode != 1 && dsd_mode != 3) {
+                av_log(avctx, AV_LOG_ERROR, "Invalid DSD encoding mode: %d\n",
+                    dsd_mode);
+                return AVERROR_INVALIDDATA;
+            }
+            bytestream2_init(&s->gb, gb.buffer, size-2);
+            bytestream2_skip(&gb, size-2);
+            got_dsd      = 1;
+            break;
+        case WP_ID_CHANINFO:
+            if (size <= 1) {
+                av_log(avctx, AV_LOG_ERROR,
+                       "Insufficient channel information\n");
+                return AVERROR_INVALIDDATA;
+            }
+            chan = bytestream2_get_byte(&gb);
+            switch (size - 2) {
+            case 0:
+                chmask = bytestream2_get_byte(&gb);
+                break;
+            case 1:
+                chmask = bytestream2_get_le16(&gb);
+                break;
+            case 2:
+                chmask = bytestream2_get_le24(&gb);
+                break;
+            case 3:
+                chmask = bytestream2_get_le32(&gb);
+                break;
+            case 4:
+                size = bytestream2_get_byte(&gb);
+                chan  |= (bytestream2_get_byte(&gb) & 0xF) << 8;
+                chan  += 1;
+                if (avctx->channels != chan)
+                    av_log(avctx, AV_LOG_WARNING, "%i channels signalled"
+                           " instead of %i.\n", chan, avctx->channels);
+                chmask = bytestream2_get_le24(&gb);
+                break;
+            case 5:
+                size = bytestream2_get_byte(&gb);
+                chan  |= (bytestream2_get_byte(&gb) & 0xF) << 8;
+                chan  += 1;
+                if (avctx->channels != chan)
+                    av_log(avctx, AV_LOG_WARNING, "%i channels signalled"
+                           " instead of %i.\n", chan, avctx->channels);
+                chmask = bytestream2_get_le32(&gb);
+                break;
+            default:
+                av_log(avctx, AV_LOG_ERROR, "Invalid channel info size %d\n",
+                       size);
+                chan   = avctx->channels;
+                chmask = avctx->channel_layout;
+            }
+            break;
+        case WP_ID_SAMPLE_RATE:
+            if (size != 3) {
+                av_log(avctx, AV_LOG_ERROR, "Invalid custom sample rate.\n");
+                return AVERROR_INVALIDDATA;
+            }
+            sample_rate = bytestream2_get_le24(&gb);
+            break;
+        default:
+            bytestream2_skip(&gb, size);
+        }
+        if (id & WP_IDF_ODD)
+            bytestream2_skip(&gb, 1);
+    }
+
+    if (!got_dsd) {
+        av_log(avctx, AV_LOG_ERROR, "Packed samples not found\n");
+        return AVERROR_INVALIDDATA;
+    }
+
+    if (!wc->ch_offset) {
+        int sr = (frame_flags >> 23) & 0xf;
+        if (sr == 0xf) {
+            if (!sample_rate) {
+                av_log(avctx, AV_LOG_ERROR, "Custom sample rate missing.\n");
+                return AVERROR_INVALIDDATA;
+            }
+            avctx->sample_rate = sample_rate * rate_x;
+        } else {
+            avctx->sample_rate = wv_rates[sr] * rate_x;
+        }
+
+        if (multiblock) {
+            if (chan)
+                avctx->channels = chan;
+            if (chmask)
+                avctx->channel_layout = chmask;
+        } else {
+            avctx->channels       = s->stereo ? 2 : 1;
+            avctx->channel_layout = s->stereo ? AV_CH_LAYOUT_STEREO :
+                                                AV_CH_LAYOUT_MONO;
+        }
+
+        /* get output buffer */
+        frame->nb_samples = s->samples + 1;
+        if ((ret = ff_get_buffer(avctx, frame, 0)) < 0)
+            return ret;
+        frame->nb_samples = s->samples;
+    }
+
+    if (wc->ch_offset + s->stereo >= avctx->channels) {
+        av_log(avctx, AV_LOG_WARNING, "Too many channels coded in a packet.\n");
+        return ((avctx->err_recognition & AV_EF_EXPLODE) || !wc->ch_offset) ?
+            AVERROR_INVALIDDATA : 0;
+    }
+
+    samples_l = frame->extended_data[wc->ch_offset];
+    if (s->stereo)
+        samples_r = frame->extended_data[wc->ch_offset + 1];
+
+    wc->ch_offset += 1 + s->stereo;
+
+    if (s->stereo_in) {
+        if (dsd_mode == 3)
+            ret = wv_unpack_dsd_high(s, samples_l, samples_r);
+        else if (dsd_mode == 1)
+            ret = wv_unpack_dsd_fast(s, samples_l, samples_r);
+        else
+            ret = wv_unpack_dsd_copy(s, samples_l, samples_r);
+    } else {
+        if (dsd_mode == 3)
+            ret = wv_unpack_dsd_high(s, samples_l, NULL);
+        else if (dsd_mode == 1)
+            ret = wv_unpack_dsd_fast(s, samples_l, NULL);
+        else
+            ret = wv_unpack_dsd_copy(s, samples_l, NULL);
+
+        if (s->stereo)
+            memcpy(samples_r, samples_l, 4 * s->samples);
+    }
+
+    ff_dsd2pcm_translate(&s->dsdctx[0], s->samples, 0, samples_l, 4, samples_l, 1);
+
+    if (s->stereo)
+        ff_dsd2pcm_translate(&s->dsdctx[1], s->samples, 0, samples_r, 4, samples_r, 1);
+
+    return ret;
+}
+
+static void wavpack_decode_flush(AVCodecContext *avctx)
+{
+    WavpackContext *s = avctx->priv_data;
+
+    for (int i = 0; i < s->fdec_num; i++) {
+        memset(s->fdec[i]->dsdctx[0].buf, 0x69, sizeof(s->fdec[i]->dsdctx[0].buf));
+        memset(s->fdec[i]->dsdctx[1].buf, 0x69, sizeof(s->fdec[i]->dsdctx[1].buf));
+    }
+}
+
+static int wavpack_decode_frame(AVCodecContext *avctx, void *data,
+                                int *got_frame_ptr, AVPacket *avpkt)
+{
+    WavpackContext *s  = avctx->priv_data;
+    const uint8_t *buf = avpkt->data;
+    int buf_size       = avpkt->size;
+    AVFrame *frame     = data;
+    int frame_size, ret, frame_flags;
+
+    if (avpkt->size <= WV_HEADER_SIZE)
+        return AVERROR_INVALIDDATA;
+
+    s->block     = 0;
+    s->ch_offset = 0;
+
+    /* determine number of samples */
+    s->samples  = AV_RL32(buf + 20);
+    frame_flags = AV_RL32(buf + 24);
+    if (s->samples <= 0 || s->samples > WV_MAX_SAMPLES) {
+        av_log(avctx, AV_LOG_ERROR, "Invalid number of samples: %d\n",
+               s->samples);
+        return AVERROR_INVALIDDATA;
+    }
+
+    if (!(frame_flags & WV_DSD_DATA)) {
+        av_log(avctx, AV_LOG_ERROR, "Encountered a non-DSD frame\n");
+        return AVERROR_INVALIDDATA;
+    }
+
+    avctx->sample_fmt = AV_SAMPLE_FMT_FLTP;
+
+    while (buf_size > 0) {
+        if (buf_size <= WV_HEADER_SIZE)
+            break;
+        frame_size = AV_RL32(buf + 4) - 12;
+        buf       += 20;
+        buf_size  -= 20;
+        if (frame_size <= 0 || frame_size > buf_size) {
+            av_log(avctx, AV_LOG_ERROR,
+                   "Block %d has invalid size (size %d vs. %d bytes left)\n",
+                   s->block, frame_size, buf_size);
+            return AVERROR_INVALIDDATA;
+        }
+        if ((ret = wavpack_decode_block(avctx, s->block,
+                                        frame, buf, frame_size)) < 0) {
+            return ret;
+        }
+        s->block++;
+        buf      += frame_size;
+        buf_size -= frame_size;
+    }
+
+    if (s->ch_offset != avctx->channels) {
+        av_log(avctx, AV_LOG_ERROR, "Not enough channels coded in a packet.\n");
+        return AVERROR_INVALIDDATA;
+    }
+
+    *got_frame_ptr = 1;
+
+    return avpkt->size;
+}
+
+AVCodec ff_wavpack_dsd_decoder = {
+    .name           = "wavpack_dsd",
+    .long_name      = NULL_IF_CONFIG_SMALL("WavPack DSD"),
+    .type           = AVMEDIA_TYPE_AUDIO,
+    .id             = AV_CODEC_ID_WAVPACK_DSD,
+    .priv_data_size = sizeof(WavpackContext),
+    .init           = wavpack_decode_init,
+    .close          = wavpack_decode_end,
+    .decode         = wavpack_decode_frame,
+    .flush          = wavpack_decode_flush,
+    .capabilities   = AV_CODEC_CAP_DR1,
+};
diff --git a/libavformat/wvdec.c b/libavformat/wvdec.c
index 649791d..50f4079 100644
--- a/libavformat/wvdec.c
+++ b/libavformat/wvdec.c
@@ -79,7 +79,7 @@  static int wv_read_block_header(AVFormatContext *ctx, AVIOContext *pb)
 {
     WVContext *wc = ctx->priv_data;
     int ret;
-    int rate, bpp, chan;
+    int rate, rate_x, bpp, chan;
     uint32_t chmask, flags;
 
     wc->pos = avio_tell(pb);
@@ -98,11 +98,6 @@  static int wv_read_block_header(AVFormatContext *ctx, AVIOContext *pb)
         return ret;
     }
 
-    if (wc->header.flags & WV_DSD) {
-        avpriv_report_missing_feature(ctx, "WV DSD");
-        return AVERROR_PATCHWELCOME;
-    }
-
     if (wc->header.version < 0x402 || wc->header.version > 0x410) {
         avpriv_report_missing_feature(ctx, "WV version 0x%03X",
                                       wc->header.version);
@@ -115,7 +110,8 @@  static int wv_read_block_header(AVFormatContext *ctx, AVIOContext *pb)
         return 0;
     // parse flags
     flags  = wc->header.flags;
-    bpp    = ((flags & 3) + 1) << 3;
+    rate_x = (flags & WV_DSD) ? 4 : 1;
+    bpp    = (flags & WV_DSD) ? 0 : ((flags & 3) + 1) << 3;
     chan   = 1 + !(flags & WV_MONO);
     chmask = flags & WV_MONO ? AV_CH_LAYOUT_MONO : AV_CH_LAYOUT_STEREO;
     rate   = wv_rates[(flags >> 23) & 0xF];
@@ -124,7 +120,7 @@  static int wv_read_block_header(AVFormatContext *ctx, AVIOContext *pb)
         chan   = wc->chan;
         chmask = wc->chmask;
     }
-    if ((rate == -1 || !chan) && !wc->block_parsed) {
+    if ((rate == -1 || !chan || flags & WV_DSD) && !wc->block_parsed) {
         int64_t block_end = avio_tell(pb) + wc->header.blocksize;
         if (!(pb->seekable & AVIO_SEEKABLE_NORMAL)) {
             av_log(ctx, AV_LOG_ERROR,
@@ -177,6 +173,16 @@  static int wv_read_block_header(AVFormatContext *ctx, AVIOContext *pb)
                     return AVERROR_INVALIDDATA;
                 }
                 break;
+            case 0xE:
+                if (size <= 1) {
+                    av_log(ctx, AV_LOG_ERROR,
+                           "Invalid DSD block\n");
+                    return AVERROR_INVALIDDATA;
+                }
+                rate_x = 1 << avio_r8(pb);
+                if (size)
+                    avio_skip(pb, size-1);
+                break;
             case 0x27:
                 rate = avio_rl24(pb);
                 break;
@@ -200,7 +206,7 @@  static int wv_read_block_header(AVFormatContext *ctx, AVIOContext *pb)
     if (!wc->chmask)
         wc->chmask = chmask;
     if (!wc->rate)
-        wc->rate   = rate;
+        wc->rate   = rate * rate_x;
 
     if (flags && bpp != wc->bpp) {
         av_log(ctx, AV_LOG_ERROR,
@@ -214,10 +220,10 @@  static int wv_read_block_header(AVFormatContext *ctx, AVIOContext *pb)
                chan, wc->chan);
         return AVERROR_INVALIDDATA;
     }
-    if (flags && rate != -1 && rate != wc->rate) {
+    if (flags && rate != -1 && !(flags & WV_DSD) && rate * rate_x != wc->rate) {
         av_log(ctx, AV_LOG_ERROR,
                "Sampling rate differ, this block: %i, header block: %i\n",
-               rate, wc->rate);
+               rate * rate_x, wc->rate);
         return AVERROR_INVALIDDATA;
     }
     return 0;
@@ -245,7 +251,9 @@  static int wv_read_header(AVFormatContext *s)
     if (!st)
         return AVERROR(ENOMEM);
     st->codecpar->codec_type            = AVMEDIA_TYPE_AUDIO;
-    st->codecpar->codec_id              = AV_CODEC_ID_WAVPACK;
+    st->codecpar->codec_id              = wc->header.flags & WV_DSD ?
+                                          AV_CODEC_ID_WAVPACK_DSD :
+                                          AV_CODEC_ID_WAVPACK;
     st->codecpar->channels              = wc->chan;
     st->codecpar->channel_layout        = wc->chmask;
     st->codecpar->sample_rate           = wc->rate;