From patchwork Thu Jul 25 05:12:51 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Bryant X-Patchwork-Id: 14066 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id CF0FF44831F for ; Thu, 25 Jul 2019 08:13:05 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A8641680803; Thu, 25 Jul 2019 08:13:05 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from zoserelay01.impulse.net (zoserelay01.impulse.net [207.154.70.55]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id BBF23680409 for ; Thu, 25 Jul 2019 08:12:58 +0300 (EEST) Received: from localhost (localhost [127.0.0.1]) by zoserelay01.impulse.net (Postfix) with ESMTP id 0E03721F7A61 for ; Wed, 24 Jul 2019 22:12:56 -0700 (PDT) Received: from zoserelay01.impulse.net ([127.0.0.1]) by localhost (zoserelay01.impulse.net [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id JCZEmaHQhedg for ; Wed, 24 Jul 2019 22:12:53 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zoserelay01.impulse.net (Postfix) with ESMTP id 1FA0E21F7A62 for ; Wed, 24 Jul 2019 22:12:52 -0700 (PDT) X-Virus-Scanned: amavisd-new at zoserelay01.impulse.net Received: from zoserelay01.impulse.net ([127.0.0.1]) by localhost (zoserelay01.impulse.net [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id bH2tJT8wfgSL for ; Wed, 24 Jul 2019 22:12:52 -0700 (PDT) Received: from [192.168.1.81] (unknown [108.208.199.113]) by zoserelay01.impulse.net (Postfix) with ESMTPA id 5637921F7A61 for ; Wed, 24 Jul 2019 22:12:52 -0700 (PDT) To: ffmpeg-devel@ffmpeg.org References: From: David Bryant Message-ID: Date: Wed, 24 Jul 2019 22:12:51 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US Subject: Re: [FFmpeg-devel] avcodec: add a WavPack DSD decoder X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" On 7/21/19 5:57 PM, Lynne wrote: > Jul 22, 2019, 12:03 AM by david@wavpack.com: > >> Hi, >> >> As I promised late last year, here is a patch to add a WavPack DSD decoder. >> >> Thanks! >> >> -David Bryant >> >> +    unsigned char probabilities [MAX_HISTORY_BINS] [256]; >> +    unsigned char *value_lookup [MAX_HISTORY_BINS]; > Use uint8_t throughout the patch. > Also don't add spaces between array declarations or lookups. Done. New patch attached for all changes (hope that's the right way to do it). > > >> +static void init_ptable (int *table, int rate_i, int rate_s) >> +{ >> +    int value = 0x808000, rate = rate_i << 8, c, i; >> + >> +    for (c = (rate + 128) >> 8; c--;) >> +        value += (DOWN - value) >> DECAY; >> + >> +    for (i = 0; i < PTABLE_BINS/2; ++i) { > What's up with the random increment position in loops? It changes to before and after the variable throughout. Make it consistent and after the variable. > Also we support declarative for (int loops. Can save lines. Fixed random increment position and put in declarative int loops (even when they didn't save a line). > > >> +    DSDfilters filters [2], *sp = filters; > Same, spaces after variables for arrays, all throughout the file. Done. Also fixed spaces after function names. > > >> +            if (code > max_probability) { >> +                int zcount = code - max_probability; >> + >> +                while (outptr < outend && zcount--) >> +                    *outptr++ = 0; >> +            } >> +            else if (code) >> +                *outptr++ = code; >> +            else >> +                break; > We don't put else on a new line, and prefer to have each branch wrapped in bracket unless all branches are single lines. Fixed. > > >> +    for (p0 = 0; p0 < history_bins; ++p0) { >> +        int32_t sum_values; >> +        unsigned char *vp; >> + >> +        for (sum_values = i = 0; i < 256; ++i) >> +            s->summed_probabilities [p0] [i] = sum_values += s->probabilities [p0] [i]; > sum_values is uninitialized. Does you compiler not warn about this? This was pointed out already as actually initialized, but I made it clearer (obviously it wasn't). > > >> +        if (sum_values) { >> +            total_summed_probabilities += sum_values; >> +            vp = s->value_lookup [p0] = av_malloc (sum_values); > I don't like the per-frame alloc. The maximum sum_values can be is 255*255 = UINT16_MAX. > 60k of memory isn't much at all, just define value_lookup[255*255] in the context and you'll probably plug a few out of bounds accesses too. It's actually up to 32 allocs per frame because there's one for each history bin (value_lookup is an array of pointers, not uint8s), and I didn't like it either because I had to worry about de-allocing on error. Refactored to use a single array in the context as a pool. Thanks for the suggestion! > > >> + mult = high / s->summed_probabilities [p0] [255]; > s->summed_probabilities [p0] [255]; can be zero, you already check if its zero when allocating currently. You should probably check for divide by zero unless you're very sure it can't happen. I'm very sure. The checks are within a few lines above each of the three divides. > > >> +        crc += (crc << 1) + code; > Don't NIH CRCs, we have av_crc in lavu. See below how to use it. It's not a standard crc, but more of a recirculating checksum, so the NIH code is required. > > >> +static int wv_unpack_dsd_copy(WavpackFrameContext *s, void *dst_l, void *dst_r) >> +{ >> +    uint8_t *dsd_l              = dst_l; >> +    uint8_t *dsd_r              = dst_r; > You're shadowing arguments. Your compiler doesn't warn on this either? > You're calling the function with uint8_ts anyway, just change the type. They're not shadowed (dsd vs. dst) which is why my compiler didn't complain, but I took your suggestion of just changing the types. > > >> +    while (total_samples--) { >> +        crc += (crc << 1) + (*dsd_l = bytestream2_get_byte(&s->gb)); >> +        dsd_l += 4; >> + >> +        if (dst_r) { >> +            crc += (crc << 1) + (*dsd_r = bytestream2_get_byte(&s->gb)); >> +            dsd_r += 4; >> +        } >> +    } > av_crc(av_crc_get_table(AV_CRC_32_IEEE/LE), UINT32_MAX, dsd_start_r/l, dsd_r/l - dsd_start_r/l) should work and be faster. see above > > >> +    s->fdec_num = 0; > Private codec context is always zeroed already. removed > > >> +    int chan = 0, chmask = 0, sample_rate = 0, rate_x = 1, dsd_mode = 0; >> +                chmask = avctx->channel_layout; >>       uint32_t chmask, flags; > frame->channel_layout is uint64_t. good to know...fixed > > >> +    samples_l = frame->extended_data[wc->ch_offset]; >> +    if (s->stereo) >> +        samples_r = frame->extended_data[wc->ch_offset + 1]; >> + >> +    wc->ch_offset += 1 + s->stereo; > Have you checked non-stereo decodes fine and the channels are correctly ordered? Yes. > > >> +        if (id & WP_IDF_LONG) { >> +            size |= (bytestream2_get_byte(&gb)) << 8; >> +            size |= (bytestream2_get_byte(&gb)) << 16; >> +        } > Could use bytestream2_get_le16u/be16u to save 2 lines. Thanks. Also found a few places where I could use bytestream2_get_be32()! > > >> +    if (!got_dsd) { >> +        av_log(avctx, AV_LOG_ERROR, "Packed samples not found\n"); >> +        return AVERROR_INVALIDDATA; >> +    } > I think you should check avctx is completely configured before this, after parsing all WP_IDs, in case something is corrupt. I don't think anything is required here because all the medadata frames that we recognize are checked when parsed, and WP_ID_DSD_DATA is the only one required for DSD audio, and all the others are silently ignored. > > >> +        frame->nb_samples = s->samples + 1; >> +        if ((ret = ff_get_buffer(avctx, frame, 0)) < 0) >> +            return ret; >> +        frame->nb_samples = s->samples; > ?. Is the extra sample used as temporary buffer or something? Your guess is as good as mine. This was part of the code "borrowed" from the PCM version (with the threading removed) so maybe there is (or was) a situation that was writing one extra sample off the end. The code here certainly doesn't, but it seemed pretty innocuous and I don't like just ripping out things I don't understand. > >> +AVCodec ff_wavpack_dsd_decoder = { >> +    .name           = "wavpack_dsd", >> +    .long_name      = NULL_IF_CONFIG_SMALL("WavPack DSD"), >> +    .type           = AVMEDIA_TYPE_AUDIO, >> +    .id             = AV_CODEC_ID_WAVPACK_DSD, >> +    .priv_data_size = sizeof(WavpackContext), >> +    .init           = wavpack_decode_init, >> +    .close          = wavpack_decode_end, >> +    .decode         = wavpack_decode_frame, >> +    .capabilities   = AV_CODEC_CAP_DR1, >> +}; > Seeking is probably broken. You should add a flush function to reset the decoder entirely. Seeking seems to work fine, again because it was working with PCM and I didn't mess with any of that. I added a flush to clear the dsd2pcm contexts because that might have left an audible tick (that's the only context that's held over between frames that isn't identical for the whole file). Kind regards, David > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". From 8c13bdb9e4a7748d3f9d937db573fc65ef645d65 Mon Sep 17 00:00:00 2001 From: David Bryant Date: Wed, 24 Jul 2019 21:25:26 -0700 Subject: [PATCH] avcodec: add a WavPack DSD decoder Signed-off-by: David Bryant --- libavcodec/Makefile | 1 + libavcodec/allcodecs.c | 1 + libavcodec/avcodec.h | 1 + libavcodec/codec_desc.c | 7 + libavcodec/wavpack.h | 2 + libavcodec/wavpack_dsd.c | 775 +++++++++++++++++++++++++++++++++++++++++++++++ libavformat/wvdec.c | 32 +- 7 files changed, 807 insertions(+), 12 deletions(-) create mode 100644 libavcodec/wavpack_dsd.c diff --git a/libavcodec/Makefile b/libavcodec/Makefile index 3cd73fb..b94327e 100644 --- a/libavcodec/Makefile +++ b/libavcodec/Makefile @@ -686,6 +686,7 @@ OBJS-$(CONFIG_VPLAYER_DECODER) += textdec.o ass.o OBJS-$(CONFIG_VP9_V4L2M2M_DECODER) += v4l2_m2m_dec.o OBJS-$(CONFIG_VQA_DECODER) += vqavideo.o OBJS-$(CONFIG_WAVPACK_DECODER) += wavpack.o +OBJS-$(CONFIG_WAVPACK_DSD_DECODER) += wavpack_dsd.o dsd.o OBJS-$(CONFIG_WAVPACK_ENCODER) += wavpackenc.o OBJS-$(CONFIG_WCMV_DECODER) += wcmv.o OBJS-$(CONFIG_WEBP_DECODER) += webp.o diff --git a/libavcodec/allcodecs.c b/libavcodec/allcodecs.c index d2f9a39..a2f414b 100644 --- a/libavcodec/allcodecs.c +++ b/libavcodec/allcodecs.c @@ -485,6 +485,7 @@ extern AVCodec ff_vorbis_encoder; extern AVCodec ff_vorbis_decoder; extern AVCodec ff_wavpack_encoder; extern AVCodec ff_wavpack_decoder; +extern AVCodec ff_wavpack_dsd_decoder; extern AVCodec ff_wmalossless_decoder; extern AVCodec ff_wmapro_decoder; extern AVCodec ff_wmav1_encoder; diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h index d234271..8d3a551 100644 --- a/libavcodec/avcodec.h +++ b/libavcodec/avcodec.h @@ -652,6 +652,7 @@ enum AVCodecID { AV_CODEC_ID_SBC, AV_CODEC_ID_ATRAC9, AV_CODEC_ID_HCOM, + AV_CODEC_ID_WAVPACK_DSD, /* subtitle codecs */ AV_CODEC_ID_FIRST_SUBTITLE = 0x17000, ///< A dummy ID pointing at the start of subtitle codecs. diff --git a/libavcodec/codec_desc.c b/libavcodec/codec_desc.c index 4d033c2..bee88b8 100644 --- a/libavcodec/codec_desc.c +++ b/libavcodec/codec_desc.c @@ -2985,6 +2985,13 @@ static const AVCodecDescriptor codec_descriptors[] = { .long_name = NULL_IF_CONFIG_SMALL("HCOM Audio"), .props = AV_CODEC_PROP_LOSSY, }, + { + .id = AV_CODEC_ID_WAVPACK_DSD, + .type = AVMEDIA_TYPE_AUDIO, + .name = "wavpack_dsd", + .long_name = NULL_IF_CONFIG_SMALL("WavPack DSD"), + .props = AV_CODEC_PROP_LOSSLESS, + }, /* subtitle codecs */ { diff --git a/libavcodec/wavpack.h b/libavcodec/wavpack.h index 6caad03..43aaac8 100644 --- a/libavcodec/wavpack.h +++ b/libavcodec/wavpack.h @@ -35,6 +35,7 @@ #define WV_FLOAT_DATA 0x00000080 #define WV_INT32_DATA 0x00000100 #define WV_FALSE_STEREO 0x40000000 +#define WV_DSD_DATA 0x80000000 #define WV_HYBRID_MODE 0x00000008 #define WV_HYBRID_SHAPE 0x00000008 @@ -77,6 +78,7 @@ enum WP_ID { WP_ID_CORR, WP_ID_EXTRABITS, WP_ID_CHANINFO, + WP_ID_DSD_DATA, WP_ID_SAMPLE_RATE = 0x27, }; diff --git a/libavcodec/wavpack_dsd.c b/libavcodec/wavpack_dsd.c new file mode 100644 index 0000000..4a89817 --- /dev/null +++ b/libavcodec/wavpack_dsd.c @@ -0,0 +1,775 @@ +/* + * WavPack lossless DSD audio decoder + * Copyright (c) 2006,2011 Konstantin Shishkov + * Copyright (c) 2019 David Bryant + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/channel_layout.h" + +#include "avcodec.h" +#include "bytestream.h" +#include "internal.h" +#include "wavpack.h" +#include "dsd.h" + +/** + * @file + * WavPack lossless DSD audio decoder + */ + +#define DSD_BYTE_READY(low,high) (!(((low) ^ (high)) & 0xff000000)) + +#define PTABLE_BITS 8 +#define PTABLE_BINS (1<CRC) { + av_log(s->avctx, AV_LOG_ERROR, "CRC error\n"); + return AVERROR_INVALIDDATA; + } + + return 0; +} + +static void init_ptable(int *table, int rate_i, int rate_s) +{ + int value = 0x808000, rate = rate_i << 8; + + for (int c = (rate + 128) >> 8; c--;) + value += (DOWN - value) >> DECAY; + + for (int i = 0; i < PTABLE_BINS/2; i++) { + table[i] = value; + table[PTABLE_BINS-1-i] = 0x100ffff - value; + + if (value > 0x010000) { + rate += (rate * rate_s + 128) >> 8; + + for (int c = (rate + 64) >> 7; c--;) + value += (DOWN - value) >> DECAY; + } + } +} + +typedef struct { + int32_t value, fltr0, fltr1, fltr2, fltr3, fltr4, fltr5, fltr6, factor, byte; +} DSDfilters; + +static int wv_unpack_dsd_high(WavpackFrameContext *s, uint8_t *dst_l, uint8_t *dst_r) +{ + uint32_t crc = 0xFFFFFFFF; + int total_samples = s->samples, stereo = dst_r ? 1 : 0; + DSDfilters filters[2], *sp = filters; + int rate_i, rate_s; + uint32_t low, high, value; + + if (bytestream2_get_bytes_left(&s->gb) < (stereo ? 20 : 13)) + return AVERROR_INVALIDDATA; + + rate_i = bytestream2_get_byte(&s->gb); + rate_s = bytestream2_get_byte(&s->gb); + + if (rate_s != RATE_S) + return AVERROR_INVALIDDATA; + + init_ptable(s->ptable, rate_i, rate_s); + + for (int channel = 0; channel < stereo + 1; channel++) { + DSDfilters *sp = filters + channel; + + sp->fltr1 = bytestream2_get_byte(&s->gb) << (PRECISION - 8); + sp->fltr2 = bytestream2_get_byte(&s->gb) << (PRECISION - 8); + sp->fltr3 = bytestream2_get_byte(&s->gb) << (PRECISION - 8); + sp->fltr4 = bytestream2_get_byte(&s->gb) << (PRECISION - 8); + sp->fltr5 = bytestream2_get_byte(&s->gb) << (PRECISION - 8); + sp->fltr6 = 0; + sp->factor = bytestream2_get_byte(&s->gb) & 0xff; + sp->factor |= (bytestream2_get_byte(&s->gb) << 8) & 0xff00; + sp->factor = (sp->factor << 16) >> 16; + } + + value = bytestream2_get_be32(&s->gb); + high = 0xffffffff; + low = 0x0; + + memset(dst_l, 0x69, total_samples * 4); + + if (stereo) + memset(dst_r, 0x69, total_samples * 4); + + while (total_samples--) { + int bitcount = 8; + + sp[0].value = sp[0].fltr1 - sp[0].fltr5 + ((sp[0].fltr6 * sp[0].factor) >> 2); + + if (stereo) + sp[1].value = sp[1].fltr1 - sp[1].fltr5 + ((sp[1].fltr6 * sp[1].factor) >> 2); + + while (bitcount--) { + int32_t *pp = s->ptable + ((sp[0].value >> (PRECISION - PRECISION_USE)) & PTABLE_MASK); + uint32_t split = low + ((high - low) >> 8) * (*pp >> 16); + + if (value <= split) { + high = split; + *pp += (UP - *pp) >> DECAY; + sp[0].fltr0 = -1; + } else { + low = split + 1; + *pp += (DOWN - *pp) >> DECAY; + sp[0].fltr0 = 0; + } + + while (DSD_BYTE_READY(high, low) && bytestream2_get_bytes_left(&s->gb)) { + value = (value << 8) | bytestream2_get_byte(&s->gb); + high = (high << 8) | 0xff; + low <<= 8; + } + + sp[0].value += sp[0].fltr6 << 3; + sp[0].byte = (sp[0].byte << 1) | (sp[0].fltr0 & 1); + sp[0].factor += (((sp[0].value ^ sp[0].fltr0) >> 31) | 1) & + ((sp[0].value ^ (sp[0].value - (sp[0].fltr6 << 4))) >> 31); + sp[0].fltr1 += ((sp[0].fltr0 & VALUE_ONE) - sp[0].fltr1) >> 6; + sp[0].fltr2 += ((sp[0].fltr0 & VALUE_ONE) - sp[0].fltr2) >> 4; + sp[0].fltr3 += (sp[0].fltr2 - sp[0].fltr3) >> 4; + sp[0].fltr4 += (sp[0].fltr3 - sp[0].fltr4) >> 4; + sp[0].value = (sp[0].fltr4 - sp[0].fltr5) >> 4; + sp[0].fltr5 += sp[0].value; + sp[0].fltr6 += (sp[0].value - sp[0].fltr6) >> 3; + sp[0].value = sp[0].fltr1 - sp[0].fltr5 + ((sp[0].fltr6 * sp[0].factor) >> 2); + + if (!stereo) + continue; + + pp = s->ptable + ((sp[1].value >> (PRECISION - PRECISION_USE)) & PTABLE_MASK); + split = low + ((high - low) >> 8) * (*pp >> 16); + + if (value <= split) { + high = split; + *pp += (UP - *pp) >> DECAY; + sp[1].fltr0 = -1; + } else { + low = split + 1; + *pp += (DOWN - *pp) >> DECAY; + sp[1].fltr0 = 0; + } + + while (DSD_BYTE_READY(high, low) && bytestream2_get_bytes_left(&s->gb)) { + value = (value << 8) | bytestream2_get_byte(&s->gb); + high = (high << 8) | 0xff; + low <<= 8; + } + + sp[1].value += sp[1].fltr6 << 3; + sp[1].byte = (sp[1].byte << 1) | (sp[1].fltr0 & 1); + sp[1].factor += (((sp[1].value ^ sp[1].fltr0) >> 31) | 1) & + ((sp[1].value ^ (sp[1].value - (sp[1].fltr6 << 4))) >> 31); + sp[1].fltr1 += ((sp[1].fltr0 & VALUE_ONE) - sp[1].fltr1) >> 6; + sp[1].fltr2 += ((sp[1].fltr0 & VALUE_ONE) - sp[1].fltr2) >> 4; + sp[1].fltr3 += (sp[1].fltr2 - sp[1].fltr3) >> 4; + sp[1].fltr4 += (sp[1].fltr3 - sp[1].fltr4) >> 4; + sp[1].value = (sp[1].fltr4 - sp[1].fltr5) >> 4; + sp[1].fltr5 += sp[1].value; + sp[1].fltr6 += (sp[1].value - sp[1].fltr6) >> 3; + sp[1].value = sp[1].fltr1 - sp[1].fltr5 + ((sp[1].fltr6 * sp[1].factor) >> 2); + } + + crc += (crc << 1) + (*dst_l = sp[0].byte & 0xff); + sp[0].factor -= (sp[0].factor + 512) >> 10; + dst_l += 4; + + if (stereo) { + crc += (crc << 1) + (*dst_r = filters[1].byte & 0xff); + filters[1].factor -= (filters[1].factor + 512) >> 10; + dst_r += 4; + } + } + + if ((s->avctx->err_recognition & AV_EF_CRCCHECK) && wv_check_crc(s, crc)) + return AVERROR_INVALIDDATA; + + return 0; +} + +static int wv_unpack_dsd_fast(WavpackFrameContext *s, uint8_t *dst_l, uint8_t *dst_r) +{ + uint8_t history_bits, max_probability; + int total_summed_probabilities = 0; + int total_samples = s->samples; + uint8_t *vlb = s->value_lookup_buffer; + int history_bins, p0, p1, chan; + uint32_t crc = 0xFFFFFFFF; + uint32_t low, high, value; + + if (!bytestream2_get_bytes_left(&s->gb)) + return AVERROR_INVALIDDATA; + + history_bits = bytestream2_get_byte(&s->gb); + + if (!bytestream2_get_bytes_left(&s->gb) || history_bits > MAX_HISTORY_BITS) + return AVERROR_INVALIDDATA; + + history_bins = 1 << history_bits; + max_probability = bytestream2_get_byte(&s->gb); + + if (max_probability < 0xff) { + uint8_t *outptr = (uint8_t *) s->probabilities; + uint8_t *outend = outptr + sizeof (*s->probabilities) * history_bins; + + while (outptr < outend && bytestream2_get_bytes_left(&s->gb)) { + int code = bytestream2_get_byte(&s->gb); + + if (code > max_probability) { + int zcount = code - max_probability; + + while (outptr < outend && zcount--) + *outptr++ = 0; + } else if (code) { + *outptr++ = code; + } + else { + break; + } + } + + if (outptr < outend || + (bytestream2_get_bytes_left(&s->gb) && bytestream2_get_byte(&s->gb))) + return AVERROR_INVALIDDATA; + } else if (bytestream2_get_bytes_left(&s->gb) > (int) sizeof (*s->probabilities) * history_bins) { + bytestream2_get_buffer(&s->gb, (uint8_t *) s->probabilities, + sizeof (*s->probabilities) * history_bins); + } else { + return AVERROR_INVALIDDATA; + } + + for (p0 = 0; p0 < history_bins; p0++) { + int32_t sum_values = 0; + + for (int i = 0; i < 256; i++) + s->summed_probabilities[p0][i] = sum_values += s->probabilities[p0][i]; + + if (sum_values) { + total_summed_probabilities += sum_values; + + if (total_summed_probabilities > history_bins * MAX_BIN_BYTES) + return AVERROR_INVALIDDATA; + + s->value_lookup[p0] = vlb; + + for (int i = 0; i < 256; i++) { + int c = s->probabilities[p0][i]; + + while (c--) + *vlb++ = i; + } + } + } + + if (bytestream2_get_bytes_left(&s->gb) < 4) + return AVERROR_INVALIDDATA; + + chan = p0 = p1 = 0; + low = 0; high = 0xffffffff; + value = bytestream2_get_be32(&s->gb); + + memset(dst_l, 0x69, total_samples * 4); + + if (dst_r) { + memset(dst_r, 0x69, total_samples * 4); + total_samples *= 2; + } + + while (total_samples--) { + int mult, index, code; + + if (!s->summed_probabilities[p0][255]) + return AVERROR_INVALIDDATA; + + mult = (high - low) / s->summed_probabilities[p0][255]; + + if (!mult) { + if (bytestream2_get_bytes_left(&s->gb) >= 4) + value = bytestream2_get_be32(&s->gb); + + low = 0; + high = 0xffffffff; + mult = high / s->summed_probabilities[p0][255]; + + if (!mult) + return AVERROR_INVALIDDATA; + } + + index = (value - low) / mult; + + if (index >= s->summed_probabilities[p0][255]) + return AVERROR_INVALIDDATA; + + if (!dst_r) { + if ((*dst_l = code = s->value_lookup[p0][index])) + low += s->summed_probabilities[p0][code-1] * mult; + + dst_l += 4; + } else { + if ((code = s->value_lookup[p0][index])) + low += s->summed_probabilities[p0][code-1] * mult; + + if (chan) { + *dst_r = code; + dst_r += 4; + } + else { + *dst_l = code; + dst_l += 4; + } + + chan ^= 1; + } + + high = low + s->probabilities[p0][code] * mult - 1; + crc += (crc << 1) + code; + + if (!dst_r) { + p0 = code & (history_bins-1); + } else { + p0 = p1; + p1 = code & (history_bins-1); + } + + while (DSD_BYTE_READY(high, low) && bytestream2_get_bytes_left(&s->gb)) { + value = (value << 8) | bytestream2_get_byte(&s->gb); + high = (high << 8) | 0xff; + low <<= 8; + } + } + + if ((s->avctx->err_recognition & AV_EF_CRCCHECK) && wv_check_crc(s, crc)) + return AVERROR_INVALIDDATA; + + return 0; +} + +static int wv_unpack_dsd_copy(WavpackFrameContext *s, uint8_t *dst_l, uint8_t *dst_r) +{ + int total_samples = s->samples; + uint32_t crc = 0xFFFFFFFF; + + if (bytestream2_get_bytes_left(&s->gb) != total_samples * (dst_r ? 2 : 1)) + return AVERROR_INVALIDDATA; + + memset(dst_l, 0x69, total_samples * 4); + + if (dst_r) + memset(dst_r, 0x69, total_samples * 4); + + while (total_samples--) { + crc += (crc << 1) + (*dst_l = bytestream2_get_byte(&s->gb)); + dst_l += 4; + + if (dst_r) { + crc += (crc << 1) + (*dst_r = bytestream2_get_byte(&s->gb)); + dst_r += 4; + } + } + + if ((s->avctx->err_recognition & AV_EF_CRCCHECK) && wv_check_crc(s, crc)) + return AVERROR_INVALIDDATA; + + return 0; +} + +static av_cold int wv_alloc_frame_context(WavpackContext *c) +{ + if (c->fdec_num == WV_MAX_FRAME_DECODERS) + return -1; + + c->fdec[c->fdec_num] = av_mallocz(sizeof(**c->fdec)); + if (!c->fdec[c->fdec_num]) + return -1; + c->fdec_num++; + c->fdec[c->fdec_num - 1]->avctx = c->avctx; + memset(c->fdec[c->fdec_num - 1]->dsdctx[0].buf, 0x69, + sizeof(c->fdec[c->fdec_num - 1]->dsdctx[0].buf)); + memset(c->fdec[c->fdec_num - 1]->dsdctx[1].buf, 0x69, + sizeof(c->fdec[c->fdec_num - 1]->dsdctx[1].buf)); + + return 0; +} + +static av_cold int wavpack_decode_init(AVCodecContext *avctx) +{ + WavpackContext *s = avctx->priv_data; + + s->avctx = avctx; + + s->fdec_num = 0; + + ff_init_dsd_data(); + + return 0; +} + +static av_cold int wavpack_decode_end(AVCodecContext *avctx) +{ + WavpackContext *s = avctx->priv_data; + + s->fdec_num = 0; + + return 0; +} + +static int wavpack_decode_block(AVCodecContext *avctx, int block_no, + AVFrame *frame, const uint8_t *buf, int buf_size) +{ + WavpackContext *wc = avctx->priv_data; + WavpackFrameContext *s; + GetByteContext gb; + void *samples_l = NULL, *samples_r = NULL; + int ret; + int got_dsd = 0; + int id, size, ssize; + int chan = 0, sample_rate = 0, rate_x = 1, dsd_mode = 0; + int frame_flags, multiblock; + uint64_t chmask = 0; + + if (block_no >= wc->fdec_num && wv_alloc_frame_context(wc) < 0) { + av_log(avctx, AV_LOG_ERROR, "Error creating frame decode context\n"); + return AVERROR_INVALIDDATA; + } + + s = wc->fdec[block_no]; + if (!s) { + av_log(avctx, AV_LOG_ERROR, "Context for block %d is not present\n", + block_no); + return AVERROR_INVALIDDATA; + } + + bytestream2_init(&gb, buf, buf_size); + + s->samples = bytestream2_get_le32(&gb); + if (s->samples != wc->samples) { + av_log(avctx, AV_LOG_ERROR, "Mismatching number of samples in " + "a sequence: %d and %d\n", wc->samples, s->samples); + return AVERROR_INVALIDDATA; + } + frame_flags = bytestream2_get_le32(&gb); + multiblock = (frame_flags & WV_SINGLE_BLOCK) != WV_SINGLE_BLOCK; + + s->stereo = !(frame_flags & WV_MONO); + s->stereo_in = (frame_flags & WV_FALSE_STEREO) ? 0 : s->stereo; + s->CRC = bytestream2_get_le32(&gb); + + // parse metadata blocks + while (bytestream2_get_bytes_left(&gb)) { + id = bytestream2_get_byte(&gb); + size = bytestream2_get_byte(&gb); + if (id & WP_IDF_LONG) + size |= (bytestream2_get_le16u(&gb)) << 8; + size <<= 1; // size is specified in words + ssize = size; + if (id & WP_IDF_ODD) + size--; + if (size < 0) { + av_log(avctx, AV_LOG_ERROR, + "Got incorrect block %02X with size %i\n", id, size); + break; + } + if (bytestream2_get_bytes_left(&gb) < ssize) { + av_log(avctx, AV_LOG_ERROR, + "Block size %i is out of bounds\n", size); + break; + } + switch (id & WP_IDF_MASK) { + case WP_ID_DSD_DATA: + if (size < 2) { + av_log(avctx, AV_LOG_ERROR, "Invalid DSD_DATA, size = %i\n", + size); + bytestream2_skip(&gb, ssize); + continue; + } + rate_x = 1 << bytestream2_get_byte(&gb); + dsd_mode = bytestream2_get_byte(&gb); + if (dsd_mode && dsd_mode != 1 && dsd_mode != 3) { + av_log(avctx, AV_LOG_ERROR, "Invalid DSD encoding mode: %d\n", + dsd_mode); + return AVERROR_INVALIDDATA; + } + bytestream2_init(&s->gb, gb.buffer, size-2); + bytestream2_skip(&gb, size-2); + got_dsd = 1; + break; + case WP_ID_CHANINFO: + if (size <= 1) { + av_log(avctx, AV_LOG_ERROR, + "Insufficient channel information\n"); + return AVERROR_INVALIDDATA; + } + chan = bytestream2_get_byte(&gb); + switch (size - 2) { + case 0: + chmask = bytestream2_get_byte(&gb); + break; + case 1: + chmask = bytestream2_get_le16(&gb); + break; + case 2: + chmask = bytestream2_get_le24(&gb); + break; + case 3: + chmask = bytestream2_get_le32(&gb); + break; + case 4: + size = bytestream2_get_byte(&gb); + chan |= (bytestream2_get_byte(&gb) & 0xF) << 8; + chan += 1; + if (avctx->channels != chan) + av_log(avctx, AV_LOG_WARNING, "%i channels signalled" + " instead of %i.\n", chan, avctx->channels); + chmask = bytestream2_get_le24(&gb); + break; + case 5: + size = bytestream2_get_byte(&gb); + chan |= (bytestream2_get_byte(&gb) & 0xF) << 8; + chan += 1; + if (avctx->channels != chan) + av_log(avctx, AV_LOG_WARNING, "%i channels signalled" + " instead of %i.\n", chan, avctx->channels); + chmask = bytestream2_get_le32(&gb); + break; + default: + av_log(avctx, AV_LOG_ERROR, "Invalid channel info size %d\n", + size); + chan = avctx->channels; + chmask = avctx->channel_layout; + } + break; + case WP_ID_SAMPLE_RATE: + if (size != 3) { + av_log(avctx, AV_LOG_ERROR, "Invalid custom sample rate.\n"); + return AVERROR_INVALIDDATA; + } + sample_rate = bytestream2_get_le24(&gb); + break; + default: + bytestream2_skip(&gb, size); + } + if (id & WP_IDF_ODD) + bytestream2_skip(&gb, 1); + } + + if (!got_dsd) { + av_log(avctx, AV_LOG_ERROR, "Packed samples not found\n"); + return AVERROR_INVALIDDATA; + } + + if (!wc->ch_offset) { + int sr = (frame_flags >> 23) & 0xf; + if (sr == 0xf) { + if (!sample_rate) { + av_log(avctx, AV_LOG_ERROR, "Custom sample rate missing.\n"); + return AVERROR_INVALIDDATA; + } + avctx->sample_rate = sample_rate * rate_x; + } else { + avctx->sample_rate = wv_rates[sr] * rate_x; + } + + if (multiblock) { + if (chan) + avctx->channels = chan; + if (chmask) + avctx->channel_layout = chmask; + } else { + avctx->channels = s->stereo ? 2 : 1; + avctx->channel_layout = s->stereo ? AV_CH_LAYOUT_STEREO : + AV_CH_LAYOUT_MONO; + } + + /* get output buffer */ + frame->nb_samples = s->samples + 1; + if ((ret = ff_get_buffer(avctx, frame, 0)) < 0) + return ret; + frame->nb_samples = s->samples; + } + + if (wc->ch_offset + s->stereo >= avctx->channels) { + av_log(avctx, AV_LOG_WARNING, "Too many channels coded in a packet.\n"); + return ((avctx->err_recognition & AV_EF_EXPLODE) || !wc->ch_offset) ? + AVERROR_INVALIDDATA : 0; + } + + samples_l = frame->extended_data[wc->ch_offset]; + if (s->stereo) + samples_r = frame->extended_data[wc->ch_offset + 1]; + + wc->ch_offset += 1 + s->stereo; + + if (s->stereo_in) { + if (dsd_mode == 3) + ret = wv_unpack_dsd_high(s, samples_l, samples_r); + else if (dsd_mode == 1) + ret = wv_unpack_dsd_fast(s, samples_l, samples_r); + else + ret = wv_unpack_dsd_copy(s, samples_l, samples_r); + } else { + if (dsd_mode == 3) + ret = wv_unpack_dsd_high(s, samples_l, NULL); + else if (dsd_mode == 1) + ret = wv_unpack_dsd_fast(s, samples_l, NULL); + else + ret = wv_unpack_dsd_copy(s, samples_l, NULL); + + if (s->stereo) + memcpy(samples_r, samples_l, 4 * s->samples); + } + + ff_dsd2pcm_translate(&s->dsdctx[0], s->samples, 0, samples_l, 4, samples_l, 1); + + if (s->stereo) + ff_dsd2pcm_translate(&s->dsdctx[1], s->samples, 0, samples_r, 4, samples_r, 1); + + return ret; +} + +static void wavpack_decode_flush(AVCodecContext *avctx) +{ + WavpackContext *s = avctx->priv_data; + + for (int i = 0; i < s->fdec_num; i++) { + memset(s->fdec[i]->dsdctx[0].buf, 0x69, sizeof(s->fdec[i]->dsdctx[0].buf)); + memset(s->fdec[i]->dsdctx[1].buf, 0x69, sizeof(s->fdec[i]->dsdctx[1].buf)); + } +} + +static int wavpack_decode_frame(AVCodecContext *avctx, void *data, + int *got_frame_ptr, AVPacket *avpkt) +{ + WavpackContext *s = avctx->priv_data; + const uint8_t *buf = avpkt->data; + int buf_size = avpkt->size; + AVFrame *frame = data; + int frame_size, ret, frame_flags; + + if (avpkt->size <= WV_HEADER_SIZE) + return AVERROR_INVALIDDATA; + + s->block = 0; + s->ch_offset = 0; + + /* determine number of samples */ + s->samples = AV_RL32(buf + 20); + frame_flags = AV_RL32(buf + 24); + if (s->samples <= 0 || s->samples > WV_MAX_SAMPLES) { + av_log(avctx, AV_LOG_ERROR, "Invalid number of samples: %d\n", + s->samples); + return AVERROR_INVALIDDATA; + } + + if (!(frame_flags & WV_DSD_DATA)) { + av_log(avctx, AV_LOG_ERROR, "Encountered a non-DSD frame\n"); + return AVERROR_INVALIDDATA; + } + + avctx->sample_fmt = AV_SAMPLE_FMT_FLTP; + + while (buf_size > 0) { + if (buf_size <= WV_HEADER_SIZE) + break; + frame_size = AV_RL32(buf + 4) - 12; + buf += 20; + buf_size -= 20; + if (frame_size <= 0 || frame_size > buf_size) { + av_log(avctx, AV_LOG_ERROR, + "Block %d has invalid size (size %d vs. %d bytes left)\n", + s->block, frame_size, buf_size); + return AVERROR_INVALIDDATA; + } + if ((ret = wavpack_decode_block(avctx, s->block, + frame, buf, frame_size)) < 0) { + return ret; + } + s->block++; + buf += frame_size; + buf_size -= frame_size; + } + + if (s->ch_offset != avctx->channels) { + av_log(avctx, AV_LOG_ERROR, "Not enough channels coded in a packet.\n"); + return AVERROR_INVALIDDATA; + } + + *got_frame_ptr = 1; + + return avpkt->size; +} + +AVCodec ff_wavpack_dsd_decoder = { + .name = "wavpack_dsd", + .long_name = NULL_IF_CONFIG_SMALL("WavPack DSD"), + .type = AVMEDIA_TYPE_AUDIO, + .id = AV_CODEC_ID_WAVPACK_DSD, + .priv_data_size = sizeof(WavpackContext), + .init = wavpack_decode_init, + .close = wavpack_decode_end, + .decode = wavpack_decode_frame, + .flush = wavpack_decode_flush, + .capabilities = AV_CODEC_CAP_DR1, +}; diff --git a/libavformat/wvdec.c b/libavformat/wvdec.c index 649791d..50f4079 100644 --- a/libavformat/wvdec.c +++ b/libavformat/wvdec.c @@ -79,7 +79,7 @@ static int wv_read_block_header(AVFormatContext *ctx, AVIOContext *pb) { WVContext *wc = ctx->priv_data; int ret; - int rate, bpp, chan; + int rate, rate_x, bpp, chan; uint32_t chmask, flags; wc->pos = avio_tell(pb); @@ -98,11 +98,6 @@ static int wv_read_block_header(AVFormatContext *ctx, AVIOContext *pb) return ret; } - if (wc->header.flags & WV_DSD) { - avpriv_report_missing_feature(ctx, "WV DSD"); - return AVERROR_PATCHWELCOME; - } - if (wc->header.version < 0x402 || wc->header.version > 0x410) { avpriv_report_missing_feature(ctx, "WV version 0x%03X", wc->header.version); @@ -115,7 +110,8 @@ static int wv_read_block_header(AVFormatContext *ctx, AVIOContext *pb) return 0; // parse flags flags = wc->header.flags; - bpp = ((flags & 3) + 1) << 3; + rate_x = (flags & WV_DSD) ? 4 : 1; + bpp = (flags & WV_DSD) ? 0 : ((flags & 3) + 1) << 3; chan = 1 + !(flags & WV_MONO); chmask = flags & WV_MONO ? AV_CH_LAYOUT_MONO : AV_CH_LAYOUT_STEREO; rate = wv_rates[(flags >> 23) & 0xF]; @@ -124,7 +120,7 @@ static int wv_read_block_header(AVFormatContext *ctx, AVIOContext *pb) chan = wc->chan; chmask = wc->chmask; } - if ((rate == -1 || !chan) && !wc->block_parsed) { + if ((rate == -1 || !chan || flags & WV_DSD) && !wc->block_parsed) { int64_t block_end = avio_tell(pb) + wc->header.blocksize; if (!(pb->seekable & AVIO_SEEKABLE_NORMAL)) { av_log(ctx, AV_LOG_ERROR, @@ -177,6 +173,16 @@ static int wv_read_block_header(AVFormatContext *ctx, AVIOContext *pb) return AVERROR_INVALIDDATA; } break; + case 0xE: + if (size <= 1) { + av_log(ctx, AV_LOG_ERROR, + "Invalid DSD block\n"); + return AVERROR_INVALIDDATA; + } + rate_x = 1 << avio_r8(pb); + if (size) + avio_skip(pb, size-1); + break; case 0x27: rate = avio_rl24(pb); break; @@ -200,7 +206,7 @@ static int wv_read_block_header(AVFormatContext *ctx, AVIOContext *pb) if (!wc->chmask) wc->chmask = chmask; if (!wc->rate) - wc->rate = rate; + wc->rate = rate * rate_x; if (flags && bpp != wc->bpp) { av_log(ctx, AV_LOG_ERROR, @@ -214,10 +220,10 @@ static int wv_read_block_header(AVFormatContext *ctx, AVIOContext *pb) chan, wc->chan); return AVERROR_INVALIDDATA; } - if (flags && rate != -1 && rate != wc->rate) { + if (flags && rate != -1 && !(flags & WV_DSD) && rate * rate_x != wc->rate) { av_log(ctx, AV_LOG_ERROR, "Sampling rate differ, this block: %i, header block: %i\n", - rate, wc->rate); + rate * rate_x, wc->rate); return AVERROR_INVALIDDATA; } return 0; @@ -245,7 +251,9 @@ static int wv_read_header(AVFormatContext *s) if (!st) return AVERROR(ENOMEM); st->codecpar->codec_type = AVMEDIA_TYPE_AUDIO; - st->codecpar->codec_id = AV_CODEC_ID_WAVPACK; + st->codecpar->codec_id = wc->header.flags & WV_DSD ? + AV_CODEC_ID_WAVPACK_DSD : + AV_CODEC_ID_WAVPACK; st->codecpar->channels = wc->chan; st->codecpar->channel_layout = wc->chmask; st->codecpar->sample_rate = wc->rate;