diff mbox series

[FFmpeg-devel,RFC,1/3] aacdec: always skip the first 2048 samples if there's no side data

Message ID Ne7-l4q--3-9@lynne.ee
State New
Headers show
Series [FFmpeg-devel,RFC,1/3] aacdec: always skip the first 2048 samples if there's no side data | expand

Checks

Context Check Description
andriy/make_x86 success Make finished
andriy/make_fate_x86 fail Make fate failed

Commit Message

Lynne Sept. 12, 2023, 6:10 a.m. UTC
For some reason, this was never set, which meant all **raw** AAC in ADTS
streams, except faac, had extra samples at the start.

Despite this being a standard MDCT-based codec with a frame size of 1024,
hence a delay of 1024 samples at the start, all major encoders, excluding
faac and FFmpeg, use 2048 samples of padding.

The FFmpeg encoder will be modified to also output 2048 samples of padding
at the start, to make it in line with other encoders.

Yes, this leaves FATE pretty sad. Will fix it with the real version of the patch.

Comments

Lynne Sept. 12, 2023, 6:15 a.m. UTC | #1
Sep 12, 2023, 08:14 by dev@lynne.ee:

> As it happens, there's no standard between startup delay for SBR between
> decoders either. libfdkaac uses 5056 samples, but Apple's encoder (via afconvert)
> uses 3136.
>
> Currently, this only fixes libfdk-aac. Would like to have more samples from more
> encoders so I can fix all known cases.
>

Wrong patch attached.
Andreas Rheinhardt Sept. 12, 2023, 7:10 a.m. UTC | #2
Lynne:
> For some reason, this was never set, which meant all **raw** AAC in ADTS
> streams, except faac, had extra samples at the start.
> 
> Despite this being a standard MDCT-based codec with a frame size of 1024,
> hence a delay of 1024 samples at the start, all major encoders, excluding
> faac and FFmpeg, use 2048 samples of padding.
> 
> The FFmpeg encoder will be modified to also output 2048 samples of padding
> at the start, to make it in line with other encoders.

Does this also have actual advantages besides "being in line with other
encoders"?

> 
> Yes, this leaves FATE pretty sad. Will fix it with the real version of the patch.
> 

Didn't we once guess the number of skip samples like this, only for this
guesswork to be removed intentionally? (This is not a rhetorical
question; I thought it to be true, but I see that there is still code
for faac in decode_fill(); maybe I misremember.)

- Andreas
Lynne Sept. 12, 2023, 4:25 p.m. UTC | #3
Sep 12, 2023, 09:43 by andreas.rheinhardt@outlook.com:

> Lynne:
>
>> For some reason, this was never set, which meant all **raw** AAC in ADTS
>> streams, except faac, had extra samples at the start.
>>
>> Despite this being a standard MDCT-based codec with a frame size of 1024,
>> hence a delay of 1024 samples at the start, all major encoders, excluding
>> faac and FFmpeg, use 2048 samples of padding.
>>
>> The FFmpeg encoder will be modified to also output 2048 samples of padding
>> at the start, to make it in line with other encoders.
>>
>
> Does this also have actual advantages besides "being in line with other
> encoders"?
>

Not really. I don't have an opinion on this. 1024 is the natural
delay of the codec, so maybe it would be best to leave it at that.


>> Yes, this leaves FATE pretty sad. Will fix it with the real version of the patch.
>>
>
> Didn't we once guess the number of skip samples like this, only for this
> guesswork to be removed intentionally? (This is not a rhetorical
> question; I thought it to be true, but I see that there is still code
> for faac in decode_fill(); maybe I misremember.)
>

I don't remember something like that. The faac workaround dates back
from 2012 (bfe735b5824c7d10ba42932a17d786db50e3b2d4), and it's only for faac.
It's less of a guess, as most encoders to use the FIL extension to signal
themselves.
Thierry Foucu Sept. 12, 2023, 9:24 p.m. UTC | #4
On Tue, Sep 12, 2023 at 9:25 AM Lynne <dev@lynne.ee> wrote:

> Sep 12, 2023, 09:43 by andreas.rheinhardt@outlook.com:
>
> > Lynne:
> >
> >> For some reason, this was never set, which meant all **raw** AAC in ADTS
> >> streams, except faac, had extra samples at the start.
> >>
> >> Despite this being a standard MDCT-based codec with a frame size of
> 1024,
> >> hence a delay of 1024 samples at the start, all major encoders,
> excluding
> >> faac and FFmpeg, use 2048 samples of padding.
> >>
> >> The FFmpeg encoder will be modified to also output 2048 samples of
> padding
> >> at the start, to make it in line with other encoders.
> >>
> >
> > Does this also have actual advantages besides "being in line with other
> > encoders"?
> >
>
> Not really. I don't have an opinion on this. 1024 is the natural
> delay of the codec, so maybe it would be best to leave it at that.
>
>
> Note:
Not all encoders add 2048. Another version of the Fraunhofer encoder will
add only 1600 samples

and for HE-AAC of the same encoder will add 3200 samples.
Should we not then have an option to set it ?


> >> Yes, this leaves FATE pretty sad. Will fix it with the real version of
> the patch.
> >>
> >
> > Didn't we once guess the number of skip samples like this, only for this
> > guesswork to be removed intentionally? (This is not a rhetorical
> > question; I thought it to be true, but I see that there is still code
> > for faac in decode_fill(); maybe I misremember.)
> >
>
> I don't remember something like that. The faac workaround dates back
> from 2012 (bfe735b5824c7d10ba42932a17d786db50e3b2d4), and it's only for
> faac.
> It's less of a guess, as most encoders to use the FIL extension to signal
> themselves.
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
Lynne Oct. 3, 2023, 4:09 a.m. UTC | #5
Sep 12, 2023, 23:25 by tfoucu@gmail.com:

> On Tue, Sep 12, 2023 at 9:25 AM Lynne <dev@lynne.ee> wrote:
>
>> Sep 12, 2023, 09:43 by andreas.rheinhardt@outlook.com:
>>
>> > Lynne:
>> >
>> >> For some reason, this was never set, which meant all **raw** AAC in ADTS
>> >> streams, except faac, had extra samples at the start.
>> >>
>> >> Despite this being a standard MDCT-based codec with a frame size of
>> 1024,
>> >> hence a delay of 1024 samples at the start, all major encoders,
>> excluding
>> >> faac and FFmpeg, use 2048 samples of padding.
>> >>
>> >> The FFmpeg encoder will be modified to also output 2048 samples of
>> padding
>> >> at the start, to make it in line with other encoders.
>> >>
>> >
>> > Does this also have actual advantages besides "being in line with other
>> > encoders"?
>> >
>>
>> Not really. I don't have an opinion on this. 1024 is the natural
>> delay of the codec, so maybe it would be best to leave it at that.
>>
>>
>> Note:
>>
> Not all encoders add 2048. Another version of the Fraunhofer encoder will
> add only 1600 samples
>
> and for HE-AAC of the same encoder will add 3200 samples.
> Should we not then have an option to set it ?
>

That sounds reasonable, I've resent the patch.
Do you think it's reasonable to go 2048 samples?

It does cut off the default AAC decoder, in case the FIL extension
has been stripped.
diff mbox series

Patch

From 079235e1f1a9caeadfd2b8d78b3fe2273d86018a Mon Sep 17 00:00:00 2001
From: Lynne <dev@lynne.ee>
Date: Fri, 11 Aug 2023 17:50:54 +0200
Subject: [PATCH 1/3] aacdec: always skip the first 2048 samples if there's no
 side data

For some reason, this was never set, which meant all **raw** AAC in ADTS
streams, except faac, had extra samples at the start.

Despite this being a standard MDCT-based codec with a frame size of 1024,
hence a delay of 1024 samples at the start, all major encoders, excluding
faac and FFmpeg, use 2048 samples of padding.

The FFmpeg encoder will be modified to also output 2048 samples of padding
at the start, to make it in line with other encoders.
---
 libavcodec/aacdec_template.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/libavcodec/aacdec_template.c b/libavcodec/aacdec_template.c
index f8039e490b..0e4a274fea 100644
--- a/libavcodec/aacdec_template.c
+++ b/libavcodec/aacdec_template.c
@@ -1273,6 +1273,9 @@  static av_cold int aac_decode_init(AVCodecContext *avctx)
     if (ret < 0)
         return ret;
 
+    /* Usually overridden by side data */
+    avctx->internal->skip_samples = 2048;
+
     return 0;
 }
 
@@ -2417,14 +2420,16 @@  static int decode_dynamic_range(DynamicRangeControl *che_drc,
     return n;
 }
 
-static int decode_fill(AACContext *ac, GetBitContext *gb, int len) {
+static int decode_fill(AACContext *ac, GetBitContext *gb, int len)
+{
     uint8_t buf[256];
-    int i, major, minor;
+    int i, major, minor, micro;
 
     if (len < 13+7*8)
         goto unknown;
 
-    get_bits(gb, 13); len -= 13;
+    get_bits(gb, 13);
+    len -= 13;
 
     for(i=0; i+1<sizeof(buf) && len>=8; i++, len-=8)
         buf[i] = get_bits(gb, 8);
@@ -2434,7 +2439,11 @@  static int decode_fill(AACContext *ac, GetBitContext *gb, int len) {
         av_log(ac->avctx, AV_LOG_DEBUG, "FILL:%s\n", buf);
 
     if (sscanf(buf, "libfaac %d.%d", &major, &minor) == 2){
-        ac->avctx->internal->skip_samples = 1024;
+        ac->avctx->internal->skip_samples -= 1024;
+    }
+
+    if ((sscanf(buf, "avc %d.%d.%d", &major, &minor, &micro) == 3)) {
+        ac->avctx->internal->skip_samples -= 1024;
     }
 
 unknown:
-- 
2.40.1