diff mbox series

[FFmpeg-devel] libswresample: avoid s16p internal processing format

Message ID CAPYw7P4DhGPgcTfDKsN8DV4-_v0K=d-MATMWo8+MXc9xHLM=gw@mail.gmail.com
State New
Headers show
Series [FFmpeg-devel] libswresample: avoid s16p internal processing format | expand

Checks

Context Check Description
yinshiyou/configure_loongarch64 warning Failed to apply patch
andriy/make_x86 success Make finished
andriy/make_fate_x86 fail Make fate failed

Commit Message

Paul B Mahol Jan. 5, 2023, 12:44 p.m. UTC
Patch attached.

Comments

Michael Niedermayer Jan. 5, 2023, 8:53 p.m. UTC | #1
On Thu, Jan 05, 2023 at 01:44:10PM +0100, Paul B Mahol wrote:
> Patch attached.

>  swresample.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> eee7a0685b44aa867562138a2e2437ecb8844612  0001-libswresample-swresample-avoid-s16p-internal-transfe.patch
> From 9c4cd60e2dd41cf98d693c8251f4cfade0807073 Mon Sep 17 00:00:00 2001
> From: Paul B Mahol <onemda@gmail.com>
> Date: Thu, 5 Jan 2023 13:40:12 +0100
> Subject: [PATCH] libswresample/swresample: avoid s16p internal transfer format
> 
> Instead use float one by default for sample rate conversions.
> The s16p internal transfer format produces visible and hearable
> quantization artifacts.

When does this occur and why?
This change should be limited to the case that benefits, this would force this
even without resampling in some cases.

thx

[...]
Paul B Mahol Jan. 5, 2023, 10:08 p.m. UTC | #2
On Thu, Jan 5, 2023 at 9:53 PM Michael Niedermayer <michael@niedermayer.cc>
wrote:

> On Thu, Jan 05, 2023 at 01:44:10PM +0100, Paul B Mahol wrote:
> > Patch attached.
>
> >  swresample.c |    3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > eee7a0685b44aa867562138a2e2437ecb8844612
> 0001-libswresample-swresample-avoid-s16p-internal-transfe.patch
> > From 9c4cd60e2dd41cf98d693c8251f4cfade0807073 Mon Sep 17 00:00:00 2001
> > From: Paul B Mahol <onemda@gmail.com>
> > Date: Thu, 5 Jan 2023 13:40:12 +0100
> > Subject: [PATCH] libswresample/swresample: avoid s16p internal transfer
> format
> >
> > Instead use float one by default for sample rate conversions.
> > The s16p internal transfer format produces visible and hearable
> > quantization artifacts.
>
> When does this occur and why?
>

It occurs always. Just compare output with 16bit and int32/float/double.
Look at other people report on internet.
Look at src.infinitewave.ca


> This change should be limited to the case that benefits, this would force
> this
> even without resampling in some cases.
>

It is forced only if sample rates between input and output differs.


>
> thx
>
> [...]
>
> --
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> The day soldiers stop bringing you their problems is the day you have
> stopped
> leading them. They have either lost confidence that you can help or
> concluded
> you do not care. Either case is a failure of leadership. - Colin Powell
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
Michael Niedermayer Jan. 6, 2023, 5:25 p.m. UTC | #3
On Thu, Jan 05, 2023 at 11:08:25PM +0100, Paul B Mahol wrote:
> On Thu, Jan 5, 2023 at 9:53 PM Michael Niedermayer <michael@niedermayer.cc>
> wrote:
> 
> > On Thu, Jan 05, 2023 at 01:44:10PM +0100, Paul B Mahol wrote:
> > > Patch attached.
> >
> > >  swresample.c |    3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > eee7a0685b44aa867562138a2e2437ecb8844612
> > 0001-libswresample-swresample-avoid-s16p-internal-transfe.patch
> > > From 9c4cd60e2dd41cf98d693c8251f4cfade0807073 Mon Sep 17 00:00:00 2001
> > > From: Paul B Mahol <onemda@gmail.com>
> > > Date: Thu, 5 Jan 2023 13:40:12 +0100
> > > Subject: [PATCH] libswresample/swresample: avoid s16p internal transfer
> > format
> > >
> > > Instead use float one by default for sample rate conversions.
> > > The s16p internal transfer format produces visible and hearable
> > > quantization artifacts.
> >
> > When does this occur and why?
> >
> 
> It occurs always. Just compare output with 16bit and int32/float/double.
> Look at other people report on internet.
> Look at src.infinitewave.ca

src.infinitewave.ca uses 32bit none of what it shows should touch the codepath
you change.

if we look at src.infinitewave.ca for swr we see 2 types of artifacts
1. Aliassing which is at maybe -120db with the actual signal at 0db
   i would like to see some evidence that a human can hear this
2. Reflection and attenuation at the transition frequency
With linear filters there is a tradeof between attenuation of the
passband, reflection of frequencies beyond, latency and so on
You can have a perfect sharp cutoff with no attenuation and no refelection
that requires a infinitly long filter. And while this looks best in this
frequency plot, does it actually sound best ? If you can hear -120db
signals you surely would then also hear the ringing long before a gunshot
from such long filter.

also what actually is the optimal frequency response of this filter ?
with a 22khz cutoff, a 22.1khz sine should be silence is that
really subjectively better than a 21.9khz sine ?
Iam not sure about this. Has someone done actual hearing tests with
actual real audio? the sinc filter originates from the idea of lossless
reconstruction of frequencies below nyquist if iam not mistaken, but humans
are not trying to losslessly restore a block of frequencies. A human listener
generally wants to enjoy listening to some media. Has someone looked into
what is actually best for that real use case ?
This question matters because with it we can tune the filter parameters to
target humans.

But lets push the doubts about choosing resampling purely based on frequency
analysis away.
swresample has several parameters with which you can tune this:
we have a filter_size, if thats bigger you should get closer to the ideal
sinc. Theres phase_shift which may reduce the (i assume) unhearable aliasing. 
And cutoff which should allow to tune the (i assume) hearable 
reflection/attenuation tradeoff also theres filter_type to allow you to tune the
window function. 

If there are issues reported by people using their ears, please provide more
details, iam interrested in these cases.


> 
> 
> > This change should be limited to the case that benefits, this would force
> > this
> > even without resampling in some cases.
> >
> 
> It is forced only if sample rates between input and output differs.

If iam not mistaken it affects rematrixing without resampling too

thx

[...]
Paul B Mahol Jan. 6, 2023, 6:01 p.m. UTC | #4
On Fri, Jan 6, 2023 at 6:25 PM Michael Niedermayer <michael@niedermayer.cc>
wrote:

> On Thu, Jan 05, 2023 at 11:08:25PM +0100, Paul B Mahol wrote:
> > On Thu, Jan 5, 2023 at 9:53 PM Michael Niedermayer <
> michael@niedermayer.cc>
> > wrote:
> >
> > > On Thu, Jan 05, 2023 at 01:44:10PM +0100, Paul B Mahol wrote:
> > > > Patch attached.
> > >
> > > >  swresample.c |    3 ++-
> > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > eee7a0685b44aa867562138a2e2437ecb8844612
> > > 0001-libswresample-swresample-avoid-s16p-internal-transfe.patch
> > > > From 9c4cd60e2dd41cf98d693c8251f4cfade0807073 Mon Sep 17 00:00:00
> 2001
> > > > From: Paul B Mahol <onemda@gmail.com>
> > > > Date: Thu, 5 Jan 2023 13:40:12 +0100
> > > > Subject: [PATCH] libswresample/swresample: avoid s16p internal
> transfer
> > > format
> > > >
> > > > Instead use float one by default for sample rate conversions.
> > > > The s16p internal transfer format produces visible and hearable
> > > > quantization artifacts.
> > >
> > > When does this occur and why?
> > >
> >
> > It occurs always. Just compare output with 16bit and int32/float/double.
> > Look at other people report on internet.
> > Look at src.infinitewave.ca
>
> src.infinitewave.ca uses 32bit none of what it shows should touch the
> codepath
> you change.
>
> if we look at src.infinitewave.ca for swr we see 2 types of artifacts
> 1. Aliassing which is at maybe -120db with the actual signal at 0db
>    i would like to see some evidence that a human can hear this
>

For s16p<->s16p it is much lower, around -78dB thus this patch.

Also for others and reports for swr its is lower than exact -120dB


2. Reflection and attenuation at the transition frequency
> With linear filters there is a tradeof between attenuation of the
> passband, reflection of frequencies beyond, latency and so on
> You can have a perfect sharp cutoff with no attenuation and no refelection
> that requires a infinitly long filter. And while this looks best in this
> frequency plot, does it actually sound best ? If you can hear -120db
> signals you surely would then also hear the ringing long before a gunshot
> from such long filter.
>
> also what actually is the optimal frequency response of this filter ?
> with a 22khz cutoff, a 22.1khz sine should be silence is that
> really subjectively better than a 21.9khz sine ?
> Iam not sure about this. Has someone done actual hearing tests with
> actual real audio? the sinc filter originates from the idea of lossless
> reconstruction of frequencies below nyquist if iam not mistaken, but humans
> are not trying to losslessly restore a block of frequencies. A human
> listener
> generally wants to enjoy listening to some media. Has someone looked into
> what is actually best for that real use case ?
> This question matters because with it we can tune the filter parameters to
> target humans.
>
> But lets push the doubts about choosing resampling purely based on
> frequency
> analysis away.
> swresample has several parameters with which you can tune this:
> we have a filter_size, if thats bigger you should get closer to the ideal
> sinc. Theres phase_shift which may reduce the (i assume) unhearable
> aliasing.
> And cutoff which should allow to tune the (i assume) hearable
> reflection/attenuation tradeoff also theres filter_type to allow you to
> tune the
> window function.
>
> If there are issues reported by people using their ears, please provide
> more
> details, iam interrested in these cases.
>
>
> >
> >
> > > This change should be limited to the case that benefits, this would
> force
> > > this
> > > even without resampling in some cases.
> > >
> >
> > It is forced only if sample rates between input and output differs.
>
> If iam not mistaken it affects rematrixing without resampling too
>

How so?
I really doubt that this patch do that.


>
> thx
>
> [...]
> --
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> The real ebay dictionary, page 1
> "Used only once"    - "Some unspecified defect prevented a second use"
> "In good condition" - "Can be repaird by experienced expert"
> "As is" - "You wouldnt want it even if you were payed for it, if you knew
> ..."
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
Paul B Mahol Jan. 6, 2023, 6:04 p.m. UTC | #5
On Fri, Jan 6, 2023 at 7:01 PM Paul B Mahol <onemda@gmail.com> wrote:

>
>
> On Fri, Jan 6, 2023 at 6:25 PM Michael Niedermayer <michael@niedermayer.cc>
> wrote:
>
>> On Thu, Jan 05, 2023 at 11:08:25PM +0100, Paul B Mahol wrote:
>> > On Thu, Jan 5, 2023 at 9:53 PM Michael Niedermayer <
>> michael@niedermayer.cc>
>> > wrote:
>> >
>> > > On Thu, Jan 05, 2023 at 01:44:10PM +0100, Paul B Mahol wrote:
>> > > > Patch attached.
>> > >
>> > > >  swresample.c |    3 ++-
>> > > >  1 file changed, 2 insertions(+), 1 deletion(-)
>> > > > eee7a0685b44aa867562138a2e2437ecb8844612
>> > > 0001-libswresample-swresample-avoid-s16p-internal-transfe.patch
>> > > > From 9c4cd60e2dd41cf98d693c8251f4cfade0807073 Mon Sep 17 00:00:00
>> 2001
>> > > > From: Paul B Mahol <onemda@gmail.com>
>> > > > Date: Thu, 5 Jan 2023 13:40:12 +0100
>> > > > Subject: [PATCH] libswresample/swresample: avoid s16p internal
>> transfer
>> > > format
>> > > >
>> > > > Instead use float one by default for sample rate conversions.
>> > > > The s16p internal transfer format produces visible and hearable
>> > > > quantization artifacts.
>> > >
>> > > When does this occur and why?
>> > >
>> >
>> > It occurs always. Just compare output with 16bit and int32/float/double.
>> > Look at other people report on internet.
>> > Look at src.infinitewave.ca
>>
>> src.infinitewave.ca uses 32bit none of what it shows should touch the
>> codepath
>> you change.
>>
>> if we look at src.infinitewave.ca for swr we see 2 types of artifacts
>> 1. Aliassing which is at maybe -120db with the actual signal at 0db
>>    i would like to see some evidence that a human can hear this
>>
>
> For s16p<->s16p it is much lower, around -78dB thus this patch.
>
> Also for others and reports for swr its is lower than exact -120dB
>
>
> 2. Reflection and attenuation at the transition frequency
>> With linear filters there is a tradeof between attenuation of the
>> passband, reflection of frequencies beyond, latency and so on
>> You can have a perfect sharp cutoff with no attenuation and no refelection
>> that requires a infinitly long filter. And while this looks best in this
>> frequency plot, does it actually sound best ? If you can hear -120db
>> signals you surely would then also hear the ringing long before a gunshot
>> from such long filter.
>>
>
One can always change linear FIR to be min phase FIR kernel.


>
>> also what actually is the optimal frequency response of this filter ?
>> with a 22khz cutoff, a 22.1khz sine should be silence is that
>> really subjectively better than a 21.9khz sine ?
>> Iam not sure about this. Has someone done actual hearing tests with
>> actual real audio? the sinc filter originates from the idea of lossless
>> reconstruction of frequencies below nyquist if iam not mistaken, but
>> humans
>> are not trying to losslessly restore a block of frequencies. A human
>> listener
>> generally wants to enjoy listening to some media. Has someone looked into
>> what is actually best for that real use case ?
>> This question matters because with it we can tune the filter parameters to
>> target humans.
>>
>> But lets push the doubts about choosing resampling purely based on
>> frequency
>> analysis away.
>> swresample has several parameters with which you can tune this:
>> we have a filter_size, if thats bigger you should get closer to the ideal
>> sinc. Theres phase_shift which may reduce the (i assume) unhearable
>> aliasing.
>> And cutoff which should allow to tune the (i assume) hearable
>> reflection/attenuation tradeoff also theres filter_type to allow you to
>> tune the
>> window function.
>>
>> If there are issues reported by people using their ears, please provide
>> more
>> details, iam interrested in these cases.
>>
>>
>> >
>> >
>> > > This change should be limited to the case that benefits, this would
>> force
>> > > this
>> > > even without resampling in some cases.
>> > >
>> >
>> > It is forced only if sample rates between input and output differs.
>>
>> If iam not mistaken it affects rematrixing without resampling too
>>
>
> How so?
> I really doubt that this patch do that.
>
>
>>
>> thx
>>
>> [...]
>> --
>> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>>
>> The real ebay dictionary, page 1
>> "Used only once"    - "Some unspecified defect prevented a second use"
>> "In good condition" - "Can be repaird by experienced expert"
>> "As is" - "You wouldnt want it even if you were payed for it, if you knew
>> ..."
>> _______________________________________________
>> ffmpeg-devel mailing list
>> ffmpeg-devel@ffmpeg.org
>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>
>> To unsubscribe, visit link above, or email
>> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>>
>
Michael Niedermayer Jan. 8, 2023, 2:45 p.m. UTC | #6
On Fri, Jan 06, 2023 at 07:01:06PM +0100, Paul B Mahol wrote:
> On Fri, Jan 6, 2023 at 6:25 PM Michael Niedermayer <michael@niedermayer.cc>
> wrote:
> 
> > On Thu, Jan 05, 2023 at 11:08:25PM +0100, Paul B Mahol wrote:
> > > On Thu, Jan 5, 2023 at 9:53 PM Michael Niedermayer <
> > michael@niedermayer.cc>
> > > wrote:
> > >
> > > > On Thu, Jan 05, 2023 at 01:44:10PM +0100, Paul B Mahol wrote:
> > > > > Patch attached.
> > > >
> > > > >  swresample.c |    3 ++-
> > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > eee7a0685b44aa867562138a2e2437ecb8844612
> > > > 0001-libswresample-swresample-avoid-s16p-internal-transfe.patch
> > > > > From 9c4cd60e2dd41cf98d693c8251f4cfade0807073 Mon Sep 17 00:00:00
> > 2001
> > > > > From: Paul B Mahol <onemda@gmail.com>
> > > > > Date: Thu, 5 Jan 2023 13:40:12 +0100
> > > > > Subject: [PATCH] libswresample/swresample: avoid s16p internal
> > transfer
> > > > format
> > > > >
> > > > > Instead use float one by default for sample rate conversions.
> > > > > The s16p internal transfer format produces visible and hearable
> > > > > quantization artifacts.
> > > >
> > > > When does this occur and why?
> > > >
> > >
> > > It occurs always. Just compare output with 16bit and int32/float/double.
> > > Look at other people report on internet.
> > > Look at src.infinitewave.ca
> >
> > src.infinitewave.ca uses 32bit none of what it shows should touch the
> > codepath
> > you change.
> >
> > if we look at src.infinitewave.ca for swr we see 2 types of artifacts
> > 1. Aliassing which is at maybe -120db with the actual signal at 0db
> >    i would like to see some evidence that a human can hear this
> >
> 
> For s16p<->s16p it is much lower, around -78dB thus this patch.

ok but you pointed to the website that apparently uses >=32bit if i trust
what they write.
And even if they test this i cannot use that website to replicate the issue
and the fix 

I just wanted to know how i can test this. You are testing it too ...
so i can see what you see
Id like to make sure this is the correct fix for the problem and
Id like to make sure its used when it makes sense and not when not.


> 
> Also for others and reports for swr its is lower than exact -120dB

The 120 was by "eye" from teh chart on  the web i didnt meassure it


> 
> 
> 2. Reflection and attenuation at the transition frequency
> > With linear filters there is a tradeof between attenuation of the
> > passband, reflection of frequencies beyond, latency and so on
> > You can have a perfect sharp cutoff with no attenuation and no refelection
> > that requires a infinitly long filter. And while this looks best in this
> > frequency plot, does it actually sound best ? If you can hear -120db
> > signals you surely would then also hear the ringing long before a gunshot
> > from such long filter.
> >
> > also what actually is the optimal frequency response of this filter ?
> > with a 22khz cutoff, a 22.1khz sine should be silence is that
> > really subjectively better than a 21.9khz sine ?
> > Iam not sure about this. Has someone done actual hearing tests with
> > actual real audio? the sinc filter originates from the idea of lossless
> > reconstruction of frequencies below nyquist if iam not mistaken, but humans
> > are not trying to losslessly restore a block of frequencies. A human
> > listener
> > generally wants to enjoy listening to some media. Has someone looked into
> > what is actually best for that real use case ?
> > This question matters because with it we can tune the filter parameters to
> > target humans.
> >
> > But lets push the doubts about choosing resampling purely based on
> > frequency
> > analysis away.
> > swresample has several parameters with which you can tune this:
> > we have a filter_size, if thats bigger you should get closer to the ideal
> > sinc. Theres phase_shift which may reduce the (i assume) unhearable
> > aliasing.
> > And cutoff which should allow to tune the (i assume) hearable
> > reflection/attenuation tradeoff also theres filter_type to allow you to
> > tune the
> > window function.
> >
> > If there are issues reported by people using their ears, please provide
> > more
> > details, iam interrested in these cases.
> >
> >
> > >
> > >
> > > > This change should be limited to the case that benefits, this would
> > force
> > > > this
> > > > even without resampling in some cases.
> > > >
> > >
> > > It is forced only if sample rates between input and output differs.
> >
> > If iam not mistaken it affects rematrixing without resampling too
> >
> 
> How so?
> I really doubt that this patch do that.

I could be missing something but 
int_sample_fmt is set to before 16bit and afterwards 32bit
and alot of things are using this:
    set_audiodata_fmt(&s->postin, s->int_sample_fmt);
    set_audiodata_fmt(&s->midbuf, s->int_sample_fmt);
    set_audiodata_fmt(&s->preout, s->int_sample_fmt);

rematrix seems using these
            swri_rematrix(s, preout, midbuf, out_count, preout==out);
...
            swri_rematrix(s, midbuf, postin, in_count, midbuf==out);
            
so i assumed that this patch makes a difference for it. Again i could be missing
something
            
thx

[...]
Michael Niedermayer Jan. 8, 2023, 2:52 p.m. UTC | #7
On Fri, Jan 06, 2023 at 07:04:59PM +0100, Paul B Mahol wrote:
> On Fri, Jan 6, 2023 at 7:01 PM Paul B Mahol <onemda@gmail.com> wrote:
> 
> >
> >
> > On Fri, Jan 6, 2023 at 6:25 PM Michael Niedermayer <michael@niedermayer.cc>
> > wrote:
> >
> >> On Thu, Jan 05, 2023 at 11:08:25PM +0100, Paul B Mahol wrote:
> >> > On Thu, Jan 5, 2023 at 9:53 PM Michael Niedermayer <
> >> michael@niedermayer.cc>
> >> > wrote:
> >> >
> >> > > On Thu, Jan 05, 2023 at 01:44:10PM +0100, Paul B Mahol wrote:
> >> > > > Patch attached.
> >> > >
> >> > > >  swresample.c |    3 ++-
> >> > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> >> > > > eee7a0685b44aa867562138a2e2437ecb8844612
> >> > > 0001-libswresample-swresample-avoid-s16p-internal-transfe.patch
> >> > > > From 9c4cd60e2dd41cf98d693c8251f4cfade0807073 Mon Sep 17 00:00:00
> >> 2001
> >> > > > From: Paul B Mahol <onemda@gmail.com>
> >> > > > Date: Thu, 5 Jan 2023 13:40:12 +0100
> >> > > > Subject: [PATCH] libswresample/swresample: avoid s16p internal
> >> transfer
> >> > > format
> >> > > >
> >> > > > Instead use float one by default for sample rate conversions.
> >> > > > The s16p internal transfer format produces visible and hearable
> >> > > > quantization artifacts.
> >> > >
> >> > > When does this occur and why?
> >> > >
> >> >
> >> > It occurs always. Just compare output with 16bit and int32/float/double.
> >> > Look at other people report on internet.
> >> > Look at src.infinitewave.ca
> >>
> >> src.infinitewave.ca uses 32bit none of what it shows should touch the
> >> codepath
> >> you change.
> >>
> >> if we look at src.infinitewave.ca for swr we see 2 types of artifacts
> >> 1. Aliassing which is at maybe -120db with the actual signal at 0db
> >>    i would like to see some evidence that a human can hear this
> >>
> >
> > For s16p<->s16p it is much lower, around -78dB thus this patch.
> >
> > Also for others and reports for swr its is lower than exact -120dB
> >
> >
> > 2. Reflection and attenuation at the transition frequency
> >> With linear filters there is a tradeof between attenuation of the
> >> passband, reflection of frequencies beyond, latency and so on
> >> You can have a perfect sharp cutoff with no attenuation and no refelection
> >> that requires a infinitly long filter. And while this looks best in this
> >> frequency plot, does it actually sound best ? If you can hear -120db
> >> signals you surely would then also hear the ringing long before a gunshot
> >> from such long filter.
> >>
> >
> One can always change linear FIR to be min phase FIR kernel.

I certainly would welcome a wider range of filters in swr, if you want to add
any low delay sinc approximation or in fact i would welcome any filter
you want to add.

thx

[...]
Paul B Mahol Jan. 8, 2023, 3:18 p.m. UTC | #8
On 1/8/23, Michael Niedermayer <michael@niedermayer.cc> wrote:
> On Fri, Jan 06, 2023 at 07:01:06PM +0100, Paul B Mahol wrote:
>> On Fri, Jan 6, 2023 at 6:25 PM Michael Niedermayer
>> <michael@niedermayer.cc>
>> wrote:
>>
>> > On Thu, Jan 05, 2023 at 11:08:25PM +0100, Paul B Mahol wrote:
>> > > On Thu, Jan 5, 2023 at 9:53 PM Michael Niedermayer <
>> > michael@niedermayer.cc>
>> > > wrote:
>> > >
>> > > > On Thu, Jan 05, 2023 at 01:44:10PM +0100, Paul B Mahol wrote:
>> > > > > Patch attached.
>> > > >
>> > > > >  swresample.c |    3 ++-
>> > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
>> > > > > eee7a0685b44aa867562138a2e2437ecb8844612
>> > > > 0001-libswresample-swresample-avoid-s16p-internal-transfe.patch
>> > > > > From 9c4cd60e2dd41cf98d693c8251f4cfade0807073 Mon Sep 17 00:00:00
>> > 2001
>> > > > > From: Paul B Mahol <onemda@gmail.com>
>> > > > > Date: Thu, 5 Jan 2023 13:40:12 +0100
>> > > > > Subject: [PATCH] libswresample/swresample: avoid s16p internal
>> > transfer
>> > > > format
>> > > > >
>> > > > > Instead use float one by default for sample rate conversions.
>> > > > > The s16p internal transfer format produces visible and hearable
>> > > > > quantization artifacts.
>> > > >
>> > > > When does this occur and why?
>> > > >
>> > >
>> > > It occurs always. Just compare output with 16bit and
>> > > int32/float/double.
>> > > Look at other people report on internet.
>> > > Look at src.infinitewave.ca
>> >
>> > src.infinitewave.ca uses 32bit none of what it shows should touch the
>> > codepath
>> > you change.
>> >
>> > if we look at src.infinitewave.ca for swr we see 2 types of artifacts
>> > 1. Aliassing which is at maybe -120db with the actual signal at 0db
>> >    i would like to see some evidence that a human can hear this
>> >
>>
>> For s16p<->s16p it is much lower, around -78dB thus this patch.
>
> ok but you pointed to the website that apparently uses >=32bit if i trust
> what they write.
> And even if they test this i cannot use that website to replicate the issue
> and the fix

If one use pure 16bit processing sweep results are even worse.

Just resample using fltp/dblp/s32p and s16p and compare (it does not
matter what, just not simple sine constant frequencies waves)

The s16p result is much worse and contains huge quantization noises.

They are not that obviously easy to hear, but are there, and
difference is > -80dB for dithered 16bit input.

You can generate and display sweep with/out resampling all with
ffmpeg/ffplay.ffplay -f lavfi -i
aevalsrc="sin(PI*t*t*t*100):s=96000",aresample=44100:tsf=s16p,showspectrum=scale=log:color=cool:overlap=0:fps=25:drange=96:legend=1

Play with tsf=values and drange=[96-150]
So, for 16bit input, drange=96 and tsf=s16p looks fine, but that web
pages shows bad results,
thus I think that they use >16bit and tsf=s16p.

But anyway the filter kernel is quantized to 16bit for s16p and if
16bit input is dithered results are also worse for s16p, because if
you reformat s16p to flt/s32p and resample and then convert back to
s16p errors are bigger than 1/2 bits range when compared to direct s16
resampling.



>
> I just wanted to know how i can test this. You are testing it too ...
> so i can see what you see
> Id like to make sure this is the correct fix for the problem and
> Id like to make sure its used when it makes sense and not when not.
>
>
>>
>> Also for others and reports for swr its is lower than exact -120dB
>
> The 120 was by "eye" from teh chart on  the web i didnt meassure it
>
>
>>
>>
>> 2. Reflection and attenuation at the transition frequency
>> > With linear filters there is a tradeof between attenuation of the
>> > passband, reflection of frequencies beyond, latency and so on
>> > You can have a perfect sharp cutoff with no attenuation and no
>> > refelection
>> > that requires a infinitly long filter. And while this looks best in
>> > this
>> > frequency plot, does it actually sound best ? If you can hear -120db
>> > signals you surely would then also hear the ringing long before a
>> > gunshot
>> > from such long filter.
>> >
>> > also what actually is the optimal frequency response of this filter ?
>> > with a 22khz cutoff, a 22.1khz sine should be silence is that
>> > really subjectively better than a 21.9khz sine ?
>> > Iam not sure about this. Has someone done actual hearing tests with
>> > actual real audio? the sinc filter originates from the idea of lossless
>> > reconstruction of frequencies below nyquist if iam not mistaken, but
>> > humans
>> > are not trying to losslessly restore a block of frequencies. A human
>> > listener
>> > generally wants to enjoy listening to some media. Has someone looked
>> > into
>> > what is actually best for that real use case ?
>> > This question matters because with it we can tune the filter parameters
>> > to
>> > target humans.
>> >
>> > But lets push the doubts about choosing resampling purely based on
>> > frequency
>> > analysis away.
>> > swresample has several parameters with which you can tune this:
>> > we have a filter_size, if thats bigger you should get closer to the
>> > ideal
>> > sinc. Theres phase_shift which may reduce the (i assume) unhearable
>> > aliasing.
>> > And cutoff which should allow to tune the (i assume) hearable
>> > reflection/attenuation tradeoff also theres filter_type to allow you to
>> > tune the
>> > window function.
>> >
>> > If there are issues reported by people using their ears, please provide
>> > more
>> > details, iam interrested in these cases.
>> >
>> >
>> > >
>> > >
>> > > > This change should be limited to the case that benefits, this would
>> > force
>> > > > this
>> > > > even without resampling in some cases.
>> > > >
>> > >
>> > > It is forced only if sample rates between input and output differs.
>> >
>> > If iam not mistaken it affects rematrixing without resampling too
>> >
>>
>> How so?
>> I really doubt that this patch do that.
>
> I could be missing something but
> int_sample_fmt is set to before 16bit and afterwards 32bit
> and alot of things are using this:
>     set_audiodata_fmt(&s->postin, s->int_sample_fmt);
>     set_audiodata_fmt(&s->midbuf, s->int_sample_fmt);
>     set_audiodata_fmt(&s->preout, s->int_sample_fmt);
>
> rematrix seems using these
>             swri_rematrix(s, preout, midbuf, out_count, preout==out);
> ...
>             swri_rematrix(s, midbuf, postin, in_count, midbuf==out);
>
> so i assumed that this patch makes a difference for it. Again i could be
> missing
> something

Yes, but only if sample rates differ the fltp is used instead of s16p
for resampling.

>
> thx
>
> [...]
> --
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> The educated differ from the uneducated as much as the living from the
> dead. -- Aristotle
>
Paul B Mahol Jan. 8, 2023, 3:27 p.m. UTC | #9
On 1/8/23, Michael Niedermayer <michael@niedermayer.cc> wrote:
> On Fri, Jan 06, 2023 at 07:04:59PM +0100, Paul B Mahol wrote:
>> On Fri, Jan 6, 2023 at 7:01 PM Paul B Mahol <onemda@gmail.com> wrote:
>>
>> >
>> >
>> > On Fri, Jan 6, 2023 at 6:25 PM Michael Niedermayer
>> > <michael@niedermayer.cc>
>> > wrote:
>> >
>> >> On Thu, Jan 05, 2023 at 11:08:25PM +0100, Paul B Mahol wrote:
>> >> > On Thu, Jan 5, 2023 at 9:53 PM Michael Niedermayer <
>> >> michael@niedermayer.cc>
>> >> > wrote:
>> >> >
>> >> > > On Thu, Jan 05, 2023 at 01:44:10PM +0100, Paul B Mahol wrote:
>> >> > > > Patch attached.
>> >> > >
>> >> > > >  swresample.c |    3 ++-
>> >> > > >  1 file changed, 2 insertions(+), 1 deletion(-)
>> >> > > > eee7a0685b44aa867562138a2e2437ecb8844612
>> >> > > 0001-libswresample-swresample-avoid-s16p-internal-transfe.patch
>> >> > > > From 9c4cd60e2dd41cf98d693c8251f4cfade0807073 Mon Sep 17
>> >> > > > 00:00:00
>> >> 2001
>> >> > > > From: Paul B Mahol <onemda@gmail.com>
>> >> > > > Date: Thu, 5 Jan 2023 13:40:12 +0100
>> >> > > > Subject: [PATCH] libswresample/swresample: avoid s16p internal
>> >> transfer
>> >> > > format
>> >> > > >
>> >> > > > Instead use float one by default for sample rate conversions.
>> >> > > > The s16p internal transfer format produces visible and hearable
>> >> > > > quantization artifacts.
>> >> > >
>> >> > > When does this occur and why?
>> >> > >
>> >> >
>> >> > It occurs always. Just compare output with 16bit and
>> >> > int32/float/double.
>> >> > Look at other people report on internet.
>> >> > Look at src.infinitewave.ca
>> >>
>> >> src.infinitewave.ca uses 32bit none of what it shows should touch the
>> >> codepath
>> >> you change.
>> >>
>> >> if we look at src.infinitewave.ca for swr we see 2 types of artifacts
>> >> 1. Aliassing which is at maybe -120db with the actual signal at 0db
>> >>    i would like to see some evidence that a human can hear this
>> >>
>> >
>> > For s16p<->s16p it is much lower, around -78dB thus this patch.
>> >
>> > Also for others and reports for swr its is lower than exact -120dB
>> >
>> >
>> > 2. Reflection and attenuation at the transition frequency
>> >> With linear filters there is a tradeof between attenuation of the
>> >> passband, reflection of frequencies beyond, latency and so on
>> >> You can have a perfect sharp cutoff with no attenuation and no
>> >> refelection
>> >> that requires a infinitly long filter. And while this looks best in
>> >> this
>> >> frequency plot, does it actually sound best ? If you can hear -120db
>> >> signals you surely would then also hear the ringing long before a
>> >> gunshot
>> >> from such long filter.
>> >>
>> >
>> One can always change linear FIR to be min phase FIR kernel.
>
> I certainly would welcome a wider range of filters in swr, if you want to
> add
> any low delay sinc approximation or in fact i would welcome any filter
> you want to add.

I have hack for simple cases when one of decimation/interpolation factor is 1
than there is single only FIR kernel applied and then one do not use
naive convolution which is O(N^2) but O(Nlog(N)) when using RDFT/FFT.

For bigger filter sizes it makes filtering amazingly faster. (from
<80x to >400x).

So I'm interested in making it faster if possible for any combination
of decimation/interpolation.

>
> thx
>
> [...]
> --
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> The real ebay dictionary, page 1
> "Used only once"    - "Some unspecified defect prevented a second use"
> "In good condition" - "Can be repaird by experienced expert"
> "As is" - "You wouldnt want it even if you were payed for it, if you knew
> ..."
>
Tobias Rapp Jan. 9, 2023, 10:41 a.m. UTC | #10
On 08/01/2023 15:45, Michael Niedermayer wrote:

> On Fri, Jan 06, 2023 at 07:01:06PM +0100, Paul B Mahol wrote:
>> On Fri, Jan 6, 2023 at 6:25 PM Michael Niedermayer <michael@niedermayer.cc>
>> wrote:
>>
>>> On Thu, Jan 05, 2023 at 11:08:25PM +0100, Paul B Mahol wrote:
>>>> On Thu, Jan 5, 2023 at 9:53 PM Michael Niedermayer <
>>> michael@niedermayer.cc>
>>>> wrote:
>>>>
>>>>> On Thu, Jan 05, 2023 at 01:44:10PM +0100, Paul B Mahol wrote:
>>>>>> Patch attached.
>>>>>>   swresample.c |    3 ++-
>>>>>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>> eee7a0685b44aa867562138a2e2437ecb8844612
>>>>> 0001-libswresample-swresample-avoid-s16p-internal-transfe.patch
>>>>>>  From 9c4cd60e2dd41cf98d693c8251f4cfade0807073 Mon Sep 17 00:00:00
>>> 2001
>>>>>> From: Paul B Mahol <onemda@gmail.com>
>>>>>> Date: Thu, 5 Jan 2023 13:40:12 +0100
>>>>>> Subject: [PATCH] libswresample/swresample: avoid s16p internal
>>> transfer
>>>>> format
>>>>>> Instead use float one by default for sample rate conversions.
>>>>>> The s16p internal transfer format produces visible and hearable
>>>>>> quantization artifacts.
>>>>> When does this occur and why?
>>>>>
>>>> It occurs always. Just compare output with 16bit and int32/float/double.
>>>> Look at other people report on internet.
>>>> Look at src.infinitewave.ca
>>> src.infinitewave.ca uses 32bit none of what it shows should touch the
>>> codepath
>>> you change.
>>>
>>> if we look at src.infinitewave.ca for swr we see 2 types of artifacts
>>> 1. Aliassing which is at maybe -120db with the actual signal at 0db
>>>     i would like to see some evidence that a human can hear this
>>>
>> For s16p<->s16p it is much lower, around -78dB thus this patch.
> ok but you pointed to the website that apparently uses >=32bit if i trust
> what they write.
> And even if they test this i cannot use that website to replicate the issue
> and the fix
>
> I just wanted to know how i can test this. You are testing it too ...
> so i can see what you see
> Id like to make sure this is the correct fix for the problem and
> Id like to make sure its used when it makes sense and not when not.

IIRC a similar sweep graph can be created with "Spek" from 
http://spek.cc/ but haven't used that application for a while. The SRC 
comparison website apparently provides the input files for testing, see 
the instructions at http://src.infinitewave.ca/faq.html.

Regards, Tobias
Paul B Mahol Jan. 9, 2023, 12:25 p.m. UTC | #11
On 1/9/23, Tobias Rapp <t.rapp@noa-archive.com> wrote:
> On 08/01/2023 15:45, Michael Niedermayer wrote:
>
>> On Fri, Jan 06, 2023 at 07:01:06PM +0100, Paul B Mahol wrote:
>>> On Fri, Jan 6, 2023 at 6:25 PM Michael Niedermayer
>>> <michael@niedermayer.cc>
>>> wrote:
>>>
>>>> On Thu, Jan 05, 2023 at 11:08:25PM +0100, Paul B Mahol wrote:
>>>>> On Thu, Jan 5, 2023 at 9:53 PM Michael Niedermayer <
>>>> michael@niedermayer.cc>
>>>>> wrote:
>>>>>
>>>>>> On Thu, Jan 05, 2023 at 01:44:10PM +0100, Paul B Mahol wrote:
>>>>>>> Patch attached.
>>>>>>>   swresample.c |    3 ++-
>>>>>>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>> eee7a0685b44aa867562138a2e2437ecb8844612
>>>>>> 0001-libswresample-swresample-avoid-s16p-internal-transfe.patch
>>>>>>>  From 9c4cd60e2dd41cf98d693c8251f4cfade0807073 Mon Sep 17 00:00:00
>>>> 2001
>>>>>>> From: Paul B Mahol <onemda@gmail.com>
>>>>>>> Date: Thu, 5 Jan 2023 13:40:12 +0100
>>>>>>> Subject: [PATCH] libswresample/swresample: avoid s16p internal
>>>> transfer
>>>>>> format
>>>>>>> Instead use float one by default for sample rate conversions.
>>>>>>> The s16p internal transfer format produces visible and hearable
>>>>>>> quantization artifacts.
>>>>>> When does this occur and why?
>>>>>>
>>>>> It occurs always. Just compare output with 16bit and
>>>>> int32/float/double.
>>>>> Look at other people report on internet.
>>>>> Look at src.infinitewave.ca
>>>> src.infinitewave.ca uses 32bit none of what it shows should touch the
>>>> codepath
>>>> you change.
>>>>
>>>> if we look at src.infinitewave.ca for swr we see 2 types of artifacts
>>>> 1. Aliassing which is at maybe -120db with the actual signal at 0db
>>>>     i would like to see some evidence that a human can hear this
>>>>
>>> For s16p<->s16p it is much lower, around -78dB thus this patch.
>> ok but you pointed to the website that apparently uses >=32bit if i trust
>> what they write.
>> And even if they test this i cannot use that website to replicate the
>> issue
>> and the fix
>>
>> I just wanted to know how i can test this. You are testing it too ...
>> so i can see what you see
>> Id like to make sure this is the correct fix for the problem and
>> Id like to make sure its used when it makes sense and not when not.
>
> IIRC a similar sweep graph can be created with "Spek" from
> http://spek.cc/ but haven't used that application for a while. The SRC

Spek is both slow and limited in range of dB, so it should never be
used by pro users.

> comparison website apparently provides the input files for testing, see
> the instructions at http://src.infinitewave.ca/faq.html.
>
> Regards, Tobias
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
Paul B Mahol Jan. 12, 2023, 2:20 p.m. UTC | #12
On 1/8/23, Michael Niedermayer <michael@niedermayer.cc> wrote:
> On Fri, Jan 06, 2023 at 07:04:59PM +0100, Paul B Mahol wrote:
>> On Fri, Jan 6, 2023 at 7:01 PM Paul B Mahol <onemda@gmail.com> wrote:
>>
>> >
>> >
>> > On Fri, Jan 6, 2023 at 6:25 PM Michael Niedermayer
>> > <michael@niedermayer.cc>
>> > wrote:
>> >
>> >> On Thu, Jan 05, 2023 at 11:08:25PM +0100, Paul B Mahol wrote:
>> >> > On Thu, Jan 5, 2023 at 9:53 PM Michael Niedermayer <
>> >> michael@niedermayer.cc>
>> >> > wrote:
>> >> >
>> >> > > On Thu, Jan 05, 2023 at 01:44:10PM +0100, Paul B Mahol wrote:
>> >> > > > Patch attached.
>> >> > >
>> >> > > >  swresample.c |    3 ++-
>> >> > > >  1 file changed, 2 insertions(+), 1 deletion(-)
>> >> > > > eee7a0685b44aa867562138a2e2437ecb8844612
>> >> > > 0001-libswresample-swresample-avoid-s16p-internal-transfe.patch
>> >> > > > From 9c4cd60e2dd41cf98d693c8251f4cfade0807073 Mon Sep 17
>> >> > > > 00:00:00
>> >> 2001
>> >> > > > From: Paul B Mahol <onemda@gmail.com>
>> >> > > > Date: Thu, 5 Jan 2023 13:40:12 +0100
>> >> > > > Subject: [PATCH] libswresample/swresample: avoid s16p internal
>> >> transfer
>> >> > > format
>> >> > > >
>> >> > > > Instead use float one by default for sample rate conversions.
>> >> > > > The s16p internal transfer format produces visible and hearable
>> >> > > > quantization artifacts.
>> >> > >
>> >> > > When does this occur and why?
>> >> > >
>> >> >
>> >> > It occurs always. Just compare output with 16bit and
>> >> > int32/float/double.
>> >> > Look at other people report on internet.
>> >> > Look at src.infinitewave.ca
>> >>
>> >> src.infinitewave.ca uses 32bit none of what it shows should touch the
>> >> codepath
>> >> you change.
>> >>
>> >> if we look at src.infinitewave.ca for swr we see 2 types of artifacts
>> >> 1. Aliassing which is at maybe -120db with the actual signal at 0db
>> >>    i would like to see some evidence that a human can hear this
>> >>
>> >
>> > For s16p<->s16p it is much lower, around -78dB thus this patch.
>> >
>> > Also for others and reports for swr its is lower than exact -120dB
>> >
>> >
>> > 2. Reflection and attenuation at the transition frequency
>> >> With linear filters there is a tradeof between attenuation of the
>> >> passband, reflection of frequencies beyond, latency and so on
>> >> You can have a perfect sharp cutoff with no attenuation and no
>> >> refelection
>> >> that requires a infinitly long filter. And while this looks best in
>> >> this
>> >> frequency plot, does it actually sound best ? If you can hear -120db
>> >> signals you surely would then also hear the ringing long before a
>> >> gunshot
>> >> from such long filter.
>> >>
>> >
>> One can always change linear FIR to be min phase FIR kernel.
>
> I certainly would welcome a wider range of filters in swr, if you want to
> add
> any low delay sinc approximation or in fact i would welcome any filter
> you want to add.

There is that afdelaysrc filter patch on ML to add FIR coefficient
generation for fractional delay audio filter that can be also used as
a interpolation FIR filter. And to me it seems better at same number
of taps than already used/available ones in soxr and swr.

Also I have done prototype of resampling filter using afir filter via
custom filters in filtergraph and it operates at similar speeds like
soxr (in these very non optimized approach) and providing better/wider
frequency output at highest band.

>
> thx
>
> [...]
> --
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> The real ebay dictionary, page 1
> "Used only once"    - "Some unspecified defect prevented a second use"
> "In good condition" - "Can be repaird by experienced expert"
> "As is" - "You wouldnt want it even if you were payed for it, if you knew
> ..."
>
Michael Niedermayer Jan. 12, 2023, 3:37 p.m. UTC | #13
On Sun, Jan 08, 2023 at 04:18:44PM +0100, Paul B Mahol wrote:
> On 1/8/23, Michael Niedermayer <michael@niedermayer.cc> wrote:
> > On Fri, Jan 06, 2023 at 07:01:06PM +0100, Paul B Mahol wrote:
> >> On Fri, Jan 6, 2023 at 6:25 PM Michael Niedermayer
> >> <michael@niedermayer.cc>
> >> wrote:
> >>
> >> > On Thu, Jan 05, 2023 at 11:08:25PM +0100, Paul B Mahol wrote:
> >> > > On Thu, Jan 5, 2023 at 9:53 PM Michael Niedermayer <
> >> > michael@niedermayer.cc>
> >> > > wrote:
> >> > >
> >> > > > On Thu, Jan 05, 2023 at 01:44:10PM +0100, Paul B Mahol wrote:
> >> > > > > Patch attached.
> >> > > >
> >> > > > >  swresample.c |    3 ++-
> >> > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> >> > > > > eee7a0685b44aa867562138a2e2437ecb8844612
> >> > > > 0001-libswresample-swresample-avoid-s16p-internal-transfe.patch
> >> > > > > From 9c4cd60e2dd41cf98d693c8251f4cfade0807073 Mon Sep 17 00:00:00
> >> > 2001
> >> > > > > From: Paul B Mahol <onemda@gmail.com>
> >> > > > > Date: Thu, 5 Jan 2023 13:40:12 +0100
> >> > > > > Subject: [PATCH] libswresample/swresample: avoid s16p internal
> >> > transfer
> >> > > > format
> >> > > > >
> >> > > > > Instead use float one by default for sample rate conversions.
> >> > > > > The s16p internal transfer format produces visible and hearable
> >> > > > > quantization artifacts.
> >> > > >
> >> > > > When does this occur and why?
> >> > > >
> >> > >
> >> > > It occurs always. Just compare output with 16bit and
> >> > > int32/float/double.
> >> > > Look at other people report on internet.
> >> > > Look at src.infinitewave.ca
> >> >
> >> > src.infinitewave.ca uses 32bit none of what it shows should touch the
> >> > codepath
> >> > you change.
> >> >
> >> > if we look at src.infinitewave.ca for swr we see 2 types of artifacts
> >> > 1. Aliassing which is at maybe -120db with the actual signal at 0db
> >> >    i would like to see some evidence that a human can hear this
> >> >
> >>
> >> For s16p<->s16p it is much lower, around -78dB thus this patch.
> >
> > ok but you pointed to the website that apparently uses >=32bit if i trust
> > what they write.
> > And even if they test this i cannot use that website to replicate the issue
> > and the fix
> 
> If one use pure 16bit processing sweep results are even worse.
> 
> Just resample using fltp/dblp/s32p and s16p and compare (it does not
> matter what, just not simple sine constant frequencies waves)
> 
> The s16p result is much worse and contains huge quantization noises.
> 
> They are not that obviously easy to hear, but are there, and
> difference is > -80dB for dithered 16bit input.
> 
> You can generate and display sweep with/out resampling all with
> ffmpeg/ffplay.ffplay -f lavfi -i
> aevalsrc="sin(PI*t*t*t*100):s=96000",aresample=44100:tsf=s16p,showspectrum=scale=log:color=cool:overlap=0:fps=25:drange=96:legend=1
> 
> Play with tsf=values and drange=[96-150]
> So, for 16bit input, drange=96 and tsf=s16p looks fine, but that web
> pages shows bad results,

ok, i see the issue, thanks


[...]
> >> How so?
> >> I really doubt that this patch do that.
> >
> > I could be missing something but
> > int_sample_fmt is set to before 16bit and afterwards 32bit
> > and alot of things are using this:
> >     set_audiodata_fmt(&s->postin, s->int_sample_fmt);
> >     set_audiodata_fmt(&s->midbuf, s->int_sample_fmt);
> >     set_audiodata_fmt(&s->preout, s->int_sample_fmt);
> >
> > rematrix seems using these
> >             swri_rematrix(s, preout, midbuf, out_count, preout==out);
> > ...
> >             swri_rematrix(s, midbuf, postin, in_count, midbuf==out);
> >
> > so i assumed that this patch makes a difference for it. Again i could be
> > missing
> > something
> 
> Yes, but only if sample rates differ the fltp is used instead of s16p
> for resampling.

yes, i misread the patch

so, i do agree that flt is a better default for resampling s16->s16
on modern hw using 16bit coeffs seems an optimization at the wrong place
the patch breaks fate though

also, i suggest to leave the s16p if either input or output is s8

teh default change is ok, if it doesnt cause bitexactness issues,
otherwise it also needs to be done conditional on bitexact flag

As a note, if someone is interrested, it should be possible to improve
the quantization noise by using some more fancy noise shaping when building
the coefficients. That would not reduce the quantization noise but it would
spread it out so it would be less audible and vissible. Though thats only
really usefull if one actually did hear it. 

thx

[...]
Michael Niedermayer Jan. 12, 2023, 3:41 p.m. UTC | #14
On Thu, Jan 12, 2023 at 03:20:06PM +0100, Paul B Mahol wrote:
> On 1/8/23, Michael Niedermayer <michael@niedermayer.cc> wrote:
> > On Fri, Jan 06, 2023 at 07:04:59PM +0100, Paul B Mahol wrote:
> >> On Fri, Jan 6, 2023 at 7:01 PM Paul B Mahol <onemda@gmail.com> wrote:
> >>
> >> >
> >> >
> >> > On Fri, Jan 6, 2023 at 6:25 PM Michael Niedermayer
> >> > <michael@niedermayer.cc>
> >> > wrote:
> >> >
> >> >> On Thu, Jan 05, 2023 at 11:08:25PM +0100, Paul B Mahol wrote:
> >> >> > On Thu, Jan 5, 2023 at 9:53 PM Michael Niedermayer <
> >> >> michael@niedermayer.cc>
> >> >> > wrote:
> >> >> >
> >> >> > > On Thu, Jan 05, 2023 at 01:44:10PM +0100, Paul B Mahol wrote:
> >> >> > > > Patch attached.
> >> >> > >
> >> >> > > >  swresample.c |    3 ++-
> >> >> > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> >> >> > > > eee7a0685b44aa867562138a2e2437ecb8844612
> >> >> > > 0001-libswresample-swresample-avoid-s16p-internal-transfe.patch
> >> >> > > > From 9c4cd60e2dd41cf98d693c8251f4cfade0807073 Mon Sep 17
> >> >> > > > 00:00:00
> >> >> 2001
> >> >> > > > From: Paul B Mahol <onemda@gmail.com>
> >> >> > > > Date: Thu, 5 Jan 2023 13:40:12 +0100
> >> >> > > > Subject: [PATCH] libswresample/swresample: avoid s16p internal
> >> >> transfer
> >> >> > > format
> >> >> > > >
> >> >> > > > Instead use float one by default for sample rate conversions.
> >> >> > > > The s16p internal transfer format produces visible and hearable
> >> >> > > > quantization artifacts.
> >> >> > >
> >> >> > > When does this occur and why?
> >> >> > >
> >> >> >
> >> >> > It occurs always. Just compare output with 16bit and
> >> >> > int32/float/double.
> >> >> > Look at other people report on internet.
> >> >> > Look at src.infinitewave.ca
> >> >>
> >> >> src.infinitewave.ca uses 32bit none of what it shows should touch the
> >> >> codepath
> >> >> you change.
> >> >>
> >> >> if we look at src.infinitewave.ca for swr we see 2 types of artifacts
> >> >> 1. Aliassing which is at maybe -120db with the actual signal at 0db
> >> >>    i would like to see some evidence that a human can hear this
> >> >>
> >> >
> >> > For s16p<->s16p it is much lower, around -78dB thus this patch.
> >> >
> >> > Also for others and reports for swr its is lower than exact -120dB
> >> >
> >> >
> >> > 2. Reflection and attenuation at the transition frequency
> >> >> With linear filters there is a tradeof between attenuation of the
> >> >> passband, reflection of frequencies beyond, latency and so on
> >> >> You can have a perfect sharp cutoff with no attenuation and no
> >> >> refelection
> >> >> that requires a infinitly long filter. And while this looks best in
> >> >> this
> >> >> frequency plot, does it actually sound best ? If you can hear -120db
> >> >> signals you surely would then also hear the ringing long before a
> >> >> gunshot
> >> >> from such long filter.
> >> >>
> >> >
> >> One can always change linear FIR to be min phase FIR kernel.
> >
> > I certainly would welcome a wider range of filters in swr, if you want to
> > add
> > any low delay sinc approximation or in fact i would welcome any filter
> > you want to add.
> 
> There is that afdelaysrc filter patch on ML to add FIR coefficient
> generation for fractional delay audio filter that can be also used as
> a interpolation FIR filter. And to me it seems better at same number
> of taps than already used/available ones in soxr and swr.

Please add improvments into swr, if you have any!


> 
> Also I have done prototype of resampling filter using afir filter via
> custom filters in filtergraph and it operates at similar speeds like
> soxr (in these very non optimized approach) and providing better/wider
> frequency output at highest band.

Can it be added into swr ?

thx

[...]
Paul B Mahol Jan. 12, 2023, 4:09 p.m. UTC | #15
On 1/12/23, Michael Niedermayer <michael@niedermayer.cc> wrote:
> On Thu, Jan 12, 2023 at 03:20:06PM +0100, Paul B Mahol wrote:
>> On 1/8/23, Michael Niedermayer <michael@niedermayer.cc> wrote:
>> > On Fri, Jan 06, 2023 at 07:04:59PM +0100, Paul B Mahol wrote:
>> >> On Fri, Jan 6, 2023 at 7:01 PM Paul B Mahol <onemda@gmail.com> wrote:
>> >>
>> >> >
>> >> >
>> >> > On Fri, Jan 6, 2023 at 6:25 PM Michael Niedermayer
>> >> > <michael@niedermayer.cc>
>> >> > wrote:
>> >> >
>> >> >> On Thu, Jan 05, 2023 at 11:08:25PM +0100, Paul B Mahol wrote:
>> >> >> > On Thu, Jan 5, 2023 at 9:53 PM Michael Niedermayer <
>> >> >> michael@niedermayer.cc>
>> >> >> > wrote:
>> >> >> >
>> >> >> > > On Thu, Jan 05, 2023 at 01:44:10PM +0100, Paul B Mahol wrote:
>> >> >> > > > Patch attached.
>> >> >> > >
>> >> >> > > >  swresample.c |    3 ++-
>> >> >> > > >  1 file changed, 2 insertions(+), 1 deletion(-)
>> >> >> > > > eee7a0685b44aa867562138a2e2437ecb8844612
>> >> >> > > 0001-libswresample-swresample-avoid-s16p-internal-transfe.patch
>> >> >> > > > From 9c4cd60e2dd41cf98d693c8251f4cfade0807073 Mon Sep 17
>> >> >> > > > 00:00:00
>> >> >> 2001
>> >> >> > > > From: Paul B Mahol <onemda@gmail.com>
>> >> >> > > > Date: Thu, 5 Jan 2023 13:40:12 +0100
>> >> >> > > > Subject: [PATCH] libswresample/swresample: avoid s16p
>> >> >> > > > internal
>> >> >> transfer
>> >> >> > > format
>> >> >> > > >
>> >> >> > > > Instead use float one by default for sample rate conversions.
>> >> >> > > > The s16p internal transfer format produces visible and
>> >> >> > > > hearable
>> >> >> > > > quantization artifacts.
>> >> >> > >
>> >> >> > > When does this occur and why?
>> >> >> > >
>> >> >> >
>> >> >> > It occurs always. Just compare output with 16bit and
>> >> >> > int32/float/double.
>> >> >> > Look at other people report on internet.
>> >> >> > Look at src.infinitewave.ca
>> >> >>
>> >> >> src.infinitewave.ca uses 32bit none of what it shows should touch
>> >> >> the
>> >> >> codepath
>> >> >> you change.
>> >> >>
>> >> >> if we look at src.infinitewave.ca for swr we see 2 types of
>> >> >> artifacts
>> >> >> 1. Aliassing which is at maybe -120db with the actual signal at 0db
>> >> >>    i would like to see some evidence that a human can hear this
>> >> >>
>> >> >
>> >> > For s16p<->s16p it is much lower, around -78dB thus this patch.
>> >> >
>> >> > Also for others and reports for swr its is lower than exact -120dB
>> >> >
>> >> >
>> >> > 2. Reflection and attenuation at the transition frequency
>> >> >> With linear filters there is a tradeof between attenuation of the
>> >> >> passband, reflection of frequencies beyond, latency and so on
>> >> >> You can have a perfect sharp cutoff with no attenuation and no
>> >> >> refelection
>> >> >> that requires a infinitly long filter. And while this looks best in
>> >> >> this
>> >> >> frequency plot, does it actually sound best ? If you can hear
>> >> >> -120db
>> >> >> signals you surely would then also hear the ringing long before a
>> >> >> gunshot
>> >> >> from such long filter.
>> >> >>
>> >> >
>> >> One can always change linear FIR to be min phase FIR kernel.
>> >
>> > I certainly would welcome a wider range of filters in swr, if you want
>> > to
>> > add
>> > any low delay sinc approximation or in fact i would welcome any filter
>> > you want to add.
>>
>> There is that afdelaysrc filter patch on ML to add FIR coefficient
>> generation for fractional delay audio filter that can be also used as
>> a interpolation FIR filter. And to me it seems better at same number
>> of taps than already used/available ones in soxr and swr.
>
> Please add improvments into swr, if you have any!
>
>
>>
>> Also I have done prototype of resampling filter using afir filter via
>> custom filters in filtergraph and it operates at similar speeds like
>> soxr (in these very non optimized approach) and providing better/wider
>> frequency output at highest band.
>
> Can it be added into swr ?

It will use libavutil/tx.h for convolution.

But I need first to research more and write actual code that does
out/in sample rate ratio factorization and do lot of benchmarks
comparing normal ratios with very big ones and make sure that approach
is always faster than swr/soxr and at same time providing better/same
quality.
Also need to add min phase version of filter and compare what
performance/latency it can bring at all.

>
> thx
>
> [...]
>
> --
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> Awnsering whenever a program halts or runs forever is
> On a turing machine, in general impossible (turings halting problem).
> On any real computer, always possible as a real computer has a finite
> number
> of states N, and will either halt in less than N cycles or never halt.
>
Michael Niedermayer Jan. 12, 2023, 7:49 p.m. UTC | #16
On Thu, Jan 12, 2023 at 05:09:18PM +0100, Paul B Mahol wrote:
> On 1/12/23, Michael Niedermayer <michael@niedermayer.cc> wrote:
> > On Thu, Jan 12, 2023 at 03:20:06PM +0100, Paul B Mahol wrote:
> >> On 1/8/23, Michael Niedermayer <michael@niedermayer.cc> wrote:
> >> > On Fri, Jan 06, 2023 at 07:04:59PM +0100, Paul B Mahol wrote:
> >> >> On Fri, Jan 6, 2023 at 7:01 PM Paul B Mahol <onemda@gmail.com> wrote:
> >> >>
> >> >> >
> >> >> >
> >> >> > On Fri, Jan 6, 2023 at 6:25 PM Michael Niedermayer
> >> >> > <michael@niedermayer.cc>
> >> >> > wrote:
> >> >> >
> >> >> >> On Thu, Jan 05, 2023 at 11:08:25PM +0100, Paul B Mahol wrote:
> >> >> >> > On Thu, Jan 5, 2023 at 9:53 PM Michael Niedermayer <
> >> >> >> michael@niedermayer.cc>
> >> >> >> > wrote:
> >> >> >> >
> >> >> >> > > On Thu, Jan 05, 2023 at 01:44:10PM +0100, Paul B Mahol wrote:
> >> >> >> > > > Patch attached.
> >> >> >> > >
> >> >> >> > > >  swresample.c |    3 ++-
> >> >> >> > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> >> >> >> > > > eee7a0685b44aa867562138a2e2437ecb8844612
> >> >> >> > > 0001-libswresample-swresample-avoid-s16p-internal-transfe.patch
> >> >> >> > > > From 9c4cd60e2dd41cf98d693c8251f4cfade0807073 Mon Sep 17
> >> >> >> > > > 00:00:00
> >> >> >> 2001
> >> >> >> > > > From: Paul B Mahol <onemda@gmail.com>
> >> >> >> > > > Date: Thu, 5 Jan 2023 13:40:12 +0100
> >> >> >> > > > Subject: [PATCH] libswresample/swresample: avoid s16p
> >> >> >> > > > internal
> >> >> >> transfer
> >> >> >> > > format
> >> >> >> > > >
> >> >> >> > > > Instead use float one by default for sample rate conversions.
> >> >> >> > > > The s16p internal transfer format produces visible and
> >> >> >> > > > hearable
> >> >> >> > > > quantization artifacts.
> >> >> >> > >
> >> >> >> > > When does this occur and why?
> >> >> >> > >
> >> >> >> >
> >> >> >> > It occurs always. Just compare output with 16bit and
> >> >> >> > int32/float/double.
> >> >> >> > Look at other people report on internet.
> >> >> >> > Look at src.infinitewave.ca
> >> >> >>
> >> >> >> src.infinitewave.ca uses 32bit none of what it shows should touch
> >> >> >> the
> >> >> >> codepath
> >> >> >> you change.
> >> >> >>
> >> >> >> if we look at src.infinitewave.ca for swr we see 2 types of
> >> >> >> artifacts
> >> >> >> 1. Aliassing which is at maybe -120db with the actual signal at 0db
> >> >> >>    i would like to see some evidence that a human can hear this
> >> >> >>
> >> >> >
> >> >> > For s16p<->s16p it is much lower, around -78dB thus this patch.
> >> >> >
> >> >> > Also for others and reports for swr its is lower than exact -120dB
> >> >> >
> >> >> >
> >> >> > 2. Reflection and attenuation at the transition frequency
> >> >> >> With linear filters there is a tradeof between attenuation of the
> >> >> >> passband, reflection of frequencies beyond, latency and so on
> >> >> >> You can have a perfect sharp cutoff with no attenuation and no
> >> >> >> refelection
> >> >> >> that requires a infinitly long filter. And while this looks best in
> >> >> >> this
> >> >> >> frequency plot, does it actually sound best ? If you can hear
> >> >> >> -120db
> >> >> >> signals you surely would then also hear the ringing long before a
> >> >> >> gunshot
> >> >> >> from such long filter.
> >> >> >>
> >> >> >
> >> >> One can always change linear FIR to be min phase FIR kernel.
> >> >
> >> > I certainly would welcome a wider range of filters in swr, if you want
> >> > to
> >> > add
> >> > any low delay sinc approximation or in fact i would welcome any filter
> >> > you want to add.
> >>
> >> There is that afdelaysrc filter patch on ML to add FIR coefficient
> >> generation for fractional delay audio filter that can be also used as
> >> a interpolation FIR filter. And to me it seems better at same number
> >> of taps than already used/available ones in soxr and swr.
> >
> > Please add improvments into swr, if you have any!
> >
> >
> >>
> >> Also I have done prototype of resampling filter using afir filter via
> >> custom filters in filtergraph and it operates at similar speeds like
> >> soxr (in these very non optimized approach) and providing better/wider
> >> frequency output at highest band.
> >
> > Can it be added into swr ?
> 
> It will use libavutil/tx.h for convolution.

ok of course


> 
> But I need first to research more and write actual code that does
> out/in sample rate ratio factorization and do lot of benchmarks
> comparing normal ratios with very big ones and make sure that approach
> is always faster than swr/soxr and at same time providing better/same
> quality.
> Also need to add min phase version of filter and compare what
> performance/latency it can bring at all.

thx

[...]
diff mbox series

Patch

From 9c4cd60e2dd41cf98d693c8251f4cfade0807073 Mon Sep 17 00:00:00 2001
From: Paul B Mahol <onemda@gmail.com>
Date: Thu, 5 Jan 2023 13:40:12 +0100
Subject: [PATCH] libswresample/swresample: avoid s16p internal transfer format

Instead use float one by default for sample rate conversions.
The s16p internal transfer format produces visible and hearable
quantization artifacts.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
---
 libswresample/swresample.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/libswresample/swresample.c b/libswresample/swresample.c
index 5884f8d533..e6151bc335 100644
--- a/libswresample/swresample.c
+++ b/libswresample/swresample.c
@@ -318,7 +318,8 @@  av_cold int swr_init(struct SwrContext *s){
 
     if(s->int_sample_fmt == AV_SAMPLE_FMT_NONE){
         if(   av_get_bytes_per_sample(s-> in_sample_fmt) <= 2
-           && av_get_bytes_per_sample(s->out_sample_fmt) <= 2){
+           && av_get_bytes_per_sample(s->out_sample_fmt) <= 2
+           && s->out_sample_rate==s->in_sample_rate) {
             s->int_sample_fmt= AV_SAMPLE_FMT_S16P;
         }else if(   av_get_bytes_per_sample(s-> in_sample_fmt) <= 2
            && !s->rematrix
-- 
2.37.2