diff mbox series

[FFmpeg-devel] Revert "avfilter/vf_palette(gen|use): support palettes with alpha"

Message ID 20221030175813.1497020-1-u@pkh.me
State New
Headers show
Series [FFmpeg-devel] Revert "avfilter/vf_palette(gen|use): support palettes with alpha" | expand

Checks

Context Check Description
andriy/commit_msg_x86 warning The first line of the commit message must start with a context terminated by a colon and a space, for example "lavu/opt: " or "doc: ".
yinshiyou/commit_msg_loongarch64 warning The first line of the commit message must start with a context terminated by a colon and a space, for example "lavu/opt: " or "doc: ".
yinshiyou/make_loongarch64 success Make finished
yinshiyou/make_fate_loongarch64 success Make fate finished
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Clément Bœsch Oct. 30, 2022, 5:58 p.m. UTC
This reverts commit dea673d0d548c864ec85f9260d8900d944ef7a2a.

This change cannot work for several reasons, the most obvious ones are:

- the alpha is being part of the scoring of the color difference, even
  though we can not interpret the alpha as part of the perception of the
  color (we don't even know if it's premultiplied or postmultiplied)
- the colors are averaged with their alpha value which simply cannot
  work

The command proposed in the original thread of the patch actually
produces a completely broken file:

    ffmpeg -y -loglevel verbose -i fate-suite/apng/o_sample.png -filter_complex "split[split1][split2];[split1]palettegen=max_colors=254:use_alpha=1[pal1];[split2][pal1]paletteuse=use_alpha=1" -frames:v 1 out.png

We can see that many color pixels are off, but more importantly some
colors have a random alpha value: https://imgur.com/eFQ2UK7

I don't see any easy fix for this unfortunately, the approach appears to
be flawed by design.
---
 doc/filters.texi            |   8 --
 libavfilter/vf_palettegen.c | 135 +++++++---------------
 libavfilter/vf_paletteuse.c | 225 +++++++++++++++---------------------
 3 files changed, 138 insertions(+), 230 deletions(-)

Comments

Soft Works Oct. 30, 2022, 9:19 p.m. UTC | #1
> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of
> Clément Bœsch
> Sent: Sunday, October 30, 2022 6:58 PM
> To: ffmpeg-devel@ffmpeg.org
> Cc: softworkz@hotmail.com; Clément Bœsch <u@pkh.me>
> Subject: [FFmpeg-devel] [PATCH] Revert "avfilter/vf_palette(gen|use):
> support palettes with alpha"
> 
> This reverts commit dea673d0d548c864ec85f9260d8900d944ef7a2a.
> 
> This change cannot work for several reasons, the most obvious ones
> are:
> 
> - the alpha is being part of the scoring of the color difference,
> even
>   though we can not interpret the alpha as part of the perception of
> the
>   color (we don't even know if it's premultiplied or postmultiplied)
> - the colors are averaged with their alpha value which simply cannot
>   work
> 
> The command proposed in the original thread of the patch actually
> produces a completely broken file:
> 
>     ffmpeg -y -loglevel verbose -i fate-suite/apng/o_sample.png -
> filter_complex
> "split[split1][split2];[split1]palettegen=max_colors=254:use_alpha=1[
> pal1];[split2][pal1]paletteuse=use_alpha=1" -frames:v 1 out.png
> 
> We can see that many color pixels are off, but more importantly some
> colors have a random alpha value: https://imgur.com/eFQ2UK7
> 
> I don't see any easy fix for this unfortunately, the approach appears
> to
> be flawed by design.
> ---

At the time of submission I did a lot of experiments and the results
seemed to be very useful:

https://gist.github.com/softworkz/deef5c2a43d3d629c3e17f9e21544a8f

softworkz
Clément Bœsch Oct. 30, 2022, 9:30 p.m. UTC | #2
On Sun, Oct 30, 2022 at 09:19:05PM +0000, Soft Works wrote:
[...]
> At the time of submission I did a lot of experiments and the results
> seemed to be very useful:
> 
> https://gist.github.com/softworkz/deef5c2a43d3d629c3e17f9e21544a8f
> 

On that page, the result of the command using palette* filters for the
colored "O" gives a broken output, and it's not a simple bug, it's more
fundamental. You cannot just just average the colors with different levels
of alpha, it makes no sense.

Do you understand why or do you need me to elaborate?

> softworkz
Clément Bœsch Oct. 30, 2022, 9:37 p.m. UTC | #3
On Sun, Oct 30, 2022 at 10:30:06PM +0100, Clément Bœsch wrote:
> On Sun, Oct 30, 2022 at 09:19:05PM +0000, Soft Works wrote:
> [...]
> > At the time of submission I did a lot of experiments and the results
> > seemed to be very useful:
> > 
> > https://gist.github.com/softworkz/deef5c2a43d3d629c3e17f9e21544a8f
> > 
> 
> On that page, the result of the command using palette* filters for the
> colored "O" gives a broken output, and it's not a simple bug, it's more
> fundamental. You cannot just just average the colors with different levels
> of alpha, it makes no sense.
> 
> Do you understand why or do you need me to elaborate?
> 

Here is a alpha channel, which is the result of average them with the
colors: https://imgur.com/a/50YyRGV
Soft Works Oct. 30, 2022, 9:41 p.m. UTC | #4
> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of
> Clément Bœsch
> Sent: Sunday, October 30, 2022 10:30 PM
> To: FFmpeg development discussions and patches <ffmpeg-
> devel@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH] Revert
> "avfilter/vf_palette(gen|use): support palettes with alpha"
> 
> On Sun, Oct 30, 2022 at 09:19:05PM +0000, Soft Works wrote:
> [...]
> > At the time of submission I did a lot of experiments and the
> results
> > seemed to be very useful:
> >
> > https://gist.github.com/softworkz/deef5c2a43d3d629c3e17f9e21544a8f
> >
> 
> On that page, the result of the command using palette* filters for
> the
> colored "O" gives a broken output, and it's not a simple bug, it's
> more
> fundamental. You cannot just just average the colors with different
> levels
> of alpha, it makes no sense.
> 
> Do you understand why or do you need me to elaborate?

I understand why. I know that it's not perfect. But it's the best
what's achievable within the way the filter is working.

But I wouldn't go that far as saying it would be "broken". I think
the result is quite acceptable with regards to the method being 
using.

The patch I had submitted doesn't change the previous behavior
without the use_alpha parameter.
And when using the use_alpha parameter, the results are still
useful in many cases - maybe not always.

But it don't see any reason to remove this just because it's not 
perfect. If you would have a patch to improve the results - that
would be a different story and of course very welcome.

Thanks,
softworkz
Soft Works Oct. 30, 2022, 10:55 p.m. UTC | #5
> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of
> Soft Works
> Sent: Sunday, October 30, 2022 10:41 PM
> To: FFmpeg development discussions and patches <ffmpeg-
> devel@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH] Revert
> "avfilter/vf_palette(gen|use): support palettes with alpha"
> 
> 
> 
> > -----Original Message-----
> > From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of
> > Clément Bœsch
> > Sent: Sunday, October 30, 2022 10:30 PM
> > To: FFmpeg development discussions and patches <ffmpeg-
> > devel@ffmpeg.org>
> > Subject: Re: [FFmpeg-devel] [PATCH] Revert
> > "avfilter/vf_palette(gen|use): support palettes with alpha"
> >
> > On Sun, Oct 30, 2022 at 09:19:05PM +0000, Soft Works wrote:
> > [...]
> > > At the time of submission I did a lot of experiments and the
> > results
> > > seemed to be very useful:
> > >
> > >
> https://gist.github.com/softworkz/deef5c2a43d3d629c3e17f9e21544a8f
> > >
> >
> > On that page, the result of the command using palette* filters for
> > the
> > colored "O" gives a broken output, and it's not a simple bug, it's
> > more
> > fundamental. You cannot just just average the colors with different
> > levels
> > of alpha, it makes no sense.
> >
> > Do you understand why or do you need me to elaborate?
> 
> I understand why. I know that it's not perfect. But it's the best
> what's achievable within the way the filter is working.
> 
> But I wouldn't go that far as saying it would be "broken". I think
> the result is quite acceptable with regards to the method being
> using.
> 
> The patch I had submitted doesn't change the previous behavior
> without the use_alpha parameter.
> And when using the use_alpha parameter, the results are still
> useful in many cases - maybe not always.
> 
> But it don't see any reason to remove this just because it's not
> perfect. If you would have a patch to improve the results - that
> would be a different story and of course very welcome.

I'm really not an expert in image processing algorithms, and I 
haven't tried the following yet:
Do you think it might make sense to put more weight on the 
alpha value by tripling it? So it would be weighted equally to the 
RGB value?

softworkz
Clément Bœsch Oct. 31, 2022, 12:29 a.m. UTC | #6
On Sun, Oct 30, 2022 at 10:55:31PM +0000, Soft Works wrote:
[...]
> > I understand why. I know that it's not perfect. But it's the best
> > what's achievable within the way the filter is working.
> > 
> > But I wouldn't go that far as saying it would be "broken". I think
> > the result is quite acceptable with regards to the method being
> > using.

It's broken because the alpha channel in the output is really completely
random. If we blend the output of that "O" PNG somewhere, it's going to be
a real mess.

Here is a more concrete example: if your input has some red fully
transparent (00ff0000), and some red fully opaque (ffff0000) which end up
in the same box, they will be averaged to a red with an alpha of 0x80, and
I'm not even accounting for the weight and other colors with different
transparency. That non-opaque average alpha might end up being used in
area that are expected to be opaque, and in some area where it's supposed
to be transparent. That's exactly what I showed with the "O" png.

In addition to that problem, since you're also accounting the alpha as a
weight to determine the proximity of 2 colors, you're going to select the
wrong colors. For example if we want to find the closest color to an
opaque green (ff00ff00), and the palette has a slightly transparent green
(fa00ff00) and an opaque blue (ff0000ff), then now the algorithm will
prefer the blue over the green.

That explains why in addition to the alpha being random in the "O" png,
the colors are also messed up.

> > The patch I had submitted doesn't change the previous behavior
> > without the use_alpha parameter.

Yes I noticed, but unfortunately I'm reworking the color distance to work
in perceptual color space, and the way that alpha is mixed up in the
equation just doesn't make any sense at all and prevents me from doing
these changes. Ignoring the alpha branches will make its output even more
terrible.

> > And when using the use_alpha parameter, the results are still
> > useful in many cases - maybe not always.

I don't think they're useful: they're unpredictable and very likely to
produce a broken output. Would you consider FFmpeg useful if half the time
the command was failing at producing a valid file?

[...]

> Do you think it might make sense to put more weight on the 
> alpha value by tripling it? So it would be weighted equally to the 
> RGB value?

You cannot mix alpha with colors at all, they are separate domains and you
need to treat them as such.

From paletteuse perspective what you need to do is first choose the colors
in the palette that match exactly the alpha (or at least the closest if
and only there is no exact match). Then within that set, and only within
that one, you'd pick the closest color.

From palettegen perspective, you need to split the colors in different
transparency domain (a first dimensional quantization), then quantize the
colors in each quantized alpha dimension. And when you have all your
quantized palettes for each level of alpha, you find an algorithm to
reduce the number of transparency dimensions or the number of colors per
dimension to make it fit inside a single palette. But you can't just do
the alpha and the colors at the same time, it cannot work, whatever
weights you choose.
Soft Works Oct. 31, 2022, 1:43 a.m. UTC | #7
> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of
> Clément Bœsch
> Sent: Monday, October 31, 2022 1:30 AM
> To: FFmpeg development discussions and patches <ffmpeg-
> devel@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH] Revert
> "avfilter/vf_palette(gen|use): support palettes with alpha"
> 
> On Sun, Oct 30, 2022 at 10:55:31PM +0000, Soft Works wrote:
> [...]
> > > I understand why. I know that it's not perfect. But it's the best
> > > what's achievable within the way the filter is working.
> > >
> > > But I wouldn't go that far as saying it would be "broken". I
> think
> > > the result is quite acceptable with regards to the method being
> > > using.
> 
> It's broken because the alpha channel in the output is really
> completely
> random. If we blend the output of that "O" PNG somewhere, it's going
> to be
> a real mess.
> 
> Here is a more concrete example: if your input has some red fully
> transparent (00ff0000), and some red fully opaque (ffff0000) which
> end up
> in the same box, they will be averaged to a red with an alpha of
> 0x80, and
> I'm not even accounting for the weight and other colors with
> different
> transparency. That non-opaque average alpha might end up being used
> in
> area that are expected to be opaque, and in some area where it's
> supposed
> to be transparent. That's exactly what I showed with the "O" png.
> 
> In addition to that problem, since you're also accounting the alpha
> as a
> weight to determine the proximity of 2 colors, you're going to select
> the
> wrong colors. For example if we want to find the closest color to an
> opaque green (ff00ff00), and the palette has a slightly transparent
> green
> (fa00ff00) and an opaque blue (ff0000ff), then now the algorithm will
> prefer the blue over the green.
> 
> That explains why in addition to the alpha being random in the "O"
> png,
> the colors are also messed up.
> 
> > > The patch I had submitted doesn't change the previous behavior
> > > without the use_alpha parameter.
> 
> Yes I noticed, but unfortunately I'm reworking the color distance to
> work
> in perceptual color space, and the way that alpha is mixed up in the
> equation just doesn't make any sense at all and prevents me from
> doing
> these changes. 

If you want to implement a new color distance algorithm, it should 
be either a new filter or a new (switchable) mode for the existing 
filter. Photoshop has these different modes as well and it would 
surely be useful, but I don't think it should be replacing the
existing behavior.

When it turns out that the use_alpha implementation doesn't fit
with your new color distance calculation and you add it as 
an additional mode, then it would be fine IMO when the filter
errors in case it would be attempted to use that mode in 
combination with use_alpha.


> > Do you think it might make sense to put more weight on the
> > alpha value by tripling it? So it would be weighted equally to the
> > RGB value?
> 
> You cannot mix alpha with colors at all, they are separate domains
> and you
> need to treat them as such.

What's interesting is that I've followed the same (simplified)
way for adding a use_alpha option to vf_elbg and it provides excellent
results without treating alpha separately.


> From paletteuse perspective what you need to do is first choose the
> colors
> in the palette that match exactly the alpha (or at least the closest
> if
> and only there is no exact match). Then within that set, and only
> within
> that one, you'd pick the closest color.
> 
> From palettegen perspective, you need to split the colors in
> different
> transparency domain (a first dimensional quantization), then quantize
> the
> colors in each quantized alpha dimension. And when you have all your
> quantized palettes for each level of alpha, you find an algorithm to
> reduce the number of transparency dimensions or the number of colors
> per
> dimension to make it fit inside a single palette. But you can't just
> do
> the alpha and the colors at the same time, it cannot work, whatever
> weights you choose.

I would be curious to see how well that would work, especially
in cases when the target palettes have just a few number of colors.


But to return to the proposal of removal: If everything from ffmpeg
would be removed which is not perfect, then it would be lacking
quite a number of features I suppose :-)

In the same way, one could say that palettegen/-use should be removed
because its results are wrong and colors are randomly mixed and 
misplaced while the vf_elbg filter does it right.

When you look at the result under the heading

"Paletteuse/gen Regular (to 8-bit non-alpha palette; only single 
transparent color)"
https://gist.github.com/softworkz/deef5c2a43d3d629c3e17f9e21544a8f?permalink_comment_id=3905155#gistcomment-3905155

Even without the alpha, many color pixels appear to be wrong and
random like for example the light purple pixels on the darker purple
at the bottom of the "O". That's not much different from irregularities
in the alpha channel you've shown (https://imgur.com/a/50YyRGV).

So, I agree to that it's not perfect, but the whole filter is
not perfect (vf_elbg is close-to). Do we remove the filter because
it's not perfect?

As mentioned above, if you want to add an additional mode for 
calculating the color distance, it's fine when it doesn't 
work with use_alpha IMO.

Thanks,
softworkz
Soft Works Oct. 31, 2022, 2:09 a.m. UTC | #8
> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of
> Clément Bœsch
> Sent: Monday, October 31, 2022 1:30 AM
> To: FFmpeg development discussions and patches <ffmpeg-
> devel@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH] Revert
> "avfilter/vf_palette(gen|use): support palettes with alpha"
> 
> On Sun, Oct 30, 2022 at 10:55:31PM +0000, Soft Works wrote:
> [...]
> 
> > Do you think it might make sense to put more weight on the
> > alpha value by tripling it? So it would be weighted equally to the
> > RGB value?
> 
> You cannot mix alpha with colors at all, they are separate domains
> and you
> need to treat them as such.
> 
> From paletteuse perspective what you need to do is first choose the
> colors
> in the palette that match exactly the alpha (or at least the closest
> if
> and only there is no exact match). Then within that set, and only
> within
> that one, you'd pick the closest color.
> 
> From palettegen perspective, you need to split the colors in
> different
> transparency domain (a first dimensional quantization), then quantize
> the
> colors in each quantized alpha dimension. And when you have all your
> quantized palettes for each level of alpha, you find an algorithm to
> reduce the number of transparency dimensions or the number of colors
> per
> dimension to make it fit inside a single palette. But you can't just
> do
> the alpha and the colors at the same time, it cannot work, whatever
> weights you choose.

Interestingly, pngquant which is supposed to have the best open source
quantization algorithms seems to be using weights (albeit in a more 
sophisticated way) and does not handle alpha separately for calculating 
color distance, variance and averaging:

https://github.com/ImageOptim/libimagequant/blob/a16c9ca66a24158496da02d86925cc0167831205/pam.h#L163-L182
https://github.com/ImageOptim/libimagequant/blob/a16c9ca66a24158496da02d86925cc0167831205/mediancut.c#L29-L49
https://github.com/ImageOptim/libimagequant/blob/a16c9ca66a24158496da02d86925cc0167831205/mediancut.c#L449-L476


softworkz
Clément Bœsch Oct. 31, 2022, 10:57 a.m. UTC | #9
On Mon, Oct 31, 2022 at 01:43:11AM +0000, Soft Works wrote:
[...]
> > > > The patch I had submitted doesn't change the previous behavior
> > > > without the use_alpha parameter.
> > 
> > Yes I noticed, but unfortunately I'm reworking the color distance to
> > work
> > in perceptual color space, and the way that alpha is mixed up in the
> > equation just doesn't make any sense at all and prevents me from
> > doing
> > these changes. 
> 
> If you want to implement a new color distance algorithm, it should 
> be either a new filter or a new (switchable) mode for the existing 
> filter.

Why?

> Photoshop has these different modes as well and it would 
> surely be useful, but I don't think it should be replacing the
> existing behavior.
> 

There is no point in keeping a ton of complexity exposed as user options
for something implementation specific. We offer no guarantee over how the
quantization is expected to run.

> When it turns out that the use_alpha implementation doesn't fit
> with your new color distance calculation and you add it as 
> an additional mode, then it would be fine IMO when the filter
> errors in case it would be attempted to use that mode in 
> combination with use_alpha.

IMO The use_alpha option shouldn't exist in the first place, it should be
the default behaviour because honoring the alpha is the correct thing to
do. That's not what the option is currently doing though.

> > > Do you think it might make sense to put more weight on the
> > > alpha value by tripling it? So it would be weighted equally to the
> > > RGB value?
> > 
> > You cannot mix alpha with colors at all, they are separate domains
> > and you
> > need to treat them as such.
> 
> What's interesting is that I've followed the same (simplified)
> way for adding a use_alpha option to vf_elbg and it provides excellent
> results without treating alpha separately.

I don't know how the filter works and what it's supposed to do, but if
it's indeed using the same approach as the palette ones, it cannot work.

> > From paletteuse perspective what you need to do is first choose the
> > colors
> > in the palette that match exactly the alpha (or at least the closest
> > if
> > and only there is no exact match). Then within that set, and only
> > within
> > that one, you'd pick the closest color.
> > 
> > From palettegen perspective, you need to split the colors in
> > different
> > transparency domain (a first dimensional quantization), then quantize
> > the
> > colors in each quantized alpha dimension. And when you have all your
> > quantized palettes for each level of alpha, you find an algorithm to
> > reduce the number of transparency dimensions or the number of colors
> > per
> > dimension to make it fit inside a single palette. But you can't just
> > do
> > the alpha and the colors at the same time, it cannot work, whatever
> > weights you choose.
> 
> I would be curious to see how well that would work, especially
> in cases when the target palettes have just a few number of colors.
> 

You have to make a call between whether you want to preserve the
transparency or the color while constructing the palette, but when
choosing a color you must absolutely not choose a color with a different
transparency, you must pick amongst the closest alpha, with a particular
attention to extreme alphas: an opaque colors must stay opaque, and fully
transparent one as well:
- rounding a color with 43% alpha into 50% alpha is acceptable
- rounding a color with 100% alpha into a 99% alpha is not acceptable in
  any way because you're starting to make transparent areas that weren't
- rounding a color with 0% alpha into a 1% alpha is not acceptable because
  some areas of the images are not starting to blend into an area that was
  supposedly non-existent

> But to return to the proposal of removal: If everything from ffmpeg
> would be removed which is not perfect, then it would be lacking
> quite a number of features I suppose :-)

We're not talking about perfection, we're talking about files with
artifacts. It's almost as bad as a corrupted file, because if used in a
pipeline where transparency matters, you're going to get a completely
broken output.

> In the same way, one could say that palettegen/-use should be removed
> because its results are wrong and colors are randomly mixed and 
> misplaced while the vf_elbg filter does it right.
> When you look at the result under the heading
> 
> "Paletteuse/gen Regular (to 8-bit non-alpha palette; only single 
> transparent color)"
> https://gist.github.com/softworkz/deef5c2a43d3d629c3e17f9e21544a8f?permalink_comment_id=3905155#gistcomment-3905155
> 
> Even without the alpha, many color pixels appear to be wrong and
> random like for example the light purple pixels on the darker purple
> at the bottom of the "O". That's not much different from irregularities
> in the alpha channel you've shown (https://imgur.com/a/50YyRGV).
> So, I agree to that it's not perfect, but the whole filter is
> not perfect (vf_elbg is close-to). Do we remove the filter because
> it's not perfect?

The colors are indeed wrong because it has some *incorrect* computations
(working in sRGB space, and the MSE is wrong at least) and these should be
corrected (working on it). I'm also adding improvements on the quality by
working in the perceptual space.

> As mentioned above, if you want to add an additional mode for 
> calculating the color distance, it's fine when it doesn't 
> work with use_alpha IMO.

That's not how it's going to work, sorry; I'm not going to increase
complexity and maintenance effort for no gain. Implementing a correct
support for the alpha will likely involve a revert of that commit anyway.

Note that if I was active at the time this patch was submitted I would
certainly have rejected it in this state. So it's my fault, but I'm
working on fixing it.
Paul B Mahol Oct. 31, 2022, 11:58 a.m. UTC | #10
On 10/31/22, Clément Bœsch <u@pkh.me> wrote:
> On Mon, Oct 31, 2022 at 01:43:11AM +0000, Soft Works wrote:
> [...]
>> > > > The patch I had submitted doesn't change the previous behavior
>> > > > without the use_alpha parameter.
>> >
>> > Yes I noticed, but unfortunately I'm reworking the color distance to
>> > work
>> > in perceptual color space, and the way that alpha is mixed up in the
>> > equation just doesn't make any sense at all and prevents me from
>> > doing
>> > these changes.
>>
>> If you want to implement a new color distance algorithm, it should
>> be either a new filter or a new (switchable) mode for the existing
>> filter.
>
> Why?
>
>> Photoshop has these different modes as well and it would
>> surely be useful, but I don't think it should be replacing the
>> existing behavior.
>>
>
> There is no point in keeping a ton of complexity exposed as user options
> for something implementation specific. We offer no guarantee over how the
> quantization is expected to run.
>
>> When it turns out that the use_alpha implementation doesn't fit
>> with your new color distance calculation and you add it as
>> an additional mode, then it would be fine IMO when the filter
>> errors in case it would be attempted to use that mode in
>> combination with use_alpha.
>
> IMO The use_alpha option shouldn't exist in the first place, it should be
> the default behaviour because honoring the alpha is the correct thing to
> do. That's not what the option is currently doing though.
>
>> > > Do you think it might make sense to put more weight on the
>> > > alpha value by tripling it? So it would be weighted equally to the
>> > > RGB value?
>> >
>> > You cannot mix alpha with colors at all, they are separate domains
>> > and you
>> > need to treat them as such.
>>
>> What's interesting is that I've followed the same (simplified)
>> way for adding a use_alpha option to vf_elbg and it provides excellent
>> results without treating alpha separately.
>
> I don't know how the filter works and what it's supposed to do, but if
> it's indeed using the same approach as the palette ones, it cannot work.
>
>> > From paletteuse perspective what you need to do is first choose the
>> > colors
>> > in the palette that match exactly the alpha (or at least the closest
>> > if
>> > and only there is no exact match). Then within that set, and only
>> > within
>> > that one, you'd pick the closest color.
>> >
>> > From palettegen perspective, you need to split the colors in
>> > different
>> > transparency domain (a first dimensional quantization), then quantize
>> > the
>> > colors in each quantized alpha dimension. And when you have all your
>> > quantized palettes for each level of alpha, you find an algorithm to
>> > reduce the number of transparency dimensions or the number of colors
>> > per
>> > dimension to make it fit inside a single palette. But you can't just
>> > do
>> > the alpha and the colors at the same time, it cannot work, whatever
>> > weights you choose.
>>
>> I would be curious to see how well that would work, especially
>> in cases when the target palettes have just a few number of colors.
>>
>
> You have to make a call between whether you want to preserve the
> transparency or the color while constructing the palette, but when
> choosing a color you must absolutely not choose a color with a different
> transparency, you must pick amongst the closest alpha, with a particular
> attention to extreme alphas: an opaque colors must stay opaque, and fully
> transparent one as well:
> - rounding a color with 43% alpha into 50% alpha is acceptable
> - rounding a color with 100% alpha into a 99% alpha is not acceptable in
>   any way because you're starting to make transparent areas that weren't
> - rounding a color with 0% alpha into a 1% alpha is not acceptable because
>   some areas of the images are not starting to blend into an area that was
>   supposedly non-existent
>
>> But to return to the proposal of removal: If everything from ffmpeg
>> would be removed which is not perfect, then it would be lacking
>> quite a number of features I suppose :-)
>
> We're not talking about perfection, we're talking about files with
> artifacts. It's almost as bad as a corrupted file, because if used in a
> pipeline where transparency matters, you're going to get a completely
> broken output.
>
>> In the same way, one could say that palettegen/-use should be removed
>> because its results are wrong and colors are randomly mixed and
>> misplaced while the vf_elbg filter does it right.
>> When you look at the result under the heading
>>
>> "Paletteuse/gen Regular (to 8-bit non-alpha palette; only single
>> transparent color)"
>> https://gist.github.com/softworkz/deef5c2a43d3d629c3e17f9e21544a8f?permalink_comment_id=3905155#gistcomment-3905155
>>
>> Even without the alpha, many color pixels appear to be wrong and
>> random like for example the light purple pixels on the darker purple
>> at the bottom of the "O". That's not much different from irregularities
>> in the alpha channel you've shown (https://imgur.com/a/50YyRGV).
>> So, I agree to that it's not perfect, but the whole filter is
>> not perfect (vf_elbg is close-to). Do we remove the filter because
>> it's not perfect?
>
> The colors are indeed wrong because it has some *incorrect* computations
> (working in sRGB space, and the MSE is wrong at least) and these should be
> corrected (working on it). I'm also adding improvements on the quality by
> working in the perceptual space.
>
>> As mentioned above, if you want to add an additional mode for
>> calculating the color distance, it's fine when it doesn't
>> work with use_alpha IMO.
>
> That's not how it's going to work, sorry; I'm not going to increase
> complexity and maintenance effort for no gain. Implementing a correct
> support for the alpha will likely involve a revert of that commit anyway.
>
> Note that if I was active at the time this patch was submitted I would
> certainly have rejected it in this state. So it's my fault, but I'm
> working on fixing it.
>

I'm against removing useful features.
Clément Bœsch Oct. 31, 2022, 12:41 p.m. UTC | #11
On Mon, Oct 31, 2022 at 12:58:20PM +0100, Paul B Mahol wrote:
[...]
> I'm against removing useful features.

Did you just rudely dismissed all the arguments made, or is this a joke?

Regards,
Soft Works Oct. 31, 2022, 3:11 p.m. UTC | #12
> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of
> Clément Bœsch
> Sent: Monday, October 31, 2022 11:57 AM
> To: FFmpeg development discussions and patches <ffmpeg-
> devel@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH] Revert
> "avfilter/vf_palette(gen|use): support palettes with alpha"
> 
> On Mon, Oct 31, 2022 at 01:43:11AM +0000, Soft Works wrote:
> [...]
> > > > > The patch I had submitted doesn't change the previous
> behavior
> > > > > without the use_alpha parameter.
> > >
> > > Yes I noticed, but unfortunately I'm reworking the color distance
> to
> > > work
> > > in perceptual color space, and the way that alpha is mixed up in
> the
> > > equation just doesn't make any sense at all and prevents me from
> > > doing
> > > these changes.
> >
> > If you want to implement a new color distance algorithm, it should
> > be either a new filter or a new (switchable) mode for the existing
> > filter.
> 
> Why?
> 
> > Photoshop has these different modes as well and it would
> > surely be useful, but I don't think it should be replacing the
> > existing behavior.
> >
> 
> There is no point in keeping a ton of complexity exposed as user
> options
> for something implementation specific. We offer no guarantee over how
> the
> quantization is expected to run.

Says who?

> > When it turns out that the use_alpha implementation doesn't fit
> > with your new color distance calculation and you add it as
> > an additional mode, then it would be fine IMO when the filter
> > errors in case it would be attempted to use that mode in
> > combination with use_alpha.
> 
> IMO The use_alpha option shouldn't exist in the first place, it
> should be
> the default behaviour because honoring the alpha is the correct thing
> to
> do. That's not what the option is currently doing though.
> 
> > > > Do you think it might make sense to put more weight on the
> > > > alpha value by tripling it? So it would be weighted equally to
> the
> > > > RGB value?
> > >
> > > You cannot mix alpha with colors at all, they are separate
> domains
> > > and you
> > > need to treat them as such.
> >
> > What's interesting is that I've followed the same (simplified)
> > way for adding a use_alpha option to vf_elbg and it provides
> excellent
> > results without treating alpha separately.
> 
> I don't know how the filter works and what it's supposed to do, but
> if
> it's indeed using the same approach as the palette ones, it cannot
> work.

Then it's magic, I guess.
The commands and results are on the same page I have posted.

And pngquant doing the impossible as well:

> Interestingly, pngquant which is supposed to have the best open source
> quantization algorithms seems to be using weights (albeit in a more 
> sophisticated way) and does not handle alpha separately for calculating 
> color distance, variance and averaging:

https://github.com/ImageOptim/libimagequant/blob/a16c9ca66a24158496da02d86925cc0167831205/pam.h#L163-L182 
https://github.com/ImageOptim/libimagequant/blob/a16c9ca66a24158496da02d86925cc0167831205/mediancut.c#L29-L49 
https://github.com/ImageOptim/libimagequant/blob/a16c9ca66a24158496da02d86925cc0167831205/mediancut.c#L449-L476

> That's not how it's going to work, sorry; I'm not going to increase
> complexity and maintenance effort for no gain. Implementing a correct
> support for the alpha will likely involve a revert of that commit
> anyway.

If you want to improve the way it works that's another story.

But at this point, we're talking about removal. And I disagree to that.

Best regards,
softworkz
Clément Bœsch Oct. 31, 2022, 6:51 p.m. UTC | #13
On Mon, Oct 31, 2022 at 03:11:13PM +0000, Soft Works wrote:
[...]
> > > Photoshop has these different modes as well and it would
> > > surely be useful, but I don't think it should be replacing the
> > > existing behavior.
> > >
> > 
> > There is no point in keeping a ton of complexity exposed as user
> > options
> > for something implementation specific. We offer no guarantee over how
> > the
> > quantization is expected to run.
> 
> Says who?

The API contract with the user is that we propose a quantization, the
implementation details do not matter, we do not document that so we take
no engagement whatsoever in that regards.

Also, users are not looking for many options about how it works under the
hood, they just want the best they can get out of the box, that's a basic
UX rule.

[...]
> And pngquant doing the impossible as well:
> 
> > Interestingly, pngquant which is supposed to have the best open source
> > quantization algorithms

Says who?

> > seems to be using weights (albeit in a more 
> > sophisticated way) and does not handle alpha separately for calculating 
> > color distance, variance and averaging:
> 
> https://github.com/ImageOptim/libimagequant/blob/a16c9ca66a24158496da02d86925cc0167831205/pam.h#L163-L182 
> https://github.com/ImageOptim/libimagequant/blob/a16c9ca66a24158496da02d86925cc0167831205/mediancut.c#L29-L49 
> https://github.com/ImageOptim/libimagequant/blob/a16c9ca66a24158496da02d86925cc0167831205/mediancut.c#L449-L476
> 

I'd rather not look too much into that code. Do they mess up the alpha
channel as well? If not, feel free to investigate how they achieve that.

> > That's not how it's going to work, sorry; I'm not going to increase
> > complexity and maintenance effort for no gain. Implementing a correct
> > support for the alpha will likely involve a revert of that commit
> > anyway.
> 
> If you want to improve the way it works that's another story.
> 
> But at this point, we're talking about removal. And I disagree to that.

You may disagree but:
- the option causes many transparency artifacts
- the option is fundamentally flawed and need to be rewritten differently
- handling the alpha should be by default if such a feature was existing
- the option is preventing improvements to the code

I will send a patchset in the coming days, which depends on its removal.
You'll be free to propose again a patch to support alpha quantization
properly, but I'll ask for it to be reliable enough so that it's enabled
by default.

Regards,
Soft Works Oct. 31, 2022, 8:41 p.m. UTC | #14
> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of
> Clément Bœsch
> Sent: Monday, October 31, 2022 7:51 PM
> To: FFmpeg development discussions and patches <ffmpeg-
> devel@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH] Revert
> "avfilter/vf_palette(gen|use): support palettes with alpha"
> 
> On Mon, Oct 31, 2022 at 03:11:13PM +0000, Soft Works wrote:
> [...]
> > And pngquant doing the impossible as well:
> >
> > > Interestingly, pngquant which is supposed to have the best open
> source
> > > quantization algorithms
> 
> Says who?

That's the result of my research I did at that time. 
Feel free to do your own research.

> https://github.com/ImageOptim/libimagequant/blob/a16c9ca66a24158496da
> 02d86925cc0167831205/pam.h#L163-L182
> >
> https://github.com/ImageOptim/libimagequant/blob/a16c9ca66a24158496da
> 02d86925cc0167831205/mediancut.c#L29-L49
> >
> https://github.com/ImageOptim/libimagequant/blob/a16c9ca66a24158496da
> 02d86925cc0167831205/mediancut.c#L449-L476
> >
> 
> I'd rather not look too much into that code. Do they mess up the
> alpha
> channel as well? 

When working on this, you should at least be familiar with the quality
of results it provides, even when you don't want to look at the code.


> > If you want to improve the way it works that's another story.
> >
> > But at this point, we're talking about removal. And I disagree to
> that.
> 
> You may disagree but:
> - the option causes many transparency artifacts
> - the option is fundamentally flawed and need to be rewritten
> differently
> - handling the alpha should be by default if such a feature was
> existing
> - the option is preventing improvements to the code
> 
> I will send a patchset in the coming days, which depends on its
> removal.
> You'll be free to propose again a patch to support alpha quantization

I'll take that as a joke..

> properly, but I'll ask for it to be reliable enough so that it's
> enabled
> by default.

That's not acceptable either, because the filter may also be
used to create animated GIFs where only a single palette entry
may exist which indicates full transparency and all others are
fully opaque.


I don't want to be just negative and rejective and I'm really 
welcoming patches that provide improvement. But I think that 
a prerequisite for working on this subject is to be familiar
with state-of-the art implementations of image color quantization.

The best implementations I'm aware of can be found in PhotoShop
(closed source) and pngquant (open source).

The quantization that vf_elbg (with use_alpha) does get very
close to the aforementioned, but it comes at the price of 
relatively high computational cost.

If you would come up with an implementation for palettegen/-use
that provides comparable results in an efficient way like
pngquant, that would be really awesome!

> but I'll ask for it to be reliable enough so that it's

I'll ask for your patch to support alpha in the same 
optional way like it is supported right now.
(actually something that should go without saying)

Best regards,
softworkz
Michael Niedermayer Oct. 31, 2022, 9:58 p.m. UTC | #15
On Mon, Oct 31, 2022 at 11:57:16AM +0100, Clément Bœsch wrote:
> On Mon, Oct 31, 2022 at 01:43:11AM +0000, Soft Works wrote:
> [...]
> > > > > The patch I had submitted doesn't change the previous behavior
> > > > > without the use_alpha parameter.
> > > 
> > > Yes I noticed, but unfortunately I'm reworking the color distance to
> > > work
> > > in perceptual color space, and the way that alpha is mixed up in the
> > > equation just doesn't make any sense at all and prevents me from
> > > doing
> > > these changes. 
> > 
> > If you want to implement a new color distance algorithm, it should 
> > be either a new filter or a new (switchable) mode for the existing 
> > filter.
> 
> Why?
> 
> > Photoshop has these different modes as well and it would 
> > surely be useful, but I don't think it should be replacing the
> > existing behavior.
> > 
> 
> There is no point in keeping a ton of complexity exposed as user options
> for something implementation specific. We offer no guarantee over how the
> quantization is expected to run.
> 
> > When it turns out that the use_alpha implementation doesn't fit
> > with your new color distance calculation and you add it as 
> > an additional mode, then it would be fine IMO when the filter
> > errors in case it would be attempted to use that mode in 
> > combination with use_alpha.
> 
> IMO The use_alpha option shouldn't exist in the first place, it should be
> the default behaviour because honoring the alpha is the correct thing to
> do. That's not what the option is currently doing though.
> 
> > > > Do you think it might make sense to put more weight on the
> > > > alpha value by tripling it? So it would be weighted equally to the
> > > > RGB value?
> > > 
> > > You cannot mix alpha with colors at all, they are separate domains
> > > and you
> > > need to treat them as such.
> > 
> > What's interesting is that I've followed the same (simplified)
> > way for adding a use_alpha option to vf_elbg and it provides excellent
> > results without treating alpha separately.
> 
> I don't know how the filter works and what it's supposed to do, but if
> it's indeed using the same approach as the palette ones, it cannot work.
> 
> > > From paletteuse perspective what you need to do is first choose the
> > > colors
> > > in the palette that match exactly the alpha (or at least the closest
> > > if
> > > and only there is no exact match). Then within that set, and only
> > > within
> > > that one, you'd pick the closest color.
> > > 
> > > From palettegen perspective, you need to split the colors in
> > > different
> > > transparency domain (a first dimensional quantization), then quantize
> > > the
> > > colors in each quantized alpha dimension. And when you have all your
> > > quantized palettes for each level of alpha, you find an algorithm to
> > > reduce the number of transparency dimensions or the number of colors
> > > per
> > > dimension to make it fit inside a single palette. But you can't just
> > > do
> > > the alpha and the colors at the same time, it cannot work, whatever
> > > weights you choose.
> > 
> > I would be curious to see how well that would work, especially
> > in cases when the target palettes have just a few number of colors.
> > 
> 
> You have to make a call between whether you want to preserve the
> transparency or the color while constructing the palette, but when
> choosing a color you must absolutely not choose a color with a different
> transparency, you must pick amongst the closest alpha, with a particular
> attention to extreme alphas: an opaque colors must stay opaque, and fully
> transparent one as well:
> - rounding a color with 43% alpha into 50% alpha is acceptable
> - rounding a color with 100% alpha into a 99% alpha is not acceptable in
>   any way because you're starting to make transparent areas that weren't
> - rounding a color with 0% alpha into a 1% alpha is not acceptable because
>   some areas of the images are not starting to blend into an area that was
>   supposedly non-existent

really ?
so if i have all shades of green available for all transparencies from 1% to 99%
i "must" make my plants all use 0% trasparency even if i only have a single color and
that is bright pink 
I dont think that is the best choice

There are perceptual differences between the different areas of the RGBA hypercube
though. Hardly anyone would notice the difference between a 255 and 254 blue but
having some slight transparency might be noticable.
These different weights in different areas could maybe be considered in palette*
and elbg, it likely would improve things. OTOH heuristics like always and never
feels like that might become alot of work to tune. I think its better to attemt
to achieve a similar goal with less hard and more perceptual scoring

thx

[...]
Soft Works Oct. 31, 2022, 11:34 p.m. UTC | #16
> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of
> Michael Niedermayer
> Sent: Monday, October 31, 2022 10:59 PM
> To: FFmpeg development discussions and patches <ffmpeg-
> devel@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH] Revert
> "avfilter/vf_palette(gen|use): support palettes with alpha"
> 
> On Mon, Oct 31, 2022 at 11:57:16AM +0100, Clément Bœsch wrote:
> > On Mon, Oct 31, 2022 at 01:43:11AM +0000, Soft Works wrote:
> > [...]
> > > > > > The patch I had submitted doesn't change the previous
> behavior
> > > > > > without the use_alpha parameter.
> > > >
> > > > Yes I noticed, but unfortunately I'm reworking the color
> distance to
> > > > work
> > > > in perceptual color space, and the way that alpha is mixed up
> in the
> > > > equation just doesn't make any sense at all and prevents me
> from
> > > > doing
> > > > these changes.
> > >
> > > If you want to implement a new color distance algorithm, it
> should
> > > be either a new filter or a new (switchable) mode for the
> existing
> > > filter.
> >
> > Why?
> >
> > > Photoshop has these different modes as well and it would
> > > surely be useful, but I don't think it should be replacing the
> > > existing behavior.
> > >
> >
> > There is no point in keeping a ton of complexity exposed as user
> options
> > for something implementation specific. We offer no guarantee over
> how the
> > quantization is expected to run.
> >
> > > When it turns out that the use_alpha implementation doesn't fit
> > > with your new color distance calculation and you add it as
> > > an additional mode, then it would be fine IMO when the filter
> > > errors in case it would be attempted to use that mode in
> > > combination with use_alpha.
> >
> > IMO The use_alpha option shouldn't exist in the first place, it
> should be
> > the default behaviour because honoring the alpha is the correct
> thing to
> > do. That's not what the option is currently doing though.
> >
> > > > > Do you think it might make sense to put more weight on the
> > > > > alpha value by tripling it? So it would be weighted equally
> to the
> > > > > RGB value?
> > > >
> > > > You cannot mix alpha with colors at all, they are separate
> domains
> > > > and you
> > > > need to treat them as such.
> > >
> > > What's interesting is that I've followed the same (simplified)
> > > way for adding a use_alpha option to vf_elbg and it provides
> excellent
> > > results without treating alpha separately.
> >
> > I don't know how the filter works and what it's supposed to do, but
> if
> > it's indeed using the same approach as the palette ones, it cannot
> work.
> >
> > > > From paletteuse perspective what you need to do is first choose
> the
> > > > colors
> > > > in the palette that match exactly the alpha (or at least the
> closest
> > > > if
> > > > and only there is no exact match). Then within that set, and
> only
> > > > within
> > > > that one, you'd pick the closest color.
> > > >
> > > > From palettegen perspective, you need to split the colors in
> > > > different
> > > > transparency domain (a first dimensional quantization), then
> quantize
> > > > the
> > > > colors in each quantized alpha dimension. And when you have all
> your
> > > > quantized palettes for each level of alpha, you find an
> algorithm to
> > > > reduce the number of transparency dimensions or the number of
> colors
> > > > per
> > > > dimension to make it fit inside a single palette. But you can't
> just
> > > > do
> > > > the alpha and the colors at the same time, it cannot work,
> whatever
> > > > weights you choose.
> > >
> > > I would be curious to see how well that would work, especially
> > > in cases when the target palettes have just a few number of
> colors.
> > >
> >
> > You have to make a call between whether you want to preserve the
> > transparency or the color while constructing the palette, but when
> > choosing a color you must absolutely not choose a color with a
> different
> > transparency, you must pick amongst the closest alpha, with a
> particular
> > attention to extreme alphas: an opaque colors must stay opaque, and
> fully
> > transparent one as well:
> > - rounding a color with 43% alpha into 50% alpha is acceptable
> > - rounding a color with 100% alpha into a 99% alpha is not
> acceptable in
> >   any way because you're starting to make transparent areas that
> weren't
> > - rounding a color with 0% alpha into a 1% alpha is not acceptable
> because
> >   some areas of the images are not starting to blend into an area
> that was
> >   supposedly non-existent
> 
> really ?
> so if i have all shades of green available for all transparencies
> from 1% to 99%
> i "must" make my plants all use 0% trasparency even if i only have a
> single color and
> that is bright pink
> I dont think that is the best choice
> 
> There are perceptual differences between the different areas of the
> RGBA hypercube
> though. Hardly anyone would notice the difference between a 255 and
> 254 blue but
> having some slight transparency might be noticable.
> These different weights in different areas could maybe be considered
> in palette*
> and elbg, it likely would improve things. OTOH heuristics like always
> and never
> feels like that might become alot of work to tune. I think its better
> to attemt
> to achieve a similar goal with less hard and more perceptual scoring

I agree. vf_elbg+use_alpha provides excellent results without separate 
alpha handling (even though I say explain why) and pngquant doesn't 
handle alpha separately either.


The pngquant algorithm is described as follows:

pngquant uses modified version of Median Cut quantization algorithm and 
additional techniques to mitigate deficiencies of Median Cut.

Instead of splitting boxes with largest volume or number of colors, 
boxes are selected to minimize variance from their median value.

Histogram is built with addition of a basic perception model, which gives 
less weight to noisy areas of the image.

To improve color further, histogram is adjusted in a process similar to 
gradient descent (Median Cut is repeated many times with more weight on 
poorly represented colors).

Finally, colors are corrected using Voronoi iteration (K-means), which 
guarantees locally optimal palette.

pngquant works in premultiplied alpha color space to give less weight 
to transparent colors.

When remapping, error diffusion is applied only to areas where several 
neighboring pixels quantize to the same value, and which are not edges. 
This avoids adding noise to areas which have high visual quality 
without dithering.

(Reference: https://pngquant.org/#algorithm) 

softworkz
Clément Bœsch Nov. 1, 2022, 10:18 a.m. UTC | #17
On Mon, Oct 31, 2022 at 10:58:57PM +0100, Michael Niedermayer wrote:
[...]
> > You have to make a call between whether you want to preserve the
> > transparency or the color while constructing the palette, but when
> > choosing a color you must absolutely not choose a color with a different
> > transparency, you must pick amongst the closest alpha, with a particular
> > attention to extreme alphas: an opaque colors must stay opaque, and fully
> > transparent one as well:
> > - rounding a color with 43% alpha into 50% alpha is acceptable
> > - rounding a color with 100% alpha into a 99% alpha is not acceptable in
> >   any way because you're starting to make transparent areas that weren't
> > - rounding a color with 0% alpha into a 1% alpha is not acceptable because
> >   some areas of the images are not starting to blend into an area that was
> >   supposedly non-existent
> 
> really ?
> so if i have all shades of green available for all transparencies from 1% to 99%
> i "must" make my plants all use 0% trasparency even if i only have a single color and
> that is bright pink 

I believe so because you don't know how the alpha channel is going to be
used in the user pipeline. The goal of the palette filters is to quantize
colors, not mess up the alpha channel. It's better that for these filters
to be very bad at quantizing colors but still preserving as best as
possible the alpha, than giving the illusion that the colors are great
while messing up massively the alpha channel (which it currently does).

BTW, the colors are not even pre-multiplied, so in the current state it
just makes no sense at all: we are comparing colors with different alpha
channel even though we have no idea how they look like when blend.

> There are perceptual differences between the different areas of the RGBA hypercube
> though. Hardly anyone would notice the difference between a 255 and 254 blue but
> having some slight transparency might be noticable.

It's noticeable late: that is when your asset reach the blending stage,
which is the worse user experience you can provide.

Just imagine, the user quantize its files, thinking it's going to preserve
transparency. Blend with a black background it appears to be somehow ok
(see softworkz screenshot), so the user starts using it on their website.
Everything looks fine. Then few months later, the user decides to changes
the black background to a brighter colored color: all the images suddenly
revealed their destroyed alpha channel, which artifacts everywhere.

> These different weights in different areas could maybe be considered in palette*
> and elbg, it likely would improve things. OTOH heuristics like always and never
> feels like that might become alot of work to tune. I think its better to attemt
> to achieve a similar goal with less hard and more perceptual scoring

Working on perception can only work with colors, you can not jam in the
alpha, it's another dimension entirely. So you first have to work with
premultiplied data, and then you need to separate the alpha scoring
separately.
diff mbox series

Patch

diff --git a/doc/filters.texi b/doc/filters.texi
index bcd19cf931..f8b78ca919 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -18269,9 +18269,6 @@  Compute new histogram for each frame.
 @end table
 
 Default value is @var{full}.
-@item use_alpha
-Create a palette of colors with alpha components.
-Setting this, will automatically disable 'reserve_transparent'.
 @end table
 
 The filter also exports the frame metadata @code{lavfi.color_quant_ratio}
@@ -18350,11 +18347,6 @@  will be treated as completely opaque, and values below this threshold will be
 treated as completely transparent.
 
 The option must be an integer value in the range [0,255]. Default is @var{128}.
-
-@item use_alpha
-Apply the palette by taking alpha values into account. Only useful with
-palettes that are containing multiple colors with alpha components.
-Setting this will automatically disable 'alpha_treshold'.
 @end table
 
 @subsection Examples
diff --git a/libavfilter/vf_palettegen.c b/libavfilter/vf_palettegen.c
index 27f74fd147..d335ef91e6 100644
--- a/libavfilter/vf_palettegen.c
+++ b/libavfilter/vf_palettegen.c
@@ -59,7 +59,7 @@  enum {
 };
 
 #define NBITS 5
-#define HIST_SIZE (1<<(4*NBITS))
+#define HIST_SIZE (1<<(3*NBITS))
 
 typedef struct PaletteGenContext {
     const AVClass *class;
@@ -67,7 +67,6 @@  typedef struct PaletteGenContext {
     int max_colors;
     int reserve_transparent;
     int stats_mode;
-    int use_alpha;
 
     AVFrame *prev_frame;                    // previous frame used for the diff stats_mode
     struct hist_node histogram[HIST_SIZE];  // histogram/hashtable of the colors
@@ -89,7 +88,6 @@  static const AVOption palettegen_options[] = {
         { "full", "compute full frame histograms", 0, AV_OPT_TYPE_CONST, {.i64=STATS_MODE_ALL_FRAMES}, INT_MIN, INT_MAX, FLAGS, "mode" },
         { "diff", "compute histograms only for the part that differs from previous frame", 0, AV_OPT_TYPE_CONST, {.i64=STATS_MODE_DIFF_FRAMES}, INT_MIN, INT_MAX, FLAGS, "mode" },
         { "single", "compute new histogram for each frame", 0, AV_OPT_TYPE_CONST, {.i64=STATS_MODE_SINGLE_FRAMES}, INT_MIN, INT_MAX, FLAGS, "mode" },
-    { "use_alpha", "create a palette including alpha values", OFFSET(use_alpha), AV_OPT_TYPE_BOOL, {.i64 = 0}, 0, 1, FLAGS },
     { NULL }
 };
 
@@ -115,16 +113,15 @@  static int cmp_##name(const void *pa, const void *pb)   \
 {                                                       \
     const struct color_ref * const *a = pa;             \
     const struct color_ref * const *b = pb;             \
-    return   (int)((*a)->color >> (8 * (3 - (pos))) & 0xff)  \
-           - (int)((*b)->color >> (8 * (3 - (pos))) & 0xff); \
+    return   (int)((*a)->color >> (8 * (2 - (pos))) & 0xff)  \
+           - (int)((*b)->color >> (8 * (2 - (pos))) & 0xff); \
 }
 
-DECLARE_CMP_FUNC(a, 0)
-DECLARE_CMP_FUNC(r, 1)
-DECLARE_CMP_FUNC(g, 2)
-DECLARE_CMP_FUNC(b, 3)
+DECLARE_CMP_FUNC(r, 0)
+DECLARE_CMP_FUNC(g, 1)
+DECLARE_CMP_FUNC(b, 2)
 
-static const cmp_func cmp_funcs[] = {cmp_a, cmp_r, cmp_g, cmp_b};
+static const cmp_func cmp_funcs[] = {cmp_r, cmp_g, cmp_b};
 
 /**
  * Simple color comparison for sorting the final palette
@@ -146,17 +143,6 @@  static av_always_inline int diff(const uint32_t a, const uint32_t b)
     return dr*dr + dg*dg + db*db;
 }
 
-static av_always_inline int diff_alpha(const uint32_t a, const uint32_t b)
-{
-    const uint8_t c1[] = {a >> 24 & 0xff, a >> 16 & 0xff, a >> 8 & 0xff, a & 0xff};
-    const uint8_t c2[] = {b >> 24 & 0xff, b >> 16 & 0xff, b >> 8 & 0xff, b & 0xff};
-    const int da = c1[0] - c2[0];
-    const int dr = c1[1] - c2[1];
-    const int dg = c1[2] - c2[2];
-    const int db = c1[3] - c2[3];
-    return da*da + dr*dr + dg*dg + db*db;
-}
-
 /**
  * Find the next box to split: pick the one with the highest variance
  */
@@ -178,10 +164,7 @@  static int get_next_box_id_to_split(PaletteGenContext *s)
 
                 for (i = 0; i < box->len; i++) {
                     const struct color_ref *ref = s->refs[box->start + i];
-                    if (s->use_alpha)
-                        variance += (int64_t)diff_alpha(ref->color, box->color) * ref->count;
-                    else
-                        variance += (int64_t)diff(ref->color, box->color) * ref->count;
+                    variance += diff(ref->color, box->color) * ref->count;
                 }
                 box->variance = variance;
             }
@@ -201,31 +184,24 @@  static int get_next_box_id_to_split(PaletteGenContext *s)
  * specified box. Takes into account the weight of each color.
  */
 static uint32_t get_avg_color(struct color_ref * const *refs,
-                              const struct range_box *box, int use_alpha)
+                              const struct range_box *box)
 {
     int i;
     const int n = box->len;
-    uint64_t a = 0, r = 0, g = 0, b = 0, div = 0;
+    uint64_t r = 0, g = 0, b = 0, div = 0;
 
     for (i = 0; i < n; i++) {
         const struct color_ref *ref = refs[box->start + i];
-        if (use_alpha)
-            a += (ref->color >> 24 & 0xff) * ref->count;
-        r += (ref->color     >> 16 & 0xff) * ref->count;
-        g += (ref->color     >>  8 & 0xff) * ref->count;
-        b += (ref->color           & 0xff) * ref->count;
+        r += (ref->color >> 16 & 0xff) * ref->count;
+        g += (ref->color >>  8 & 0xff) * ref->count;
+        b += (ref->color       & 0xff) * ref->count;
         div += ref->count;
     }
 
-    if (use_alpha)
-        a = a / div;
     r = r / div;
     g = g / div;
     b = b / div;
 
-    if (use_alpha)
-        return a<<24 | r<<16 | g<<8 | b;
-
     return 0xffU<<24 | r<<16 | g<<8 | b;
 }
 
@@ -244,8 +220,8 @@  static void split_box(PaletteGenContext *s, struct range_box *box, int n)
     av_assert0(box->len     >= 1);
     av_assert0(new_box->len >= 1);
 
-    box->color     = get_avg_color(s->refs, box, s->use_alpha);
-    new_box->color = get_avg_color(s->refs, new_box, s->use_alpha);
+    box->color     = get_avg_color(s->refs, box);
+    new_box->color = get_avg_color(s->refs, new_box);
     box->variance     = -1;
     new_box->variance = -1;
 }
@@ -275,7 +251,7 @@  static void write_palette(AVFilterContext *ctx, AVFrame *out)
         pal += pal_linesize;
     }
 
-    if (s->reserve_transparent && !s->use_alpha) {
+    if (s->reserve_transparent) {
         av_assert0(s->nb_boxes < 256);
         pal[out->width - pal_linesize - 1] = AV_RB32(&s->transparency_color) >> 8;
     }
@@ -343,49 +319,40 @@  static AVFrame *get_palette_frame(AVFilterContext *ctx)
     box = &s->boxes[box_id];
     box->len = s->nb_refs;
     box->sorted_by = -1;
-    box->color = get_avg_color(s->refs, box, s->use_alpha);
+    box->color = get_avg_color(s->refs, box);
     box->variance = -1;
     s->nb_boxes = 1;
 
     while (box && box->len > 1) {
-        int i, ar, rr, gr, br, longest;
+        int i, rr, gr, br, longest;
         uint64_t median, box_weight = 0;
 
         /* compute the box weight (sum all the weights of the colors in the
          * range) and its boundings */
-        uint8_t min[4] = {0xff, 0xff, 0xff, 0xff};
-        uint8_t max[4] = {0x00, 0x00, 0x00, 0x00};
+        uint8_t min[3] = {0xff, 0xff, 0xff};
+        uint8_t max[3] = {0x00, 0x00, 0x00};
         for (i = box->start; i < box->start + box->len; i++) {
             const struct color_ref *ref = s->refs[i];
             const uint32_t rgb = ref->color;
-            const uint8_t a = rgb >> 24 & 0xff, r = rgb >> 16 & 0xff, g = rgb >> 8 & 0xff, b = rgb & 0xff;
-            min[0] = FFMIN(a, min[0]); max[0] = FFMAX(a, max[0]);
-            min[1] = FFMIN(r, min[1]); max[1] = FFMAX(r, max[1]);
-            min[2] = FFMIN(g, min[2]); max[2] = FFMAX(g, max[2]);
-            min[3] = FFMIN(b, min[3]); max[3] = FFMAX(b, max[3]);
+            const uint8_t r = rgb >> 16 & 0xff, g = rgb >> 8 & 0xff, b = rgb & 0xff;
+            min[0] = FFMIN(r, min[0]), max[0] = FFMAX(r, max[0]);
+            min[1] = FFMIN(g, min[1]), max[1] = FFMAX(g, max[1]);
+            min[2] = FFMIN(b, min[2]), max[2] = FFMAX(b, max[2]);
             box_weight += ref->count;
         }
 
         /* define the axis to sort by according to the widest range of colors */
-        ar = max[0] - min[0];
-        rr = max[1] - min[1];
-        gr = max[2] - min[2];
-        br = max[3] - min[3];
-        longest = 2; // pick green by default (the color the eye is the most sensitive to)
-        if (s->use_alpha) {
-            if (ar >= rr && ar >= br && ar >= gr) longest = 0;
-            if (br >= rr && br >= gr && br >= ar) longest = 3;
-            if (rr >= gr && rr >= br && rr >= ar) longest = 1;
-            if (gr >= rr && gr >= br && gr >= ar) longest = 2; // prefer green again
-        } else {
-            if (br >= rr && br >= gr) longest = 3;
-            if (rr >= gr && rr >= br) longest = 1;
-            if (gr >= rr && gr >= br) longest = 2; // prefer green again
-        }
-
-        ff_dlog(ctx, "box #%02X [%6d..%-6d] (%6d) w:%-6"PRIu64" ranges:[%2x %2x %2x %2x] sort by %c (already sorted:%c) ",
+        rr = max[0] - min[0];
+        gr = max[1] - min[1];
+        br = max[2] - min[2];
+        longest = 1; // pick green by default (the color the eye is the most sensitive to)
+        if (br >= rr && br >= gr) longest = 2;
+        if (rr >= gr && rr >= br) longest = 0;
+        if (gr >= rr && gr >= br) longest = 1; // prefer green again
+
+        ff_dlog(ctx, "box #%02X [%6d..%-6d] (%6d) w:%-6"PRIu64" ranges:[%2x %2x %2x] sort by %c (already sorted:%c) ",
                 box_id, box->start, box->start + box->len - 1, box->len, box_weight,
-                ar, rr, gr, br, "argb"[longest], box->sorted_by == longest ? 'y' : 'n');
+                rr, gr, br, "rgb"[longest], box->sorted_by == longest ? 'y':'n');
 
         /* sort the range by its longest axis if it's not already sorted */
         if (box->sorted_by != longest) {
@@ -427,27 +394,22 @@  static AVFrame *get_palette_frame(AVFilterContext *ctx)
  * It keeps the NBITS least significant bit of each component to make it
  * "random" even if the scene doesn't have much different colors.
  */
-static inline unsigned color_hash(uint32_t color, int use_alpha)
+static inline unsigned color_hash(uint32_t color)
 {
     const uint8_t r = color >> 16 & ((1<<NBITS)-1);
     const uint8_t g = color >>  8 & ((1<<NBITS)-1);
     const uint8_t b = color       & ((1<<NBITS)-1);
 
-    if (use_alpha) {
-        const uint8_t a = color >> 24 & ((1 << NBITS) - 1);
-        return a << (NBITS * 3) | r << (NBITS * 2) | g << NBITS | b;
-    }
-
     return r << (NBITS * 2) | g << NBITS | b;
 }
 
 /**
  * Locate the color in the hash table and increment its counter.
  */
-static int color_inc(struct hist_node *hist, uint32_t color, int use_alpha)
+static int color_inc(struct hist_node *hist, uint32_t color)
 {
     int i;
-    const unsigned hash = color_hash(color, use_alpha);
+    const unsigned hash = color_hash(color);
     struct hist_node *node = &hist[hash];
     struct color_ref *e;
 
@@ -472,7 +434,7 @@  static int color_inc(struct hist_node *hist, uint32_t color, int use_alpha)
  * Update histogram when pixels differ from previous frame.
  */
 static int update_histogram_diff(struct hist_node *hist,
-                                 const AVFrame *f1, const AVFrame *f2, int use_alpha)
+                                 const AVFrame *f1, const AVFrame *f2)
 {
     int x, y, ret, nb_diff_colors = 0;
 
@@ -483,7 +445,7 @@  static int update_histogram_diff(struct hist_node *hist,
         for (x = 0; x < f1->width; x++) {
             if (p[x] == q[x])
                 continue;
-            ret = color_inc(hist, p[x], use_alpha);
+            ret = color_inc(hist, p[x]);
             if (ret < 0)
                 return ret;
             nb_diff_colors += ret;
@@ -495,7 +457,7 @@  static int update_histogram_diff(struct hist_node *hist,
 /**
  * Simple histogram of the frame.
  */
-static int update_histogram_frame(struct hist_node *hist, const AVFrame *f, int use_alpha)
+static int update_histogram_frame(struct hist_node *hist, const AVFrame *f)
 {
     int x, y, ret, nb_diff_colors = 0;
 
@@ -503,7 +465,7 @@  static int update_histogram_frame(struct hist_node *hist, const AVFrame *f, int
         const uint32_t *p = (const uint32_t *)(f->data[0] + y*f->linesize[0]);
 
         for (x = 0; x < f->width; x++) {
-            ret = color_inc(hist, p[x], use_alpha);
+            ret = color_inc(hist, p[x]);
             if (ret < 0)
                 return ret;
             nb_diff_colors += ret;
@@ -519,8 +481,8 @@  static int filter_frame(AVFilterLink *inlink, AVFrame *in)
 {
     AVFilterContext *ctx = inlink->dst;
     PaletteGenContext *s = ctx->priv;
-    int ret = s->prev_frame ? update_histogram_diff(s->histogram, s->prev_frame, in, s->use_alpha)
-                            : update_histogram_frame(s->histogram, in, s->use_alpha);
+    int ret = s->prev_frame ? update_histogram_diff(s->histogram, s->prev_frame, in)
+                            : update_histogram_frame(s->histogram, in);
 
     if (ret > 0)
         s->nb_refs += ret;
@@ -579,16 +541,6 @@  static int config_output(AVFilterLink *outlink)
     return 0;
 }
 
-static int init(AVFilterContext *ctx)
-{
-    PaletteGenContext* s = ctx->priv;
-
-    if (s->use_alpha && s->reserve_transparent)
-        s->reserve_transparent = 0;
-
-    return 0;
-}
-
 static av_cold void uninit(AVFilterContext *ctx)
 {
     int i;
@@ -621,7 +573,6 @@  const AVFilter ff_vf_palettegen = {
     .name          = "palettegen",
     .description   = NULL_IF_CONFIG_SMALL("Find the optimal palette for a given stream."),
     .priv_size     = sizeof(PaletteGenContext),
-    .init          = init,
     .uninit        = uninit,
     FILTER_INPUTS(palettegen_inputs),
     FILTER_OUTPUTS(palettegen_outputs),
diff --git a/libavfilter/vf_paletteuse.c b/libavfilter/vf_paletteuse.c
index a6b5d5a5fa..cb18329bb7 100644
--- a/libavfilter/vf_paletteuse.c
+++ b/libavfilter/vf_paletteuse.c
@@ -29,6 +29,7 @@ 
 #include "libavutil/opt.h"
 #include "libavutil/qsort.h"
 #include "avfilter.h"
+#include "filters.h"
 #include "framesync.h"
 #include "internal.h"
 
@@ -63,7 +64,7 @@  struct color_node {
 };
 
 #define NBITS 5
-#define CACHE_SIZE (1<<(4*NBITS))
+#define CACHE_SIZE (1<<(3*NBITS))
 
 struct cached_color {
     uint32_t color;
@@ -88,7 +89,6 @@  typedef struct PaletteUseContext {
     uint32_t palette[AVPALETTE_COUNT];
     int transparency_index; /* index in the palette of transparency. -1 if there is no transparency in the palette. */
     int trans_thresh;
-    int use_alpha;
     int palette_loaded;
     int dither;
     int new;
@@ -108,7 +108,7 @@  typedef struct PaletteUseContext {
 } PaletteUseContext;
 
 #define OFFSET(x) offsetof(PaletteUseContext, x)
-#define FLAGS (AV_OPT_FLAG_FILTERING_PARAM | AV_OPT_FLAG_VIDEO_PARAM)
+#define FLAGS AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_VIDEO_PARAM
 static const AVOption paletteuse_options[] = {
     { "dither", "select dithering mode", OFFSET(dither), AV_OPT_TYPE_INT, {.i64=DITHERING_SIERRA2_4A}, 0, NB_DITHERING-1, FLAGS, "dithering_mode" },
         { "bayer",           "ordered 8x8 bayer dithering (deterministic)",                            0, AV_OPT_TYPE_CONST, {.i64=DITHERING_BAYER},           INT_MIN, INT_MAX, FLAGS, "dithering_mode" },
@@ -121,7 +121,6 @@  static const AVOption paletteuse_options[] = {
         { "rectangle", "process smallest different rectangle", 0, AV_OPT_TYPE_CONST, {.i64=DIFF_MODE_RECTANGLE}, INT_MIN, INT_MAX, FLAGS, "diff_mode" },
     { "new", "take new palette for each output frame", OFFSET(new), AV_OPT_TYPE_BOOL, {.i64=0}, 0, 1, FLAGS },
     { "alpha_threshold", "set the alpha threshold for transparency", OFFSET(trans_thresh), AV_OPT_TYPE_INT, {.i64=128}, 0, 255, FLAGS },
-    { "use_alpha", "use alpha channel for mapping", OFFSET(use_alpha), AV_OPT_TYPE_BOOL, {.i64=0}, 0, 1, FLAGS },
 
     /* following are the debug options, not part of the official API */
     { "debug_kdtree", "save Graphviz graph of the kdtree in specified file", OFFSET(dot_filename), AV_OPT_TYPE_STRING, {.str=NULL}, 0, 0, FLAGS },
@@ -163,41 +162,37 @@  static av_always_inline uint32_t dither_color(uint32_t px, int er, int eg,
          | av_clip_uint8((px       & 0xff) + ((eb * scale) / (1<<shift)));
 }
 
-static av_always_inline int diff(const uint8_t *c1, const uint8_t *c2, const PaletteUseContext *s)
+static av_always_inline int diff(const uint8_t *c1, const uint8_t *c2, const int trans_thresh)
 {
     // XXX: try L*a*b with CIE76 (dL*dL + da*da + db*db)
-    const int da = c1[0] - c2[0];
     const int dr = c1[1] - c2[1];
     const int dg = c1[2] - c2[2];
     const int db = c1[3] - c2[3];
 
-    if (s->use_alpha)
-        return da*da + dr*dr + dg*dg + db*db;
-
-    if (c1[0] < s->trans_thresh && c2[0] < s->trans_thresh) {
+    if (c1[0] < trans_thresh && c2[0] < trans_thresh) {
         return 0;
-    } else if (c1[0] >= s->trans_thresh && c2[0] >= s->trans_thresh) {
+    } else if (c1[0] >= trans_thresh && c2[0] >= trans_thresh) {
         return dr*dr + dg*dg + db*db;
     } else {
         return 255*255 + 255*255 + 255*255;
     }
 }
 
-static av_always_inline uint8_t colormap_nearest_bruteforce(const PaletteUseContext *s, const uint8_t *argb)
+static av_always_inline uint8_t colormap_nearest_bruteforce(const uint32_t *palette, const uint8_t *argb, const int trans_thresh)
 {
     int i, pal_id = -1, min_dist = INT_MAX;
 
     for (i = 0; i < AVPALETTE_COUNT; i++) {
-        const uint32_t c = s->palette[i];
+        const uint32_t c = palette[i];
 
-        if (s->use_alpha || c >> 24 >= s->trans_thresh) { // ignore transparent entry
+        if (c >> 24 >= trans_thresh) { // ignore transparent entry
             const uint8_t palargb[] = {
-                s->palette[i]>>24 & 0xff,
-                s->palette[i]>>16 & 0xff,
-                s->palette[i]>> 8 & 0xff,
-                s->palette[i]     & 0xff,
+                palette[i]>>24 & 0xff,
+                palette[i]>>16 & 0xff,
+                palette[i]>> 8 & 0xff,
+                palette[i]     & 0xff,
             };
-            const int d = diff(palargb, argb, s);
+            const int d = diff(palargb, argb, trans_thresh);
             if (d < min_dist) {
                 pal_id = i;
                 min_dist = d;
@@ -213,17 +208,17 @@  struct nearest_color {
     int dist_sqd;
 };
 
-static void colormap_nearest_node(const PaletteUseContext *s,
-                                  const struct color_node *map,
+static void colormap_nearest_node(const struct color_node *map,
                                   const int node_pos,
                                   const uint8_t *target,
+                                  const int trans_thresh,
                                   struct nearest_color *nearest)
 {
     const struct color_node *kd = map + node_pos;
-    const int split = kd->split;
+    const int s = kd->split;
     int dx, nearer_kd_id, further_kd_id;
     const uint8_t *current = kd->val;
-    const int current_to_target = diff(target, current, s);
+    const int current_to_target = diff(target, current, trans_thresh);
 
     if (current_to_target < nearest->dist_sqd) {
         nearest->node_pos = node_pos;
@@ -231,23 +226,23 @@  static void colormap_nearest_node(const PaletteUseContext *s,
     }
 
     if (kd->left_id != -1 || kd->right_id != -1) {
-        dx = target[split] - current[split];
+        dx = target[s] - current[s];
 
         if (dx <= 0) nearer_kd_id = kd->left_id,  further_kd_id = kd->right_id;
         else         nearer_kd_id = kd->right_id, further_kd_id = kd->left_id;
 
         if (nearer_kd_id != -1)
-            colormap_nearest_node(s, map, nearer_kd_id, target, nearest);
+            colormap_nearest_node(map, nearer_kd_id, target, trans_thresh, nearest);
 
         if (further_kd_id != -1 && dx*dx < nearest->dist_sqd)
-            colormap_nearest_node(s, map, further_kd_id, target, nearest);
+            colormap_nearest_node(map, further_kd_id, target, trans_thresh, nearest);
     }
 }
 
-static av_always_inline uint8_t colormap_nearest_recursive(const PaletteUseContext *s, const struct color_node *node, const uint8_t *rgb)
+static av_always_inline uint8_t colormap_nearest_recursive(const struct color_node *node, const uint8_t *rgb, const int trans_thresh)
 {
     struct nearest_color res = {.dist_sqd = INT_MAX, .node_pos = -1};
-    colormap_nearest_node(s, node, 0, rgb, &res);
+    colormap_nearest_node(node, 0, rgb, trans_thresh, &res);
     return node[res.node_pos].palette_id;
 }
 
@@ -256,7 +251,7 @@  struct stack_node {
     int dx2;
 };
 
-static av_always_inline uint8_t colormap_nearest_iterative(const PaletteUseContext *s, const struct color_node *root, const uint8_t *target)
+static av_always_inline uint8_t colormap_nearest_iterative(const struct color_node *root, const uint8_t *target, const int trans_thresh)
 {
     int pos = 0, best_node_id = -1, best_dist = INT_MAX, cur_color_id = 0;
     struct stack_node nodes[16];
@@ -266,7 +261,7 @@  static av_always_inline uint8_t colormap_nearest_iterative(const PaletteUseConte
 
         const struct color_node *kd = &root[cur_color_id];
         const uint8_t *current = kd->val;
-        const int current_to_target = diff(target, current, s);
+        const int current_to_target = diff(target, current, trans_thresh);
 
         /* Compare current color node to the target and update our best node if
          * it's actually better. */
@@ -328,10 +323,10 @@  end:
     return root[best_node_id].palette_id;
 }
 
-#define COLORMAP_NEAREST(s, search, root, target)                                    \
-    search == COLOR_SEARCH_NNS_ITERATIVE ? colormap_nearest_iterative(s, root, target) :      \
-    search == COLOR_SEARCH_NNS_RECURSIVE ? colormap_nearest_recursive(s, root, target) :      \
-                                           colormap_nearest_bruteforce(s, target)
+#define COLORMAP_NEAREST(search, palette, root, target, trans_thresh)                                    \
+    search == COLOR_SEARCH_NNS_ITERATIVE ? colormap_nearest_iterative(root, target, trans_thresh) :      \
+    search == COLOR_SEARCH_NNS_RECURSIVE ? colormap_nearest_recursive(root, target, trans_thresh) :      \
+                                           colormap_nearest_bruteforce(palette, target, trans_thresh)
 
 /**
  * Check if the requested color is in the cache already. If not, find it in the
@@ -368,13 +363,13 @@  static av_always_inline int color_get(PaletteUseContext *s, uint32_t color,
     if (!e)
         return AVERROR(ENOMEM);
     e->color = color;
-    e->pal_entry = COLORMAP_NEAREST(s, search_method, s->map, argb_elts);
+    e->pal_entry = COLORMAP_NEAREST(search_method, s->palette, s->map, argb_elts, s->trans_thresh);
 
     return e->pal_entry;
 }
 
 static av_always_inline int get_dst_color_err(PaletteUseContext *s,
-                                              uint32_t c, int *ea, int *er, int *eg, int *eb,
+                                              uint32_t c, int *er, int *eg, int *eb,
                                               const enum color_search_method search_method)
 {
     const uint8_t a = c >> 24 & 0xff;
@@ -387,9 +382,8 @@  static av_always_inline int get_dst_color_err(PaletteUseContext *s,
         return dstx;
     dstc = s->palette[dstx];
     if (dstx == s->transparency_index) {
-        *ea =*er = *eg = *eb = 0;
+        *er = *eg = *eb = 0;
     } else {
-        *ea = (int)a - (int)(dstc >> 24 & 0xff);
         *er = (int)r - (int)(dstc >> 16 & 0xff);
         *eg = (int)g - (int)(dstc >>  8 & 0xff);
         *eb = (int)b - (int)(dstc       & 0xff);
@@ -413,7 +407,7 @@  static av_always_inline int set_frame(PaletteUseContext *s, AVFrame *out, AVFram
 
     for (y = y_start; y < h; y++) {
         for (x = x_start; x < w; x++) {
-            int ea, er, eg, eb;
+            int er, eg, eb;
 
             if (dither == DITHERING_BAYER) {
                 const int d = s->ordered_dither[(y & 7)<<3 | (x & 7)];
@@ -433,7 +427,7 @@  static av_always_inline int set_frame(PaletteUseContext *s, AVFrame *out, AVFram
 
             } else if (dither == DITHERING_HECKBERT) {
                 const int right = x < w - 1, down = y < h - 1;
-                const int color = get_dst_color_err(s, src[x], &ea, &er, &eg, &eb, search_method);
+                const int color = get_dst_color_err(s, src[x], &er, &eg, &eb, search_method);
 
                 if (color < 0)
                     return color;
@@ -445,7 +439,7 @@  static av_always_inline int set_frame(PaletteUseContext *s, AVFrame *out, AVFram
 
             } else if (dither == DITHERING_FLOYD_STEINBERG) {
                 const int right = x < w - 1, down = y < h - 1, left = x > x_start;
-                const int color = get_dst_color_err(s, src[x], &ea, &er, &eg, &eb, search_method);
+                const int color = get_dst_color_err(s, src[x], &er, &eg, &eb, search_method);
 
                 if (color < 0)
                     return color;
@@ -459,7 +453,7 @@  static av_always_inline int set_frame(PaletteUseContext *s, AVFrame *out, AVFram
             } else if (dither == DITHERING_SIERRA2) {
                 const int right  = x < w - 1, down  = y < h - 1, left  = x > x_start;
                 const int right2 = x < w - 2,                    left2 = x > x_start + 1;
-                const int color = get_dst_color_err(s, src[x], &ea, &er, &eg, &eb, search_method);
+                const int color = get_dst_color_err(s, src[x], &er, &eg, &eb, search_method);
 
                 if (color < 0)
                     return color;
@@ -478,7 +472,7 @@  static av_always_inline int set_frame(PaletteUseContext *s, AVFrame *out, AVFram
 
             } else if (dither == DITHERING_SIERRA2_4A) {
                 const int right = x < w - 1, down = y < h - 1, left = x > x_start;
-                const int color = get_dst_color_err(s, src[x], &ea, &er, &eg, &eb, search_method);
+                const int color = get_dst_color_err(s, src[x], &er, &eg, &eb, search_method);
 
                 if (color < 0)
                     return color;
@@ -561,7 +555,8 @@  static int disp_tree(const struct color_node *node, const char *fname)
     return 0;
 }
 
-static int debug_accuracy(const PaletteUseContext *s)
+static int debug_accuracy(const struct color_node *node, const uint32_t *palette, const int trans_thresh,
+                          const enum color_search_method search_method)
 {
     int r, g, b, ret = 0;
 
@@ -569,26 +564,19 @@  static int debug_accuracy(const PaletteUseContext *s)
         for (g = 0; g < 256; g++) {
             for (b = 0; b < 256; b++) {
                 const uint8_t argb[] = {0xff, r, g, b};
-                const int r1 = COLORMAP_NEAREST(s, s->color_search_method, s->map, argb);
-                const int r2 = colormap_nearest_bruteforce(s, argb);
+                const int r1 = COLORMAP_NEAREST(search_method, palette, node, argb, trans_thresh);
+                const int r2 = colormap_nearest_bruteforce(palette, argb, trans_thresh);
                 if (r1 != r2) {
-                    const uint32_t c1 = s->palette[r1];
-                    const uint32_t c2 = s->palette[r2];
-                    const uint8_t a1 = s->use_alpha ? c1>>24 & 0xff : 0xff;
-                    const uint8_t a2 = s->use_alpha ? c2>>24 & 0xff : 0xff;
-                    const uint8_t palargb1[] = { a1, c1>>16 & 0xff, c1>> 8 & 0xff, c1 & 0xff };
-                    const uint8_t palargb2[] = { a2, c2>>16 & 0xff, c2>> 8 & 0xff, c2 & 0xff };
-                    const int d1 = diff(palargb1, argb, s);
-                    const int d2 = diff(palargb2, argb, s);
+                    const uint32_t c1 = palette[r1];
+                    const uint32_t c2 = palette[r2];
+                    const uint8_t palargb1[] = { 0xff, c1>>16 & 0xff, c1>> 8 & 0xff, c1 & 0xff };
+                    const uint8_t palargb2[] = { 0xff, c2>>16 & 0xff, c2>> 8 & 0xff, c2 & 0xff };
+                    const int d1 = diff(palargb1, argb, trans_thresh);
+                    const int d2 = diff(palargb2, argb, trans_thresh);
                     if (d1 != d2) {
-                        if (s->use_alpha)
-                            av_log(NULL, AV_LOG_ERROR,
-                                   "/!\\ %02X%02X%02X: %d ! %d (%08"PRIX32" ! %08"PRIX32") / dist: %d ! %d\n",
-                                   r, g, b, r1, r2, c1, c2, d1, d2);
-                        else
-                            av_log(NULL, AV_LOG_ERROR,
-                                   "/!\\ %02X%02X%02X: %d ! %d (%06"PRIX32" ! %06"PRIX32") / dist: %d ! %d\n",
-                                   r, g, b, r1, r2, c1 & 0xffffff, c2 & 0xffffff, d1, d2);
+                        av_log(NULL, AV_LOG_ERROR,
+                               "/!\\ %02X%02X%02X: %d ! %d (%06"PRIX32" ! %06"PRIX32") / dist: %d ! %d\n",
+                               r, g, b, r1, r2, c1 & 0xffffff, c2 & 0xffffff, d1, d2);
                         ret = 1;
                     }
                 }
@@ -604,8 +592,8 @@  struct color {
 };
 
 struct color_rect {
-    uint8_t min[4];
-    uint8_t max[4];
+    uint8_t min[3];
+    uint8_t max[3];
 };
 
 typedef int (*cmp_func)(const void *, const void *);
@@ -626,47 +614,43 @@  DECLARE_CMP_FUNC(b, 3)
 
 static const cmp_func cmp_funcs[] = {cmp_a, cmp_r, cmp_g, cmp_b};
 
-static int get_next_color(const uint8_t *color_used, const PaletteUseContext *s,
+static int get_next_color(const uint8_t *color_used, const uint32_t *palette,
+                          const int trans_thresh,
                           int *component, const struct color_rect *box)
 {
-    int wa, wr, wg, wb;
+    int wr, wg, wb;
     int i, longest = 0;
     unsigned nb_color = 0;
     struct color_rect ranges;
     struct color tmp_pal[256];
     cmp_func cmpf;
 
-    ranges.min[0] = ranges.min[1] = ranges.min[2]  = ranges.min[3]= 0xff;
-    ranges.max[0] = ranges.max[1] = ranges.max[2]  = ranges.max[3]= 0x00;
+    ranges.min[0] = ranges.min[1] = ranges.min[2] = 0xff;
+    ranges.max[0] = ranges.max[1] = ranges.max[2] = 0x00;
 
     for (i = 0; i < AVPALETTE_COUNT; i++) {
-        const uint32_t c = s->palette[i];
+        const uint32_t c = palette[i];
         const uint8_t a = c >> 24 & 0xff;
         const uint8_t r = c >> 16 & 0xff;
         const uint8_t g = c >>  8 & 0xff;
         const uint8_t b = c       & 0xff;
 
-        if (!s->use_alpha && a < s->trans_thresh) {
+        if (a < trans_thresh) {
             continue;
         }
 
-        if (color_used[i] || (a != 0xff && !s->use_alpha) ||
-            r < box->min[1] || g < box->min[2] || b < box->min[3] ||
-            r > box->max[1] || g > box->max[2] || b > box->max[3])
+        if (color_used[i] || (a != 0xff) ||
+            r < box->min[0] || g < box->min[1] || b < box->min[2] ||
+            r > box->max[0] || g > box->max[1] || b > box->max[2])
             continue;
 
-        if (s->use_alpha && (a < box->min[0] || a > box->max[0]))
-            continue;
-
-        if (a < ranges.min[0]) ranges.min[0] = a;
-        if (r < ranges.min[1]) ranges.min[1] = r;
-        if (g < ranges.min[2]) ranges.min[2] = g;
-        if (b < ranges.min[3]) ranges.min[3] = b;
+        if (r < ranges.min[0]) ranges.min[0] = r;
+        if (g < ranges.min[1]) ranges.min[1] = g;
+        if (b < ranges.min[2]) ranges.min[2] = b;
 
-        if (a > ranges.max[0]) ranges.max[0] = a;
-        if (r > ranges.max[1]) ranges.max[1] = r;
-        if (g > ranges.max[2]) ranges.max[2] = g;
-        if (b > ranges.max[3]) ranges.max[3] = b;
+        if (r > ranges.max[0]) ranges.max[0] = r;
+        if (g > ranges.max[1]) ranges.max[1] = g;
+        if (b > ranges.max[2]) ranges.max[2] = b;
 
         tmp_pal[nb_color].value  = c;
         tmp_pal[nb_color].pal_id = i;
@@ -678,22 +662,12 @@  static int get_next_color(const uint8_t *color_used, const PaletteUseContext *s,
         return -1;
 
     /* define longest axis that will be the split component */
-    wa = ranges.max[0] - ranges.min[0];
-    wr = ranges.max[1] - ranges.min[1];
-    wg = ranges.max[2] - ranges.min[2];
-    wb = ranges.max[3] - ranges.min[3];
-
-    if (s->use_alpha) {
-        if (wa >= wr && wa >= wb && wa >= wg) longest = 0;
-        if (wr >= wg && wr >= wb && wr >= wa) longest = 1;
-        if (wg >= wr && wg >= wb && wg >= wa) longest = 2;
-        if (wb >= wr && wb >= wg && wb >= wa) longest = 3;
-    } else {
-        if (wr >= wg && wr >= wb) longest = 1;
-        if (wg >= wr && wg >= wb) longest = 2;
-        if (wb >= wr && wb >= wg) longest = 3;
-    }
-
+    wr = ranges.max[0] - ranges.min[0];
+    wg = ranges.max[1] - ranges.min[1];
+    wb = ranges.max[2] - ranges.min[2];
+    if (wr >= wg && wr >= wb) longest = 1;
+    if (wg >= wr && wg >= wb) longest = 2;
+    if (wb >= wr && wb >= wg) longest = 3;
     cmpf = cmp_funcs[longest];
     *component = longest;
 
@@ -706,7 +680,8 @@  static int get_next_color(const uint8_t *color_used, const PaletteUseContext *s,
 static int colormap_insert(struct color_node *map,
                            uint8_t *color_used,
                            int *nb_used,
-                           const PaletteUseContext *s,
+                           const uint32_t *palette,
+                           const int trans_thresh,
                            const struct color_rect *box)
 {
     uint32_t c;
@@ -714,14 +689,14 @@  static int colormap_insert(struct color_node *map,
     int node_left_id = -1, node_right_id = -1;
     struct color_node *node;
     struct color_rect box1, box2;
-    const int pal_id = get_next_color(color_used, s, &component, box);
+    const int pal_id = get_next_color(color_used, palette, trans_thresh, &component, box);
 
     if (pal_id < 0)
         return -1;
 
     /* create new node with that color */
     cur_id = (*nb_used)++;
-    c = s->palette[pal_id];
+    c = palette[pal_id];
     node = &map[cur_id];
     node->split = component;
     node->palette_id = pal_id;
@@ -734,13 +709,13 @@  static int colormap_insert(struct color_node *map,
 
     /* get the two boxes this node creates */
     box1 = box2 = *box;
-    box1.max[component] = node->val[component];
-    box2.min[component] = FFMIN(node->val[component] + 1, 255);
+    box1.max[component-1] = node->val[component];
+    box2.min[component-1] = FFMIN(node->val[component] + 1, 255);
 
-    node_left_id = colormap_insert(map, color_used, nb_used, s, &box1);
+    node_left_id = colormap_insert(map, color_used, nb_used, palette, trans_thresh, &box1);
 
-    if (box2.min[component] <= box2.max[component])
-        node_right_id = colormap_insert(map, color_used, nb_used, s, &box2);
+    if (box2.min[component-1] <= box2.max[component-1])
+        node_right_id = colormap_insert(map, color_used, nb_used, palette, trans_thresh, &box2);
 
     node->left_id  = node_left_id;
     node->right_id = node_right_id;
@@ -755,13 +730,6 @@  static int cmp_pal_entry(const void *a, const void *b)
     return c1 - c2;
 }
 
-static int cmp_pal_entry_alpha(const void *a, const void *b)
-{
-    const int c1 = *(const uint32_t *)a;
-    const int c2 = *(const uint32_t *)b;
-    return c1 - c2;
-}
-
 static void load_colormap(PaletteUseContext *s)
 {
     int i, nb_used = 0;
@@ -769,13 +737,12 @@  static void load_colormap(PaletteUseContext *s)
     uint32_t last_color = 0;
     struct color_rect box;
 
-    if (!s->use_alpha && s->transparency_index >= 0) {
+    if (s->transparency_index >= 0) {
         FFSWAP(uint32_t, s->palette[s->transparency_index], s->palette[255]);
     }
 
     /* disable transparent colors and dups */
-    qsort(s->palette, AVPALETTE_COUNT-(s->transparency_index >= 0), sizeof(*s->palette),
-        s->use_alpha ? cmp_pal_entry_alpha : cmp_pal_entry);
+    qsort(s->palette, AVPALETTE_COUNT-(s->transparency_index >= 0), sizeof(*s->palette), cmp_pal_entry);
 
     for (i = 0; i < AVPALETTE_COUNT; i++) {
         const uint32_t c = s->palette[i];
@@ -784,22 +751,22 @@  static void load_colormap(PaletteUseContext *s)
             continue;
         }
         last_color = c;
-        if (!s->use_alpha && c >> 24 < s->trans_thresh) {
+        if (c >> 24 < s->trans_thresh) {
             color_used[i] = 1; // ignore transparent color(s)
             continue;
         }
     }
 
-    box.min[0] = box.min[1] = box.min[2] = box.min[3] = 0x00;
-    box.max[0] = box.max[1] = box.max[2] = box.max[3] = 0xff;
+    box.min[0] = box.min[1] = box.min[2] = 0x00;
+    box.max[0] = box.max[1] = box.max[2] = 0xff;
 
-    colormap_insert(s->map, color_used, &nb_used, s, &box);
+    colormap_insert(s->map, color_used, &nb_used, s->palette, s->trans_thresh, &box);
 
     if (s->dot_filename)
         disp_tree(s->map, s->dot_filename);
 
     if (s->debug_accuracy) {
-        if (!debug_accuracy(s))
+        if (!debug_accuracy(s->map, s->palette, s->trans_thresh, s->color_search_method))
             av_log(NULL, AV_LOG_INFO, "Accuracy check passed\n");
     }
 }
@@ -813,18 +780,16 @@  static void debug_mean_error(PaletteUseContext *s, const AVFrame *in1,
     uint8_t  *src2 =             in2->data[0];
     const int src1_linesize = in1->linesize[0] >> 2;
     const int src2_linesize = in2->linesize[0];
-    const float div = in1->width * in1->height * (s->use_alpha ? 4 : 3);
+    const float div = in1->width * in1->height * 3;
     unsigned mean_err = 0;
 
     for (y = 0; y < in1->height; y++) {
         for (x = 0; x < in1->width; x++) {
             const uint32_t c1 = src1[x];
             const uint32_t c2 = palette[src2[x]];
-            const uint8_t a1 = s->use_alpha ? c1>>24 & 0xff : 0xff;
-            const uint8_t a2 = s->use_alpha ? c2>>24 & 0xff : 0xff;
-            const uint8_t argb1[] = {a1, c1 >> 16 & 0xff, c1 >> 8 & 0xff, c1 & 0xff};
-            const uint8_t argb2[] = {a2, c2 >> 16 & 0xff, c2 >> 8 & 0xff, c2 & 0xff};
-            mean_err += diff(argb1, argb2, s);
+            const uint8_t argb1[] = {0xff, c1 >> 16 & 0xff, c1 >> 8 & 0xff, c1 & 0xff};
+            const uint8_t argb2[] = {0xff, c2 >> 16 & 0xff, c2 >> 8 & 0xff, c2 & 0xff};
+            mean_err += diff(argb1, argb2, s->trans_thresh);
         }
         src1 += src1_linesize;
         src2 += src2_linesize;
@@ -1024,7 +989,7 @@  static void load_palette(PaletteUseContext *s, const AVFrame *palette_frame)
     for (y = 0; y < palette_frame->height; y++) {
         for (x = 0; x < palette_frame->width; x++) {
             s->palette[i] = p[x];
-            if (!s->use_alpha && p[x]>>24 < s->trans_thresh) {
+            if (p[x]>>24 < s->trans_thresh) {
                 s->transparency_index = i; // we are assuming at most one transparent color in palette
             }
             i++;