diff mbox series

[FFmpeg-devel,2/2] swscale/input: Avoid calls to av_pix_fmt_desc_get()

Message ID GV1P250MB0737CDE238A6D1F2FB8A9F4E8F409@GV1P250MB0737.EURP250.PROD.OUTLOOK.COM
State Superseded
Headers show
Series [FFmpeg-devel,1/2] swscale/input: Remove spec-incompliant '; ' | expand

Checks

Context Check Description
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Andreas Rheinhardt Sept. 8, 2022, 2:38 a.m. UTC
Up until now, libswscale/input.c used a macro to read
an input pixel which involved a call to av_pix_fmt_desc_get()
to find out whether the input pixel format is BE or LE
despite this being known at compile-time (there are templates
per pixfmt). Even worse, these calls are made in a loop,
so that e.g. there are six calls to av_pix_fmt_desc_get()
for every pair of UV pixel processed in
rgb64ToUV_half_c_template().

This commit modifies these macros to ensure that isBE()
is evaluated at compile-time. This saved 9743B of .text
for me (GCC 11.2, -O3).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
output.c suffers from the same issue (and according to coverage
is responsible for quite a large chunk of the calls to
av_pix_fmt_desc_get() (144699832 of 321342234)), so I will
be looking into this tomorrow.

 libswscale/input.c | 96 +++++++++++++++++++++++++---------------------
 1 file changed, 52 insertions(+), 44 deletions(-)

Comments

Michael Niedermayer Sept. 8, 2022, 5:39 p.m. UTC | #1
Hi

On Thu, Sep 08, 2022 at 04:38:11AM +0200, Andreas Rheinhardt wrote:
> Up until now, libswscale/input.c used a macro to read
> an input pixel which involved a call to av_pix_fmt_desc_get()
> to find out whether the input pixel format is BE or LE
> despite this being known at compile-time (there are templates
> per pixfmt). Even worse, these calls are made in a loop,
> so that e.g. there are six calls to av_pix_fmt_desc_get()
> for every pair of UV pixel processed in
> rgb64ToUV_half_c_template().
> 
> This commit modifies these macros to ensure that isBE()
> is evaluated at compile-time. This saved 9743B of .text
> for me (GCC 11.2, -O3).

hmm, all these functions where supposed to be optimized out
why where they not ?

iam asking as the code is simpler before your patch if that
"optimization out" thing would work

thx

[...]
Andreas Rheinhardt Sept. 8, 2022, 7:38 p.m. UTC | #2
Michael Niedermayer:
> Hi
> 
> On Thu, Sep 08, 2022 at 04:38:11AM +0200, Andreas Rheinhardt wrote:
>> Up until now, libswscale/input.c used a macro to read
>> an input pixel which involved a call to av_pix_fmt_desc_get()
>> to find out whether the input pixel format is BE or LE
>> despite this being known at compile-time (there are templates
>> per pixfmt). Even worse, these calls are made in a loop,
>> so that e.g. there are six calls to av_pix_fmt_desc_get()
>> for every pair of UV pixel processed in
>> rgb64ToUV_half_c_template().
>>
>> This commit modifies these macros to ensure that isBE()
>> is evaluated at compile-time. This saved 9743B of .text
>> for me (GCC 11.2, -O3).
> 
> hmm, all these functions where supposed to be optimized out
> why where they not ?
> 
> iam asking as the code is simpler before your patch if that
> "optimization out" thing would work
> 

Why should these functions be optimized out? What would enable the
compiler to optimize them out?
(And I really don't see why this patch would make the code more
complicated.)

- Andreas
Michael Niedermayer Sept. 8, 2022, 8:25 p.m. UTC | #3
On Thu, Sep 08, 2022 at 09:38:51PM +0200, Andreas Rheinhardt wrote:
> Michael Niedermayer:
> > Hi
> > 
> > On Thu, Sep 08, 2022 at 04:38:11AM +0200, Andreas Rheinhardt wrote:
> >> Up until now, libswscale/input.c used a macro to read
> >> an input pixel which involved a call to av_pix_fmt_desc_get()
> >> to find out whether the input pixel format is BE or LE
> >> despite this being known at compile-time (there are templates
> >> per pixfmt). Even worse, these calls are made in a loop,
> >> so that e.g. there are six calls to av_pix_fmt_desc_get()
> >> for every pair of UV pixel processed in
> >> rgb64ToUV_half_c_template().
> >>
> >> This commit modifies these macros to ensure that isBE()
> >> is evaluated at compile-time. This saved 9743B of .text
> >> for me (GCC 11.2, -O3).
> > 
> > hmm, all these functions where supposed to be optimized out
> > why where they not ?
> > 
> > iam asking as the code is simpler before your patch if that
> > "optimization out" thing would work
> > 
> 
> Why should these functions be optimized out? What would enable the
> compiler to optimize them out?

Going back into the past, there was
6b0768e2021b90215a2ab55ed427bce91d148148

before this the code certainly did get optimized out, it was just
#define isBE(x) ((x)&1)

thats simple and clean code btw
after this it became

#define isBE(x) \
+    (av_pix_fmt_descriptors[x].flags & PIX_FMT_BE)

thats still really good, and very readable, its a const array so
one would assume that a compiler can figure that out at compile time
well, i try not to think of linking and seperate objects here ;)

next it got then replaced by a function and a call that i suspect
people thought would be inlined


> (And I really don't see why this patch would make the code more
> complicated.)

the code historically was capable to lookup any flag and detail
of a pixel format at compile time
now your code works around that not working. Introducing a 2nd
system to do this in parallel. To me if i look at the evolution
of isBE() / code checking BE-ness it become more messy over time

I think it would be interresting to think about if we can make
av_pix_fmt_desc_get(compile time constant) work at compile time.
or if we maybe can return to a simpler implementation


thx

[...]
Andreas Rheinhardt Sept. 8, 2022, 9:44 p.m. UTC | #4
Michael Niedermayer:
> On Thu, Sep 08, 2022 at 09:38:51PM +0200, Andreas Rheinhardt wrote:
>> Michael Niedermayer:
>>> Hi
>>>
>>> On Thu, Sep 08, 2022 at 04:38:11AM +0200, Andreas Rheinhardt wrote:
>>>> Up until now, libswscale/input.c used a macro to read
>>>> an input pixel which involved a call to av_pix_fmt_desc_get()
>>>> to find out whether the input pixel format is BE or LE
>>>> despite this being known at compile-time (there are templates
>>>> per pixfmt). Even worse, these calls are made in a loop,
>>>> so that e.g. there are six calls to av_pix_fmt_desc_get()
>>>> for every pair of UV pixel processed in
>>>> rgb64ToUV_half_c_template().
>>>>
>>>> This commit modifies these macros to ensure that isBE()
>>>> is evaluated at compile-time. This saved 9743B of .text
>>>> for me (GCC 11.2, -O3).
>>>
>>> hmm, all these functions where supposed to be optimized out
>>> why where they not ?
>>>
>>> iam asking as the code is simpler before your patch if that
>>> "optimization out" thing would work
>>>
>>
>> Why should these functions be optimized out? What would enable the
>> compiler to optimize them out?
> 
> Going back into the past, there was
> 6b0768e2021b90215a2ab55ed427bce91d148148
> 
> before this the code certainly did get optimized out, it was just
> #define isBE(x) ((x)&1)
> 
> thats simple and clean code btw

I don't really consider such magic numbers to be clean.

> after this it became
> 
> #define isBE(x) \
> +    (av_pix_fmt_descriptors[x].flags & PIX_FMT_BE)
> 
> thats still really good, and very readable, its a const array so
> one would assume that a compiler can figure that out at compile time
> well, i try not to think of linking and seperate objects here ;)
> 
> next it got then replaced by a function and a call that i suspect
> people thought would be inlined
> 
> 
>> (And I really don't see why this patch would make the code more
>> complicated.)
> 
> the code historically was capable to lookup any flag and detail
> of a pixel format at compile time
> now your code works around that not working. Introducing a 2nd
> system to do this in parallel. 

I am not introducing a second system, I am reusing the existing system,
namely our existing naming system (the fact that we use BE/LE in the
name of BE/LE pixel formats).

> To me if i look at the evolution
> of isBE() / code checking BE-ness it become more messy over time
> 
> I think it would be interresting to think about if we can make
> av_pix_fmt_desc_get(compile time constant) work at compile time.
> or if we maybe can return to a simpler implementation
> 

We could put the av_pix_fmt_descriptors array into an internal header
and use something like

static av_always_inline const AVPixFmtDescriptor
*ff_pix_fmt_descriptor_get(enum AVPixelFormat fmt)
{
    if (av_builtin_constant_p(fmt))
        return &av_pix_fmt_descriptors[fmt];
    return av_pix_fmt_desc_get(fmt);
}

- Andreas
Michael Niedermayer Sept. 9, 2022, 2:55 p.m. UTC | #5
On Thu, Sep 08, 2022 at 11:44:51PM +0200, Andreas Rheinhardt wrote:
> Michael Niedermayer:
> > On Thu, Sep 08, 2022 at 09:38:51PM +0200, Andreas Rheinhardt wrote:
> >> Michael Niedermayer:
[...]
> > To me if i look at the evolution
> > of isBE() / code checking BE-ness it become more messy over time
> > 
> > I think it would be interresting to think about if we can make
> > av_pix_fmt_desc_get(compile time constant) work at compile time.
> > or if we maybe can return to a simpler implementation
> > 
> 
> We could put the av_pix_fmt_descriptors array into an internal header
> and use something like
> 
> static av_always_inline const AVPixFmtDescriptor
> *ff_pix_fmt_descriptor_get(enum AVPixelFormat fmt)
> {
>     if (av_builtin_constant_p(fmt))
>         return &av_pix_fmt_descriptors[fmt];
>     return av_pix_fmt_desc_get(fmt);
> }

yes thats what i was thinking of too.

thx

[...]
Michael Niedermayer Sept. 9, 2022, 3:52 p.m. UTC | #6
On Thu, Sep 08, 2022 at 11:44:51PM +0200, Andreas Rheinhardt wrote:
> Michael Niedermayer:
> > On Thu, Sep 08, 2022 at 09:38:51PM +0200, Andreas Rheinhardt wrote:
> >> Michael Niedermayer:
> >>> Hi
> >>>
> >>> On Thu, Sep 08, 2022 at 04:38:11AM +0200, Andreas Rheinhardt wrote:
> >>>> Up until now, libswscale/input.c used a macro to read
> >>>> an input pixel which involved a call to av_pix_fmt_desc_get()
> >>>> to find out whether the input pixel format is BE or LE
> >>>> despite this being known at compile-time (there are templates
> >>>> per pixfmt). Even worse, these calls are made in a loop,
> >>>> so that e.g. there are six calls to av_pix_fmt_desc_get()
> >>>> for every pair of UV pixel processed in
> >>>> rgb64ToUV_half_c_template().
> >>>>
> >>>> This commit modifies these macros to ensure that isBE()
> >>>> is evaluated at compile-time. This saved 9743B of .text
> >>>> for me (GCC 11.2, -O3).
> >>>
> >>> hmm, all these functions where supposed to be optimized out
> >>> why where they not ?
> >>>
> >>> iam asking as the code is simpler before your patch if that
> >>> "optimization out" thing would work
> >>>
> >>
> >> Why should these functions be optimized out? What would enable the
> >> compiler to optimize them out?
> > 
> > Going back into the past, there was
> > 6b0768e2021b90215a2ab55ed427bce91d148148
> > 
> > before this the code certainly did get optimized out, it was just
> > #define isBE(x) ((x)&1)
> > 
> > thats simple and clean code btw
> 
> I don't really consider such magic numbers to be clean.

no, but that wasnt what i meant
what i meant was more that if you can structure an identifer so that it
itself can provide alot of the used information from within its structure, 
that has advanatges over requiring a descriptor table lookup.
I didnt mean to suggest that this should use hardcoded litteral numbers
and be undocumented and untested.

thx

[...]
Andreas Rheinhardt Sept. 9, 2022, 6:15 p.m. UTC | #7
Michael Niedermayer:
> On Thu, Sep 08, 2022 at 11:44:51PM +0200, Andreas Rheinhardt wrote:
>> Michael Niedermayer:
>>> On Thu, Sep 08, 2022 at 09:38:51PM +0200, Andreas Rheinhardt wrote:
>>>> Michael Niedermayer:
> [...]
>>> To me if i look at the evolution
>>> of isBE() / code checking BE-ness it become more messy over time
>>>
>>> I think it would be interresting to think about if we can make
>>> av_pix_fmt_desc_get(compile time constant) work at compile time.
>>> or if we maybe can return to a simpler implementation
>>>
>>
>> We could put the av_pix_fmt_descriptors array into an internal header
>> and use something like
>>
>> static av_always_inline const AVPixFmtDescriptor
>> *ff_pix_fmt_descriptor_get(enum AVPixelFormat fmt)
>> {
>>     if (av_builtin_constant_p(fmt))
>>         return &av_pix_fmt_descriptors[fmt];
>>     return av_pix_fmt_desc_get(fmt);
>> }
> 
> yes thats what i was thinking of too.
> 

Seems like Anton is away for a week or so. I am sure he has an opinion
on the above approach. I think we will wait for him or shall I apply the
patches as they are given that they do not block any later alternative
solution?
(There is one thing I already don't like about the alternative solution:
It relies on av_builtin_constant_p, which not every compiler supports.)

- Andreas
Michael Niedermayer Sept. 10, 2022, 3:29 p.m. UTC | #8
On Fri, Sep 09, 2022 at 08:15:22PM +0200, Andreas Rheinhardt wrote:
> Michael Niedermayer:
> > On Thu, Sep 08, 2022 at 11:44:51PM +0200, Andreas Rheinhardt wrote:
> >> Michael Niedermayer:
> >>> On Thu, Sep 08, 2022 at 09:38:51PM +0200, Andreas Rheinhardt wrote:
> >>>> Michael Niedermayer:
> > [...]
> >>> To me if i look at the evolution
> >>> of isBE() / code checking BE-ness it become more messy over time
> >>>
> >>> I think it would be interresting to think about if we can make
> >>> av_pix_fmt_desc_get(compile time constant) work at compile time.
> >>> or if we maybe can return to a simpler implementation
> >>>
> >>
> >> We could put the av_pix_fmt_descriptors array into an internal header
> >> and use something like
> >>
> >> static av_always_inline const AVPixFmtDescriptor
> >> *ff_pix_fmt_descriptor_get(enum AVPixelFormat fmt)
> >> {
> >>     if (av_builtin_constant_p(fmt))
> >>         return &av_pix_fmt_descriptors[fmt];
> >>     return av_pix_fmt_desc_get(fmt);
> >> }
> > 
> > yes thats what i was thinking of too.
> > 
> 
> Seems like Anton is away for a week or so. I am sure he has an opinion
> on the above approach. I think we will wait for him or shall I apply the
> patches as they are given that they do not block any later alternative
> solution?

please dont apply code like "IS_BE(BE_LE)" 
iam sure it makes sense to you today but it requires an additional step
for the reader to understand
simplest is a seperate endianness and isbe variable. on the wrapper
that should be less code too but quite possibly you see a better and
cleaner way


> (There is one thing I already don't like about the alternative solution:
> It relies on av_builtin_constant_p, which not every compiler supports.)

Thats why i didnt suggets to use av_builtin_constant_p() i was hoping you
saw a better way to achieve it.

But this is a problem which occurs more than once in the codebase.
mapping some identifer to some value just depending on the identifer,
something that is compile time constant, yet it calls to a function to
do a lookup

thx

[...]
Andreas Rheinhardt Sept. 10, 2022, 4:34 p.m. UTC | #9
Michael Niedermayer:
> On Fri, Sep 09, 2022 at 08:15:22PM +0200, Andreas Rheinhardt wrote:
>> Michael Niedermayer:
>>> On Thu, Sep 08, 2022 at 11:44:51PM +0200, Andreas Rheinhardt wrote:
>>>> Michael Niedermayer:
>>>>> On Thu, Sep 08, 2022 at 09:38:51PM +0200, Andreas Rheinhardt wrote:
>>>>>> Michael Niedermayer:
>>> [...]
>>>>> To me if i look at the evolution
>>>>> of isBE() / code checking BE-ness it become more messy over time
>>>>>
>>>>> I think it would be interresting to think about if we can make
>>>>> av_pix_fmt_desc_get(compile time constant) work at compile time.
>>>>> or if we maybe can return to a simpler implementation
>>>>>
>>>>
>>>> We could put the av_pix_fmt_descriptors array into an internal header
>>>> and use something like
>>>>
>>>> static av_always_inline const AVPixFmtDescriptor
>>>> *ff_pix_fmt_descriptor_get(enum AVPixelFormat fmt)
>>>> {
>>>>     if (av_builtin_constant_p(fmt))
>>>>         return &av_pix_fmt_descriptors[fmt];
>>>>     return av_pix_fmt_desc_get(fmt);
>>>> }
>>>
>>> yes thats what i was thinking of too.
>>>
>>
>> Seems like Anton is away for a week or so. I am sure he has an opinion
>> on the above approach. I think we will wait for him or shall I apply the
>> patches as they are given that they do not block any later alternative
>> solution?
> 
> please dont apply code like "IS_BE(BE_LE)" 
> iam sure it makes sense to you today but it requires an additional step
> for the reader to understand
> simplest is a seperate endianness and isbe variable. on the wrapper
> that should be less code too but quite possibly you see a better and
> cleaner way
> 

I actually thought that my solution is superior to the one you seem to
have in mind; after all, the parameter for isbe is redundant: It can
also be derived from the pixfmt-name.

> 
>> (There is one thing I already don't like about the alternative solution:
>> It relies on av_builtin_constant_p, which not every compiler supports.)
> 
> Thats why i didnt suggets to use av_builtin_constant_p() i was hoping you
> saw a better way to achieve it.
> 

Well, without av_builtin_constant_p() the only other way for this would
be to add two systems to get the information, one for compile-time
constants and one for not-constants. Ensuring that the former is only
used for compile-time constants will be a challenge, to say the least.

> But this is a problem which occurs more than once in the codebase.
> mapping some identifer to some value just depending on the identifer,
> something that is compile time constant, yet it calls to a function to
> do a lookup
> 

Another idea from the top of my head:
- We could provide some of the info contained in the descriptors via
separate defines/enums that parallel the actual enum, like
enum AVPixelFormatFlags {
    AV_PIX_FMT_YUV420P_FLAGS = AV_PIX_FMT_FLAG_PLANAR,
    AV_PIX_FMT_YUYV422_FLAGS = 0,
...
};
The list in pixdesc.c would then use these constants instead of defining
the flags twice (to avoid inconsistencies). These constants can be used
from macros via fmt ## _FLAGS, avoiding the av_builtin_constant_p()
issue. This could even be made public if we add a big warning that new
flags may be added in the future. Other such enums for other values may
be added, too, but honestly I don't really like this approach.

We could also use make an xmacro out of the list in lavu/pixdesc.c and
use this xmacro to query the values. E.g. isBE() would then in effect
become one big switch:
isBE(fmt)
{
    switch (fmt) {
    case AV_PIX_FMT_YUV420P: return 0;
....
    }
}
This could be made to handle non-compile-time constants as well (by
having a default that uses av_pix_fmt_desc_get()), but it has a
downside: There would be runtime checks for whether we are in the
default branch or not. So once again this function should not be used
with non-compile time constants.

- Andreas
Michael Niedermayer Sept. 10, 2022, 5:06 p.m. UTC | #10
On Sat, Sep 10, 2022 at 06:34:43PM +0200, Andreas Rheinhardt wrote:
> Michael Niedermayer:
> > On Fri, Sep 09, 2022 at 08:15:22PM +0200, Andreas Rheinhardt wrote:
> >> Michael Niedermayer:
> >>> On Thu, Sep 08, 2022 at 11:44:51PM +0200, Andreas Rheinhardt wrote:
> >>>> Michael Niedermayer:
> >>>>> On Thu, Sep 08, 2022 at 09:38:51PM +0200, Andreas Rheinhardt wrote:
> >>>>>> Michael Niedermayer:
> >>> [...]
> >>>>> To me if i look at the evolution
> >>>>> of isBE() / code checking BE-ness it become more messy over time
> >>>>>
> >>>>> I think it would be interresting to think about if we can make
> >>>>> av_pix_fmt_desc_get(compile time constant) work at compile time.
> >>>>> or if we maybe can return to a simpler implementation
> >>>>>
> >>>>
> >>>> We could put the av_pix_fmt_descriptors array into an internal header
> >>>> and use something like
> >>>>
> >>>> static av_always_inline const AVPixFmtDescriptor
> >>>> *ff_pix_fmt_descriptor_get(enum AVPixelFormat fmt)
> >>>> {
> >>>>     if (av_builtin_constant_p(fmt))
> >>>>         return &av_pix_fmt_descriptors[fmt];
> >>>>     return av_pix_fmt_desc_get(fmt);
> >>>> }
> >>>
> >>> yes thats what i was thinking of too.
> >>>
> >>
> >> Seems like Anton is away for a week or so. I am sure he has an opinion
> >> on the above approach. I think we will wait for him or shall I apply the
> >> patches as they are given that they do not block any later alternative
> >> solution?
> > 
> > please dont apply code like "IS_BE(BE_LE)" 
> > iam sure it makes sense to you today but it requires an additional step
> > for the reader to understand
> > simplest is a seperate endianness and isbe variable. on the wrapper
> > that should be less code too but quite possibly you see a better and
> > cleaner way
> > 
> 
> I actually thought that my solution is superior to the one you seem to
> have in mind; after all, the parameter for isbe is redundant: It can
> also be derived from the pixfmt-name.

your solution is supperior in some sense. But iam concerned that it is
less readable for a small gain.
Maybe BE_LE and others can be renamed to make it immedeatly clear


> 
> > 
> >> (There is one thing I already don't like about the alternative solution:
> >> It relies on av_builtin_constant_p, which not every compiler supports.)
> > 
> > Thats why i didnt suggets to use av_builtin_constant_p() i was hoping you
> > saw a better way to achieve it.
> > 
> 
> Well, without av_builtin_constant_p() the only other way for this would
> be to add two systems to get the information, one for compile-time
> constants and one for not-constants. Ensuring that the former is only
> used for compile-time constants will be a challenge, to say the least.

I dont think its a challenge, but iam still quite unsure if this would
lead to other issues.

For example, if we add a constant lookup function that has direct access
to the table. We assue that will only be used where constants are the
arguzment. Now its easy to add a extra field to that table and then
grep the binary for that. If its found it was either used with a non
constant or the compiler failed to optimize it out.
This could be a fate test. If it works 100% thats simple but if it works
99% it will be a pain and is not a direction i would suggest, it would
be more pain than its worth.

So really this is just for discussion at this point, far from a plan
with understood consequences

thx
[...]
Anton Khirnov Sept. 16, 2022, 7 a.m. UTC | #11
Quoting Andreas Rheinhardt (2022-09-09 20:15:22)
> Michael Niedermayer:
> > On Thu, Sep 08, 2022 at 11:44:51PM +0200, Andreas Rheinhardt wrote:
> >> Michael Niedermayer:
> >>> On Thu, Sep 08, 2022 at 09:38:51PM +0200, Andreas Rheinhardt wrote:
> >>>> Michael Niedermayer:
> > [...]
> >>> To me if i look at the evolution
> >>> of isBE() / code checking BE-ness it become more messy over time
> >>>
> >>> I think it would be interresting to think about if we can make
> >>> av_pix_fmt_desc_get(compile time constant) work at compile time.
> >>> or if we maybe can return to a simpler implementation
> >>>
> >>
> >> We could put the av_pix_fmt_descriptors array into an internal header
> >> and use something like
> >>
> >> static av_always_inline const AVPixFmtDescriptor
> >> *ff_pix_fmt_descriptor_get(enum AVPixelFormat fmt)
> >> {
> >>     if (av_builtin_constant_p(fmt))
> >>         return &av_pix_fmt_descriptors[fmt];
> >>     return av_pix_fmt_desc_get(fmt);
> >> }
> > 
> > yes thats what i was thinking of too.
> > 
> 
> Seems like Anton is away for a week or so. I am sure he has an opinion
> on the above approach. I think we will wait for him or shall I apply the
> patches as they are given that they do not block any later alternative
> solution?
> (There is one thing I already don't like about the alternative solution:
> It relies on av_builtin_constant_p, which not every compiler supports.)

For what my opinion is worth, the patch as is with some extra
explanatory comments for the new IS_BE* macros seems like the best
solution to me. They are indeed slightly confusing at first glance, but
quite obvious if you look at the code for two minutes (less for people
smarter than me). And I think few people will need to understand how
precisely they work anyway.

The other proposals seem significantly more complex and fragile than
this one.
Michael Niedermayer Sept. 16, 2022, 10:57 a.m. UTC | #12
On Fri, Sep 16, 2022 at 09:00:24AM +0200, Anton Khirnov wrote:
> Quoting Andreas Rheinhardt (2022-09-09 20:15:22)
> > Michael Niedermayer:
> > > On Thu, Sep 08, 2022 at 11:44:51PM +0200, Andreas Rheinhardt wrote:
> > >> Michael Niedermayer:
> > >>> On Thu, Sep 08, 2022 at 09:38:51PM +0200, Andreas Rheinhardt wrote:
> > >>>> Michael Niedermayer:
> > > [...]
> > >>> To me if i look at the evolution
> > >>> of isBE() / code checking BE-ness it become more messy over time
> > >>>
> > >>> I think it would be interresting to think about if we can make
> > >>> av_pix_fmt_desc_get(compile time constant) work at compile time.
> > >>> or if we maybe can return to a simpler implementation
> > >>>
> > >>
> > >> We could put the av_pix_fmt_descriptors array into an internal header
> > >> and use something like
> > >>
> > >> static av_always_inline const AVPixFmtDescriptor
> > >> *ff_pix_fmt_descriptor_get(enum AVPixelFormat fmt)
> > >> {
> > >>     if (av_builtin_constant_p(fmt))
> > >>         return &av_pix_fmt_descriptors[fmt];
> > >>     return av_pix_fmt_desc_get(fmt);
> > >> }
> > > 
> > > yes thats what i was thinking of too.
> > > 
> > 
> > Seems like Anton is away for a week or so. I am sure he has an opinion
> > on the above approach. I think we will wait for him or shall I apply the
> > patches as they are given that they do not block any later alternative
> > solution?
> > (There is one thing I already don't like about the alternative solution:
> > It relies on av_builtin_constant_p, which not every compiler supports.)
> 
> For what my opinion is worth, the patch as is with some extra
> explanatory comments for the new IS_BE* macros seems like the best
> solution to me. They are indeed slightly confusing at first glance, but
> quite obvious if you look at the code for two minutes (less for people
> smarter than me). And I think few people will need to understand how
> precisely they work anyway.

i agree, maybe we can solve this by changing the IS_BE* names instead
of adding comments.
IS_BE, ENDIAN_IDENTIFER, 
(0,1)  , (LE, BE)

or something like that, may reduce the need for comments


> 
> The other proposals seem significantly more complex and fragile than
> this one.

i agree too, though allowing av_pix_fmt_desc_get() to be done at compile
time would allow its use in inlined functions and macros much broader
so these are not 1:1 comparable as they serve only overlapping goals

thx

[...]
Andreas Rheinhardt Sept. 18, 2022, 8:58 p.m. UTC | #13
Michael Niedermayer:
> On Fri, Sep 16, 2022 at 09:00:24AM +0200, Anton Khirnov wrote:
>> Quoting Andreas Rheinhardt (2022-09-09 20:15:22)
>>> Michael Niedermayer:
>>>> On Thu, Sep 08, 2022 at 11:44:51PM +0200, Andreas Rheinhardt wrote:
>>>>> Michael Niedermayer:
>>>>>> On Thu, Sep 08, 2022 at 09:38:51PM +0200, Andreas Rheinhardt wrote:
>>>>>>> Michael Niedermayer:
>>>> [...]
>>>>>> To me if i look at the evolution
>>>>>> of isBE() / code checking BE-ness it become more messy over time
>>>>>>
>>>>>> I think it would be interresting to think about if we can make
>>>>>> av_pix_fmt_desc_get(compile time constant) work at compile time.
>>>>>> or if we maybe can return to a simpler implementation
>>>>>>
>>>>>
>>>>> We could put the av_pix_fmt_descriptors array into an internal header
>>>>> and use something like
>>>>>
>>>>> static av_always_inline const AVPixFmtDescriptor
>>>>> *ff_pix_fmt_descriptor_get(enum AVPixelFormat fmt)
>>>>> {
>>>>>     if (av_builtin_constant_p(fmt))
>>>>>         return &av_pix_fmt_descriptors[fmt];
>>>>>     return av_pix_fmt_desc_get(fmt);
>>>>> }
>>>>
>>>> yes thats what i was thinking of too.
>>>>
>>>
>>> Seems like Anton is away for a week or so. I am sure he has an opinion
>>> on the above approach. I think we will wait for him or shall I apply the
>>> patches as they are given that they do not block any later alternative
>>> solution?
>>> (There is one thing I already don't like about the alternative solution:
>>> It relies on av_builtin_constant_p, which not every compiler supports.)
>>
>> For what my opinion is worth, the patch as is with some extra
>> explanatory comments for the new IS_BE* macros seems like the best
>> solution to me. They are indeed slightly confusing at first glance, but
>> quite obvious if you look at the code for two minutes (less for people
>> smarter than me). And I think few people will need to understand how
>> precisely they work anyway.
> 
> i agree, maybe we can solve this by changing the IS_BE* names instead
> of adding comments.
> IS_BE, ENDIAN_IDENTIFER, 
> (0,1)  , (LE, BE)
> 
> or something like that, may reduce the need for comments
> 
> 

I sent a version 2 here:
https://ffmpeg.org/pipermail/ffmpeg-devel/2022-September/301535.html Can
you take a look at it, please?

- Andreas
diff mbox series

Patch

diff --git a/libswscale/input.c b/libswscale/input.c
index 88e318e664..4144ff81a9 100644
--- a/libswscale/input.c
+++ b/libswscale/input.c
@@ -28,14 +28,19 @@ 
 #include "config.h"
 #include "swscale_internal.h"
 
-#define input_pixel(pos) (isBE(origin) ? AV_RB16(pos) : AV_RL16(pos))
+#define input_pixel(pos) (is_be ? AV_RB16(pos) : AV_RL16(pos))
+
+#define IS_BE_LE 0
+#define IS_BE_BE 1
+#define IS_BE_   0
+#define IS_BE(BE_LE) IS_BE_ ## BE_LE
 
 #define r ((origin == AV_PIX_FMT_BGR48BE || origin == AV_PIX_FMT_BGR48LE || origin == AV_PIX_FMT_BGRA64BE || origin == AV_PIX_FMT_BGRA64LE) ? b_r : r_b)
 #define b ((origin == AV_PIX_FMT_BGR48BE || origin == AV_PIX_FMT_BGR48LE || origin == AV_PIX_FMT_BGRA64BE || origin == AV_PIX_FMT_BGRA64LE) ? r_b : b_r)
 
 static av_always_inline void
 rgb64ToY_c_template(uint16_t *dst, const uint16_t *src, int width,
-                    enum AVPixelFormat origin, int32_t *rgb2yuv)
+                    enum AVPixelFormat origin, int32_t *rgb2yuv, int is_be)
 {
     int32_t ry = rgb2yuv[RY_IDX], gy = rgb2yuv[GY_IDX], by = rgb2yuv[BY_IDX];
     int i;
@@ -51,7 +56,7 @@  rgb64ToY_c_template(uint16_t *dst, const uint16_t *src, int width,
 static av_always_inline void
 rgb64ToUV_c_template(uint16_t *dstU, uint16_t *dstV,
                     const uint16_t *src1, const uint16_t *src2,
-                    int width, enum AVPixelFormat origin, int32_t *rgb2yuv)
+                    int width, enum AVPixelFormat origin, int32_t *rgb2yuv, int is_be)
 {
     int i;
     int32_t ru = rgb2yuv[RU_IDX], gu = rgb2yuv[GU_IDX], bu = rgb2yuv[BU_IDX];
@@ -70,7 +75,7 @@  rgb64ToUV_c_template(uint16_t *dstU, uint16_t *dstV,
 static av_always_inline void
 rgb64ToUV_half_c_template(uint16_t *dstU, uint16_t *dstV,
                           const uint16_t *src1, const uint16_t *src2,
-                          int width, enum AVPixelFormat origin, int32_t *rgb2yuv)
+                          int width, enum AVPixelFormat origin, int32_t *rgb2yuv, int is_be)
 {
     int i;
     int32_t ru = rgb2yuv[RU_IDX], gu = rgb2yuv[GU_IDX], bu = rgb2yuv[BU_IDX];
@@ -92,7 +97,7 @@  static void pattern ## 64 ## BE_LE ## ToY_c(uint8_t *_dst, const uint8_t *_src,
 { \
     const uint16_t *src = (const uint16_t *) _src; \
     uint16_t *dst = (uint16_t *) _dst; \
-    rgb64ToY_c_template(dst, src, width, origin, rgb2yuv); \
+    rgb64ToY_c_template(dst, src, width, origin, rgb2yuv, IS_BE(BE_LE)); \
 } \
  \
 static void pattern ## 64 ## BE_LE ## ToUV_c(uint8_t *_dstU, uint8_t *_dstV, \
@@ -102,7 +107,7 @@  static void pattern ## 64 ## BE_LE ## ToUV_c(uint8_t *_dstU, uint8_t *_dstV, \
     const uint16_t *src1 = (const uint16_t *) _src1, \
                    *src2 = (const uint16_t *) _src2; \
     uint16_t *dstU = (uint16_t *) _dstU, *dstV = (uint16_t *) _dstV; \
-    rgb64ToUV_c_template(dstU, dstV, src1, src2, width, origin, rgb2yuv); \
+    rgb64ToUV_c_template(dstU, dstV, src1, src2, width, origin, rgb2yuv, IS_BE(BE_LE)); \
 } \
  \
 static void pattern ## 64 ## BE_LE ## ToUV_half_c(uint8_t *_dstU, uint8_t *_dstV, \
@@ -112,7 +117,7 @@  static void pattern ## 64 ## BE_LE ## ToUV_half_c(uint8_t *_dstU, uint8_t *_dstV
     const uint16_t *src1 = (const uint16_t *) _src1, \
                    *src2 = (const uint16_t *) _src2; \
     uint16_t *dstU = (uint16_t *) _dstU, *dstV = (uint16_t *) _dstV; \
-    rgb64ToUV_half_c_template(dstU, dstV, src1, src2, width, origin, rgb2yuv); \
+    rgb64ToUV_half_c_template(dstU, dstV, src1, src2, width, origin, rgb2yuv, IS_BE(BE_LE)); \
 }
 
 rgb64funcs(rgb, LE, AV_PIX_FMT_RGBA64LE)
@@ -123,7 +128,7 @@  rgb64funcs(bgr, BE, AV_PIX_FMT_BGRA64BE)
 static av_always_inline void rgb48ToY_c_template(uint16_t *dst,
                                                  const uint16_t *src, int width,
                                                  enum AVPixelFormat origin,
-                                                 int32_t *rgb2yuv)
+                                                 int32_t *rgb2yuv, int is_be)
 {
     int32_t ry = rgb2yuv[RY_IDX], gy = rgb2yuv[GY_IDX], by = rgb2yuv[BY_IDX];
     int i;
@@ -142,7 +147,7 @@  static av_always_inline void rgb48ToUV_c_template(uint16_t *dstU,
                                                   const uint16_t *src2,
                                                   int width,
                                                   enum AVPixelFormat origin,
-                                                  int32_t *rgb2yuv)
+                                                  int32_t *rgb2yuv, int is_be)
 {
     int i;
     int32_t ru = rgb2yuv[RU_IDX], gu = rgb2yuv[GU_IDX], bu = rgb2yuv[BU_IDX];
@@ -164,7 +169,7 @@  static av_always_inline void rgb48ToUV_half_c_template(uint16_t *dstU,
                                                        const uint16_t *src2,
                                                        int width,
                                                        enum AVPixelFormat origin,
-                                                       int32_t *rgb2yuv)
+                                                       int32_t *rgb2yuv, int is_be)
 {
     int i;
     int32_t ru = rgb2yuv[RU_IDX], gu = rgb2yuv[GU_IDX], bu = rgb2yuv[BU_IDX];
@@ -197,7 +202,7 @@  static void pattern ## 48 ## BE_LE ## ToY_c(uint8_t *_dst,              \
 {                                                                       \
     const uint16_t *src = (const uint16_t *)_src;                       \
     uint16_t *dst       = (uint16_t *)_dst;                             \
-    rgb48ToY_c_template(dst, src, width, origin, rgb2yuv);              \
+    rgb48ToY_c_template(dst, src, width, origin, rgb2yuv, IS_BE(BE_LE));\
 }                                                                       \
                                                                         \
 static void pattern ## 48 ## BE_LE ## ToUV_c(uint8_t *_dstU,            \
@@ -213,7 +218,7 @@  static void pattern ## 48 ## BE_LE ## ToUV_c(uint8_t *_dstU,            \
                    *src2 = (const uint16_t *)_src2;                     \
     uint16_t *dstU = (uint16_t *)_dstU,                                 \
              *dstV = (uint16_t *)_dstV;                                 \
-    rgb48ToUV_c_template(dstU, dstV, src1, src2, width, origin, rgb2yuv);        \
+    rgb48ToUV_c_template(dstU, dstV, src1, src2, width, origin, rgb2yuv, IS_BE(BE_LE)); \
 }                                                                       \
                                                                         \
 static void pattern ## 48 ## BE_LE ## ToUV_half_c(uint8_t *_dstU,       \
@@ -229,7 +234,7 @@  static void pattern ## 48 ## BE_LE ## ToUV_half_c(uint8_t *_dstU,       \
                    *src2 = (const uint16_t *)_src2;                     \
     uint16_t *dstU = (uint16_t *)_dstU,                                 \
              *dstV = (uint16_t *)_dstV;                                 \
-    rgb48ToUV_half_c_template(dstU, dstV, src1, src2, width, origin, rgb2yuv);   \
+    rgb48ToUV_half_c_template(dstU, dstV, src1, src2, width, origin, rgb2yuv, IS_BE(BE_LE)); \
 }
 
 rgb48funcs(rgb, LE, AV_PIX_FMT_RGB48LE)
@@ -245,7 +250,7 @@  rgb48funcs(bgr, BE, AV_PIX_FMT_BGR48BE)
                         : ((origin == AV_PIX_FMT_X2RGB10LE ||              \
                             origin == AV_PIX_FMT_X2BGR10LE)                \
                            ? AV_RL32(&src[(i) * 4])                        \
-                           : (isBE(origin) ? AV_RB16(&src[(i) * 2])        \
+                           : (is_be ? AV_RB16(&src[(i) * 2])               \
                               : AV_RL16(&src[(i) * 2]))))
 
 static av_always_inline void rgb16_32ToY_c_template(int16_t *dst,
@@ -257,7 +262,7 @@  static av_always_inline void rgb16_32ToY_c_template(int16_t *dst,
                                                     int maskr, int maskg,
                                                     int maskb, int rsh,
                                                     int gsh, int bsh, int S,
-                                                    int32_t *rgb2yuv)
+                                                    int32_t *rgb2yuv, int is_be)
 {
     const int ry       = rgb2yuv[RY_IDX]<<rsh, gy = rgb2yuv[GY_IDX]<<gsh, by = rgb2yuv[BY_IDX]<<bsh;
     const unsigned rnd = (32<<((S)-1)) + (1<<(S-7));
@@ -283,7 +288,7 @@  static av_always_inline void rgb16_32ToUV_c_template(int16_t *dstU,
                                                      int maskr, int maskg,
                                                      int maskb, int rsh,
                                                      int gsh, int bsh, int S,
-                                                     int32_t *rgb2yuv)
+                                                     int32_t *rgb2yuv, int is_be)
 {
     const int ru       = rgb2yuv[RU_IDX] * (1 << rsh), gu = rgb2yuv[GU_IDX] * (1 << gsh), bu = rgb2yuv[BU_IDX] * (1 << bsh),
               rv       = rgb2yuv[RV_IDX] * (1 << rsh), gv = rgb2yuv[GV_IDX] * (1 << gsh), bv = rgb2yuv[BV_IDX] * (1 << bsh);
@@ -311,7 +316,7 @@  static av_always_inline void rgb16_32ToUV_half_c_template(int16_t *dstU,
                                                           int maskr, int maskg,
                                                           int maskb, int rsh,
                                                           int gsh, int bsh, int S,
-                                                          int32_t *rgb2yuv)
+                                                          int32_t *rgb2yuv, int is_be)
 {
     const int ru       = rgb2yuv[RU_IDX] * (1 << rsh), gu = rgb2yuv[GU_IDX] * (1 << gsh), bu = rgb2yuv[BU_IDX] * (1 << bsh),
               rv       = rgb2yuv[RV_IDX] * (1 << rsh), gv = rgb2yuv[GV_IDX] * (1 << gsh), bv = rgb2yuv[BV_IDX] * (1 << bsh),
@@ -345,13 +350,13 @@  static av_always_inline void rgb16_32ToUV_half_c_template(int16_t *dstU,
 
 #undef input_pixel
 
-#define rgb16_32_wrapper(fmt, name, shr, shg, shb, shp, maskr,          \
-                         maskg, maskb, rsh, gsh, bsh, S)                \
+#define rgb16_32_wrapper_0(fmt, name, shr, shg, shb, shp, maskr,        \
+                           maskg, maskb, rsh, gsh, bsh, S, is_be)       \
 static void name ## ToY_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2,            \
                           int width, uint32_t *tab, void *opq)          \
 {                                                                       \
     rgb16_32ToY_c_template((int16_t*)dst, src, width, fmt, shr, shg, shb, shp,    \
-                           maskr, maskg, maskb, rsh, gsh, bsh, S, tab); \
+                           maskr, maskg, maskb, rsh, gsh, bsh, S, tab, is_be); \
 }                                                                       \
                                                                         \
 static void name ## ToUV_c(uint8_t *dstU, uint8_t *dstV,                \
@@ -360,7 +365,7 @@  static void name ## ToUV_c(uint8_t *dstU, uint8_t *dstV,                \
 {                                                                       \
     rgb16_32ToUV_c_template((int16_t*)dstU, (int16_t*)dstV, src, width, fmt,                \
                             shr, shg, shb, shp,                         \
-                            maskr, maskg, maskb, rsh, gsh, bsh, S, tab);\
+                            maskr, maskg, maskb, rsh, gsh, bsh, S, tab, is_be); \
 }                                                                       \
                                                                         \
 static void name ## ToUV_half_c(uint8_t *dstU, uint8_t *dstV,           \
@@ -371,27 +376,32 @@  static void name ## ToUV_half_c(uint8_t *dstU, uint8_t *dstV,           \
     rgb16_32ToUV_half_c_template((int16_t*)dstU, (int16_t*)dstV, src, width, fmt,           \
                                  shr, shg, shb, shp,                    \
                                  maskr, maskg, maskb,                   \
-                                 rsh, gsh, bsh, S, tab);                \
-}
-
-rgb16_32_wrapper(AV_PIX_FMT_BGR32,    bgr32,  16, 0,  0, 0, 0xFF0000, 0xFF00,   0x00FF,  8, 0,  8, RGB2YUV_SHIFT + 8)
-rgb16_32_wrapper(AV_PIX_FMT_BGR32_1,  bgr321, 16, 0,  0, 8, 0xFF0000, 0xFF00,   0x00FF,  8, 0,  8, RGB2YUV_SHIFT + 8)
-rgb16_32_wrapper(AV_PIX_FMT_RGB32,    rgb32,   0, 0, 16, 0,   0x00FF, 0xFF00, 0xFF0000,  8, 0,  8, RGB2YUV_SHIFT + 8)
-rgb16_32_wrapper(AV_PIX_FMT_RGB32_1,  rgb321,  0, 0, 16, 8,   0x00FF, 0xFF00, 0xFF0000,  8, 0,  8, RGB2YUV_SHIFT + 8)
-rgb16_32_wrapper(AV_PIX_FMT_BGR565LE, bgr16le, 0, 0,  0, 0,   0x001F, 0x07E0,   0xF800, 11, 5,  0, RGB2YUV_SHIFT + 8)
-rgb16_32_wrapper(AV_PIX_FMT_BGR555LE, bgr15le, 0, 0,  0, 0,   0x001F, 0x03E0,   0x7C00, 10, 5,  0, RGB2YUV_SHIFT + 7)
-rgb16_32_wrapper(AV_PIX_FMT_BGR444LE, bgr12le, 0, 0,  0, 0,   0x000F, 0x00F0,   0x0F00,  8, 4,  0, RGB2YUV_SHIFT + 4)
-rgb16_32_wrapper(AV_PIX_FMT_RGB565LE, rgb16le, 0, 0,  0, 0,   0xF800, 0x07E0,   0x001F,  0, 5, 11, RGB2YUV_SHIFT + 8)
-rgb16_32_wrapper(AV_PIX_FMT_RGB555LE, rgb15le, 0, 0,  0, 0,   0x7C00, 0x03E0,   0x001F,  0, 5, 10, RGB2YUV_SHIFT + 7)
-rgb16_32_wrapper(AV_PIX_FMT_RGB444LE, rgb12le, 0, 0,  0, 0,   0x0F00, 0x00F0,   0x000F,  0, 4,  8, RGB2YUV_SHIFT + 4)
-rgb16_32_wrapper(AV_PIX_FMT_BGR565BE, bgr16be, 0, 0,  0, 0,   0x001F, 0x07E0,   0xF800, 11, 5,  0, RGB2YUV_SHIFT + 8)
-rgb16_32_wrapper(AV_PIX_FMT_BGR555BE, bgr15be, 0, 0,  0, 0,   0x001F, 0x03E0,   0x7C00, 10, 5,  0, RGB2YUV_SHIFT + 7)
-rgb16_32_wrapper(AV_PIX_FMT_BGR444BE, bgr12be, 0, 0,  0, 0,   0x000F, 0x00F0,   0x0F00,  8, 4,  0, RGB2YUV_SHIFT + 4)
-rgb16_32_wrapper(AV_PIX_FMT_RGB565BE, rgb16be, 0, 0,  0, 0,   0xF800, 0x07E0,   0x001F,  0, 5, 11, RGB2YUV_SHIFT + 8)
-rgb16_32_wrapper(AV_PIX_FMT_RGB555BE, rgb15be, 0, 0,  0, 0,   0x7C00, 0x03E0,   0x001F,  0, 5, 10, RGB2YUV_SHIFT + 7)
-rgb16_32_wrapper(AV_PIX_FMT_RGB444BE, rgb12be, 0, 0,  0, 0,   0x0F00, 0x00F0,   0x000F,  0, 4,  8, RGB2YUV_SHIFT + 4)
-rgb16_32_wrapper(AV_PIX_FMT_X2RGB10LE, rgb30le, 16, 6, 0, 0, 0x3FF00000, 0xFFC00, 0x3FF, 0, 0, 4, RGB2YUV_SHIFT + 6)
-rgb16_32_wrapper(AV_PIX_FMT_X2BGR10LE, bgr30le, 0, 6, 16, 0, 0x3FF, 0xFFC00, 0x3FF00000, 4, 0, 0, RGB2YUV_SHIFT + 6)
+                                 rsh, gsh, bsh, S, tab, is_be);         \
+}
+
+#define rgb16_32_wrapper(base_fmt, endianness, name, shr, shg, shb, shp, maskr, \
+                         maskg, maskb, rsh, gsh, bsh, S) \
+    rgb16_32_wrapper_0(base_fmt ## endianness, name, shr, shg, shb, shp, maskr, \
+                       maskg, maskb, rsh, gsh, bsh, S, IS_BE(endianness))
+
+rgb16_32_wrapper(AV_PIX_FMT_BGR32,     , bgr32,  16, 0,  0, 0, 0xFF0000, 0xFF00,   0x00FF,  8, 0,  8, RGB2YUV_SHIFT + 8)
+rgb16_32_wrapper(AV_PIX_FMT_BGR32_1,   , bgr321, 16, 0,  0, 8, 0xFF0000, 0xFF00,   0x00FF,  8, 0,  8, RGB2YUV_SHIFT + 8)
+rgb16_32_wrapper(AV_PIX_FMT_RGB32,     , rgb32,   0, 0, 16, 0,   0x00FF, 0xFF00, 0xFF0000,  8, 0,  8, RGB2YUV_SHIFT + 8)
+rgb16_32_wrapper(AV_PIX_FMT_RGB32_1,   , rgb321,  0, 0, 16, 8,   0x00FF, 0xFF00, 0xFF0000,  8, 0,  8, RGB2YUV_SHIFT + 8)
+rgb16_32_wrapper(AV_PIX_FMT_BGR565,  LE, bgr16le, 0, 0,  0, 0,   0x001F, 0x07E0,   0xF800, 11, 5,  0, RGB2YUV_SHIFT + 8)
+rgb16_32_wrapper(AV_PIX_FMT_BGR555,  LE, bgr15le, 0, 0,  0, 0,   0x001F, 0x03E0,   0x7C00, 10, 5,  0, RGB2YUV_SHIFT + 7)
+rgb16_32_wrapper(AV_PIX_FMT_BGR444,  LE, bgr12le, 0, 0,  0, 0,   0x000F, 0x00F0,   0x0F00,  8, 4,  0, RGB2YUV_SHIFT + 4)
+rgb16_32_wrapper(AV_PIX_FMT_RGB565,  LE, rgb16le, 0, 0,  0, 0,   0xF800, 0x07E0,   0x001F,  0, 5, 11, RGB2YUV_SHIFT + 8)
+rgb16_32_wrapper(AV_PIX_FMT_RGB555,  LE, rgb15le, 0, 0,  0, 0,   0x7C00, 0x03E0,   0x001F,  0, 5, 10, RGB2YUV_SHIFT + 7)
+rgb16_32_wrapper(AV_PIX_FMT_RGB444,  LE, rgb12le, 0, 0,  0, 0,   0x0F00, 0x00F0,   0x000F,  0, 4,  8, RGB2YUV_SHIFT + 4)
+rgb16_32_wrapper(AV_PIX_FMT_BGR565,  BE, bgr16be, 0, 0,  0, 0,   0x001F, 0x07E0,   0xF800, 11, 5,  0, RGB2YUV_SHIFT + 8)
+rgb16_32_wrapper(AV_PIX_FMT_BGR555,  BE, bgr15be, 0, 0,  0, 0,   0x001F, 0x03E0,   0x7C00, 10, 5,  0, RGB2YUV_SHIFT + 7)
+rgb16_32_wrapper(AV_PIX_FMT_BGR444,  BE, bgr12be, 0, 0,  0, 0,   0x000F, 0x00F0,   0x0F00,  8, 4,  0, RGB2YUV_SHIFT + 4)
+rgb16_32_wrapper(AV_PIX_FMT_RGB565,  BE, rgb16be, 0, 0,  0, 0,   0xF800, 0x07E0,   0x001F,  0, 5, 11, RGB2YUV_SHIFT + 8)
+rgb16_32_wrapper(AV_PIX_FMT_RGB555,  BE, rgb15be, 0, 0,  0, 0,   0x7C00, 0x03E0,   0x001F,  0, 5, 10, RGB2YUV_SHIFT + 7)
+rgb16_32_wrapper(AV_PIX_FMT_RGB444,  BE, rgb12be, 0, 0,  0, 0,   0x0F00, 0x00F0,   0x000F,  0, 4,  8, RGB2YUV_SHIFT + 4)
+rgb16_32_wrapper(AV_PIX_FMT_X2RGB10, LE, rgb30le, 16, 6, 0, 0, 0x3FF00000, 0xFFC00, 0x3FF, 0, 0, 4, RGB2YUV_SHIFT + 6)
+rgb16_32_wrapper(AV_PIX_FMT_X2BGR10, LE, bgr30le, 0, 6, 16, 0, 0x3FF, 0xFFC00, 0x3FF00000, 4, 0, 0, RGB2YUV_SHIFT + 6)
 
 static void gbr24pToUV_half_c(uint8_t *_dstU, uint8_t *_dstV,
                          const uint8_t *gsrc, const uint8_t *bsrc, const uint8_t *rsrc,
@@ -832,8 +842,6 @@  p01x_wrapper(10, 6)
 p01x_wrapper(12, 4)
 p01x_uv_wrapper(16, 0)
 
-#define input_pixel(pos) (isBE(origin) ? AV_RB16(pos) : AV_RL16(pos))
-
 static void bgr24ToY_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2,
                        int width, uint32_t *rgb2yuv, void *opq)
 {