Message ID | 20190807173907.21750-3-daniel@octaforge.org |
---|---|
State | Accepted |
Commit | e6625ca41f85ed10b3108d37bd897f0ae063250e |
Headers | show |
On 07.08.2019, at 19:39, Daniel Kolesa <daniel@octaforge.org> wrote: > The argument to vec_splat_u16 must be a literal. By making the > function always inline and marking the arguments const, gcc can > turn those into literals, and avoid build errors like: Why marking the arguments const? If it depends on that it sounds like this might be really unreliable and just work or not work with random compiler versions. It would also be nice to know if/what the impact on code size or performance would be of always inline. An alternative, uglier but likely more reliable option would be to pass the vswap/vshift vectors as arguments and have a macro that generates them (admit the multiple-evaluation risks though) e.g. #define yuv2plane1_16_vsx(s, d, w, b, o) yuv2plane1_16_vsx(s, d, w, b, o, vec_splat_u16(b ? 8 : 0))
On 11.08.2019, at 21:24, Reimar Döffinger <Reimar.Doeffinger@gmx.de> wrote: > On 07.08.2019, at 19:39, Daniel Kolesa <daniel@octaforge.org> wrote: > >> The argument to vec_splat_u16 must be a literal. By making the >> function always inline and marking the arguments const, gcc can >> turn those into literals, and avoid build errors like: > > Why marking the arguments const? > If it depends on that it sounds like this might be really unreliable and just work or not work with random compiler versions. > It would also be nice to know if/what the impact on code size or performance would be of always inline. > An alternative, uglier but likely more reliable option would be to pass the vswap/vshift vectors as arguments and have a macro that generates them (admit the multiple-evaluation risks though) > e.g. > #define yuv2plane1_16_vsx(s, d, w, b, o) yuv2plane1_16_vsx(s, d, w, b, o, vec_splat_u16(b ? 8 : 0)) I also realise now that the vec_splat_u16 is just an optimization. So maybe the first step would to be to check if there is actually a relevant performance advantage or if it wouldn't be simplest to just use the generic initialization code.
On Sun, Aug 11, 2019, at 21:39, Reimar Döffinger wrote: > On 11.08.2019, at 21:24, Reimar Döffinger <Reimar.Doeffinger@gmx.de> wrote: > > > On 07.08.2019, at 19:39, Daniel Kolesa <daniel@octaforge.org> wrote: > > > >> The argument to vec_splat_u16 must be a literal. By making the > >> function always inline and marking the arguments const, gcc can > >> turn those into literals, and avoid build errors like: > > > > Why marking the arguments const? > > If it depends on that it sounds like this might be really unreliable and just work or not work with random compiler versions. > > It would also be nice to know if/what the impact on code size or performance would be of always inline. > > An alternative, uglier but likely more reliable option would be to pass the vswap/vshift vectors as arguments and have a macro that generates them (admit the multiple-evaluation risks though) > > e.g. > > #define yuv2plane1_16_vsx(s, d, w, b, o) yuv2plane1_16_vsx(s, d, w, b, o, vec_splat_u16(b ? 8 : 0)) > > I also realise now that the vec_splat_u16 is just an optimization. > So maybe the first step would to be to check if there is actually a > relevant performance advantage or if it wouldn't be simplest to just > use the generic initialization code. Well, note that this has already been done before http://git.videolan.org/?p=ffmpeg.git;a=commitdiff;h=153fcd6de6ba558a3720c64589a7e4e9e4d62420 So these changes are just in line with older ones. > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
diff --git a/libswscale/ppc/swscale_vsx.c b/libswscale/ppc/swscale_vsx.c index e6a35d3f78..af8b0e1fa3 100644 --- a/libswscale/ppc/swscale_vsx.c +++ b/libswscale/ppc/swscale_vsx.c @@ -154,8 +154,10 @@ static void yuv2plane1_nbps_u(const int16_t *src, uint16_t *dest, int dstW, } } -static void yuv2plane1_nbps_vsx(const int16_t *src, uint16_t *dest, int dstW, - int big_endian, int output_bits) +static av_always_inline void yuv2plane1_nbps_vsx(const int16_t *src, + uint16_t *dest, int dstW, + const int big_endian, + const int output_bits) { const int dst_u = -(uintptr_t)dest & 7; const int shift = 15 - output_bits; @@ -273,8 +275,10 @@ static void yuv2plane1_16_u(const int32_t *src, uint16_t *dest, int dstW, } } -static void yuv2plane1_16_vsx(const int32_t *src, uint16_t *dest, int dstW, - int big_endian, int output_bits) +static av_always_inline void yuv2plane1_16_vsx(const int32_t *src, + uint16_t *dest, int dstW, + const int big_endian, + int output_bits) { const int dst_u = -(uintptr_t)dest & 7; const int shift = 3;
The argument to vec_splat_u16 must be a literal. By making the function always inline and marking the arguments const, gcc can turn those into literals, and avoid build errors like: swscale_vsx.c:165:53: error: argument 1 must be a 5-bit signed literal Fixes #7861. Signed-off-by: Daniel Kolesa <daniel@octaforge.org> --- libswscale/ppc/swscale_vsx.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)