diff mbox

[FFmpeg-devel,V2] Fixes compiler bug - replace vec_lvsl/vec_perm with vec_xl

Message ID 3c8116811516c04d11397f392f9a6008@linux.vnet.ibm.com
State Accepted
Commit 3a557c5d88b7b15b5954ba2743febb055549b536
Headers show

Commit Message

ckerchne Aug. 2, 2019, 3:27 p.m. UTC
A bug exist with the gcc compilers for Power in versions 6.x and 7.x 
(verified with 6.3 and 7.4). It was fixed in version 8.x (verified with 
8.3). I was using a Power 9 ppc64le machine for building and testing.
This is to address ticket #7124.

It appears the compiler is generating the wrong code for little endian 
machines for the vec_lvsl/vec_perm instruction combos in some cases. If 
these instructions are replaced with vec_xl, the problem goes away for 
all versions of the compilers.

Comments

ckerchne Aug. 12, 2019, 2:21 p.m. UTC | #1
How do I get someone to look at this patch?

> A bug exist with the gcc compilers for Power in versions 6.x and 7.x
> (verified with 6.3 and 7.4). It was fixed in version 8.x (verified
> with 8.3). I was using a Power 9 ppc64le machine for building and
> testing.
> This is to address ticket #7124.
> 
> It appears the compiler is generating the wrong code for little endian
> machines for the vec_lvsl/vec_perm instruction combos in some cases.
> If these instructions are replaced with vec_xl, the problem goes away
> for all versions of the compilers.
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Carl Eugen Hoyos Aug. 12, 2019, 3:14 p.m. UTC | #2
> Am 12.08.2019 um 16:21 schrieb ckerchne <ckerchne@linux.vnet.ibm.com>:
> 
> How do I get someone to look at this patch?
> 
>> A bug exist with the gcc compilers for Power in versions 6.x and 7.x
>> (verified with 6.3 and 7.4). It was fixed in version 8.x (verified
>> with 8.3). I was using a Power 9 ppc64le machine for building and
>> testing.
>> This is to address ticket #7124.
>> It appears the compiler is generating the wrong code for little endian
>> machines for the vec_lvsl/vec_perm instruction combos in some cases.
>> If these instructions are replaced with vec_xl, the problem goes away
>> for all versions of the compilers.

As said, this commit message is not ok, please add information on how the issue can actually be reproduced.

And please avoid top-posting here.

Carl Eugen
Carl Eugen Hoyos Aug. 13, 2019, 12:26 a.m. UTC | #3
Am Fr., 2. Aug. 2019 um 17:24 Uhr schrieb ckerchne
<ckerchne@linux.vnet.ibm.com>:
>
> A bug exist with the gcc compilers for Power in versions 6.x and 7.x
> (verified with 6.3 and 7.4). It was fixed in version 8.x (verified with
> 8.3). I was using a Power 9 ppc64le machine for building and testing.
> This is to address ticket #7124.
>
> It appears the compiler is generating the wrong code for little endian
> machines for the vec_lvsl/vec_perm instruction combos in some cases. If
> these instructions are replaced with vec_xl, the problem goes away for
> all versions of the compilers

I tested with gcc 7 (gcc 6 fails compilation for ppc64le) and was unable
to see a speed difference on be, so pushed.

For future patches: Please provide patches as made by git format-patch,
not diff files.

Thank you for the fix, Carl Eugen
Carl Eugen Hoyos April 15, 2020, 3:16 p.m. UTC | #4
Am Fr., 2. Aug. 2019 um 17:24 Uhr schrieb ckerchne
<ckerchne@linux.vnet.ibm.com>:
>
> A bug exist with the gcc compilers for Power in versions 6.x and 7.x
> (verified with 6.3 and 7.4). It was fixed in version 8.x (verified with
> 8.3). I was using a Power 9 ppc64le machine for building and testing.
> This is to address ticket #7124.
>
> It appears the compiler is generating the wrong code for little endian
> machines for the vec_lvsl/vec_perm instruction combos in some cases. If
> these instructions are replaced with vec_xl, the problem goes away for
> all versions of the compilers.

This patch broke compilation on ppc64be with gcc 9.3 (Debian sid):
src/libswscale/ppc/yuv2rgb_altivec.c: In function ‘altivec_yuv2_abgr’:
src/libswscale/ppc/yuv2rgb_altivec.c:338:18: error: implicit
declaration of function ‘vec_xl’; did you mean ‘vec_rl’?
[-Werror=implicit-function-declaration]
  338 |             y0 = vec_xl(0, y1i);

Carl Eugen
diff mbox

Patch

diff --git a/libswscale/ppc/yuv2rgb_altivec.c b/libswscale/ppc/yuv2rgb_altivec.c
index c1e2852adb..536545293d 100644
--- a/libswscale/ppc/yuv2rgb_altivec.c
+++ b/libswscale/ppc/yuv2rgb_altivec.c
@@ -305,9 +305,6 @@  static int altivec_ ## name(SwsContext *c, const unsigned char **in,          \
     vector signed short R1, G1, B1;                                           \
     vector unsigned char R, G, B;                                             \
                                                                               \
-    const vector unsigned char *y1ivP, *y2ivP, *uivP, *vivP;                  \
-    vector unsigned char align_perm;                                          \
-                                                                              \
     vector signed short lCY       = c->CY;                                    \
     vector signed short lOY       = c->OY;                                    \
     vector signed short lCRV      = c->CRV;                                   \
@@ -338,26 +335,13 @@  static int altivec_ ## name(SwsContext *c, const unsigned char **in,          \
         vec_dstst(oute, (0x02000002 | (((w * 3 + 32) / 32) << 16)), 1);       \
                                                                               \
         for (j = 0; j < w / 16; j++) {                                        \
-            y1ivP = (const vector unsigned char *) y1i;                       \
-            y2ivP = (const vector unsigned char *) y2i;                       \
-            uivP  = (const vector unsigned char *) ui;                        \
-            vivP  = (const vector unsigned char *) vi;                        \
-                                                                              \
-            align_perm = vec_lvsl(0, y1i);                                    \
-            y0 = (vector unsigned char)                                       \
-                     vec_perm(y1ivP[0], y1ivP[1], align_perm);                \
+            y0 = vec_xl(0, y1i);                                              \
                                                                               \
-            align_perm = vec_lvsl(0, y2i);                                    \
-            y1 = (vector unsigned char)                                       \
-                     vec_perm(y2ivP[0], y2ivP[1], align_perm);                \
+            y1 = vec_xl(0, y2i);                                              \
                                                                               \
-            align_perm = vec_lvsl(0, ui);                                     \
-            u = (vector signed char)                                          \
-                    vec_perm(uivP[0], uivP[1], align_perm);                   \
+            u = (vector signed char) vec_xl(0, ui);                           \
                                                                               \
-            align_perm = vec_lvsl(0, vi);                                     \
-            v = (vector signed char)                                          \
-                    vec_perm(vivP[0], vivP[1], align_perm);                   \
+            v = (vector signed char) vec_xl(0, vi);                           \
                                                                               \
             u = (vector signed char)                                          \
                     vec_sub(u,                                                \