From patchwork Tue Jul 23 12:51:10 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: ckerchne X-Patchwork-Id: 14045 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id ADA41449418 for ; Tue, 23 Jul 2019 15:48:19 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 96E4868A8CA; Tue, 23 Jul 2019 15:48:19 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 2155C68A6C7 for ; Tue, 23 Jul 2019 15:48:13 +0300 (EEST) Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x6NClPlu066019 for ; Tue, 23 Jul 2019 08:48:11 -0400 Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.153]) by mx0a-001b2d01.pphosted.com with ESMTP id 2tx0vtwc69-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 23 Jul 2019 08:48:11 -0400 Received: from localhost by e35.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 23 Jul 2019 13:48:10 +0100 Received: from b03cxnp08025.gho.boulder.ibm.com (9.17.130.17) by e35.co.us.ibm.com (192.168.1.135) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 23 Jul 2019 13:48:08 +0100 Received: from b03ledav002.gho.boulder.ibm.com (b03ledav002.gho.boulder.ibm.com [9.17.130.233]) by b03cxnp08025.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x6NCm7Xv53805534 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 23 Jul 2019 12:48:07 GMT Received: from b03ledav002.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6B18B13605D for ; Tue, 23 Jul 2019 12:48:07 +0000 (GMT) Received: from b03ledav002.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4FB0113604F for ; Tue, 23 Jul 2019 12:48:07 +0000 (GMT) Received: from ltc.linux.ibm.com (unknown [9.16.170.189]) by b03ledav002.gho.boulder.ibm.com (Postfix) with ESMTP for ; Tue, 23 Jul 2019 12:48:07 +0000 (GMT) MIME-Version: 1.0 Date: Tue, 23 Jul 2019 08:51:10 -0400 From: ckerchne To: ffmpeg-devel@ffmpeg.org X-Sender: ckerchne@linux.vnet.ibm.com User-Agent: Roundcube Webmail/1.0.1 X-TM-AS-GCONF: 00 x-cbid: 19072312-0012-0000-0000-000017555ECC X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00011480; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000287; SDB=6.01236292; UDB=6.00651587; IPR=6.01017646; MB=3.00027852; MTD=3.00000008; XFM=3.00000015; UTC=2019-07-23 12:48:09 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19072312-0013-0000-0000-0000582EF9C7 Message-Id: X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2019-07-23_05:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=833 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907230127 Subject: [FFmpeg-devel] [PATCH] Ticket #7124: Fixes compiler bug - replace vec_lvsl/vec_perm with vec_xl X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" A bug exist with the gcc compilers for Power in versions 6.x and 7.x (verified with 6.3 and 7.4). It was fixed in version 8.x (verified with 8.3). I was using a Power 9 ppc64le machine for building and testing. It appears the compiler is generating the wrong code for little endian machines for the vec_lvsl/vec_perm instruction combos in some cases. If these instructions are replaced with vec_xl, the problem goes away for all versions of the compilers. --- libswscale/ppc/yuv2rgb_altivec.c | 24 ++++-------------------- 1 file changed, 4 insertions(+), 20 deletions(-) - vector unsigned char align_perm; \ - \ vector signed short lCY = c->CY; \ vector signed short lOY = c->OY; \ vector signed short lCRV = c->CRV; \ @@ -338,26 +335,13 @@ static int altivec_ ## name(SwsContext *c, const unsigned char **in, \ vec_dstst(oute, (0x02000002 | (((w * 3 + 32) / 32) << 16)), 1); \ \ for (j = 0; j < w / 16; j++) { \ - y1ivP = (const vector unsigned char *) y1i; \ - y2ivP = (const vector unsigned char *) y2i; \ - uivP = (const vector unsigned char *) ui; \ - vivP = (const vector unsigned char *) vi; \ - \ - align_perm = vec_lvsl(0, y1i); \ - y0 = (vector unsigned char) \ - vec_perm(y1ivP[0], y1ivP[1], align_perm); \ + y0 = vec_xl(0, y1i); \ \ - align_perm = vec_lvsl(0, y2i); \ - y1 = (vector unsigned char) \ - vec_perm(y2ivP[0], y2ivP[1], align_perm); \ + y1 = vec_xl(0, y2i); \ \ - align_perm = vec_lvsl(0, ui); \ - u = (vector signed char) \ - vec_perm(uivP[0], uivP[1], align_perm); \ + u = (vector signed char) vec_xl(0, ui); \ \ - align_perm = vec_lvsl(0, vi); \ - v = (vector signed char) \ - vec_perm(vivP[0], vivP[1], align_perm); \ + v = (vector signed char) vec_xl(0, vi); \ \ u = (vector signed char) \ vec_sub(u, \ -- 2.17.1 diff --git a/libswscale/ppc/yuv2rgb_altivec.c b/libswscale/ppc/yuv2rgb_altivec.c index c1e2852adb..536545293d 100644 --- a/libswscale/ppc/yuv2rgb_altivec.c +++ b/libswscale/ppc/yuv2rgb_altivec.c @@ -305,9 +305,6 @@ static int altivec_ ## name(SwsContext *c, const unsigned char **in, \ vector signed short R1, G1, B1; \ vector unsigned char R, G, B; \ \ - const vector unsigned char *y1ivP, *y2ivP, *uivP, *vivP; \