From patchwork Sun Nov 25 23:45:26 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Carl Eugen Hoyos X-Patchwork-Id: 11160 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 4EEA544C8ED for ; Mon, 26 Nov 2018 01:45:31 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C79E968A2A6; Mon, 26 Nov 2018 01:45:31 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-it1-f193.google.com (mail-it1-f193.google.com [209.85.166.193]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id DBB7768A0A0 for ; Mon, 26 Nov 2018 01:45:24 +0200 (EET) Received: by mail-it1-f193.google.com with SMTP id m123-v6so1374719ita.4 for ; Sun, 25 Nov 2018 15:45:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=VzxWlktJH0Lp3IcL55S4a3+wDFGkGYp8IJzEj9c5qXA=; b=XAzyIVRblnPq9jj+K7IVECNWAcRWha9ofXrVVVkk1gNNoOGLqF8j+Rl2fx46646fiJ FsOuKCwqzI+JouMCNDJa+CihnIXpKig98xgv9mTaMmAIGk6JaT4xR4NYAHNVTA1lrUd5 roP6RMrP9GF22o7b/boW1q1j1WeeQbssiV62SPqJA7YX1y7rhmfya8mWLJdtPoEqd2ty 8fMp87S+FG0wWQH1+4kv3Lo8obvLTmyCm14TTT8kRPKaBia2rnIJe7ETfFwGVHMoARdt Fu8QmFZZUy7tfopiwqqj7miCS1gAfFCXQSI2S9cjwHRx4/H8Ylq/LRYSN2+ixCVjKSd+ 2pWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=VzxWlktJH0Lp3IcL55S4a3+wDFGkGYp8IJzEj9c5qXA=; b=nB+wYPyrv0TYslscM+FqlCZ5NAtIWdAVEBqUPxDmAW3vmbf8yaWrnywfJPGw8fh7gy RnfqUQbq+cEqFz6J3HJV8lCnm/P7pmnsQ4TS1UVMiYOkUrLWOZHe9/xMhzO8vWLrg9r4 KF1TS7M+yDy+9EyM8OC4UAUvmTmbUxeUDPsZ9l3BabwAqOFVzaIhILtJpcPqnpUCpU2c 6Dhkbw//VslYXgEJ4/Zn9DavlOlHzFBdVBAyyF4KfRzkL+d8UgdJTt6Z45K4EK2LAhKr 4xdiRw09S52RfmxcbpfuCZCPCG8k7dKXofv6VWbmdojoA3eYnO2ErVQ8HvmnOoSKvHmj TFyw== X-Gm-Message-State: AA+aEWZsKeBickncWr/9BBsvWxgyyFhx/FTkPdt70ksRtA2JOyWGPEYt ET7IhSG1lnG5/XS2OuFtxBXQXv0l+ngdpUpC22uHvw== X-Google-Smtp-Source: AFSGD/Xz6pAFwt8Jm6t1bfebHvKDxuNcam7F9R498GYBQkT+v1Ha8jHrnFb3ISVCjzetmkePexDQ9dkwZi+/LKYO9a0= X-Received: by 2002:a24:e4ca:: with SMTP id o193mr21481378ith.149.1543189527704; Sun, 25 Nov 2018 15:45:27 -0800 (PST) MIME-Version: 1.0 Received: by 2002:a02:5f11:0:0:0:0:0 with HTTP; Sun, 25 Nov 2018 15:45:26 -0800 (PST) In-Reply-To: <20181125171758.3444b0543efb98e51eda94cd@gmx.com> References: <20181125171758.3444b0543efb98e51eda94cd@gmx.com> From: Carl Eugen Hoyos Date: Mon, 26 Nov 2018 00:45:26 +0100 Message-ID: To: FFmpeg development discussions and patches Subject: Re: [FFmpeg-devel] fate-rv20-1239 failure on power8, aliasing bug X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" 2018-11-25 16:17 GMT+01:00, Lauri Kasanen : > Hi, > > The lone power8 fate failing test seems like an aliasing issue. > I've isolated it into the attached standalone test case. Compiling it > with > gcc -std=c11 -maltivec -mabi=altivec -mvsx -O3 -fno-tree-vectorize > -o test test.c > > reproduces on gcc 8.2.0, dropping the optimization level fixes it. This > was one of the "adding a printf made it work" things too. > > -Wstrict-aliasing=1 complains about the "register int *idataptr = > (int*)dataptr;" cast. If I put "typedef int __attribute__((may_alias)) > int_alias;" at the top and change the cast and type to int_alias, the > results become correct. Thank you for the analysis! Patch attached, Carl Eugen From e5403b832f2bcd360128d9986b602484e576c931 Mon Sep 17 00:00:00 2001 From: Carl Eugen Hoyos Date: Mon, 26 Nov 2018 00:43:46 +0100 Subject: [PATCH] lavc/jrevdct: Avoid an aliasing violation. Fixes fate on different PowerPC systems with some compilers. Analyzed-by: Lauri Kasanen --- libavcodec/jrevdct.c | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/libavcodec/jrevdct.c b/libavcodec/jrevdct.c index 3b15a52..c749468 100644 --- a/libavcodec/jrevdct.c +++ b/libavcodec/jrevdct.c @@ -63,6 +63,7 @@ */ #include "libavutil/common.h" +#include "libavutil/intreadwrite.h" #include "dct.h" #include "idctdsp.h" @@ -234,7 +235,7 @@ void ff_j_rev_dct(DCTBLOCK data) * row DCT calculations can be simplified this way. */ - register int *idataptr = (int*)dataptr; + register uint8_t *idataptr = (uint8_t*)dataptr; /* WARNING: we do the same permutation as MMX idct to simplify the video core */ @@ -254,10 +255,10 @@ void ff_j_rev_dct(DCTBLOCK data) int16_t dcval = (int16_t) (d0 * (1 << PASS1_BITS)); register int v = (dcval & 0xffff) | ((dcval * (1 << 16)) & 0xffff0000); - idataptr[0] = v; - idataptr[1] = v; - idataptr[2] = v; - idataptr[3] = v; + AV_WN32(&idataptr[ 0], v); + AV_WN32(&idataptr[ 4], v); + AV_WN32(&idataptr[ 8], v); + AV_WN32(&idataptr[12], v); } dataptr += DCTSIZE; /* advance pointer to next row */ @@ -974,7 +975,7 @@ void ff_j_rev_dct4(DCTBLOCK data) * row DCT calculations can be simplified this way. */ - register int *idataptr = (int*)dataptr; + register uint8_t *idataptr = (uint8_t*)dataptr; d0 = dataptr[0]; d2 = dataptr[1]; @@ -988,8 +989,8 @@ void ff_j_rev_dct4(DCTBLOCK data) int16_t dcval = (int16_t) (d0 << PASS1_BITS); register int v = (dcval & 0xffff) | ((dcval << 16) & 0xffff0000); - idataptr[0] = v; - idataptr[1] = v; + AV_WN32(&idataptr[0], v); + AV_WN32(&idataptr[4], v); } dataptr += DCTSTRIDE; /* advance pointer to next row */ -- 1.7.10.4