From patchwork Thu May 18 12:52:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Lynne X-Patchwork-Id: 41712 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:ba91:b0:105:feb:71f2 with SMTP id fb17csp470418pzb; Thu, 18 May 2023 05:53:05 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6GoewhYv8e8hpPeYpkWSZaSBk8IYv/QM7vx/VLkvgaZwG0uqyrsBArCzXXzCC+Iw7cKqK7 X-Received: by 2002:a17:907:96ac:b0:948:a1ae:b2c4 with SMTP id hd44-20020a17090796ac00b00948a1aeb2c4mr48367109ejc.6.1684414384810; Thu, 18 May 2023 05:53:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684414384; cv=none; d=google.com; s=arc-20160816; b=wncPSoGfHO6VHqYlo+Hztv1SHnQ/LVVDBY6fuCZ2bTDO3EcmsuZ81Fp5lfqigWKzNV X1G5clfWEyBeIBo9czXBlUlfcMk6Wbc85KDGSixYU3rG3KyQ+BYzhzuNFnqFLg14GbJw g7dZ0j2jUy2ra5YPx/2OXIhPpk7+kFUtLii27vGPokaZxBZNBppjADCIaGXpcUDPoiz9 RW8aSWviqMmlrKrUZOEYwzR/Oso7a8GzNQDIn0/K5FpYu5vBmUeNOs6nPZULfYSlGpSg tgLLLIZ26HjVM/EuDgvuAr+TXlM5FwMnlPDt0lR0BVnlwxn6oeie7vBThdtDLShBS8XT q4kA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject :mime-version:message-id:to:from:date:dkim-signature:delivered-to; bh=stDy9B5sjNhtWpxGzzgLwUHBzHcd2iSO0OCiY2u5zPs=; b=hXoD08a0+mUGOd5fD4MCtZZaqstV45INigBoiE1S/6N4NZkozOmGyFbIEGD/IV7n50 dde2lj2mYkeUL+g8WpqdFFv+ZLeWppBjoEdal6UwMhfLc2MEJSAurAnD/4ap3az4Hta4 gGYGBs/s9J+b4f5XBldlQ9Y7ASRxZ+NA9saZdvN4UaDm1+VNujVLEaCyrmfdB6/T9YpV Ts+okCg4TKsTEhTiL2L1XITNHviTroLgytxKEseXAN4f9PmWg9DGNR00VzFOxP+FUcWc VguT1V5gmYMLMU+Nb95zb0RZZwwgdj3renPn1oa27btNKWK8r0B1qDhxj9bPzvu1x6rU rTKw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@lynne.ee header.s=s1 header.b=GFjpVUxe; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=lynne.ee Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id io15-20020a17090780cf00b00965ec31eba6si1453519ejc.716.2023.05.18.05.53.03; Thu, 18 May 2023 05:53:04 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@lynne.ee header.s=s1 header.b=GFjpVUxe; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=lynne.ee Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0425A68C14D; Thu, 18 May 2023 15:53:00 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from w4.tutanota.de (w4.tutanota.de [81.3.6.165]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 5E14968B48D for ; Thu, 18 May 2023 15:52:53 +0300 (EEST) Received: from tutadb.w10.tutanota.de (unknown [192.168.1.10]) by w4.tutanota.de (Postfix) with ESMTP id D3F3A1060140 for ; Thu, 18 May 2023 12:52:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1684414372; s=s1; d=lynne.ee; h=From:From:To:To:Subject:Subject:Content-Description:Content-ID:Content-Type:Content-Type:Content-Transfer-Encoding:Cc:Date:Date:In-Reply-To:MIME-Version:MIME-Version:Message-ID:Message-ID:Reply-To:References:Sender; bh=RZy04yuEPtCt/1HzuYcAXJIDdbt3OCJgsXqzoeuElLQ=; b=GFjpVUxev1nQqxOV2IXxFpUk6W9oTSKaWxF1Yjf1iutKPzSVcOIGxjAjJyftQ6UO 9ob+ulA/sTHtFFxhyeUrPnoy287vOfBdWc/rDVe3kbim20ExlTp75VzSfC08VWKzeay E3ZXOasBy1ZmtkTbB+Vl9YluI6UIg+NKdvxHvbRNxlSZc53woJyEkXoDshXn6Bmlujm z77M+fkvn0EK2+D26QZxOcLKVaI5NfPArE5aFoxHgVy5t21pQAKcnR65JedRkhbaSxH CzcQbovmlM60kb3Kk8ERKN+k0gPHS8i3BKZKP5L5tkwzg9F3MXTK+h+A2/zLUiJb3Qe pSjAuYRY1w== Date: Thu, 18 May 2023 14:52:52 +0200 (CEST) From: Lynne To: Ffmpeg Devel Message-ID: MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] swscale/ppc: remove hScale8To19_vsx X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: NbmMnWplGwGO Fails checkasm on a Power9 DD2.2 02CY771 system. The assembly doesn't seem to have been independently tested at all. https://paste.sr.ht/~ky0ko/fe255ff73fab49b0c6d335437d894c1db626289e Patch attached. From 0ba39b07e85d866ef43c38e1bcf352af2bedacb9 Mon Sep 17 00:00:00 2001 From: Lynne Date: Thu, 18 May 2023 14:42:14 +0200 Subject: [PATCH] swscale/ppc: remove hScale8To19_vsx Fails checkasm on a Power9 system. --- libswscale/ppc/swscale_vsx.c | 60 ------------------------------------ 1 file changed, 60 deletions(-) diff --git a/libswscale/ppc/swscale_vsx.c b/libswscale/ppc/swscale_vsx.c index 8152ce7f10..7080a16aee 100644 --- a/libswscale/ppc/swscale_vsx.c +++ b/libswscale/ppc/swscale_vsx.c @@ -1858,64 +1858,6 @@ static void hcscale_fast_vsx(SwsContext *c, int16_t *dst1, int16_t *dst2, #undef HCSCALE -static void hScale8To19_vsx(SwsContext *c, int16_t *_dst, int dstW, - const uint8_t *src, const int16_t *filter, - const int32_t *filterPos, int filterSize) -{ - int i, j; - int32_t *dst = (int32_t *) _dst; - vec_s16 vfilter, vin; - vec_u8 vin8; - vec_s32 vout; - const vec_u8 vzero = vec_splat_u8(0); - const vec_u8 vunusedtab[8] = { - (vec_u8) {0x0, 0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7, - 0x8, 0x9, 0xa, 0xb, 0xc, 0xd, 0xe, 0xf}, - (vec_u8) {0x0, 0x1, 0x10, 0x10, 0x10, 0x10, 0x10, 0x10, - 0x10, 0x10, 0x10, 0x10, 0x10, 0x10, 0x10, 0x10}, - (vec_u8) {0x0, 0x1, 0x2, 0x3, 0x10, 0x10, 0x10, 0x10, - 0x10, 0x10, 0x10, 0x10, 0x10, 0x10, 0x10, 0x10}, - (vec_u8) {0x0, 0x1, 0x2, 0x3, 0x4, 0x5, 0x10, 0x10, - 0x10, 0x10, 0x10, 0x10, 0x10, 0x10, 0x10, 0x10}, - (vec_u8) {0x0, 0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7, - 0x10, 0x10, 0x10, 0x10, 0x10, 0x10, 0x10, 0x10}, - (vec_u8) {0x0, 0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7, - 0x8, 0x9, 0x10, 0x10, 0x10, 0x10, 0x10, 0x10}, - (vec_u8) {0x0, 0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7, - 0x8, 0x9, 0xa, 0xb, 0x10, 0x10, 0x10, 0x10}, - (vec_u8) {0x0, 0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7, - 0x8, 0x9, 0xa, 0xb, 0xc, 0xd, 0x10, 0x10}, - }; - const vec_u8 vunused = vunusedtab[filterSize % 8]; - - if (filterSize == 1) { - for (i = 0; i < dstW; i++) { - int srcPos = filterPos[i]; - int val = 0; - for (j = 0; j < filterSize; j++) { - val += ((int)src[srcPos + j]) * filter[filterSize * i + j]; - } - dst[i] = FFMIN(val >> 3, (1 << 19) - 1); // the cubic equation does overflow ... - } - } else { - for (i = 0; i < dstW; i++) { - const int srcPos = filterPos[i]; - vout = vec_splat_s32(0); - for (j = 0; j < filterSize; j += 8) { - vin8 = vec_vsx_ld(0, &src[srcPos + j]); - vin = (vec_s16) vec_mergeh(vin8, vzero); - if (j + 8 > filterSize) // Remove the unused elements on the last round - vin = vec_perm(vin, (vec_s16) vzero, vunused); - - vfilter = vec_vsx_ld(0, &filter[filterSize * i + j]); - vout = vec_msums(vin, vfilter, vout); - } - vout = vec_sums(vout, (vec_s32) vzero); - dst[i] = FFMIN(vout[3] >> 3, (1 << 19) - 1); - } - } -} - static void hScale16To19_vsx(SwsContext *c, int16_t *_dst, int dstW, const uint8_t *_src, const int16_t *filter, const int32_t *filterPos, int filterSize) @@ -2092,8 +2034,6 @@ av_cold void ff_sws_init_swscale_vsx(SwsContext *c) c->hyscale_fast = hyscale_fast_vsx; c->hcscale_fast = hcscale_fast_vsx; } - } else { - c->hyScale = c->hcScale = hScale8To19_vsx; } } else { if (power8) { -- 2.40.0