From patchwork Mon May 17 09:55:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Martin_Storsj=C3=B6?= X-Patchwork-Id: 27828 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a6b:b214:0:0:0:0:0 with SMTP id b20csp2713229iof; Mon, 17 May 2021 03:20:57 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyTEqNn3RPo3xq6LfKjChX1s5D4oUlqM4MUYsn5CjON0LPV+ZcoM6Nwm40PJ9z6G62TgWlr X-Received: by 2002:a50:fd0d:: with SMTP id i13mr10177943eds.163.1621246856912; Mon, 17 May 2021 03:20:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1621246856; cv=none; d=google.com; s=arc-20160816; b=Spzfd+2SWtUkNcEj9bPHCTTMkR0OMErHKH/T3cjDH0MIY8+dpc0Ke0uK9i5BESd90a AsOkYlU3oBubUV5DAnTuNlSCL7wTIzCfubWYEDQ6Oa8hT3rIL9Lw0hBOxbTmVjr6XiDz bJFZue0DKPb4K4vYUWOUvdstvyxDTKjQJijhLohZxBgAr+9yCwM2qUmd3fVbTU6zXgPP 1dydKqTTuiZXb49xJ1eHHnOBp47xCUmuY6Rf8ILDuaqhFINoQ3SO9t+sMdbZHJEWJunp 33yQ2zEM97coC8El7JuvfxJti40adsRqOiNwJBS8Irr3YpMiynFFVSDNnCgFJKz0EIiD LNPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=8GWDscYSkH6L50ujHIwIrO5rFUCfwNRGnMwq7Iq6AHo=; b=ZRPN091cC+C2CmDVYkV3W9qhOAuBiJAWPNMOBOeIUYwss3EO28t2gNNH1yzHaKcE+D ognhiaepp631L1iaghinlXdy2Vtu0PAmpHBoBKfyM97ZrXuhIhBtIHanb3wWiiap0yKt hYGVbOXv6WH3TwAb7wI8uAZKBU3exsp2khUVYHr0ZolT7xdACD7Ro9/g1CST2lV7NxXE ajv1Uu1N3bcbBufmgYKD/1n4p+oYo/ub/yvjw0jqRSZ9hBLsEBSX95QQbczoBO9JsRaz zOf7MER135o31XjgP0KPyb19wVSDEeS2FQMiQVIj4iHvIHEnNPXcMYr5ZCvQy/bjHJ3K sI6Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20150623.gappssmtp.com header.s=20150623 header.b="z6pyfH/K"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id bo12si14645952ejb.317.2021.05.17.03.20.56; Mon, 17 May 2021 03:20:56 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20150623.gappssmtp.com header.s=20150623 header.b="z6pyfH/K"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 175B368980E; Mon, 17 May 2021 13:20:52 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lj1-f172.google.com (mail-lj1-f172.google.com [209.85.208.172]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 29BBB68818D for ; Mon, 17 May 2021 13:20:46 +0300 (EEST) Received: by mail-lj1-f172.google.com with SMTP id v5so6490209ljg.12 for ; Mon, 17 May 2021 03:20:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=bOtB3goMz+5sQJ6wZ8z9jzBOn/Ih3YS4dme1X+qYheM=; b=z6pyfH/Ktsz92ulwr0yfWL/OLHiHLlyybVtZDDTJUg1UEKhk85BqN4WuIFv+ec1IkF FCbOqsM3SPkoHGuN/TAOKyXdZI7YrFR1xJ7n/fnfjSgGL9JdwM8oVMEdzs2/6h3RjU6Z CasofxJpc3GWJvktvbZfqCBw7AVwm+4it2QZZSlJ9kLWg5KUwY+dg1o+KhQhj51zIhW3 CdCYBBgVhaAwxwUV/QXs5LoYx0H0eHc4HoE361zMuNUCwN+Gu/HAYvT6++2Mt2OpN51U Hg1WuSbivht97I8rKdniG1OtbrayxuzgDfKFOeUu/yZuW+EC+XuybBEAl78woAnsV6Jr 3Mig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=bOtB3goMz+5sQJ6wZ8z9jzBOn/Ih3YS4dme1X+qYheM=; b=M8RziJXrgxSAi8sGBlu03IlKQR5lkk4qJFhvVH8y6+aR6q1GSGtVNQkynSUgU3APbz qXK901226o1oXuUssLFM5vq4SSwQkdflCeyixeugWWZvun9+qdiZAJzckpKLMRBo4wSP KAKUrd60O1di73gFFMDCqaDotzUR/6f3r+7rhm4ISKdZy5Dljiu2RIAHTftIXxBOCG6D fShTEe/FcvaOKFTtyiRkdTzQAr3J3bM2QkdusCcFU4KDk7K12VwIfmhrKoQ36Dyj0/Ba 3aykE63gkZIMnY/BQ0bgt4VadGx0RhoOt57CmxCzjMKBp8qUPqEeRq/bOQNiUFWWBWXl 5wsA== X-Gm-Message-State: AOAM532xz7OBKruBCXIY0Oop96tfLbAfTKbvCkymtT17fh0hyAXka1jZ AraZzqb+sVKaQYNsF8iJ8BaVCMh3YG5v8q/v X-Received: by 2002:a05:6512:5c2:: with SMTP id o2mr41859232lfo.165.1621245338772; Mon, 17 May 2021 02:55:38 -0700 (PDT) Received: from localhost.localdomain (dsl-tkubng21-58c01c-243.dhcp.inet.fi. [88.192.28.243]) by smtp.gmail.com with ESMTPSA id u12sm1633359lfc.75.2021.05.17.02.55.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 17 May 2021 02:55:38 -0700 (PDT) From: =?utf-8?q?Martin_Storsj=C3=B6?= To: ffmpeg-devel@ffmpeg.org Date: Mon, 17 May 2021 12:55:37 +0300 Message-Id: <20210517095537.318311-1-martin@martin.st> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] aarch64: hevc_idct: Fix overflows in idct_dc X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Josh Dekker Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: qqOvA831AW4b This is marginally slower, but correct for all input values. The previous implementation failed with certain input seeds, e.g. "checkasm --test=hevc_idct 98". --- libavcodec/aarch64/hevcdsp_idct_neon.S | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/libavcodec/aarch64/hevcdsp_idct_neon.S b/libavcodec/aarch64/hevcdsp_idct_neon.S index 28c11e632c..0869431294 100644 --- a/libavcodec/aarch64/hevcdsp_idct_neon.S +++ b/libavcodec/aarch64/hevcdsp_idct_neon.S @@ -573,14 +573,13 @@ idct_16x16 10 // void ff_hevc_idct_NxN_dc_DEPTH_neon(int16_t *coeffs) .macro idct_dc size, bitdepth function ff_hevc_idct_\size\()x\size\()_dc_\bitdepth\()_neon, export=1 - movi v1.8h, #((1 << (14 - \bitdepth))+1) ld1r {v4.8h}, [x0] - add v4.8h, v4.8h, v1.8h - sshr v0.8h, v4.8h, #(15 - \bitdepth) - sshr v1.8h, v4.8h, #(15 - \bitdepth) + srshr v4.8h, v4.8h, #1 + srshr v0.8h, v4.8h, #(14 - \bitdepth) + srshr v1.8h, v4.8h, #(14 - \bitdepth) .if \size > 4 - sshr v2.8h, v4.8h, #(15 - \bitdepth) - sshr v3.8h, v4.8h, #(15 - \bitdepth) + srshr v2.8h, v4.8h, #(14 - \bitdepth) + srshr v3.8h, v4.8h, #(14 - \bitdepth) .if \size > 16 /* dc 32x32 */ mov x2, #4 1: