From patchwork Wed Nov 15 03:53:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?5b6Q56aP6ZqG?= <839789740@qq.com> X-Patchwork-Id: 44668 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:92a5:b0:181:818d:5e7f with SMTP id q37csp2438951pzg; Tue, 14 Nov 2023 19:53:37 -0800 (PST) X-Google-Smtp-Source: AGHT+IFGzSiUCWSrqbb5e8NF3oU8D92fZ0kwzsM1xs50XDBvFaXStOIO5rJf4WAo9lH6tfugQiJF X-Received: by 2002:aa7:d6d5:0:b0:547:ebe:2561 with SMTP id x21-20020aa7d6d5000000b005470ebe2561mr3943385edr.14.1700020417173; Tue, 14 Nov 2023 19:53:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700020417; cv=none; d=google.com; s=arc-20160816; b=mGesKPL/kAw6sIYp40FnjT49pYdantDN1miMfKQmsYCR+AOx/g8bb3er1SX+ziqMY2 D17jWAbcLYif8NXTUtxsDyaTgzhUoYNPeI/Gu4rEcjaSx4d1TP+oxtIAg78Z8CxFV2A+ NIRzjINF+uzLKfUXLro+b1voxISA0wuN+yM2p1rp4CV1UjA/QvH6MWayJH9/0OrRtm4m 47q1/bEHGg254OcbOXBgfcAR/7HpRgpUqKK7YWQU+PH28oodjgVKV41kzI30IPTkCdzR HPwrPSziVWce6bg3yMwkyL92/y90TLG51N3M9K2EPTf3pL4fzxCsZ9MI4SxlhDyHJx6q UrgA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to:from :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:date:to:message-id :delivered-to; bh=4CHCiGARpy1Trh/WxbC5Tnl0BBF5B+9RMCw9Oe9BiJE=; fh=/QvfVIDcib8YPugs5q/sxqF0zCx8M9migilS77PcQx8=; b=Yod6SrOedzah+k7JYJ1jgYo//nEQ2X/kX4sCPo2dJmGngL8w7TpoE4eiSugfUNnSCx rtzsko5xI8AhVOtYurCv87ccKZse6RLr41lVxiCIGV1LDvxFSdcgzV2Y+ZRrCKvMOZOb 0RJA5Qy/ccNKbewQPYRgctjpHRdB8Ru0Os6p5sz479ydP3y6wb4S7yBss8yjrB7WFh2c 8moXN/HPfY3fuZsJLrT6VeOnyjEmE9bIbDgHqi5/6HcrdzrsD8dfRnuTokhpc1m1j9Bl 37a87mxCDo+fjY/BSphvl4memW5RlNfTJJ/q8Jm24Wd1HcjlLDhcSLaFExTrsNQSK0uG TAHA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id i26-20020a50871a000000b005470b961192si4453841edb.224.2023.11.14.19.53.36; Tue, 14 Nov 2023 19:53:37 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id EFA4F68CD8E; Wed, 15 Nov 2023 05:53:32 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out203-205-251-27.mail.qq.com (out203-205-251-27.mail.qq.com [203.205.251.27]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id E77B968CBB8 for ; Wed, 15 Nov 2023 05:53:23 +0200 (EET) Received: from localhost.localdomain ([14.145.0.19]) by newxmesmtplogicsvrszc2-1.qq.com (NewEsmtp) with SMTP id D5086499; Wed, 15 Nov 2023 11:53:16 +0800 X-QQ-mid: xmsmtpt1700020396t7dqv3jzz Message-ID: X-QQ-XMAILINFO: MQjKimNqHmeys4HeCRTu04v58BQmGaw3bjt6iBAJr2bRmzvCPR+NaWSxfu+jOP tnfLu0dq+0TtljCOLRRqYzupvMat7AB7i1axnbipw3DAKlEosRlVojVhvDLXN6fhpwnS1mqYJ6in ItDF/30ujZxJ47HIuJd86ts2OTVHwnjY9i6ph/0BUtD+8CUrD7JOqnqe+sob2G89/RaAHjuGTZOX WIQbWw0omM74MqIGxDCJhuzjrOkHM+UXtsU7TRWdB9pxfcqzTPLvOzvJZ/MEs93m7IxYSYEueE8w ge2FKhr9uoX8zR9aW1dTIl4hQpa4pn3FhnD6WLvQsyzSUKowNqq8KTX+64rFCoN7DhnFwNVrixEn zQA9BYsuLPtzk2hmwoNqoK/swKlfLx666TYf54OyKr81EK0omkotne/I0iaTvKVn7qz/GGSclYXi 75+H1e46IMWR8zJHA6Q4Hp8LVWjqqhGlQjAHMp3agxEFsZqpG5J8wqwMhbzZDHQJrK0HcZ0MZQWJ CKrHwJ5BNhz4epQOvmV9rX6O6NcoJvBwNAHR3DkjMqyVjyRRJn1Bdmi8VtOz69xgG5XYrj52hA5k pC9tdpFi183gsLPE9liSIdZAqVi5q+a10nJDwP+kFPvC3U3nOSA1VTTOtstVqogv4iUP8OK1EEm1 qKdL8U6H8SyE3mpeNfchd6yHHMEXMJ9207dxpz4/gYKw5Ku473OTJomHBDmnzSbLZC7JFP0A7zjc t6DKVbztdsAhbOVYVl/Zos9FJAkjQo73FBufWu8rqFbGvltDkPwVEWTpgvBxq8UtK2rSTtma1gQw d3K59JCv//Uh69Q42uGQpbp4IFYDc4t1YcI6iMYLIKSD3pcxaJEk2/1pomWqXugHIehiEpYyldL1 xWAdAP4R8uHevEyXhpuSSTUCcYPWdV0BOH8DDxLu/zOHzju3R/foj30f9oqWXQG2RRJzszc5Drdf dz54dSqMS3oV3l/XPnLfk6qn7nkWjQYvz8R9hKUc8= X-QQ-XMRINFO: OWPUhxQsoeAVDbp3OJHYyFg= To: ffmpeg-devel@ffmpeg.org Date: Wed, 15 Nov 2023 11:53:12 +0800 X-OQ-MSGID: <20231115035312.59577-1-839789740@qq.com> X-Mailer: git-send-email 2.32.0 (Apple Git-132) MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] lavc/hevcdsp_qpel_neon: using movi.16b instead of movi.2d X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: xufuji456 via ffmpeg-devel From: =?utf-8?b?5b6Q56aP6ZqG?= <839789740@qq.com> Reply-To: FFmpeg development discussions and patches Cc: xufuji456 <839789740@qq.com> Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 6kbOlOQrXP0s Building iOS platform with arm64, the compiler has a warning: "instruction movi.2d with immediate #0 may not function correctly on this CPU, converting to movi.16b" Signed-off-by: xufuji456 <839789740@qq.com> --- libavcodec/aarch64/hevcdsp_epel_neon.S | 136 ++++++++++++------------- libavcodec/aarch64/hevcdsp_qpel_neon.S | 52 +++++----- libavcodec/aarch64/mpegaudiodsp_neon.S | 8 +- libswscale/aarch64/hscale.S | 72 ++++++------- 4 files changed, 134 insertions(+), 134 deletions(-) diff --git a/libavcodec/aarch64/hevcdsp_epel_neon.S b/libavcodec/aarch64/hevcdsp_epel_neon.S index a2a051210f..c077c204cc 100644 --- a/libavcodec/aarch64/hevcdsp_epel_neon.S +++ b/libavcodec/aarch64/hevcdsp_epel_neon.S @@ -694,7 +694,7 @@ function ff_hevc_put_hevc_epel_h4_8_neon_i8mm, export=1 trn1 v4.2s, v4.2s, v5.2s trn1 v6.2s, v6.2s, v7.2s trn1 v4.2d, v4.2d, v6.2d - movi v16.2d, #0 + movi v16.16b, #0 usdot v16.4s, v4.16b, v30.16b xtn v16.4h, v16.4s st1 {v16.4h}, [x0], x10 @@ -714,8 +714,8 @@ function ff_hevc_put_hevc_epel_h6_8_neon_i8mm, export=1 trn2 v17.2s, v4.2s, v5.2s trn1 v6.2s, v6.2s, v7.2s trn1 v16.2d, v16.2d, v6.2d - movi v18.2d, #0 - movi v19.2d, #0 + movi v18.16b, #0 + movi v19.16b, #0 usdot v18.4s, v16.16b, v30.16b usdot v19.2s, v17.8b, v30.8b xtn v18.4h, v18.4s @@ -736,8 +736,8 @@ function ff_hevc_put_hevc_epel_h8_8_neon_i8mm, export=1 ext v7.16b, v4.16b, v4.16b, #3 zip1 v20.4s, v4.4s, v6.4s zip1 v21.4s, v5.4s, v7.4s - movi v16.2d, #0 - movi v17.2d, #0 + movi v16.16b, #0 + movi v17.16b, #0 usdot v16.4s, v20.16b, v30.16b usdot v17.4s, v21.16b, v30.16b xtn v16.4h, v16.4s @@ -761,9 +761,9 @@ function ff_hevc_put_hevc_epel_h12_8_neon_i8mm, export=1 trn1 v4.4s, v20.4s, v21.4s trn2 v5.4s, v20.4s, v21.4s trn1 v6.4s, v22.4s, v23.4s - movi v16.2d, #0 - movi v17.2d, #0 - movi v18.2d, #0 + movi v16.16b, #0 + movi v17.16b, #0 + movi v18.16b, #0 usdot v16.4s, v4.16b, v30.16b usdot v17.4s, v5.16b, v30.16b usdot v18.4s, v6.16b, v30.16b @@ -788,10 +788,10 @@ function ff_hevc_put_hevc_epel_h16_8_neon_i8mm, export=1 zip2 v22.4s, v0.4s, v6.4s zip1 v21.4s, v5.4s, v7.4s zip2 v23.4s, v5.4s, v7.4s - movi v16.2d, #0 - movi v17.2d, #0 - movi v18.2d, #0 - movi v19.2d, #0 + movi v16.16b, #0 + movi v17.16b, #0 + movi v18.16b, #0 + movi v19.16b, #0 usdot v16.4s, v20.16b, v30.16b usdot v17.4s, v21.16b, v30.16b usdot v18.4s, v22.16b, v30.16b @@ -815,14 +815,14 @@ function ff_hevc_put_hevc_epel_h24_8_neon_i8mm, export=1 ext v26.16b, v1.16b, v1.16b, #1 ext v27.16b, v1.16b, v1.16b, #2 ext v28.16b, v1.16b, v1.16b, #3 - movi v16.2d, #0 - movi v17.2d, #0 - movi v18.2d, #0 - movi v19.2d, #0 - movi v20.2d, #0 - movi v21.2d, #0 - movi v22.2d, #0 - movi v23.2d, #0 + movi v16.16b, #0 + movi v17.16b, #0 + movi v18.16b, #0 + movi v19.16b, #0 + movi v20.16b, #0 + movi v21.16b, #0 + movi v22.16b, #0 + movi v23.16b, #0 usdot v16.4s, v0.16b, v30.16b usdot v17.4s, v5.16b, v30.16b usdot v18.4s, v6.16b, v30.16b @@ -861,14 +861,14 @@ function ff_hevc_put_hevc_epel_h32_8_neon_i8mm, export=1 ext v26.16b, v1.16b, v2.16b, #1 ext v27.16b, v1.16b, v2.16b, #2 ext v28.16b, v1.16b, v2.16b, #3 - movi v16.2d, #0 - movi v17.2d, #0 - movi v18.2d, #0 - movi v19.2d, #0 - movi v20.2d, #0 - movi v21.2d, #0 - movi v22.2d, #0 - movi v23.2d, #0 + movi v16.16b, #0 + movi v17.16b, #0 + movi v18.16b, #0 + movi v19.16b, #0 + movi v20.16b, #0 + movi v21.16b, #0 + movi v22.16b, #0 + movi v23.16b, #0 usdot v16.4s, v0.16b, v30.16b usdot v17.4s, v5.16b, v30.16b usdot v18.4s, v6.16b, v30.16b @@ -900,18 +900,18 @@ function ff_hevc_put_hevc_epel_h48_8_neon_i8mm, export=1 ext v16.16b, v1.16b, v2.16b, #1 ext v17.16b, v1.16b, v2.16b, #2 ext v18.16b, v1.16b, v2.16b, #3 - movi v20.2d, #0 - movi v21.2d, #0 - movi v22.2d, #0 - movi v23.2d, #0 + movi v20.16b, #0 + movi v21.16b, #0 + movi v22.16b, #0 + movi v23.16b, #0 usdot v20.4s, v0.16b, v30.16b usdot v21.4s, v4.16b, v30.16b usdot v22.4s, v5.16b, v30.16b usdot v23.4s, v6.16b, v30.16b - movi v24.2d, #0 - movi v25.2d, #0 - movi v26.2d, #0 - movi v27.2d, #0 + movi v24.16b, #0 + movi v25.16b, #0 + movi v26.16b, #0 + movi v27.16b, #0 usdot v24.4s, v1.16b, v30.16b usdot v25.4s, v16.16b, v30.16b usdot v26.4s, v17.16b, v30.16b @@ -928,10 +928,10 @@ function ff_hevc_put_hevc_epel_h48_8_neon_i8mm, export=1 ext v4.16b, v2.16b, v3.16b, #1 ext v5.16b, v2.16b, v3.16b, #2 ext v6.16b, v2.16b, v3.16b, #3 - movi v20.2d, #0 - movi v21.2d, #0 - movi v22.2d, #0 - movi v23.2d, #0 + movi v20.16b, #0 + movi v21.16b, #0 + movi v22.16b, #0 + movi v23.16b, #0 usdot v20.4s, v2.16b, v30.16b usdot v21.4s, v4.16b, v30.16b usdot v22.4s, v5.16b, v30.16b @@ -957,18 +957,18 @@ function ff_hevc_put_hevc_epel_h64_8_neon_i8mm, export=1 ext v16.16b, v1.16b, v2.16b, #1 ext v17.16b, v1.16b, v2.16b, #2 ext v18.16b, v1.16b, v2.16b, #3 - movi v20.2d, #0 - movi v21.2d, #0 - movi v22.2d, #0 - movi v23.2d, #0 + movi v20.16b, #0 + movi v21.16b, #0 + movi v22.16b, #0 + movi v23.16b, #0 usdot v20.4s, v0.16b, v30.16b usdot v21.4s, v4.16b, v30.16b usdot v22.4s, v5.16b, v30.16b usdot v23.4s, v6.16b, v30.16b - movi v24.2d, #0 - movi v25.2d, #0 - movi v26.2d, #0 - movi v27.2d, #0 + movi v24.16b, #0 + movi v25.16b, #0 + movi v26.16b, #0 + movi v27.16b, #0 usdot v24.4s, v1.16b, v30.16b usdot v25.4s, v16.16b, v30.16b usdot v26.4s, v17.16b, v30.16b @@ -989,18 +989,18 @@ function ff_hevc_put_hevc_epel_h64_8_neon_i8mm, export=1 ext v16.16b, v3.16b, v7.16b, #1 ext v17.16b, v3.16b, v7.16b, #2 ext v18.16b, v3.16b, v7.16b, #3 - movi v20.2d, #0 - movi v21.2d, #0 - movi v22.2d, #0 - movi v23.2d, #0 + movi v20.16b, #0 + movi v21.16b, #0 + movi v22.16b, #0 + movi v23.16b, #0 usdot v20.4s, v2.16b, v30.16b usdot v21.4s, v4.16b, v30.16b usdot v22.4s, v5.16b, v30.16b usdot v23.4s, v6.16b, v30.16b - movi v24.2d, #0 - movi v25.2d, #0 - movi v26.2d, #0 - movi v27.2d, #0 + movi v24.16b, #0 + movi v25.16b, #0 + movi v26.16b, #0 + movi v27.16b, #0 usdot v24.4s, v3.16b, v30.16b usdot v25.4s, v16.16b, v30.16b usdot v26.4s, v17.16b, v30.16b @@ -1593,7 +1593,7 @@ function ff_hevc_put_hevc_epel_uni_w_h4_8_neon_i8mm, export=1 trn1 v0.2s, v0.2s, v2.2s trn1 v1.2s, v1.2s, v3.2s zip1 v0.4s, v0.4s, v1.4s - movi v16.2d, #0 + movi v16.16b, #0 usdot v16.4s, v0.16b, v28.16b mul v16.4s, v16.4s, v30.4s sqrshl v16.4s, v16.4s, v31.4s @@ -1620,8 +1620,8 @@ function ff_hevc_put_hevc_epel_uni_w_h6_8_neon_i8mm, export=1 trn2 v6.2s, v0.2s, v1.2s trn1 v5.2s, v2.2s, v3.2s zip1 v4.2d, v4.2d, v5.2d - movi v16.2d, #0 - movi v17.2d, #0 + movi v16.16b, #0 + movi v17.16b, #0 usdot v16.4s, v4.16b, v28.16b usdot v17.2s, v6.8b, v28.8b mul v16.4s, v16.4s, v30.4s @@ -1640,8 +1640,8 @@ function ff_hevc_put_hevc_epel_uni_w_h6_8_neon_i8mm, export=1 endfunc .macro EPEL_UNI_W_H_CALC s0, s1, d0, d1 - movi \d0\().2d, #0 - movi \d1\().2d, #0 + movi \d0\().16b, #0 + movi \d1\().16b, #0 usdot \d0\().4s, \s0\().16b, v28.16b usdot \d1\().4s, \s1\().16b, v28.16b mul \d0\().4s, \d0\().4s, v30.4s @@ -1687,7 +1687,7 @@ function ff_hevc_put_hevc_epel_uni_w_h12_8_neon_i8mm, export=1 zip2 v7.4s, v1.4s, v3.4s zip1 v6.4s, v6.4s, v7.4s EPEL_UNI_W_H_CALC v4, v5, v16, v17 - movi v18.2d, #0 + movi v18.16b, #0 usdot v18.4s, v6.16b, v28.16b mul v18.4s, v18.4s, v30.4s sqrshl v18.4s, v18.4s, v31.4s @@ -2575,7 +2575,7 @@ DISABLE_I8MM .endm .macro EPEL_UNI_W_V4_CALC d0, s0, s1, s2, s3 - movi \d0\().2d, #0 + movi \d0\().16b, #0 umlsl \d0\().8h, \s0\().8b, v0.8b umlal \d0\().8h, \s1\().8b, v1.8b umlal \d0\().8h, \s2\().8b, v2.8b @@ -2626,7 +2626,7 @@ function ff_hevc_put_hevc_epel_uni_w_v4_8_neon, export=1 endfunc .macro EPEL_UNI_W_V8_CALC d0, s0, s1, s2, s3, t0, t1 - movi \d0\().2d, #0 + movi \d0\().16b, #0 umlsl \d0\().8h, \s0\().8b, v0.8b umlal \d0\().8h, \s1\().8b, v1.8b umlal \d0\().8h, \s2\().8b, v2.8b @@ -2720,8 +2720,8 @@ function ff_hevc_put_hevc_epel_uni_w_v8_8_neon, export=1 endfunc .macro EPEL_UNI_W_V12_CALC d0, d1, s0, s1, s2, s3, t0, t1, t2, t3 - movi \d0\().2d, #0 - movi \d1\().2d, #0 + movi \d0\().16b, #0 + movi \d1\().16b, #0 umlsl \d0\().8h, \s0\().8b, v0.8b umlsl2 \d1\().8h, \s0\().16b, v0.16b umlal \d0\().8h, \s1\().8b, v1.8b @@ -2793,8 +2793,8 @@ function ff_hevc_put_hevc_epel_uni_w_v12_8_neon, export=1 endfunc .macro EPEL_UNI_W_V16_CALC d0, d1, s0, s1, s2, s3, t0, t1, t2, t3 - movi \d0\().2d, #0 - movi \d1\().2d, #0 + movi \d0\().16b, #0 + movi \d1\().16b, #0 umlsl \d0\().8h, \s0\().8b, v0.8b umlsl2 \d1\().8h, \s0\().16b, v0.16b umlal \d0\().8h, \s1\().8b, v1.8b diff --git a/libavcodec/aarch64/hevcdsp_qpel_neon.S b/libavcodec/aarch64/hevcdsp_qpel_neon.S index 8adfa38ccf..bcee627cba 100644 --- a/libavcodec/aarch64/hevcdsp_qpel_neon.S +++ b/libavcodec/aarch64/hevcdsp_qpel_neon.S @@ -2180,8 +2180,8 @@ function ff_hevc_put_hevc_qpel_uni_w_h4_8_neon_i8mm, export=1 ext v3.16b, v0.16b, v0.16b, #3 zip1 v0.2d, v0.2d, v1.2d zip1 v2.2d, v2.2d, v3.2d - movi v16.2d, #0 - movi v17.2d, #0 + movi v16.16b, #0 + movi v17.16b, #0 usdot v16.4s, v0.16b, v28.16b usdot v17.4s, v2.16b, v28.16b addp v16.4s, v16.4s, v17.4s @@ -2210,9 +2210,9 @@ function ff_hevc_put_hevc_qpel_uni_w_h6_8_neon_i8mm, export=1 zip1 v0.2d, v0.2d, v1.2d zip1 v2.2d, v2.2d, v3.2d zip1 v4.2d, v4.2d, v5.2d - movi v16.2d, #0 - movi v17.2d, #0 - movi v18.2d, #0 + movi v16.16b, #0 + movi v17.16b, #0 + movi v18.16b, #0 usdot v16.4s, v0.16b, v28.16b usdot v17.4s, v2.16b, v28.16b usdot v18.4s, v4.16b, v28.16b @@ -2236,10 +2236,10 @@ endfunc .macro QPEL_UNI_W_H_CALC s0, s1, s2, s3, d0, d1, d2, d3 - movi \d0\().2d, #0 - movi \d1\().2d, #0 - movi \d2\().2d, #0 - movi \d3\().2d, #0 + movi \d0\().16b, #0 + movi \d1\().16b, #0 + movi \d2\().16b, #0 + movi \d3\().16b, #0 usdot \d0\().4s, \s0\().16b, v28.16b usdot \d1\().4s, \s1\().16b, v28.16b usdot \d2\().4s, \s2\().16b, v28.16b @@ -2255,8 +2255,8 @@ endfunc .endm .macro QPEL_UNI_W_H_CALC_HALF s0, s1, d0, d1 - movi \d0\().2d, #0 - movi \d1\().2d, #0 + movi \d0\().16b, #0 + movi \d1\().16b, #0 usdot \d0\().4s, \s0\().16b, v28.16b usdot \d1\().4s, \s1\().16b, v28.16b addp \d0\().4s, \d0\().4s, \d1\().4s @@ -2606,8 +2606,8 @@ function ff_hevc_put_hevc_qpel_h4_8_neon_i8mm, export=1 ext v3.16b, v0.16b, v0.16b, #3 zip1 v0.2d, v0.2d, v1.2d zip1 v2.2d, v2.2d, v3.2d - movi v16.2d, #0 - movi v17.2d, #0 + movi v16.16b, #0 + movi v17.16b, #0 usdot v16.4s, v0.16b, v31.16b usdot v17.4s, v2.16b, v31.16b addp v16.4s, v16.4s, v17.4s @@ -2633,9 +2633,9 @@ function ff_hevc_put_hevc_qpel_h6_8_neon_i8mm, export=1 zip1 v0.2d, v0.2d, v1.2d zip1 v2.2d, v2.2d, v3.2d zip1 v4.2d, v4.2d, v5.2d - movi v16.2d, #0 - movi v17.2d, #0 - movi v18.2d, #0 + movi v16.16b, #0 + movi v17.16b, #0 + movi v18.16b, #0 usdot v16.4s, v0.16b, v31.16b usdot v17.4s, v2.16b, v31.16b usdot v18.4s, v4.16b, v31.16b @@ -2668,10 +2668,10 @@ function ff_hevc_put_hevc_qpel_h8_8_neon_i8mm, export=1 zip1 v2.2d, v2.2d, v3.2d zip1 v4.2d, v4.2d, v5.2d zip1 v6.2d, v6.2d, v7.2d - movi v16.2d, #0 - movi v17.2d, #0 - movi v18.2d, #0 - movi v19.2d, #0 + movi v16.16b, #0 + movi v17.16b, #0 + movi v18.16b, #0 + movi v19.16b, #0 usdot v16.4s, v0.16b, v31.16b usdot v17.4s, v2.16b, v31.16b usdot v18.4s, v4.16b, v31.16b @@ -2688,10 +2688,10 @@ function ff_hevc_put_hevc_qpel_h8_8_neon_i8mm, export=1 endfunc .macro QPEL_H_CALC s0, s1, s2, s3, d0, d1, d2, d3 - movi \d0\().2d, #0 - movi \d1\().2d, #0 - movi \d2\().2d, #0 - movi \d3\().2d, #0 + movi \d0\().16b, #0 + movi \d1\().16b, #0 + movi \d2\().16b, #0 + movi \d3\().16b, #0 usdot \d0\().4s, \s0\().16b, v31.16b usdot \d1\().4s, \s1\().16b, v31.16b usdot \d2\().4s, \s2\().16b, v31.16b @@ -2716,8 +2716,8 @@ function ff_hevc_put_hevc_qpel_h12_8_neon_i8mm, export=1 QPEL_H_CALC v16, v1, v2, v3, v20, v21, v22, v23 addp v20.4s, v20.4s, v22.4s addp v21.4s, v21.4s, v23.4s - movi v24.2d, #0 - movi v25.2d, #0 + movi v24.16b, #0 + movi v25.16b, #0 usdot v24.4s, v18.16b, v31.16b usdot v25.4s, v19.16b, v31.16b addp v24.4s, v24.4s, v25.4s diff --git a/libavcodec/aarch64/mpegaudiodsp_neon.S b/libavcodec/aarch64/mpegaudiodsp_neon.S index b6ef131228..9799c271d0 100644 --- a/libavcodec/aarch64/mpegaudiodsp_neon.S +++ b/libavcodec/aarch64/mpegaudiodsp_neon.S @@ -56,7 +56,7 @@ function ff_mpadsp_apply_window_\type\()_neon, export=1 .ifc \type, fixed ld1r {v16.2s}, [x2] // dither_state sxtl v16.2d, v16.2s - movi v29.2d, #0 + movi v29.16b, #0 movi v30.2d, #(1<