From patchwork Wed Sep 18 12:44:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Zhili X-Patchwork-Id: 51646 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:d32e:0:b0:48e:c0f8:d0de with SMTP id cf14csp891563vqb; Wed, 18 Sep 2024 05:45:49 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCUYa0cpy8JWXqPDRJBgy4PcnChD4vu3y+ys+bGI0z51EIsYHXNSONNk4p769u/iG+kpjEI4Q495khLnRpY2c/Lu@gmail.com X-Google-Smtp-Source: AGHT+IHd+QRgyaCPq8eGoBLdyd3Sub9D3ahpJmaGlJJOWKtz5UOqlyqvsuyyTOVCZBNZTl/E+rYC X-Received: by 2002:a17:907:2d88:b0:a86:7e7f:69ab with SMTP id a640c23a62f3a-a9047ca39e2mr1938691266b.15.1726663549592; Wed, 18 Sep 2024 05:45:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1726663549; cv=none; d=google.com; s=arc-20240605; b=k8zY8xiKKIol9vJtqwxYac6mXlHh0h+wqOUKuR/7RI0ZhpaJ3wT+t2VmdJU7kLCMgW oB2avCKqIYdh1GeP43IBBXaJrxsOXC9ioou4f0Ql5JgB52CZcbXdsKobZG3zdl6zJOst pnbDmvP9LjVEB4gN2IcgceJHMOSt9mZtP5NjbrMRDRrd0YCJ1/VStP49DRl2kFFbUUqd LpzXjgLMovmRNjAPm9zVMUUMbl+yyjyn0/hFCC37KH7Cfvsva1VSkCK9qWbYVg2F8Lyh /5HbWMtjkvJY99EowWDH73vR9ri2emwQ08hJTbcm+dOWtTe3ZQbTmCI/rrPjjgpKHzJ9 OZUg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:date:to:from:message-id :dkim-signature:delivered-to; bh=zWhBzwJ6Mf0uBPdHxaEJuXz0A0RyexG1wLi2fIAk31c=; fh=y+ouhbRqlYJ87+nl/Oz6omD2nlI0MOBhZeN2h+M4QK8=; b=M3PYY5RJOFtHNGI9Uxum5ht7h9ZO3YhgBfN2Ndy78c9Pd+9VMWAQy1FtgA5qG3TTL7 75xuJZ3nZcdn8LYNm0bmW8/kkQr1gNvwduOPXhK7y9Q4glab6ZTYzCTMC2ashy6CchhX yprsmYhbSO2wcle2xw9di6W42wiWWeqb6cWxZNwE09sQOOgUi9Zj7Q4aCO/mHbi+AZJn BHPvKgb8FF8mP1Mz/OHFilMeOMq/3reLMWfpyFpn0/0j7RZ2ByooQrYyTDy4RsOznq22 oMsQ0v1VGe+ZDQ10C0pnMd/Mpkd9VOHHptpjm7nY5s7SmxKf/QbidWNIvhm+itOoqyKZ r3wA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=I6xGk1kX; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a640c23a62f3a-a90612cafdasi677841266b.278.2024.09.18.05.45.48; Wed, 18 Sep 2024 05:45:49 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=I6xGk1kX; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3150C68DC37; Wed, 18 Sep 2024 15:45:45 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out203-205-221-239.mail.qq.com (out203-205-221-239.mail.qq.com [203.205.221.239]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 77C8368DA92 for ; Wed, 18 Sep 2024 15:45:36 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1726663531; bh=BPtFeCaN4DcxmZI4qaNMmVNon3FiYWyAWna2DZdK3cQ=; h=From:To:Cc:Subject:Date; b=I6xGk1kX05bmdcXygYuFuTKk6okJUfzVMY3+o7/Gq+30HBw6eGdJTUTi2KFVwrRo5 W9CHpBo9aTPI0r2ZYN6u3jS26fz63VzA0yaCNOnbXhczKW+RN6rdjcG599z70RmonW NcimICnzc8jGnv5o+vJjvaR/MTHC85yRTHkPfdWo= Received: from ZHILIZHAO-MB1.tencent.com ([119.147.10.198]) by newxmesmtplogicsvrszb16-1.qq.com (NewEsmtp) with SMTP id B23B8096; Wed, 18 Sep 2024 20:44:35 +0800 X-QQ-mid: xmsmtpt1726663475t7kbpgo22 Message-ID: X-QQ-XMAILINFO: OKkKo7I1HxIeVhdy/UkYWzRPs0nJnuwDIX02T776raQzzxH+7ao6lU81jViC1w Ok3jQmPGzoPjNS4SxhvqhtA6Rwko7M2aFT0XbaCTX5HfXN3LQi93sQfjMDMoTy5Rl8y0Iq+FuFti 0u9e31XLQ/4YGKACT/9UDVuLeNneCDWWNhVb/mCns5BBsBOLDn9ASEadrCz6Bce7iMZ6wmHQtlo6 Bl6f7FSHCxcVI01IozYPrnq5bqrL3ppr/E0YeLwXz9JQH/BapFVjKzynsZMJuIS5JAikTUidbA3w 9zdhR76uFUq7BlwKc7QrRjBC48y8/lkvS8wdDOIWlm6xEj/Dyp8murs7/j7wGjFw71mVYOJadJDP tQH7gpQ81WxeQVI13Z2+wz3DHJn2bHUqNetOjXekiiNglsWD6cyGh4j4SaVQUU3SJZj3RGw3F5mH FLBBNO7cgcqaNjGIo6BJNCF6Eesm3+Z5OsuyHf4J+NOozKv+g1YBZirDmLMzwSZr72vPwcJ7c0wM 1Ad84AajCLAMfTlTjKbBTn6M0VR+ln0AwUB10hLvpTZiI+5U/4uZ7XZONzZy7A4VnT54vTAUI1Jc lKCSJ4YE97UrJ5StgxCMb80YbBhdLBSRROjvPfOcwm2cuxH5iMSSiMxlv4//dfi02LKeLHpQ1kpu 8Xgz7l7vsaVB4Jiv4kJnyUGmBONeTtACc+D/lmFqD52ZbR+HAWT2zHm5Mk8Pp38jdo7w4TvuxGHl ErdEED4uEaxgkXTSEq2JzFGsOiFGFOs1VplxjXMRwOJebsSspTwPedywdfqDaRjPDyyHfKSbBbVB zkyZ6ZXmK57ANkos1MzumYXGjnKYcnA1NWA0WgoF1k2aKewdKIa7+xv39rPygCzO2h1fN/LKElp/ I/M8rwFNaRSJQzlT8Bha9mndgqoPoOqwSdTo4tayQLecbwqWxqxXF/vtzBpArSbvvs3iIQslT8+J Sxpy8+PESuVbquJIR51lAkaXF/PZqt8L8YzvmMLgto8kebOHWppGwRofro20bwpctkYmT6KWjoJd kt2sfcdQ== X-QQ-XMRINFO: Nq+8W0+stu50PRdwbJxPCL0= From: Zhao Zhili To: ffmpeg-devel@ffmpeg.org Date: Wed, 18 Sep 2024 20:44:34 +0800 X-OQ-MSGID: <20240918124434.78446-1-quinkblack@foxmail.com> X-Mailer: git-send-email 2.42.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] swscale/aarch64: Fix rgb24toyv12 only works with aligned width X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: ramiro.polla@gmail.com, Zhao Zhili , johzzy Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: aDqvcXDm3ejo From: Zhao Zhili Since c0666d8b, rgb24toyv12 is broken for width non-aligned to 16. Add a simple wrapper to handle the non-aligned part. Signed-off-by: Zhao Zhili Co-authored-by: johzzy --- libswscale/aarch64/rgb2rgb.c | 23 ++++++++++++++++++++++- tests/checkasm/sw_rgb.c | 2 +- 2 files changed, 23 insertions(+), 2 deletions(-) diff --git a/libswscale/aarch64/rgb2rgb.c b/libswscale/aarch64/rgb2rgb.c index d978a6f173..20a25033cb 100644 --- a/libswscale/aarch64/rgb2rgb.c +++ b/libswscale/aarch64/rgb2rgb.c @@ -27,9 +27,30 @@ #include "libswscale/swscale.h" #include "libswscale/swscale_internal.h" +// Only handle width aligned to 16 void ff_rgb24toyv12_neon(const uint8_t *src, uint8_t *ydst, uint8_t *udst, uint8_t *vdst, int width, int height, int lumStride, int chromStride, int srcStride, int32_t *rgb2yuv); + +static void rgb24toyv12(const uint8_t *src, uint8_t *ydst, uint8_t *udst, + uint8_t *vdst, int width, int height, int lumStride, + int chromStride, int srcStride, int32_t *rgb2yuv) +{ + int width_align = width & (~15); + + if (width_align > 0) + ff_rgb24toyv12_neon(src, ydst, udst, vdst, width_align, height, + lumStride, chromStride, srcStride, rgb2yuv); + if (width_align < width) { + src += width_align * 3; + ydst += width_align; + udst += width_align / 2; + vdst += width_align / 2; + ff_rgb24toyv12_c(src, ydst, udst, vdst, width - width_align, height, + lumStride, chromStride, srcStride, rgb2yuv); + } +} + void ff_interleave_bytes_neon(const uint8_t *src1, const uint8_t *src2, uint8_t *dest, int width, int height, int src1Stride, int src2Stride, int dstStride); @@ -42,7 +63,7 @@ av_cold void rgb2rgb_init_aarch64(void) int cpu_flags = av_get_cpu_flags(); if (have_neon(cpu_flags)) { - ff_rgb24toyv12 = ff_rgb24toyv12_neon; + ff_rgb24toyv12 = rgb24toyv12; interleaveBytes = ff_interleave_bytes_neon; deinterleaveBytes = ff_deinterleave_bytes_neon; } diff --git a/tests/checkasm/sw_rgb.c b/tests/checkasm/sw_rgb.c index af9434073a..a57c471e3b 100644 --- a/tests/checkasm/sw_rgb.c +++ b/tests/checkasm/sw_rgb.c @@ -129,7 +129,7 @@ static int cmp_off_by_n(const uint8_t *ref, const uint8_t *test, size_t n, int a static void check_rgb24toyv12(struct SwsContext *ctx) { - static const int input_sizes[] = {16, 128, 512, MAX_LINE_SIZE, -MAX_LINE_SIZE}; + static const int input_sizes[] = {4, 16, 128, 512, MAX_LINE_SIZE, -MAX_LINE_SIZE}; LOCAL_ALIGNED_32(uint8_t, src, [BUFSIZE * 3]); LOCAL_ALIGNED_32(uint8_t, buf_y_0, [BUFSIZE]);