From patchwork Wed Sep 18 13:11:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Zhili X-Patchwork-Id: 51647 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:d32e:0:b0:48e:c0f8:d0de with SMTP id cf14csp909183vqb; Wed, 18 Sep 2024 06:12:16 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCXoTPSTY5HyEYyvTquTY3x/eICHuzQo5hZPlATF8BM0G0OT4VuLZpQxIH+qNVNzU9oaRWYwVPxq8HCYnbZGkR4/@gmail.com X-Google-Smtp-Source: AGHT+IH2eB3pLnUhnd+5SsyKE5LAUyidZY0QvT6qBEIcdgcqL1vIIf/ICTd3VFrin5v8MHoRVy5/ X-Received: by 2002:a05:6000:402a:b0:378:955f:cc09 with SMTP id ffacd0b85a97d-378c2d5f6d9mr6552538f8f.11.1726665135701; Wed, 18 Sep 2024 06:12:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1726665135; cv=none; d=google.com; s=arc-20240605; b=QIn3lFHFscpdOT8Q13a5Y5+RuKoMSw6vtYjTDpVDnUJSXDx3LApGRWL5XfGsl9f80r xZo94FCqzF8K8SbgaZxwYf8b4BeJkq2/O0qk7HCExaoHiFTzkXDNNu0zGm3ccR8IhW3R /zmqhCNDuSPEc0g3qvewM2F324dRKGC2lZ9bbF99wAejgVR3baKopuskAXvD0H3jDKyB 4zI1KoLimryoG2DmztTB/rXaTKZZBnniCFnOFN1cOUgAluWfa4qXTRWZ1KCjgzyuG9Ff /P+UE5E01Cj1Vc/xc/v+cNxTlMJf2OfDnU8193Hq9br7NSPi3mSCsmvjdSXFNke7VIls pDvA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:date:to:from:message-id :dkim-signature:delivered-to; bh=juMz1TPqPOiWX8jUZ37CfAvlda5gahDqPw73EUOBLA8=; fh=OikgNYR8lxuqf+5LRACiqmqwtrs6CXfdvgu0CzK6TWk=; b=gePcSQph7igpOu3z60wLP+edJB+doeJrE0Gckn+tbAexGEbv8jEv7Kk2d86t7PiemN YbfXdiY3IQZhtdbLIQFmX4aJdEp6zqc+e+Z6xrNK5rCnmelD6tyEmTiWSrRLJWS3iVhf AZ/BTedFGpsptdqmuaQwaJXviYKwcaZCOTT42z2jaaLAqtlIbrVBgs3qSHISScBGOGPE 1Zcp18zNrllNnH6MU0HZxKojZrNSjDHfJ1B6IuN1k6uI42fUvtZpMxaupPlrWJ9A3dIY F6wxKEx6wOrN5yvXP2+ggBzPOvwcLSeuHKyAUZRVTsRAJFev0HA27QQQluyKG97KdZCc d8gQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=bNwiCojq; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 4fb4d7f45d1cf-5c42bc91796si6659842a12.378.2024.09.18.06.12.15; Wed, 18 Sep 2024 06:12:15 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=bNwiCojq; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8E47F68DC89; Wed, 18 Sep 2024 16:12:11 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-57-49.mail.qq.com (out162-62-57-49.mail.qq.com [162.62.57.49]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 6241B68D93D for ; Wed, 18 Sep 2024 16:12:02 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1726665106; bh=Efoe+bgVZKK/hy3wnshdhDyBdCj0dNj2/e8iTY1E2sQ=; h=From:To:Cc:Subject:Date; b=bNwiCojqQgLmvrfbvTXRvz48c6AsbuDm5NutVQwOeNW38gzFOyJ85uR/DP79GvS7N 3AoT550HzpLJAZW1XG1MG21zXq3R7U4LcY6SkefKhM5TS/YgpyLAYkRQhulF7sKPEk YAzbS0CLgDpjUhTcIPDuWHo6ALEp2wR1oAbQCVRM= Received: from ZHILIZHAO-MB1.tencent.com ([119.147.10.198]) by newxmesmtplogicsvrszb9-0.qq.com (NewEsmtp) with SMTP id 2ED1D043; Wed, 18 Sep 2024 21:11:45 +0800 X-QQ-mid: xmsmtpt1726665105trfvvs4z3 Message-ID: X-QQ-XMAILINFO: OZZSS56D9fAjzjrFF6PsJaW5QHjpeCMS2qfyuhloc5/yUSb5y95NVMPJAafyfs qArawp6FbwYjk1PcafyoS6ee/mF8wWLs4Fy6IFrj4PBkELwM8XvAfR5oLqXUazbCS0K1pHpmX+GL FlsriALH2qsRISj9UiHaMAiDj5Ei1tiZZMMeuKkeouwaVUJa3BXiowBNvbAb8yBjXpVIZhBfLHQB n7cpK5yIFldIKdLLHK3JZitUwcPrBIkxA5XC5Susk5RuVHZkh5SECCsIfYVF9Ewt40h0EdQcSgMH aQfDgCbaFi5G0vAiIcXktERZQWqsEe2UyEweqcMFqxvM8JRm1AFSbUbVBRXOSI1JFtn6xSkPxLQW zXtob4VPTyCeIJjQkAh7QGhgCMezwQ1rQM8l5dPn0SeOdBS2/Hc/jg7r0WwDjyrhWoams1I83J9k rf/W6p6gQynx70ipzP83fvXY0QDFdV14PwFn3EcwdZR8mz2JjGPVdkIGwzKbLc7NbQEXCCAiAyVi zCZ+VEN7FXAK7eS0ARp2eyqelOePrl5yhk31xX4SuHrVQPhzIR4mKbvCxY+bOqrDNmwXy1kqxPT7 JfBRttic0qmEfOqYO49H8x8MrcdfwxdELgS/59qH+rQ9w2tv9Wi2U80bqSPEWLDNbnN769lI3F9H 6EH1G1L4Z85A6wRS3pKUfsGwg6qTxVgci+LjEsPeKm7IHdr4utPm3FnmyKi4GshZRUWb3YeE0gOx E60hYUjeeldEzq1hE16zC9y2rLLsXtHxp1sLwhOToAr2HMoUfYbwT1YO8URIOjSNPXtuu3xgIVF0 vY6HkJDbYniouyaUXJzWMspa/aOGhycoEU6amVJdfmBI3DwCDRE6OjDEYJAnX96q5tf2H+w8mElt dosxYpl9CxS+Ee0CxaJexA4e1T/4W3copaYh3txPBENRfnQMy7iDwNe52Z+7jC3ANzG4F2g/8UcI tyVwZcZry3hEndoT1xJIPVb92C6QiQlEW0uPCR0Ft1cAHue1wvwVFgpxiQWrVja9YgyiNhIUQ= X-QQ-XMRINFO: NyFYKkN4Ny6FSmKK/uo/jdU= From: Zhao Zhili To: ffmpeg-devel@ffmpeg.org Date: Wed, 18 Sep 2024 21:11:44 +0800 X-OQ-MSGID: <20240918131144.91535-1-quinkblack@foxmail.com> X-Mailer: git-send-email 2.42.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2] swscale/aarch64: Fix rgb24toyv12 only works with aligned width X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Zhao Zhili , johzzy Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: KaYgfuFqlzz8 From: Zhao Zhili Since c0666d8b, rgb24toyv12 is broken for width non-aligned to 16. Add a simple wrapper to handle the non-aligned part. Signed-off-by: Zhao Zhili Co-authored-by: johzzy --- v2: test width 2 and 540 libswscale/aarch64/rgb2rgb.c | 23 ++++++++++++++++++++++- tests/checkasm/sw_rgb.c | 2 +- 2 files changed, 23 insertions(+), 2 deletions(-) diff --git a/libswscale/aarch64/rgb2rgb.c b/libswscale/aarch64/rgb2rgb.c index d978a6f173..20a25033cb 100644 --- a/libswscale/aarch64/rgb2rgb.c +++ b/libswscale/aarch64/rgb2rgb.c @@ -27,9 +27,30 @@ #include "libswscale/swscale.h" #include "libswscale/swscale_internal.h" +// Only handle width aligned to 16 void ff_rgb24toyv12_neon(const uint8_t *src, uint8_t *ydst, uint8_t *udst, uint8_t *vdst, int width, int height, int lumStride, int chromStride, int srcStride, int32_t *rgb2yuv); + +static void rgb24toyv12(const uint8_t *src, uint8_t *ydst, uint8_t *udst, + uint8_t *vdst, int width, int height, int lumStride, + int chromStride, int srcStride, int32_t *rgb2yuv) +{ + int width_align = width & (~15); + + if (width_align > 0) + ff_rgb24toyv12_neon(src, ydst, udst, vdst, width_align, height, + lumStride, chromStride, srcStride, rgb2yuv); + if (width_align < width) { + src += width_align * 3; + ydst += width_align; + udst += width_align / 2; + vdst += width_align / 2; + ff_rgb24toyv12_c(src, ydst, udst, vdst, width - width_align, height, + lumStride, chromStride, srcStride, rgb2yuv); + } +} + void ff_interleave_bytes_neon(const uint8_t *src1, const uint8_t *src2, uint8_t *dest, int width, int height, int src1Stride, int src2Stride, int dstStride); @@ -42,7 +63,7 @@ av_cold void rgb2rgb_init_aarch64(void) int cpu_flags = av_get_cpu_flags(); if (have_neon(cpu_flags)) { - ff_rgb24toyv12 = ff_rgb24toyv12_neon; + ff_rgb24toyv12 = rgb24toyv12; interleaveBytes = ff_interleave_bytes_neon; deinterleaveBytes = ff_deinterleave_bytes_neon; } diff --git a/tests/checkasm/sw_rgb.c b/tests/checkasm/sw_rgb.c index af9434073a..7a6d621375 100644 --- a/tests/checkasm/sw_rgb.c +++ b/tests/checkasm/sw_rgb.c @@ -129,7 +129,7 @@ static int cmp_off_by_n(const uint8_t *ref, const uint8_t *test, size_t n, int a static void check_rgb24toyv12(struct SwsContext *ctx) { - static const int input_sizes[] = {16, 128, 512, MAX_LINE_SIZE, -MAX_LINE_SIZE}; + static const int input_sizes[] = {2, 16, 128, 540, MAX_LINE_SIZE, -MAX_LINE_SIZE}; LOCAL_ALIGNED_32(uint8_t, src, [BUFSIZE * 3]); LOCAL_ALIGNED_32(uint8_t, buf_y_0, [BUFSIZE]);