From patchwork Wed Aug 21 14:55:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramiro Polla X-Patchwork-Id: 51108 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:612c:4062:b0:48e:c0f8:d0de with SMTP id kz34csp688912vqb; Wed, 21 Aug 2024 14:31:19 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCWh0iQIYiyoOGfxgooctTQs0nV9UxDkfjq2ujkCzl8SFFH+khZDGWhqDM03zuIAYgtVplm6QtfZohTeMY6GuWft@gmail.com X-Google-Smtp-Source: AGHT+IEQObJDxjn9ryL50h5VeFz2AWzLka/oz9i7oZq2fdftzncSSJ97hzhnlfO94CsIABdwkps0 X-Received: by 2002:a05:651c:212a:b0:2ef:2373:5f90 with SMTP id 38308e7fff4ca-2f3f8720e22mr11934681fa.0.1724275878795; Wed, 21 Aug 2024 14:31:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1724275878; cv=none; d=google.com; s=arc-20160816; b=SZiHBX4PYR/TURCs+qOROLs3ZbcYT/YSv2p1ld8+nsIWDLF/x1kZcNi7EnVIQ67cZe /YRUO48IqOqdoiDsJAK/KSran+Oo3G2q5DjBPPBoazB9F1FO7o6z2lVataZCGUvWmPRW suIPFooBKKs+TcjyYxAp0eS+wz8r6YHryFw0fTv8Qcus0jixrhCMqhnJZgJ2cSUauiSz LhRYhMRO7IJCGqM2dxo54VqU1QndFru7plX4XiTZd2xDR4bQM5b/KmPZVPWR4a8OOIie B3GIDq163kRgnuFeRfvBoO/nsZhYubxn2b2rwS9dTyrw09U2uCtOLqDZKtRc+g09j37n JbEw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=FjqhPYmnFQtygoztVVIL7OClGzN2nVwhd96PMa/mL4M=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=dHuvjEohIw2LxJuXoN5WZuNv4/SajtzclWwif26nW/qa/3Fnp14tOaJ2q/J8fIq7I8 Pma2oLvUjJn/Y/w3eSX8HOHLY5HKtRlpDYSNQ7wWKBna79urqUcON7UVdYI1cL0dPPJn g2WRBlfB4lUdLxOHTvM9SVKPB8IBoW7/Y6r9sKfw9lMeyI1e7KRqEsloahLcDLnRiTNv RmE6Uy5qT3YfV3RkCp9M0HwVf2HsSFKkJtrETt3N+IFWcf/TbI3d5t8YlfYxgzCflDpv GxVtMqWJGupbcCad1SQZ95RVkRy/1AZVVIjXXPbhliVEf40MHvcZuQcf/FjKwken8g9w QvZw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b="Zjgq/Z4/"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 4fb4d7f45d1cf-5c04a3d28c1si80782a12.171.2024.08.21.14.31.18; Wed, 21 Aug 2024 14:31:18 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b="Zjgq/Z4/"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E6FB968DC2A; Wed, 21 Aug 2024 17:56:13 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wr1-f53.google.com (mail-wr1-f53.google.com [209.85.221.53]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 22A6968DAF8 for ; Wed, 21 Aug 2024 17:56:01 +0300 (EEST) Received: by mail-wr1-f53.google.com with SMTP id ffacd0b85a97d-3719753d365so3916192f8f.2 for ; Wed, 21 Aug 2024 07:56:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724252160; x=1724856960; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=7P6ur6hdfqL32kqkL2wiEZ+qGRxSD1ITEguf7RH4LYs=; b=Zjgq/Z4/w+uzCn/UDoo+9FkfZRLQdXCAtkpeI2Af8+02ys8hiGwKg7qaMQKaAbW32q lCDa6IkZHKwfpvMa3hkOGoyhudgLdZECOb/CQ5shpD0VGZW5pKP7Agv7EtQ7zargDSuW CSL28FtDJUuF3t6a9z2UgDahr4uNpI1CXExlCBgfWgYNrvI0iLP7zkks6VcyumGaJhpR IBtNOPVRGbme2Kc06AXQS93oO22mdg3KK7u7uTRSW+d3lJrWqkrGAZJJvqC0DJmg+92f U4HCIUknkqqjTFtx6RuyZQ4F2VA1FLa6nVSAQ11tpa5ndC+sIdBp3Hc8BLvISdyS2M6S SIgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724252160; x=1724856960; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7P6ur6hdfqL32kqkL2wiEZ+qGRxSD1ITEguf7RH4LYs=; b=OKCbg5vcHydx7AJMsXCbkxMxFYNPl9gcawPJgPdQdbBidLtHTpO8iom4n8nBh5PhGm Jwqz3FQ9PFIxsDjBIxEtnyTLX8JXKeoupwwk6hi+7ZGPrApWbngjOdPRAOKsU6FaJi2w kzMsa3fSsaHai0MliWQUPt5P33KCqRqmUIlrO6ODFTgbQBDGpAQo7KfTK6izVsaYZCbL H2MUxogpMeEq49tC+3L6wuy1zDaLHykWKJ0qF0nErQQ6WuW5jDKenQdYhGjpg2MifMKd +kAhTLIbfPcEkrao2OLKmR5TTKcq8PP7OTishp2mP4ayW3m8+mrn63udmLacQLacgigB 12Zg== X-Gm-Message-State: AOJu0YxCHGXKmPKQsJ3ImFkgaCBVzdo4vdFeSyJNZfQLZyxJ5UflHErR zVyWvtDI7Ih7Ums+NHuq8eBmuMaIoTOBRS1PWA2X3L61XLONDHLiXx2lPUP2 X-Received: by 2002:a5d:634c:0:b0:36b:a3c7:b9fd with SMTP id ffacd0b85a97d-372fd7297cbmr1715575f8f.56.1724252159515; Wed, 21 Aug 2024 07:55:59 -0700 (PDT) Received: from localhost.localdomain (196.105-180-91.adsl-dyn.isp.belgacom.be. [91.180.105.196]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-37189896c50sm15873554f8f.85.2024.08.21.07.55.58 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Aug 2024 07:55:58 -0700 (PDT) From: Ramiro Polla To: ffmpeg-devel@ffmpeg.org Date: Wed, 21 Aug 2024 16:55:50 +0200 Message-Id: <20240821145555.235323-3-ramiro.polla@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240821145555.235323-1-ramiro.polla@gmail.com> References: <20240821145555.235323-1-ramiro.polla@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2 2/7] checkasm/mpegvideoencdsp: add pix_sum, pix_norm1, and draw_edges X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: jiCI9PMemayN --- tests/checkasm/Makefile | 1 + tests/checkasm/checkasm.c | 3 + tests/checkasm/checkasm.h | 1 + tests/checkasm/mpegvideoencdsp.c | 147 +++++++++++++++++++++++++++++++ tests/fate/checkasm.mak | 1 + 5 files changed, 153 insertions(+) create mode 100644 tests/checkasm/mpegvideoencdsp.c diff --git a/tests/checkasm/Makefile b/tests/checkasm/Makefile index 3a7670e24b..7da58c14c4 100644 --- a/tests/checkasm/Makefile +++ b/tests/checkasm/Makefile @@ -18,6 +18,7 @@ AVCODECOBJS-$(CONFIG_LLVIDDSP) += llviddsp.o AVCODECOBJS-$(CONFIG_LLVIDENCDSP) += llviddspenc.o AVCODECOBJS-$(CONFIG_LPC) += lpc.o AVCODECOBJS-$(CONFIG_ME_CMP) += motion.o +AVCODECOBJS-$(CONFIG_MPEGVIDEOENC) += mpegvideoencdsp.o AVCODECOBJS-$(CONFIG_VC1DSP) += vc1dsp.o AVCODECOBJS-$(CONFIG_VP8DSP) += vp8dsp.o AVCODECOBJS-$(CONFIG_VIDEODSP) += videodsp.o diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c index 58597d3888..0bba4fb295 100644 --- a/tests/checkasm/checkasm.c +++ b/tests/checkasm/checkasm.c @@ -170,6 +170,9 @@ static const struct { #if CONFIG_ME_CMP { "motion", checkasm_check_motion }, #endif + #if CONFIG_MPEGVIDEOENC + { "mpegvideoencdsp", checkasm_check_mpegvideoencdsp }, + #endif #if CONFIG_OPUS_DECODER { "opusdsp", checkasm_check_opusdsp }, #endif diff --git a/tests/checkasm/checkasm.h b/tests/checkasm/checkasm.h index 4d5f3e387e..ba7e8c1ea0 100644 --- a/tests/checkasm/checkasm.h +++ b/tests/checkasm/checkasm.h @@ -110,6 +110,7 @@ void checkasm_check_llviddsp(void); void checkasm_check_llviddspenc(void); void checkasm_check_lpc(void); void checkasm_check_motion(void); +void checkasm_check_mpegvideoencdsp(void); void checkasm_check_nlmeans(void); void checkasm_check_opusdsp(void); void checkasm_check_pixblockdsp(void); diff --git a/tests/checkasm/mpegvideoencdsp.c b/tests/checkasm/mpegvideoencdsp.c new file mode 100644 index 0000000000..9d000b93a6 --- /dev/null +++ b/tests/checkasm/mpegvideoencdsp.c @@ -0,0 +1,147 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/intreadwrite.h" +#include "libavutil/mem.h" +#include "libavutil/mem_internal.h" + +#include "libavcodec/mpegvideoencdsp.h" + +#include "checkasm.h" + +#define randomize_buffers(buf, size) \ + do { \ + for (int j = 0; j < size; j += 4) \ + AV_WN32(buf + j, rnd()); \ + } while (0) + +static void check_pix_sum(MpegvideoEncDSPContext *c) +{ + LOCAL_ALIGNED_16(uint8_t, src, [16 * 16]); + + declare_func(int, const uint8_t *pix, int line_size); + + randomize_buffers(src, 16 * 16); + + for (int n = 0; n < 2; n++) { + const char *negstride_str = n ? "_negstride" : ""; + if (check_func(c->pix_sum, "pix_sum%s", negstride_str)) { + int sum0, sum1; + const uint8_t *pix = src + (n ? (15 * 16) : 0); + int line_size = 16 * (n ? -1 : 1); + sum0 = call_ref(pix, line_size); + sum1 = call_new(pix, line_size); + if (sum0 != sum1) + fail(); + bench_new(pix, line_size); + } + } +} + +static void check_pix_norm1(MpegvideoEncDSPContext *c) +{ + LOCAL_ALIGNED_16(uint8_t, src, [16 * 16]); + + declare_func(int, const uint8_t *pix, int line_size); + + randomize_buffers(src, 16 * 16); + + for (int n = 0; n < 2; n++) { + const char *negstride_str = n ? "_negstride" : ""; + if (check_func(c->pix_norm1, "pix_norm1%s", negstride_str)) { + int sum0, sum1; + const uint8_t *pix = src + (n ? (15 * 16) : 0); + int line_size = 16 * (n ? -1 : 1); + sum0 = call_ref(pix, line_size); + sum1 = call_new(pix, line_size); + if (sum0 != sum1) + fail(); + bench_new(pix, line_size); + } + } +} + +#define NUM_LINES 4 +#define MAX_LINE_SIZE 1920 +#define EDGE_WIDTH 16 +#define LINESIZE (EDGE_WIDTH + MAX_LINE_SIZE + EDGE_WIDTH) +#define BUFSIZE ((EDGE_WIDTH + NUM_LINES + EDGE_WIDTH) * LINESIZE) + +static void check_draw_edges(MpegvideoEncDSPContext *c) +{ + static const int input_sizes[] = {8, 128, 1080, MAX_LINE_SIZE, -MAX_LINE_SIZE}; + LOCAL_ALIGNED_16(uint8_t, buf0, [BUFSIZE]); + LOCAL_ALIGNED_16(uint8_t, buf1, [BUFSIZE]); + + declare_func_emms(AV_CPU_FLAG_MMX, void, uint8_t *buf, int wrap, int width, int height, + int w, int h, int sides); + + for (int isi = 0; isi < FF_ARRAY_ELEMS(input_sizes); isi++) { + int input_size = input_sizes[isi]; + int negstride = input_size < 0; + const char *negstride_str = negstride ? "_negstride" : ""; + int width = FFABS(input_size); + int linesize = EDGE_WIDTH + width + EDGE_WIDTH; + /* calculate height based on specified width to use the entire buffer. */ + int height = (BUFSIZE / linesize) - (2 * EDGE_WIDTH); + uint8_t *dst0 = buf0 + EDGE_WIDTH * linesize + EDGE_WIDTH; + uint8_t *dst1 = buf1 + EDGE_WIDTH * linesize + EDGE_WIDTH; + + if (negstride) { + dst0 += (height - 1) * linesize; + dst1 += (height - 1) * linesize; + linesize *= -1; + } + + for (int shift = 0; shift < 3; shift++) { + int edge = EDGE_WIDTH >> shift; + if (check_func(c->draw_edges, "draw_edges_%d_%d_%d%s", width, height, edge, negstride_str)) { + randomize_buffers(buf0, BUFSIZE); + memcpy(buf1, buf0, BUFSIZE); + call_ref(dst0, linesize, width, height, edge, edge, EDGE_BOTTOM | EDGE_TOP); + call_new(dst1, linesize, width, height, edge, edge, EDGE_BOTTOM | EDGE_TOP); + if (memcmp(buf0, buf1, BUFSIZE)) + fail(); + bench_new(dst1, linesize, width, height, edge, edge, EDGE_BOTTOM | EDGE_TOP); + } + } + } +} + +#undef NUM_LINES +#undef MAX_LINE_SIZE +#undef EDGE_WIDTH +#undef LINESIZE +#undef BUFSIZE + +void checkasm_check_mpegvideoencdsp(void) +{ + AVCodecContext avctx = { + .bits_per_raw_sample = 8, + }; + MpegvideoEncDSPContext c = { 0 }; + + ff_mpegvideoencdsp_init(&c, &avctx); + + check_pix_sum(&c); + report("pix_sum"); + check_pix_norm1(&c); + report("pix_norm1"); + check_draw_edges(&c); + report("draw_edges"); +} diff --git a/tests/fate/checkasm.mak b/tests/fate/checkasm.mak index 49832b09bf..5f8ed7584f 100644 --- a/tests/fate/checkasm.mak +++ b/tests/fate/checkasm.mak @@ -33,6 +33,7 @@ FATE_CHECKASM = fate-checkasm-aacencdsp \ fate-checkasm-llviddspenc \ fate-checkasm-lpc \ fate-checkasm-motion \ + fate-checkasm-mpegvideoencdsp \ fate-checkasm-opusdsp \ fate-checkasm-pixblockdsp \ fate-checkasm-sbrdsp \ From patchwork Wed Aug 21 14:55:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramiro Polla X-Patchwork-Id: 51103 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:612c:4062:b0:48e:c0f8:d0de with SMTP id kz34csp558541vqb; Wed, 21 Aug 2024 09:51:21 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCXSS7FJEQHZSRydbKTa6lol8aJ4NlBcPMXNev3fiNKG0yH+EGqTieJeEW+K8IhsrXP6ewI7BOOCIDrk5ggzXPPs@gmail.com X-Google-Smtp-Source: AGHT+IEp3JjsQx+jgopwBwh+jruJacqpma4HXc+74bAYLqa/BBcEJS7eKhME2ZW5avYbu/hMU8vj X-Received: by 2002:a05:6402:354a:b0:5be:df28:f6e3 with SMTP id 4fb4d7f45d1cf-5bf1f0d675bmr1950574a12.13.1724259080978; Wed, 21 Aug 2024 09:51:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1724259080; cv=none; d=google.com; s=arc-20160816; b=zDptwiPLoL30adp9ZHPGixlKnUOXpM4nCfq0q9KeN2EQwJUlalaxpNJSX12aSMvCC0 wyjueVxiPq5v835lD4+7ZG87RAjpYOKCTlUPT/6KZdh6uhOJ1Lae4IhBzO0a1mK8GQhf 0sNatXNKriOPeUo6Hjq0Z/HD5ZgxFz3O+PRiWA3MkFnYySb4d+F+Vdc/X3ydukpRyhv3 +0DBoGGf88r+aKc/Fks/jsDOEJN/o1BLf+TOt+jVTgVoVnfTBOCc+O+Ra6qnLZ6Qk2zH otX9Zmx1RWAGWaIutpNp/+R5CrnL2/FWVVUzlvymuQQ0EkfANWITsNGceQYisasVEfus lv2A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=lTfkphaXje1pVm8LmWj4c5g5H2pSt7SEM0O5VhNzfdc=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=Sru9MIMehUnKFuex3P8Ks+suWz18CX0h00RMUEpJxZXJLlcFUhDQA/inyJGIvGJa/v 93KhJMEUzgdV8FycVZ91szf9qfZYZpY0vccFMMHr451zPFPRAHeWnKhZ+2bnaw0u0VKS B808dZ2/BXfToneGT5wXcabT+/1ieOZHGyXBFDU9xFrDY6eiaSHg8ABS172qDSY9fq0T HXWgiEMxf8IskDhPsuDiEGgoHsOGsjHzCOCMf39aWh4Yt78+NMa7N0wHoh2/i1SeSYBH kXE5xUkOSG+OiQQXToZAmvKwP/bjf5wldUehJj7d2PDZxIJWKmZjCj7jVUAcJ9aGz1qu ygaQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=M8NXnq36; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 4fb4d7f45d1cf-5bebc0b6476si8627738a12.564.2024.08.21.09.51.20; Wed, 21 Aug 2024 09:51:20 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=M8NXnq36; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6A1AF68D714; Wed, 21 Aug 2024 17:56:17 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wr1-f44.google.com (mail-wr1-f44.google.com [209.85.221.44]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 3D61768DC19 for ; Wed, 21 Aug 2024 17:56:02 +0300 (EEST) Received: by mail-wr1-f44.google.com with SMTP id ffacd0b85a97d-3717de33d58so4341700f8f.1 for ; Wed, 21 Aug 2024 07:56:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724252161; x=1724856961; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=+fW+omCYAb6Y90JIBkMoPN3DD8yuqsenOfAVrNoVDRo=; b=M8NXnq367+RmqsFYM5XInHG3+GDxYTIivwp7fWbjkeIYJ/IICWvkLDzUezk0OPjaa+ 9B9K5tgtKQhQ09Y8PjD9jJ6942s7VIwudBpv0nzMXsFFi9nxzYZpfFniVCng0px8eDhx ni42ZOfsztXx8jlDV53cuIa+sbjls979SBnIYi9FmThbM5knKCniERhxMoaFVGOcWiur AAjcvrFMc5RCW0xJ0evEgB9VhhvKcN5a7AXoOErjtkpOl51J7+ftcEpOha+G2OxChPY/ 91TKLWSsifGGeTDB4+2MBTHzdOBXdswhOB0l3qgw66HQuvbulmJzLXL31Bi/aVSoTaHo 8q8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724252161; x=1724856961; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+fW+omCYAb6Y90JIBkMoPN3DD8yuqsenOfAVrNoVDRo=; b=cX9wN6xYiaY40agyoKgCJGRLFpDOPdLIcpL23vawpkjHu+IGCkTxkrMcCyS4mVbuD+ 8LFuD8gE17k4SnXvGJDlSA1F4ovLhJN3ExPvJDdakSg3vAuB6Eb+640IP5VKD4fiPjLV 9x6kCOEEw4sbGy+jb6VR87IErSk5rOB7qPXRNN3tnezJhsQltQljR1wX/dlJbkLC+kXi ThIhK1jdXhuJEwROdnWiPw1DNWpPx2whEoDdm30xNaHLZVmXctS3SOS60vJt/n3zrXwr SLIidcukwytsRLqp7Qi8aq0a3vbwrHSPVYYAskyRchVG5U8o0WeOF/WxxnEbH1E4n7a/ hRqg== X-Gm-Message-State: AOJu0YxEB1jJfCNe3wqId9XuVeWph5KNLcMqsoZxF1d0wJJ0BS/XlgkJ Mtu7KizCPwtn3tiCbTibObQCzoiqF26TFKhUHjWdW20yiQVKE79vW5IDLq/G X-Received: by 2002:a5d:584d:0:b0:367:8847:5bf4 with SMTP id ffacd0b85a97d-372fd57fafdmr2049354f8f.10.1724252160883; Wed, 21 Aug 2024 07:56:00 -0700 (PDT) Received: from localhost.localdomain (196.105-180-91.adsl-dyn.isp.belgacom.be. [91.180.105.196]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-37189896c50sm15873554f8f.85.2024.08.21.07.55.59 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Aug 2024 07:56:00 -0700 (PDT) From: Ramiro Polla To: ffmpeg-devel@ffmpeg.org Date: Wed, 21 Aug 2024 16:55:51 +0200 Message-Id: <20240821145555.235323-4-ramiro.polla@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240821145555.235323-1-ramiro.polla@gmail.com> References: <20240821145555.235323-1-ramiro.polla@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2 3/7] avcodec/aarch64/mpegvideoencdsp: add neon implementations for pix_sum and pix_norm1 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 16KDi+cWWYSu A55 A76 pix_norm1_c: 484.3 235.2 pix_norm1_neon: 193.8 ( 2.50x) 44.7 ( 5.26x) pix_sum_c: 302.8 243.7 pix_sum_neon: 81.6 ( 3.71x) 26.0 ( 9.37x) --- libavcodec/aarch64/Makefile | 2 + libavcodec/aarch64/mpegvideoencdsp_init.c | 39 +++++++++++++ libavcodec/aarch64/mpegvideoencdsp_neon.S | 69 +++++++++++++++++++++++ libavcodec/mpegvideoencdsp.c | 4 +- libavcodec/mpegvideoencdsp.h | 2 + 5 files changed, 115 insertions(+), 1 deletion(-) create mode 100644 libavcodec/aarch64/mpegvideoencdsp_init.c create mode 100644 libavcodec/aarch64/mpegvideoencdsp_neon.S diff --git a/libavcodec/aarch64/Makefile b/libavcodec/aarch64/Makefile index a3256bb1cc..de0653ebbc 100644 --- a/libavcodec/aarch64/Makefile +++ b/libavcodec/aarch64/Makefile @@ -10,6 +10,7 @@ OBJS-$(CONFIG_HPELDSP) += aarch64/hpeldsp_init_aarch64.o OBJS-$(CONFIG_IDCTDSP) += aarch64/idctdsp_init_aarch64.o OBJS-$(CONFIG_ME_CMP) += aarch64/me_cmp_init_aarch64.o OBJS-$(CONFIG_MPEGAUDIODSP) += aarch64/mpegaudiodsp_init.o +OBJS-$(CONFIG_MPEGVIDEOENC) += aarch64/mpegvideoencdsp_init.o OBJS-$(CONFIG_NEON_CLOBBER_TEST) += aarch64/neontest.o OBJS-$(CONFIG_PIXBLOCKDSP) += aarch64/pixblockdsp_init_aarch64.o OBJS-$(CONFIG_VIDEODSP) += aarch64/videodsp_init.o @@ -51,6 +52,7 @@ NEON-OBJS-$(CONFIG_IDCTDSP) += aarch64/idctdsp_neon.o \ aarch64/simple_idct_neon.o NEON-OBJS-$(CONFIG_ME_CMP) += aarch64/me_cmp_neon.o NEON-OBJS-$(CONFIG_MPEGAUDIODSP) += aarch64/mpegaudiodsp_neon.o +NEON-OBJS-$(CONFIG_MPEGVIDEOENC) += aarch64/mpegvideoencdsp_neon.o NEON-OBJS-$(CONFIG_PIXBLOCKDSP) += aarch64/pixblockdsp_neon.o NEON-OBJS-$(CONFIG_VC1DSP) += aarch64/vc1dsp_neon.o NEON-OBJS-$(CONFIG_VP8DSP) += aarch64/vp8dsp_neon.o diff --git a/libavcodec/aarch64/mpegvideoencdsp_init.c b/libavcodec/aarch64/mpegvideoencdsp_init.c new file mode 100644 index 0000000000..7eb632ed1b --- /dev/null +++ b/libavcodec/aarch64/mpegvideoencdsp_init.c @@ -0,0 +1,39 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include +#include + +#include "libavutil/attributes.h" +#include "libavutil/aarch64/cpu.h" +#include "libavcodec/mpegvideoencdsp.h" +#include "config.h" + +int ff_pix_sum16_neon(const uint8_t *pix, int line_size); +int ff_pix_norm1_neon(const uint8_t *pix, int line_size); + +av_cold void ff_mpegvideoencdsp_init_aarch64(MpegvideoEncDSPContext *c, + AVCodecContext *avctx) +{ + int cpu_flags = av_get_cpu_flags(); + + if (have_neon(cpu_flags)) { + c->pix_sum = ff_pix_sum16_neon; + c->pix_norm1 = ff_pix_norm1_neon; + } +} diff --git a/libavcodec/aarch64/mpegvideoencdsp_neon.S b/libavcodec/aarch64/mpegvideoencdsp_neon.S new file mode 100644 index 0000000000..6e7a9319ba --- /dev/null +++ b/libavcodec/aarch64/mpegvideoencdsp_neon.S @@ -0,0 +1,69 @@ +/* + * Copyright (c) 2024 Ramiro Polla + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/aarch64/asm.S" + +function ff_pix_sum16_neon, export=1 +// x0 const uint8_t *pix +// x1 int line_size + + add x2, x0, w1, sxtw + sbfiz x1, x1, #1, #32 + movi v0.16b, #0 + mov w3, #16 + +1: + ld1 {v1.16b}, [x0], x1 + ld1 {v2.16b}, [x2], x1 + subs w3, w3, #2 + uadalp v0.8h, v1.16b + uadalp v0.8h, v2.16b + b.ne 1b + + uaddlv s0, v0.8h + fmov w0, s0 + + ret +endfunc + +function ff_pix_norm1_neon, export=1 +// x0 const uint8_t *pix +// x1 int line_size + + sxtw x1, w1 + movi v4.16b, #0 + movi v5.16b, #0 + mov w2, #16 + +1: + ld1 {v1.16b}, [x0], x1 + subs w2, w2, #1 + umull v2.8h, v1.8b, v1.8b + umull2 v3.8h, v1.16b, v1.16b + uadalp v4.4s, v2.8h + uadalp v5.4s, v3.8h + b.ne 1b + + add v0.4s, v4.4s, v5.4s + uaddlv d0, v0.4s + fmov w0, s0 + + ret +endfunc diff --git a/libavcodec/mpegvideoencdsp.c b/libavcodec/mpegvideoencdsp.c index 9ccf1c302e..1091c94574 100644 --- a/libavcodec/mpegvideoencdsp.c +++ b/libavcodec/mpegvideoencdsp.c @@ -245,7 +245,9 @@ av_cold void ff_mpegvideoencdsp_init(MpegvideoEncDSPContext *c, c->draw_edges = draw_edges_8_c; -#if ARCH_ARM +#if ARCH_AARCH64 + ff_mpegvideoencdsp_init_aarch64(c, avctx); +#elif ARCH_ARM ff_mpegvideoencdsp_init_arm(c, avctx); #elif ARCH_PPC ff_mpegvideoencdsp_init_ppc(c, avctx); diff --git a/libavcodec/mpegvideoencdsp.h b/libavcodec/mpegvideoencdsp.h index 3925d87dab..f437bc4e4e 100644 --- a/libavcodec/mpegvideoencdsp.h +++ b/libavcodec/mpegvideoencdsp.h @@ -46,6 +46,8 @@ typedef struct MpegvideoEncDSPContext { void ff_mpegvideoencdsp_init(MpegvideoEncDSPContext *c, AVCodecContext *avctx); +void ff_mpegvideoencdsp_init_aarch64(MpegvideoEncDSPContext *c, + AVCodecContext *avctx); void ff_mpegvideoencdsp_init_arm(MpegvideoEncDSPContext *c, AVCodecContext *avctx); void ff_mpegvideoencdsp_init_ppc(MpegvideoEncDSPContext *c, From patchwork Wed Aug 21 14:55:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramiro Polla X-Patchwork-Id: 51104 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:612c:4062:b0:48e:c0f8:d0de with SMTP id kz34csp558579vqb; Wed, 21 Aug 2024 09:51:24 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCWSizn8J6e8x6SmWWv2BFkKP+K/4RX8/Zgonzjb/7Em49FaTprMHmqqjBjIJiNgpFaYtaccxjv7Oxs3JZE6ytHg@gmail.com X-Google-Smtp-Source: AGHT+IFWmlr1ZV8+UKbqTuLMbtcBm7XwI6F2k6uHRoRPCVJxFIkYamMknDJQ2KDVLbVf1WI6/S/7 X-Received: by 2002:a05:6402:380c:b0:5be:dab8:1bb3 with SMTP id 4fb4d7f45d1cf-5bf1f0dc628mr2463868a12.13.1724259084328; Wed, 21 Aug 2024 09:51:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1724259084; cv=none; d=google.com; s=arc-20160816; b=itr9SmJt8y3bo84VtyymivnGuzD1RTsu8YVgZkRZXssEK8q/BDVDuV1lsSzzW9zXIY kAOrQXurSkyI+8rT3ju02Z8s+GA9fwZ2vYrv4EjeUR9O0AYIGSFLd9F0xer1Bs5LMxVJ Y+PnvLn+Mvz0UkFVtcIxfkKVS362WtwbENN1HLVk2lvQYe3u09ySuUdVxiRHoZZqKKqb Cm1IwV66BoiV+y0dPClsKgE/R3mNwRWWGCLpHWIUNVjl+2vXT+QaE27L1htqhHAHEG5r 3W6l4JHczDXxiZcJhODacRUDJDk0CGEYNZp+xayXUpwVukKIZQX/NZm2AJ76TWwrWjRD Gt7w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=085Z7a9P7EYk7awyqiJu2iM/LezWrmqrSN3mNJ53Ubk=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=TZfyazy5j+pYx5IUTHU536HBUS5q/JTCWsWp9zf17JJhjNq5Pvp1zoH94bVq4wNqjh lRghz0pbCmr+1B7RSHM4N0SPOgotWVnEcTlb2MWgtWIDtpuKcgvPnNcs2ucQQYqIB0rv ae+efVG0QFpl1U7RrDDY/EqXX/RIpA3EDK5FuZA0i9IR4A/FU4HqPwwPMD8G8ivRue5i C6BWKFxq5ErqnLzrS7qj80PnICGKcr4NOxjqhrquTo1Z4MNO+U618VSZHYxbZGc+bYDK jKTIcvRKSNcM5ro1xr2SoyaY1olni9ggufjh7soYFb3mR/ypoTYR2GTH/MeQ5stAM+qr r/6w==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=mWuFrPky; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 4fb4d7f45d1cf-5bebc08a191si9134586a12.431.2024.08.21.09.51.23; Wed, 21 Aug 2024 09:51:24 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=mWuFrPky; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 09C7768DC3E; Wed, 21 Aug 2024 17:56:24 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wm1-f53.google.com (mail-wm1-f53.google.com [209.85.128.53]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 585BF68DC22 for ; Wed, 21 Aug 2024 17:56:03 +0300 (EEST) Received: by mail-wm1-f53.google.com with SMTP id 5b1f17b1804b1-42ab99fb45dso20259495e9.1 for ; Wed, 21 Aug 2024 07:56:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724252162; x=1724856962; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=ACm4Tuvqi694HqoJKH+BaiU2vMVvfMh8bxavo7472hU=; b=mWuFrPkyrpfEFxcakzSQJBlp2UslTiidEwJYxVY7pms3jboF0YBJtJoXs4DFIsZiC2 wOsZaet6Hl8PJ6klm8yASUDl1DIVkQ4CgB79ifQW2airhuqJcx1WY2xwBz3zJPj7n8v4 9FE4v6L9+7YeA4iEhY+2P6uAii0ekU6Dz8G5K0mRDUYjdmmE2+xz3xlKE406FWoeL/bH wW8AyPgERF6a4OLe116t1ezS16KhquOCPkCzrgaicQ7F5EIMqdUard9ui/J0WDVaBK2+ c8WG1+RRBQTXxVeXG6NsBH9BG0Xb4CI1L++ctflvlIoBZoqoDdf+jF35uk8/m37QdysM ynpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724252162; x=1724856962; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ACm4Tuvqi694HqoJKH+BaiU2vMVvfMh8bxavo7472hU=; b=jJBlivfe1B13ycMcKPK+tQgqEHbUIRZSFwE9KboEgQMTCuP2sN7I4S2vSZ37cLVVBe 0bOms561LHr/L8ky7EcoEDoMM3Sfr0Rnal/uVqboxWDO3VigsKxzgFHHYyjvMPMU00Mb axDBODEdS4IGGf1O2+CA5yfnqHbAdwg83oC8b+FO5LKPJObBIhJAufIbqyBkboQq0n1I 43OHOHlry5umjtYu9Q/DwRh/QDyNntEKIVhb6vnQp+pZItxAf28MM//fIVyYkvZWfhzy SLqaAq0EQZXKke768yn7LmvnPD+CYsXmMaZ94vdD5fc0xPGh3OcMLl0wsaybesO8qnhH iF0A== X-Gm-Message-State: AOJu0YxzeqMHJ1P2pdnqLzJsQr7w0X+kr/ARz2hf2sB8Q5ODxZLP+XXU 5pZEntfHOFZoLiXRV2O0PTSgA0eBKMmvOM9LaU25T22/TlqYGoJGKOxcAsrQ X-Received: by 2002:a5d:4b82:0:b0:371:93d1:428b with SMTP id ffacd0b85a97d-372fd826cb4mr2092984f8f.58.1724252162138; Wed, 21 Aug 2024 07:56:02 -0700 (PDT) Received: from localhost.localdomain (196.105-180-91.adsl-dyn.isp.belgacom.be. [91.180.105.196]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-37189896c50sm15873554f8f.85.2024.08.21.07.56.00 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Aug 2024 07:56:01 -0700 (PDT) From: Ramiro Polla To: ffmpeg-devel@ffmpeg.org Date: Wed, 21 Aug 2024 16:55:52 +0200 Message-Id: <20240821145555.235323-5-ramiro.polla@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240821145555.235323-1-ramiro.polla@gmail.com> References: <20240821145555.235323-1-ramiro.polla@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2 4/7] avcodec/aarch64/mpegvideoencdsp: add dotprod implementation for pix_norm1 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 9SGPvPTrkmfz A55 A76 pix_norm1_c: 484.3 235.2 pix_norm1_neon: 193.8 ( 2.50x) 44.7 ( 5.26x) pix_norm1_dotprod: 91.8 ( 5.28x) 21.2 (11.09x) --- libavcodec/aarch64/mpegvideoencdsp_init.c | 10 ++++++++ libavcodec/aarch64/mpegvideoencdsp_neon.S | 28 +++++++++++++++++++++++ 2 files changed, 38 insertions(+) diff --git a/libavcodec/aarch64/mpegvideoencdsp_init.c b/libavcodec/aarch64/mpegvideoencdsp_init.c index 7eb632ed1b..d0ce07e178 100644 --- a/libavcodec/aarch64/mpegvideoencdsp_init.c +++ b/libavcodec/aarch64/mpegvideoencdsp_init.c @@ -27,6 +27,10 @@ int ff_pix_sum16_neon(const uint8_t *pix, int line_size); int ff_pix_norm1_neon(const uint8_t *pix, int line_size); +#if HAVE_DOTPROD +int ff_pix_norm1_neon_dotprod(const uint8_t *pix, int line_size); +#endif + av_cold void ff_mpegvideoencdsp_init_aarch64(MpegvideoEncDSPContext *c, AVCodecContext *avctx) { @@ -36,4 +40,10 @@ av_cold void ff_mpegvideoencdsp_init_aarch64(MpegvideoEncDSPContext *c, c->pix_sum = ff_pix_sum16_neon; c->pix_norm1 = ff_pix_norm1_neon; } + +#if HAVE_DOTPROD + if (have_dotprod(cpu_flags)) { + c->pix_norm1 = ff_pix_norm1_neon_dotprod; + } +#endif } diff --git a/libavcodec/aarch64/mpegvideoencdsp_neon.S b/libavcodec/aarch64/mpegvideoencdsp_neon.S index 6e7a9319ba..0dbafef87b 100644 --- a/libavcodec/aarch64/mpegvideoencdsp_neon.S +++ b/libavcodec/aarch64/mpegvideoencdsp_neon.S @@ -67,3 +67,31 @@ function ff_pix_norm1_neon, export=1 ret endfunc + +#if HAVE_DOTPROD +ENABLE_DOTPROD + +function ff_pix_norm1_neon_dotprod, export=1 +// x0 const uint8_t *pix +// x1 int line_size + + sxtw x1, w1 + movi v0.16b, #0 + mov w2, #16 + +1: + ld1 {v1.16b}, [x0], x1 + ld1 {v2.16b}, [x0], x1 + udot v0.4s, v1.16b, v1.16b + subs w2, w2, #2 + udot v0.4s, v2.16b, v2.16b + b.ne 1b + + uaddlv d0, v0.4s + fmov w0, s0 + + ret +endfunc + +DISABLE_DOTPROD +#endif From patchwork Wed Aug 21 14:55:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramiro Polla X-Patchwork-Id: 51102 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:612c:4062:b0:48e:c0f8:d0de with SMTP id kz34csp520574vqb; Wed, 21 Aug 2024 08:51:24 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCXcabCe+4FNfJHHo0Ip1V4KJeZ2WvYCOMbQ/UinmAtKeWkJ17xH1nkvlzuodYA3LdWL33o2sX1bQE2/QiD4uky6@gmail.com X-Google-Smtp-Source: AGHT+IHjQ6HkT2un7PKbZh86kj7WxdkMUPndHATp+/6TVphVbNU7DldledOVrUPWNdGF5JmuNNcn X-Received: by 2002:a05:6512:696:b0:52e:f77b:bb58 with SMTP id 2adb3069b0e04-5334858e2dbmr1503122e87.36.1724255484580; Wed, 21 Aug 2024 08:51:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1724255484; cv=none; d=google.com; s=arc-20160816; b=V5reRpsZ05rGe9B4iTHJlIwQrOf0UC/keLmmIEdsZWGEgHJfXx9zdrfDavi1Xj6OqX m/N0D+ClNBhYOOcLMtxLhigf9lAQoPvgiq2bNYkNYJ0SCNMrNDKX0BObmKD3imUxPmWQ FyCFqkvau/ya3BXfuIfBER8rqD7OtPO0Mn0SlEjMEagYSt5FbErdZqJoo1am2KT8uYpm ATOnPzGwHA/Dm08h6aKT8VkZe2H+uv2NC1h4qiPCHOrRihS3TKud49kM9fZTPKvzjd6t wj+RKC/lWmb8g5e4A02PypTTreu+GPjFD5Ejlv1BBUNaML0JXsVfFo9F86S7DnrGf7ii 9gjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=/Kn1RaVpywzbHAm7CEW1GO9ORN0Dg9YfcoysCBnPYn8=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=YgcUt9dlGjpMHEfe5in8HEo9PnDd4IJPJC0QNdOnn7kWUHRK4OzoCpTHkzAeKb+Idd HGwzDeC7lzZJwWeUDUUtNVIic1fLWVJCAUYMWaMZ9SkYinSi6DnMwlqcT9xULQQkDmWx kiqUz2xCoLdq4v6191W1jnzpMZ24EXNfT7J3z7fyIz0Jm1nNHhhhugEoHq6AhRe4eAEn i3VI+JWtv1Ab6sK9GV6iyrhzok9xUEiKpuVs98UA2gm+9a8o1JiB1lkd8F7LAyZ1hwTA M5t7Iafcn0xu9jy99rF1weWdIKFPlCTmjgTqpQFEDsSdNqKGcOuuXzARduTpPsiD3nsZ MLgw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=YnNb5V8d; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 2adb3069b0e04-5334c75ad4dsi46618e87.303.2024.08.21.08.51.24; Wed, 21 Aug 2024 08:51:24 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=YnNb5V8d; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A252E68DC56; Wed, 21 Aug 2024 17:56:30 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lf1-f49.google.com (mail-lf1-f49.google.com [209.85.167.49]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 0783A68DC2D for ; Wed, 21 Aug 2024 17:56:07 +0300 (EEST) Received: by mail-lf1-f49.google.com with SMTP id 2adb3069b0e04-533461323cdso1835264e87.2 for ; Wed, 21 Aug 2024 07:56:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724252166; x=1724856966; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=uYZ2gXYrte0OBagmUhW+bit7Q3biwhuwonyYp/qJy4c=; b=YnNb5V8dpYCbWinVHmvU1b1ZGX2t9wN1wRPk2PrfvLHMMUiKICtNpuoVqTM8slMZRc 9FSnx1lPC3j7H4qNxQSYtAArxDLI+e+iGPMx3LGlNPeHHxcD5sx5pYcZ7m72r3leS1Q6 LxjWNrONb7fAAexOsjeLlnSZKyC20zhXXTQeabLC2hwH/DTwLuyVWXVw5mcqK1ZyRsLJ CMwUaTS4JIsV4+k9Ousvd50NB/r9+oH8k3TSNea2eRhrlT55n8/GOiTXGp4pG/6mN5OX n4HCTwXh3LyAOfS7/XsFbg3k75iqTn0oVFTzI6wwvkz+zkRLT+76oaLDXt30Oznhvtag G4ig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724252166; x=1724856966; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uYZ2gXYrte0OBagmUhW+bit7Q3biwhuwonyYp/qJy4c=; b=DepC11gobT7GSIJPF09xfzVilT3eXyHrlhqaLyj+ey1cxsvQUvSbNCa75wtnd+n77b iHfb0gveV32+NG/N48Z0UsL9uT7OMdDSClhLQkWgsnuK50HRMt1rZEEQp4uBGO4Ov+UY YMbRBEgL91X4m5xdLncqBmo4E8VAuHlM3hu/V2ZqNgwE1t480XH9CTvmgbGpmwUB3uEw uvVdSdxl/NApyueSXsPyGGdoNIyTY3pwjleXDLYVjPcBELy2HyD5NB6bhG2CEJ1ISTRd 9Wb0C8djQXZphnNeIumuVwE7nKpIJR0CiCU4ee365RxkZ42zJEMmJ5jma+9IAZEfXmSl YWcQ== X-Gm-Message-State: AOJu0Yw7oWD/Rdb39fdz26gr0gUJDzcyd0Y8xf/G6CT7OhHSZbcqeZHI 7bP3StGJeFSbnsX4HMCqmkcMFmNaOYDtjxB3okc1Z4oatejUm+pYLmeG7Nps X-Received: by 2002:a05:6512:3b8a:b0:52c:9e82:a971 with SMTP id 2adb3069b0e04-53348552de8mr1780115e87.7.1724252165535; Wed, 21 Aug 2024 07:56:05 -0700 (PDT) Received: from localhost.localdomain (196.105-180-91.adsl-dyn.isp.belgacom.be. [91.180.105.196]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-37189896c50sm15873554f8f.85.2024.08.21.07.56.04 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Aug 2024 07:56:04 -0700 (PDT) From: Ramiro Polla To: ffmpeg-devel@ffmpeg.org Date: Wed, 21 Aug 2024 16:55:55 +0200 Message-Id: <20240821145555.235323-8-ramiro.polla@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240821145555.235323-1-ramiro.polla@gmail.com> References: <20240821145555.235323-1-ramiro.polla@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2 7/7] avcodec/mpegvideoencdsp: speed up draw_edges_8_c by inlining it for all used edge widths X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 4kAXcs4Ajm6f This commit also restricts w to 4, 8, or 16. Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz: before after draw_edges_8_1724_4_c: 46796.5 7141.7 ( 6.55x) draw_edges_8_1724_8_c: 43584.5 7216.5 ( 6.04x) draw_edges_8_1724_16_c: 47007.2 10080.5 ( 4.66x) draw_edges_128_407_4_c: 11199.0 4185.0 ( 2.68x) draw_edges_128_407_8_c: 10660.2 4418.0 ( 2.41x) draw_edges_128_407_16_c: 11800.2 4634.5 ( 2.55x) draw_edges_1080_31_4_c: 1356.5 634.7 ( 2.14x) draw_edges_1080_31_8_c: 1972.0 1430.2 ( 1.38x) draw_edges_1080_31_16_c: 4621.0 4009.7 ( 1.15x) draw_edges_1920_4_4_c: 834.5 795.2 ( 1.05x) draw_edges_1920_4_4_negstride_c: 821.7 802.0 ( 1.02x) draw_edges_1920_4_8_c: 2782.2 2650.7 ( 1.05x) draw_edges_1920_4_8_negstride_c: 2724.7 2670.0 ( 1.02x) draw_edges_1920_4_16_c: 6437.5 6327.7 ( 1.02x) draw_edges_1920_4_16_negstride_c: 6395.2 6349.5 ( 1.01x) A55: before after draw_edges_8_1724_4_c: 52540.4 19739.2 ( 2.66x) draw_edges_8_1724_8_c: 45386.9 19847.4 ( 2.29x) draw_edges_8_1724_16_c: 51995.4 23284.7 ( 2.23x) draw_edges_128_407_4_c: 13401.1 6988.2 ( 1.92x) draw_edges_128_407_8_c: 12218.4 7527.9 ( 1.62x) draw_edges_128_407_16_c: 13695.9 8207.2 ( 1.67x) draw_edges_1080_31_4_c: 3702.9 3110.4 ( 1.19x) draw_edges_1080_31_8_c: 6015.6 5643.2 ( 1.07x) draw_edges_1080_31_16_c: 12281.9 11901.4 ( 1.03x) draw_edges_1920_4_4_c: 3957.9 3970.2 ( 1.00x) draw_edges_1920_4_4_negstride_c: 3964.1 3825.2 ( 1.04x) draw_edges_1920_4_8_c: 7757.9 7676.4 ( 1.01x) draw_edges_1920_4_8_negstride_c: 7923.6 7812.4 ( 1.01x) draw_edges_1920_4_16_c: 14791.6 15143.9 ( 0.98x) draw_edges_1920_4_16_negstride_c: 14788.6 15163.4 ( 0.98x) A76: before after draw_edges_8_1724_4_c: 39786.0 4968.5 ( 8.01x) draw_edges_8_1724_8_c: 32971.5 5069.5 ( 6.50x) draw_edges_8_1724_16_c: 40056.0 6017.2 ( 6.66x) draw_edges_128_407_4_c: 9517.2 1210.5 ( 7.86x) draw_edges_128_407_8_c: 8035.7 1346.2 ( 5.97x) draw_edges_128_407_16_c: 9946.5 1648.2 ( 6.03x) draw_edges_1080_31_4_c: 1308.0 660.7 ( 1.98x) draw_edges_1080_31_8_c: 1785.5 1270.7 ( 1.41x) draw_edges_1080_31_16_c: 3266.7 2591.5 ( 1.26x) draw_edges_1920_4_4_c: 1151.0 1090.7 ( 1.06x) draw_edges_1920_4_4_negstride_c: 1153.7 1096.5 ( 1.05x) draw_edges_1920_4_8_c: 2220.7 2186.5 ( 1.02x) draw_edges_1920_4_8_negstride_c: 2218.5 2193.5 ( 1.01x) draw_edges_1920_4_16_c: 4324.2 4230.0 ( 1.02x) draw_edges_1920_4_16_negstride_c: 4310.7 4233.0 ( 1.02x) --- libavcodec/mpegvideoencdsp.c | 22 +++++++++++++++++----- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/libavcodec/mpegvideoencdsp.c b/libavcodec/mpegvideoencdsp.c index 1091c94574..00a2c4ba71 100644 --- a/libavcodec/mpegvideoencdsp.c +++ b/libavcodec/mpegvideoencdsp.c @@ -114,19 +114,31 @@ static int pix_norm1_c(const uint8_t *pix, int line_size) return s; } +static av_always_inline void draw_edges_lr(uint8_t *ptr, int wrap, int width, int height, int w) +{ + for (int i = 0; i < height; i++) { + memset(ptr - w, ptr[0], w); + memset(ptr + width, ptr[width - 1], w); + ptr += wrap; + } +} + /* draw the edges of width 'w' of an image of size width, height */ // FIXME: Check that this is OK for MPEG-4 interlaced. static void draw_edges_8_c(uint8_t *buf, int wrap, int width, int height, int w, int h, int sides) { - uint8_t *ptr = buf, *last_line; + uint8_t *last_line; int i; /* left and right */ - for (i = 0; i < height; i++) { - memset(ptr - w, ptr[0], w); - memset(ptr + width, ptr[width - 1], w); - ptr += wrap; + if (w == 16) { + draw_edges_lr(buf, wrap, width, height, 16); + } else if (w == 8) { + draw_edges_lr(buf, wrap, width, height, 8); + } else { + av_assert1(w == 4); + draw_edges_lr(buf, wrap, width, height, 4); } /* top and bottom + corners */