From patchwork Tue Feb 6 15:55:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: flow gg X-Patchwork-Id: 46079 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:7b08:b0:19e:8a94:b663 with SMTP id s8csp1492971pzh; Tue, 6 Feb 2024 07:55:54 -0800 (PST) X-Google-Smtp-Source: AGHT+IHV4q2fmGCe9dX53gJEN8ig8o7RSzPnePyx92rQcyfZh1fxixJPmmcY82lPVDsmcxb9Ckmf X-Received: by 2002:a17:907:778a:b0:a38:c78:d4c1 with SMTP id ky10-20020a170907778a00b00a380c78d4c1mr1643854ejc.5.1707234954509; Tue, 06 Feb 2024 07:55:54 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1707234954; cv=none; d=google.com; s=arc-20160816; b=l3GRtjcxGFqyZfHfQJ6Kx2YPu/+3XSEnw+lyHkI87ROgJRbEQ0oAiqDsfiQRswLJe+ i0UVfhadNohokOHQFTnM2W4q7ZpiBMwmT1gEnqUk+p3qjZX67hBkb5Wqtcld0tlo2W4O DdeFHqkZdDdUz8PVvxBd7hN84DyyyTNpDULn1QDciKcQxDIkDwpZ+jEYZ56QP/DdheR6 /7wZldTnDCJalEn366jW/f5uhtkXXVhgCLEryKGfGjqvUJr4qmqDGmdz7eFGCWDbXgBP 0RhkL+OyN0bkBGxbbTP/+XxUquQEzo1v09ALxOzd5VIexqcwYLRKVvxz/mfNnjQrYFOK 3vUA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject:to :message-id:date:from:mime-version:dkim-signature:delivered-to; bh=nuWAtm9Y+dkimC5hIKfJqOqsnZrzfHqvpuT9v3F6upk=; fh=wC7FFo4WFL3YF2HH6X6DR7lcGXyRwxgSvaw9pAxufsQ=; b=fITRYA5MIqykafqahmAoVX2xiRMhdWlm2ykfXQ/h407aPqS1xhaV5cdcfjENOvyN9n pIMClJovUdNzuguD+cpi6m0oF9EXQxeroz3al8vQANPnjHSdfIuBMGUqmlaqs+5wBMe9 5zalQm+XCp1A8KXQhjJ+ncqhYSxC219m9TqFgwTFJqOWhcT9y0J0JrhrQD3ElZdE54Oj DfYYMgQjVJ/L/JFTehE63ctkwEQwSjD5wGQOg0vou001LaZMr6M9jNLqUQHnQasKaSiL 8bVBhw+t5aB8iTzRUB+XiSLNCneIahHMPpQ0OF5kN0ENar4KlZRtpkojo38VP05AjGJL 5ljA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=au7hIVc5; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com X-Forwarded-Encrypted: i=0; AJvYcCVsrPV5ne13ZPIDe5aIYqzmkTrKToGqp17/zWGTTDpeqGqvzzSLOkcqUWdLvGuLHHOLhJ7GYeHd/SrXTs55okebwrThoTIkujGtqN5aIoCbRXB5Qlb0sQdHDA6mfR2OBFS3NhpcFjokTF3u/KT7MRnMytsPlcCDWks/Da+pAMoaWqVMtEFxHYD7PGK0Js3EFkrPt3evq6NMysS7/aE8+06QfqyVbmI0xsh9G/dXe6jhkWQKkx+SXj3xAQck+0l/Wa3vPgg0hIQuFsjKof/Fj8mjFSqG+TBL9tefQF1pZW6VrMr8wk/BjWH3RKOTcT5PK7ZV8FIvDyYAJLm85B2oe1fM8mzmCkVy7XDc2QLEQ2kbgEtBiSDQxNm3yAuB7c2u0+JjbfYgKdEoG07hIOEvZRrmcw/Cmif6saINdsgicfmEiePOYM2ATcc+igkF8xmm4vlDFnAxBpZRKkJZTn6leReAbFPdW8AXV7uwg3q2CuLs5HiX6fNTTjlrcFwGrMw+trOIvgFKXoiZf68qiDK8qEJOqK8JhlVVKxUd/dq28ePGMR7EiEeFy48mjBX5r8PVfdQ7RDQbREWC9+iL2EOO5P8IgSihTghUNomyo23Cs5nFuVHzIgzb7zjAxw63C6yo0CQ8cuDZL1dl3L+S22j1/U+wPSyS8AZcgfPITTlOK65tkibfvGRs7kH337MuxEtZZn0Vi423ZWLEH97dyVczcSSrbENzQt1b3xzbdmtG1MaU9J2yJFI3ncCKQcgOk2+vvxdiOJ/uaBJDjie/gwdrZaO02oHPCFpmQiBQ+Udv5mbIhcs1pB5d1EBCPVcKWXhV18RaVVMw0qJCSy9mz28wI5VNUKqhVvlCh98dXnU5JoCSMJnao11Ubuc9hhIuNqP6K8ViuD/ySJi/I30IGe5Eor5hAZVowcFhngkkn3TzVQlY4fht4hCzCvdotSJ7lj+A+Wp6ET CNu694XiNF6mbMPvBB42rCJ+AbWU/4IOhV2qDV11u+M+CDHYcweIoohxCtYdAVjcICv314nr3xLQjI8zGd22kgVPR0vHME1vLLT4DdL5LxomrGg9XAGQSGhe5D3mlr1BkjUoA3iuPzflLSybFmyecK2PTHnCWbm7WBbGKbB2UG7P9BEVnIU5JgtjtnwauvOlktVVvZZr++nnB+0MxzZnx5s2WvgT0CVZbwqKCv4CfmISc5L3w+2v//YQ1U1d4h39TPUVWlw8QZqIDo2ZeQl5XO77yrPBewNDTB0nac3pTMwoKfHyo1tH6AxKC0JzR4j1SqfARSv6E52mdyUiFc+Y30HEEoUaF9rB6Dq46Oh/u/fXTIWmlgGEZgLsJzJO9V6lD9dHm1Bm6cJvx9KZr7VQsBox/gH3bhbUzo39pPhB3vPY3DIjw53ViSChXfrZIjSArfTdHZAJUMJSfdLlC8v2y6PuIj/E1F89sbOqG+ZUIukGCSwahYCJ7cnDU8lpJlBrjQEwEAIFiUDIttxhM1cpwmyQxIQxmvA9JP2SV64lAfIyjqKY3e9oZ8zuw91nPHgtKZK78PTIOZqQZFAPxF7MNw/rlNLueYuhwCoyI/DB5ZIneH7TEihNrZQcHkILXWb1Js76Tu Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id z22-20020a170906241600b00a38055a71c6si1196135eja.632.2024.02.06.07.55.54; Tue, 06 Feb 2024 07:55:54 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=au7hIVc5; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 25F5F68CDBE; Tue, 6 Feb 2024 17:55:51 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-qv1-f48.google.com (mail-qv1-f48.google.com [209.85.219.48]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id DC9BE68CADA for ; Tue, 6 Feb 2024 17:55:44 +0200 (EET) Received: by mail-qv1-f48.google.com with SMTP id 6a1803df08f44-6818f3cf006so27500776d6.2 for ; Tue, 06 Feb 2024 07:55:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707234943; x=1707839743; darn=ffmpeg.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=Jsw8qYJnYEplyZ35u9etE45zI5gAueiJUjpgreEzzLM=; b=au7hIVc5K92OYjp5hn++eV7FnHEMnlunnuWsWgGIylElTr9DV5qxnnTuwEFSV8bYf4 HOs8VDUQyNFeXmOGY5Wj4vp44RAGQL4iuNhruSCl+jJ6g6T0bg4FMYHUq/nYOHg3UB4v 7HHaAKs74O0VwAIC7ZEBaXNsGOaQihY0fdTnyI9g+eHGRL7WVwoB8E9EzAuvRqLSRoUx EQBqJSx7l5aUaL/yd+Ovadhc8v19b0nA9vHRZHjeqd5hnhi/Eu954vNDgI7AYNOnXkqd HHIDykwQbtxAStl9B8T0SZHQh2pY0WTmscFiQIaIBHThZ/J5Alno6HolmAIVpN4TjIb5 cGSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707234943; x=1707839743; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=Jsw8qYJnYEplyZ35u9etE45zI5gAueiJUjpgreEzzLM=; b=W6l2BP9gGUBdTt4OJBR9tD7rwD0Zf7dGLCWKCGjhhi5Khz64stX1ediM6Al1RfqZgD Y1Lb4K59n1yapkIbw2CUMYXJPhb7vDKdr7yLVB7Vt0n7ghzk4BQ0rL97Jo6qcE9z3knK 38Fhtbk9I0F+QTUoMlOOuN6AsIJC0SgMIU2fnGjKSpyql11VRKKVbndJuCXPSBBImNIX hszytrH4XR167giqyzIDIB5TRvEpcnG0dtA+9RD9V93L6nPCltIDC+VpbmcdbTmAzYq+ 0YcNS+wX/yOACAVQQKFtcbNdT9q3bMpHhUtLqNSnYnM0UszxuOMl0/KzQa0RyJA6ZAAC 4pDQ== X-Gm-Message-State: AOJu0YxtlR7L63W7lmAoEzx99Q1vBwWWxohK2rFN8nOrvbNXQGhtm6jN vYkDpJmkSSWjWYcSaW3BnvQuvHfIK/FQKp0DBCy1Z/6sP4zXwsuHr520c/5fZZW4ltMZQej4loY HsBCvjN63ENBXAq0Xf9+zloF1a9aU3uYs X-Received: by 2002:a05:6214:2583:b0:68c:9637:3c79 with SMTP id fq3-20020a056214258300b0068c96373c79mr3205516qvb.30.1707234943256; Tue, 06 Feb 2024 07:55:43 -0800 (PST) MIME-Version: 1.0 From: flow gg Date: Tue, 6 Feb 2024 23:55:32 +0800 Message-ID: To: FFmpeg development discussions and patches X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: [FFmpeg-devel] [PATCH 1/7] lavc/me_cmp: R-V V pix_abs X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: MsFao7pVPQub From d4d6b3ea040f3f7997463b4452813bc75d1c9f9d Mon Sep 17 00:00:00 2001 From: sunyuechi Date: Sat, 3 Feb 2024 10:58:13 +0800 Subject: [PATCH 1/7] lavc/me_cmp: R-V V pix_abs C908: pix_abs_0_0_c: 534.0 pix_abs_0_0_rvv_i32: 136.2 pix_abs_1_0_c: 287.7 pix_abs_1_0_rvv_i32: 125.2 sad_0_c: 534.0 sad_0_rvv_i32: 136.2 sad_1_c: 287.7 sad_1_rvv_i32: 125.2 --- libavcodec/me_cmp.c | 2 + libavcodec/me_cmp.h | 1 + libavcodec/riscv/Makefile | 2 + libavcodec/riscv/me_cmp_init.c | 46 +++++++++++++++++++++++ libavcodec/riscv/me_cmp_rvv.S | 67 ++++++++++++++++++++++++++++++++++ 5 files changed, 118 insertions(+) create mode 100644 libavcodec/riscv/me_cmp_init.c create mode 100644 libavcodec/riscv/me_cmp_rvv.S diff --git a/libavcodec/me_cmp.c b/libavcodec/me_cmp.c index fecd70d723..8f4b3d0ad5 100644 --- a/libavcodec/me_cmp.c +++ b/libavcodec/me_cmp.c @@ -1136,6 +1136,8 @@ av_cold void ff_me_cmp_init(MECmpContext *c, AVCodecContext *avctx) ff_me_cmp_init_arm(c, avctx); #elif ARCH_PPC ff_me_cmp_init_ppc(c, avctx); +#elif ARCH_RISCV + ff_me_cmp_init_riscv(c, avctx); #elif ARCH_X86 ff_me_cmp_init_x86(c, avctx); #elif ARCH_MIPS diff --git a/libavcodec/me_cmp.h b/libavcodec/me_cmp.h index aefd32a7dc..fee0ecb28e 100644 --- a/libavcodec/me_cmp.h +++ b/libavcodec/me_cmp.h @@ -86,6 +86,7 @@ void ff_me_cmp_init_aarch64(MECmpContext *c, AVCodecContext *avctx); void ff_me_cmp_init_alpha(MECmpContext *c, AVCodecContext *avctx); void ff_me_cmp_init_arm(MECmpContext *c, AVCodecContext *avctx); void ff_me_cmp_init_ppc(MECmpContext *c, AVCodecContext *avctx); +void ff_me_cmp_init_riscv(MECmpContext *c, AVCodecContext *avctx); void ff_me_cmp_init_x86(MECmpContext *c, AVCodecContext *avctx); void ff_me_cmp_init_mips(MECmpContext *c, AVCodecContext *avctx); diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 97067558d8..dff8784102 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -41,6 +41,8 @@ OBJS-$(CONFIG_LLVIDENCDSP) += riscv/llvidencdsp_init.o RVV-OBJS-$(CONFIG_LLVIDENCDSP) += riscv/llvidencdsp_rvv.o OBJS-$(CONFIG_LPC) += riscv/lpc_init.o RVV-OBJS-$(CONFIG_LPC) += riscv/lpc_rvv.o +OBJS-$(CONFIG_ME_CMP) += riscv/me_cmp_init.o +RVV-OBJS-$(CONFIG_ME_CMP) += riscv/me_cmp_rvv.o OBJS-$(CONFIG_OPUS_DECODER) += riscv/opusdsp_init.o RVV-OBJS-$(CONFIG_OPUS_DECODER) += riscv/opusdsp_rvv.o OBJS-$(CONFIG_PIXBLOCKDSP) += riscv/pixblockdsp_init.o diff --git a/libavcodec/riscv/me_cmp_init.c b/libavcodec/riscv/me_cmp_init.c new file mode 100644 index 0000000000..9228f74cfd --- /dev/null +++ b/libavcodec/riscv/me_cmp_init.c @@ -0,0 +1,46 @@ +/* + * Copyright (c) 2024 Institue of Software Chinese Academy of Sciences (ISCAS). + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" + +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavutil/riscv/cpu.h" +#include "libavcodec/me_cmp.h" +#include "libavcodec/mpegvideo.h" + +int ff_pix_abs16_rvv(MpegEncContext *v, const uint8_t *pix1, const uint8_t *pix2, + ptrdiff_t stride, int h); +int ff_pix_abs8_rvv(MpegEncContext *v, const uint8_t *pix1, const uint8_t *pix2, + ptrdiff_t stride, int h); + +av_cold void ff_me_cmp_init_riscv(MECmpContext *c, AVCodecContext *avctx) +{ +#if HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RVV_I32 && ff_get_rv_vlenb() >= 16) { + c->pix_abs[0][0] = ff_pix_abs16_rvv; + c->sad[0] = ff_pix_abs16_rvv; + c->pix_abs[1][0] = ff_pix_abs8_rvv; + c->sad[1] = ff_pix_abs8_rvv; + } +#endif +} diff --git a/libavcodec/riscv/me_cmp_rvv.S b/libavcodec/riscv/me_cmp_rvv.S new file mode 100644 index 0000000000..8dadf39bc7 --- /dev/null +++ b/libavcodec/riscv/me_cmp_rvv.S @@ -0,0 +1,67 @@ +/* + * Copyright (c) 2024 Institue of Software Chinese Academy of Sciences (ISCAS). + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +.macro pix_abs_ret + vsetivli zero, 1, e32, m1, ta, ma + vmv.x.s a0, v0 + ret +.endm + +func ff_pix_abs16_rvv, zve32x + vsetivli zero, 1, e32, m1, ta, ma + vmv.s.x v0, zero +1: + vsetivli zero, 16, e8, m1, tu, ma + vle8.v v4, (a1) + vle8.v v12, (a2) + addi a4, a4, -1 + vwsubu.vv v16, v4, v12 + add a1, a1, a3 + vwsubu.vv v20, v12, v4 + vsetvli zero, zero, e16, m2, tu, ma + vmax.vv v16, v16, v20 + add a2, a2, a3 + vwredsum.vs v0, v16, v0 + bnez a4, 1b + + pix_abs_ret +endfunc + +func ff_pix_abs8_rvv, zve32x + vsetivli zero, 1, e32, m1, ta, ma + vmv.s.x v0, zero +1: + vsetivli zero, 8, e8, mf2, tu, ma + vle8.v v4, (a1) + vle8.v v12, (a2) + addi a4, a4, -1 + vwsubu.vv v16, v4, v12 + add a1, a1, a3 + vwsubu.vv v20, v12, v4 + vsetvli zero, zero, e16, m1, tu, ma + vmax.vv v16, v16, v20 + add a2, a2, a3 + vwredsum.vs v0, v16, v0 + bnez a4, 1b + + pix_abs_ret +endfunc -- 2.43.0