From patchwork Tue Nov 28 17:00:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: flow gg X-Patchwork-Id: 44842 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:bca6:b0:181:818d:5e7f with SMTP id fx38csp4327278pzb; Tue, 28 Nov 2023 09:01:07 -0800 (PST) X-Google-Smtp-Source: AGHT+IH7AMr9s2ocPlG9ZDpnG84RyeP1gPdSc71xw35BBUj7DJI5KJ8fdLmFSBkqiu0vx9gEahhU X-Received: by 2002:a05:6512:3612:b0:50b:a889:f3dd with SMTP id f18-20020a056512361200b0050ba889f3ddmr7149664lfs.54.1701190866644; Tue, 28 Nov 2023 09:01:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701190866; cv=none; d=google.com; s=arc-20160816; b=kRayh8nYVTt5P+cL9Oid1Fkm88YkQJJQs3TH5s69rNHuVSPRH+3TnP+628vjr7eKtM SNe6MdPhcQuJeHwSK56vE/9/8eH0R+O4V971Sw4lBYkvUn/d9itQ369I997dPV7S3SAh TT0uR+0ab1QQfrhtLpJHIO6WgfflLe9iEHFwcPugzG1Ptg1UaxAXESJkUHGTmhZSOktn JH21VKjt8ZiM0nwDzc0nrQaqu9qJrD+oYFuHecX5foRp5wJdE5SIRa310fwlYCHV0hAZ rh0Bw4P/dcBrg5FPxdfavbSxtjM7F36CAGkWTEMNOQNYKIjFCMPMBz/wQtzbVqxn4mTW fguQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject:to :message-id:date:from:mime-version:dkim-signature:delivered-to; bh=TjPFE/pfbYZYtP6ujv0li61cZ8FepM4OHdDQ6ixVDyI=; fh=e5zN9xSzcxLA6bGo3lF+CqTbY/oLwzApV03EO/RBfgQ=; b=Uou0CjmXX+cNQr3+I4Pkc17tU1aHcXp2QB8PCdsg/ix/w82N34o7uX4yh0zkKU3ySO 8Nb6AB7qlkpPl6JMrgsbBy8JbiKKcci4RCekcRcZVcz0MsG46iAKyPiVIbKG3KRadRxY fs7gJ7nc/1xZ47C4nK8801cQCRK+lj7lDQ0kGSRgZvNtPy7S9ZUgntG9IU1tlOth4oEK z41QrDPJ1tRUieQqwhNUDUO4JXEsdBDoqF5f3JpMlmzCEVyyHM9Fc+TrngNBuDFk/D3a 2ibxPqQkDMgZF31WO4ZCTxzCxIBLEITQ5FMzV15zef9U8IabjfFHZad5VQVfUFMSrlMP ej+g== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=d50N2r+1; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id z10-20020a05640240ca00b0054b5c1d4ce0si2839189edb.213.2023.11.28.09.01.00; Tue, 28 Nov 2023 09:01:06 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=d50N2r+1; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 450BD68CFBF; Tue, 28 Nov 2023 19:00:58 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-ot1-f42.google.com (mail-ot1-f42.google.com [209.85.210.42]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 5DB3A68CA1B for ; Tue, 28 Nov 2023 19:00:51 +0200 (EET) Received: by mail-ot1-f42.google.com with SMTP id 46e09a7af769-6d8029dae41so2765762a34.0 for ; Tue, 28 Nov 2023 09:00:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701190850; x=1701795650; darn=ffmpeg.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=iepTfguSccveyu7IHTAWMRC/+CDJWc49qrc9A8ahsoM=; b=d50N2r+1NM+UzhcEFSUz9I1/JA/X6OclC3JiVyrgXamyB0kQAGiSKm500bfQGDe/x/ cMf/qOHEBH23ruU91iVJubD0bonYjIx6GQqo5OQ7cqj9L9OjBq+e7ZjFOWfGVAbTkbFv rBoxTuf+o4Yf+o4zOpQ4hrjHT1oFvBnG7j3qPoHq+UGz5ohoAQJdr/X8dJBWtFE/eIKF l0jWYUiBiS4lW0J6ZzcT0Z8rEXOTqYy5mXtypW3QGYB97vsdXdmigtDWeEFTt8boGpy8 6aGZEk+VReig5BB4vG1YNYXAvYjCe6yPBSsIe2fEN/OBL6FTG1PdhQ+11gAevZkQVFcB gUpg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701190850; x=1701795650; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=iepTfguSccveyu7IHTAWMRC/+CDJWc49qrc9A8ahsoM=; b=J29ianvZIbOFhPtQ4dG9Uet0b3Zu8hnNO2bcQ60r8towKhVDqXlvv7GeZJrkP/ZbrH DN6k5TEUbWlLljkMTyvFhq/ZF7zwZvqDXMFGbjQcmSkzKsnMxspSzhaNv6rg/TFzdQw5 sWsfOfqSe/qTj9GAvhfb42pZSBda454rlBQx+WLkReJ3FjPxw1syz1Mfleq0ncjzXB+S d/RqGVkzxi1ULHkT3MxjE6yls2yAfWeBXtHPGbjxGFDAJxL6tPnOM1eNkRDoQk1dRG9/ 832XPq/H8qICmxGDg4xbzbSqc51onEYJkUazRdAKbCudHYoD+hel7IrjT+FruZvP8lGx GT0A== X-Gm-Message-State: AOJu0YyzuTsw6aNc08uemJDLt4IXBNWZ5nR9NUpCgomZnx1giAeyZxmN WO5gdlsYwT6i3ySQfuBtfQ7Bsm/axGektJ9LehovgXGFnHz+OZ3o X-Received: by 2002:a05:6830:11:b0:6cd:a9d:bc57 with SMTP id c17-20020a056830001100b006cd0a9dbc57mr17349375otp.32.1701190848027; Tue, 28 Nov 2023 09:00:48 -0800 (PST) MIME-Version: 1.0 From: flow gg Date: Wed, 29 Nov 2023 01:00:37 +0800 Message-ID: To: FFmpeg development discussions and patches X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: [FFmpeg-devel] [PATCH 2/2] lavc/aacencdsp: R-V V abs_pow34 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: jZxVID0kGU6H c910: abs_pow34_c: 24610.7 abs_pow34_rvv_f32: 6177.7 (need use "[FFmpeg-devel] [PATCH 1/2] checkasm: test for abs_pow34" first) From 86577c2d40d29422c4b769c854df99a88c7b3c77 Mon Sep 17 00:00:00 2001 From: sunyuechi Date: Tue, 28 Nov 2023 20:14:14 +0800 Subject: [PATCH 2/2] lavc/aacencdsp: R-V V abs_pow34 c910: abs_pow34_c: 24610.7 abs_pow34_rvv_f32: 6177.7 --- libavcodec/aacenc.c | 4 +++ libavcodec/aacenc.h | 1 + libavcodec/riscv/Makefile | 1 + libavcodec/riscv/aacencdsp_init.c | 42 +++++++++++++++++++++++++++++++ libavcodec/riscv/aacencdsp_rvv.S | 38 ++++++++++++++++++++++++++++ 5 files changed, 86 insertions(+) create mode 100644 libavcodec/riscv/aacencdsp_init.c create mode 100644 libavcodec/riscv/aacencdsp_rvv.S diff --git a/libavcodec/aacenc.c b/libavcodec/aacenc.c index 443b25e25a..55c4bf55ce 100644 --- a/libavcodec/aacenc.c +++ b/libavcodec/aacenc.c @@ -1440,6 +1440,10 @@ void ff_aac_dsp_init(AACEncContext *s){ s->abs_pow34 = abs_pow34_v; s->quant_bands = quantize_bands; +#if ARCH_RISCV + ff_aac_dsp_init_riscv(s); +#endif + #if ARCH_X86 ff_aac_dsp_init_x86(s); #endif diff --git a/libavcodec/aacenc.h b/libavcodec/aacenc.h index 09dd8639be..18b424736d 100644 --- a/libavcodec/aacenc.h +++ b/libavcodec/aacenc.h @@ -155,6 +155,7 @@ typedef struct AACEncContext { } AACEncContext; void ff_aac_dsp_init(AACEncContext *s); +void ff_aac_dsp_init_riscv(AACEncContext *s); void ff_aac_dsp_init_x86(AACEncContext *s); void ff_aac_coder_init_mips(AACEncContext *c); void ff_quantize_band_cost_cache_init(struct AACEncContext *s); diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 2d0e6c19c8..6028f23b58 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -1,4 +1,5 @@ OBJS-$(CONFIG_AAC_DECODER) += riscv/aacpsdsp_init.o riscv/sbrdsp_init.o +OBJS-$(CONFIG_AAC_ENCODER) += riscv/aacencdsp_init.o riscv/aacencdsp_rvv.o RVV-OBJS-$(CONFIG_AAC_DECODER) += riscv/aacpsdsp_rvv.o riscv/sbrdsp_rvv.o OBJS-$(CONFIG_AC3DSP) += riscv/ac3dsp_init.o \ riscv/ac3dsp_rvb.o diff --git a/libavcodec/riscv/aacencdsp_init.c b/libavcodec/riscv/aacencdsp_init.c new file mode 100644 index 0000000000..83ae16f46b --- /dev/null +++ b/libavcodec/riscv/aacencdsp_init.c @@ -0,0 +1,42 @@ +/* + * AAC encoder assembly optimizations + * Copyright (c) 2023 Institue of Software Chinese Academy of Sciences (ISCAS). + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" + +#include "libavutil/attributes.h" +#include "libavutil/float_dsp.h" +#include "libavutil/cpu.h" +#include "libavcodec/aacenc.h" + +void ff_abs_pow34_rvv(float *out, const float *in, const int size); + +av_cold void ff_aac_dsp_init_riscv(AACEncContext *s) +{ +#if HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RVV_F32) { + if (flags & AV_CPU_FLAG_RVB_ADDR) { + s->abs_pow34 = ff_abs_pow34_rvv; + } + } +#endif +} diff --git a/libavcodec/riscv/aacencdsp_rvv.S b/libavcodec/riscv/aacencdsp_rvv.S new file mode 100644 index 0000000000..07f9e7228d --- /dev/null +++ b/libavcodec/riscv/aacencdsp_rvv.S @@ -0,0 +1,38 @@ +/* + * Copyright (c) 2023 Institue of Software Chinese Academy of Sciences (ISCAS). + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +func ff_abs_pow34_rvv, zve32f +1: + vsetvli t0, a2, e32, m4, ta, ma + sub a2, a2, t0 + vle32.v v0, (a1) + sh2add a1, t0, a1 + vfabs.v v0, v0 + vfsqrt.v v4, v0 + vfmul.vv v4, v4, v0 + vfsqrt.v v4, v4 + vse32.v v4, (a0) + sh2add a0, t0, a0 + bnez a2, 1b + + ret +endfunc -- 2.43.0