From patchwork Sat May 25 19:09:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 49263 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:542:0:b0:460:55fa:d5ed with SMTP id 63csp2448429vqf; Sat, 25 May 2024 12:10:06 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCWJqwApF++PNcAwO15ojL0dgMuylLXBCorvYP4UX0zX+tGdJ3ez0X4jT1t5u6G7sLmmKxIY0r61KUnfm8eLDOsh7Xt7aLOWJbNyLw== X-Google-Smtp-Source: AGHT+IGEkpJQafh1ivVAJOqNe/J0OD0XRMT4YSm4Mmq1dFy7kwVbyUMDgkc+n4ouKfC9Vlo6y8x6 X-Received: by 2002:a17:906:8887:b0:a59:a0b7:1850 with SMTP id a640c23a62f3a-a62652537ccmr379284266b.5.1716664206549; Sat, 25 May 2024 12:10:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1716664206; cv=none; d=google.com; s=arc-20160816; b=HqDIV0Eql5zplXicrTdyS9cT1cs62S5dBfcamwlzHzGjgB42vbfu7VS17PDgqb9TO5 Wa3C8AzMJ8n2FWIyVBTtPC6fSSj4H9wH2gNURlMkPxJBGaIRePQurFmf4qZjyF46G2/Q QaOZl6Hm020kwdk/X19xnKeEs44EgNfyxFyYX8iI+T182PrJLOiyGtlo5b8YrweTU3mQ sNAobiQB2wVEGo+CUQysvKYJ/TItbtcbLmfEmXtsVYzksKq6KCd5g0R2bRcOR6AMfyYY ruHr30xJyjqYjTUhNhDPx9EwZuMbzAeIuvqcd5zOkw+LMZ76HUDICpmYqbffkby9Vyh4 Rqhg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=Ga5DyzXyMf4yvNjxVzOQTsje1gJ6TTF7idlsy60XcfQ=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=ff5Yd8OJ4ExH5bnk2bCFuiluZPWjWilNWJLSm+3XdZvzr3sEh2KItScESE/tCvIq1X 5RAdPKUJbFxbQt7o87wWJaelZR8WT/C92pJVNHS1nOHsIbNGinqvl8NquphSu3IwLpOh W/zsYN9B9QNUFactIIpWothlHcFsCvMoq3yUjgKTunUZIEYpuvQ6WsU1NjZIaJEevNXB 6JtfWXRXVh1uDq6EPHUQUlojYvtfvZDYmMv8PhMyV+Yv08IRBxfON2mP/cPcZCJLJQv6 +JYKPD2Y/uQz3tr2wzs6d3hzhxQaR8WSlA9af8dD0qqrjvlnVquVl8WJegqm1MKUkNuq wwaw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a640c23a62f3a-a626cd919basi204553666b.708.2024.05.25.12.10.06; Sat, 25 May 2024 12:10:06 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3503968D4ED; Sat, 25 May 2024 22:10:03 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D3A9A68D4D5 for ; Sat, 25 May 2024 22:09:55 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 63E36C006B for ; Sat, 25 May 2024 22:09:55 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Sat, 25 May 2024 22:09:55 +0300 Message-ID: <20240525190955.96364-1-remi@remlab.net> X-Mailer: git-send-email 2.45.1 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] lavc/sbrdsp: add support for 256-bit vectors X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: eT4xW8lGbFn0 hf_apply_noise_0_c: 35.7 hf_apply_noise_0_rvv_f32: 9.5 hf_apply_noise_1_c: 38.5 hf_apply_noise_1_rvv_f32: 10.0 hf_apply_noise_2_c: 35.5 hf_apply_noise_2_rvv_f32: 9.7 hf_apply_noise_3_c: 38.5 hf_apply_noise_3_rvv_f32: 10.0 Maybe extending the noise table manually is not such great idea, but I not quite sure how to deal with that otherwise? Allocating the table dynamically is possible but would require an ELF destructor to clean up. --- libavcodec/riscv/sbrdsp_init.c | 2 +- libavcodec/sbrdsp_template.c | 8 ++++++++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/libavcodec/riscv/sbrdsp_init.c b/libavcodec/riscv/sbrdsp_init.c index 6c17b12ae0..562c75e6b4 100644 --- a/libavcodec/riscv/sbrdsp_init.c +++ b/libavcodec/riscv/sbrdsp_init.c @@ -52,7 +52,7 @@ av_cold void ff_sbrdsp_init_riscv(SBRDSPContext *c) c->sum_square = ff_sbr_sum_square_rvv; c->hf_gen = ff_sbr_hf_gen_rvv; c->hf_g_filt = ff_sbr_hf_g_filt_rvv; - if (ff_get_rv_vlenb() <= 16) { + if (ff_get_rv_vlenb() <= 32) { c->hf_apply_noise[0] = ff_sbr_hf_apply_noise_0_rvv; c->hf_apply_noise[2] = ff_sbr_hf_apply_noise_2_rvv; if (flags & AV_CPU_FLAG_RVB_BASIC) { diff --git a/libavcodec/sbrdsp_template.c b/libavcodec/sbrdsp_template.c index 0f731ba50d..9a94af8670 100644 --- a/libavcodec/sbrdsp_template.c +++ b/libavcodec/sbrdsp_template.c @@ -376,5 +376,13 @@ const attribute_visibility_hidden DECLARE_ALIGNED(16, INTFLOAT, AAC_RENAME(ff_sb {Q31(-0.99867974711855f), Q31(-0.88147068645358f)}, {Q31(-0.95531076805040f), Q31( 0.90908757154593f)}, {Q31(-0.45725933317144f), Q31(-0.56716323646760f)}, {Q31(-0.72929675029275f), Q31(-0.98008272727324f)}, {Q31( 0.75622801399036f), Q31( 0.20950329995549f)}, {Q31( 0.07069442601050f), Q31(-0.78247898470706f)}, +{Q31( 0.74496252926055f), Q31(-0.91169004445807f)}, {Q31(-0.96440182703856f), Q31(-0.94739918296622f)}, +{Q31( 0.30424629369539f), Q31(-0.49438267012479f)}, {Q31( 0.66565033746925f), Q31( 0.64652935542491f)}, +{Q31( 0.91697008020594f), Q31( 0.17514097332009f)}, {Q31(-0.70774918760427f), Q31( 0.52548653416543f)}, +{Q31(-0.70051415345560f), Q31(-0.45340028808763f)}, {Q31(-0.99496513054797f), Q31(-0.90071908066973f)}, +{Q31( 0.98164490790123f), Q31(-0.77463155528697f)}, {Q31(-0.54671580548181f), Q31(-0.02570928536004f)}, +{Q31(-0.01689629065389f), Q31( 0.00287506445732f)}, {Q31(-0.86110349531986f), Q31( 0.42548583726477f)}, +{Q31(-0.98892980586032f), Q31(-0.87881132267556f)}, {Q31( 0.51756627678691f), Q31( 0.66926784710139f)}, +{Q31(-0.99635026409640f), Q31(-0.58107730574765f)}, {Q31(-0.99969370862163f), Q31( 0.98369989360250f)}, #endif };