From patchwork Sun May 5 16:45:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48551 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp963083pzb; Sun, 5 May 2024 09:47:22 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCXffxhAwMQrmsilWSAkp0j4AF/2KbrNtG6Ah1r0Rp46yK+WdARk5quW4GpHjJCtJgazYQ4/SUN2nj5VkY1Fp4TimEskpwhL8YW3Tw== X-Google-Smtp-Source: AGHT+IH/pd4PA76K4MPyrNwbpzn+nXlIiJ5dJZ2/1quCOoni1aYl/PKt3+aFVqSBseth+NM9BURu X-Received: by 2002:a17:906:f845:b0:a58:9e89:7d91 with SMTP id ks5-20020a170906f84500b00a589e897d91mr5199625ejb.42.1714927642223; Sun, 05 May 2024 09:47:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714927642; cv=none; d=google.com; s=arc-20160816; b=VR5HQeUW7/t42iDmkgg7QCwVV0h5CWsBwY6IBqMA7vZ730c572+CrAiN2F/Z3/9xNq ktWfR+sV4mU5aJdIBRSyquf22QzoUuhjo6ipkGCnMFwBsQxsP8yiX4yrQFTSRxQ5zLpe 9XtdKXUY3P9R3mROMgu4XxrXEuO8tzaZT9QI5ZY+n2Z6ojYnSPQnNm8//H47AtKpcEoH grECF2InxxrbIcx09whGtTwFE+WqpUj4XU3YBE2x7l+Gd64ikV0/BPWndpMHg4+qPWGY zgJIMstItWRvOLSHILlpFB7z/PI+kJIJRYL6upWH/B8jLy90GQB6aGdf9o6uQ3jERM7U KpzQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=5sw3rqocwLFTdKO2iq6ZxoqARAuJCMCKdXgp98hUEYw=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=MtVvdhD4spWx877ZtP3Vy3doc8Po9IiMF6BhSN+SJek+Tgvu6tp9kok5AYo5ncCPlc fNrTg1XLYOMqQaREJNHMj6Syw+ukYMn1bSXClcxLd02Vo9j+08kph8VZf9BAaEtbIjNo 86tO2CqgsUndHQU4i4VTBSA67S+fYiRA579jaab+hpi0PAGKQmUCfr9OygCbKCisRYjO qZWK2Ak+tqxBJZbCqDhbEnf+shxLBResFz0kzfLbX1aaQRwxVfsc4a9wqU8t+/o0tYvr 24CRaRf/Gil5ZuDEluR6JLz7TxiE1Dmj/rumHdgXWWEkYSI02Kuv/aZ0YtUp5As9xR4K vIig==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=NAw3JEYH; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id sh34-20020a1709076ea200b00a59a975a8adsi1893512ejc.761.2024.05.05.09.47.21; Sun, 05 May 2024 09:47:22 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=NAw3JEYH; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 093BA68D647; Sun, 5 May 2024 19:46:07 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out203-205-221-209.mail.qq.com (out203-205-221-209.mail.qq.com [203.205.221.209]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id EF67368D5EE for ; Sun, 5 May 2024 19:45:54 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714927545; bh=J5fgv8bzv3WNdHdeUCKFgcJXslmRF8mItl//+anGV2w=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=NAw3JEYH5h63dOf2fVMgsvifOOIBbK7KCILKIVmDo0aCR/S2FCZRKjBotKoPyuu4h pTjjTMaSyMJhG3P0SWoSt/wfNLPqvcYyQGn63WUZZW8WAppv4i2z2/HhbdA5nDGCen /U0qmksZ6B34LDmaRWyeSfn2V4ZsPVjloyi+4X9w= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza10-0.qq.com (NewEsmtp) with SMTP id B65A42EA; Mon, 06 May 2024 00:45:37 +0800 X-QQ-mid: xmsmtpt1714927544tr3ayn5b9 Message-ID: X-QQ-XMAILINFO: NDgMZBR9sMmaL0Gzn/9bn1SEGVEPNgEQzFjZ5HAxFSOomc5XjMhIHz6o6m7pJ4 xN67TgLeLafIrUd0bES2ZRD+chmFdCSvXXy66baHSwhPADzn9qbMcPVp9UhPJJwSaraNVpEjdQ99 lqBbbe+5+1hDxc3rSDtHhdI4/CgY21ULLDV9rBJOsVNb9ajfdX8M+2x8h49mzdSuZjeC7q3eaa7Z hgr3YIXj/Venb889FXub+LQrDYaj8BAiL5jFQnILcmRNvQqKmT+hjF/xX2nQKhQJf6bTkzf9WaqV 5qlFCMkcxwYrGVn82yBxWML42zCv7cNxOnG1JzP1z8wp9aL07RqiwQJumWIXQbdVRVMgPg3lk/eT 7hX6iDJAMS523lHdWsQdHt/uoAjm2HoIqk/SAvOcBOBsoeRwl62VsRM1amkroWfAoQvrZKYED0Dz cpKeybPTWh4kWD7hBU3Xkn8JWdnOKFTMOI/SEkE6batwN3I+MPuv5jbJ4UXG/Df9JgSMa0fqEyXQ mz8T8/vR7NWGFcWxOM5quPiOBuiG0QcKMxINBKgaysCpc9rQZUpK1wJlQ/Jc8QD/T3NzxDj/cleL pMSE7tASeWg5y3hGOKV+LDCQ5riXUIsPSK7/gAAhCV8qKjojTobppDn65kRGow9U9Ym0sVyxEbKW ysp0q4eFGYXEqYEpu0ETUeeqSZM0B/JBmyB8gDRxDd1FWWzB+nEXL5cEqCTYrYK4Cc3lgqx+ddOg DaQ2g2oHtNvZxD9T5cDbXLhsA+Oj9yI56COb1cuZfTojKhlZ3ngYacp5vSqTJoJ8xi4BFq/ygyo/ MHEpfeldP1nslzSqz2+iPbyeFRsZcxrTTRpLZdaytUPwjykknLgOMJ42LYOGeV5ec41ZCGxMVyAt cM5oPxky4ma5s5xwGiGYOIT/2z26vWD8B74tY/AY7svin1vH942Iw= X-QQ-XMRINFO: MPJ6Tf5t3I/ycC2BItcBVIA= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Mon, 6 May 2024 00:45:36 +0800 X-OQ-MSGID: <20240505164536.872683-10-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240505164536.872683-1-uk7b@foxmail.com> References: <20240505164536.872683-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 10/10] lavc/vp8dsp: R-V V loop_filter X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 2Rl3x1FBa+EM From: sunyuechi C908: vp8_loop_filter8uv_v_c: 745.5 vp8_loop_filter8uv_v_rvv_i32: 467.2 vp8_loop_filter16y_h_c: 674.2 vp8_loop_filter16y_h_rvv_i32: 553.0 vp8_loop_filter16y_v_c: 732.7 vp8_loop_filter16y_v_rvv_i32: 324.5 --- libavcodec/riscv/vp8dsp_init.c | 4 +++ libavcodec/riscv/vp8dsp_rvv.S | 57 ++++++++++++++++++++++++++++++++++ 2 files changed, 61 insertions(+) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index 4f38abba93..35c1646dab 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -130,6 +130,10 @@ av_cold void ff_vp8dsp_init_riscv(VP8DSPContext *c) c->vp8_idct_dc_add4uv = ff_vp8_idct_dc_add4uv_rvv; } + c->vp8_v_loop_filter16y = ff_vp8_v_loop_filter16_rvv; + c->vp8_h_loop_filter16y = ff_vp8_h_loop_filter16_rvv; + c->vp8_v_loop_filter8uv = ff_vp8_v_loop_filter8uv_rvv; + c->vp8_v_loop_filter16y_inner = ff_vp8_v_loop_filter16_inner_rvv; c->vp8_h_loop_filter16y_inner = ff_vp8_h_loop_filter16_inner_rvv; c->vp8_v_loop_filter8uv_inner = ff_vp8_v_loop_filter8uv_inner_rvv; diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index d7e8b6ae58..360d79bc22 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -230,6 +230,33 @@ endfunc vsra.vi v24, v24, 1 // (f1 + 1) >> 1; vadd.vv v8, v18, v24 vsub.vv v10, v20, v24 + .else + li t5, 27 + li t3, 9 + li a7, 18 + vwmul.vx v2, v11, t5 + vwmul.vx v6, v11, t3 + vwmul.vx v4, v11, a7 + vsetvlstatic16 \len + li a7, 63 + vzext.vf2 v14, v15 // p2 + vzext.vf2 v24, v10 // q2 + vadd.vx v2, v2, a7 + vadd.vx v4, v4, a7 + vadd.vx v6, v6, a7 + vsra.vi v2, v2, 7 // a0 + vsra.vi v12, v4, 7 // a1 + vsra.vi v6, v6, 7 // a2 + vadd.vv v14, v14, v6 // p2 + a2 + vsub.vv v22, v24, v6 // q2 - a2 + vsub.vv v10, v20, v12 // q1 - a1 + vadd.vv v4, v8, v2 // p0 + a0 + vsub.vv v6, v16, v2 // q0 - a0 + vadd.vv v8, v12, v18 // a1 + p1 + vmax.vx v4, v4, zero + vmax.vx v6, v6, zero + vmax.vx v14, v14, zero + vmax.vx v16, v22, zero .endif vmax.vx v8, v8, zero @@ -250,6 +277,17 @@ endfunc vsse8.v v6, (a6), \stride, v0.t vsse8.v v7, (t4), \stride, v0.t .endif + .if !\inner + vnclipu.wi v14, v14, 0 + vnclipu.wi v16, v16, 0 + .ifc \type,v + vse8.v v14, (t0), v0.t + vse8.v v16, (t6), v0.t + .else + vsse8.v v14, (t0), \stride, v0.t + vsse8.v v16, (t6), \stride, v0.t + .endif + .endif .endif .endm @@ -284,6 +322,25 @@ func ff_vp8_v_loop_filter8uv_inner_rvv, zve32x ret endfunc +func ff_vp8_v_loop_filter16_rvv, zve32x + vsetvlstatic8 16 + filter 16 v 1 0 a0 a1 a2 a3 a4 + ret +endfunc + +func ff_vp8_h_loop_filter16_rvv, zve32x + vsetvlstatic8 16 + filter 16 h 1 0 a0 a1 a2 a3 a4 + ret +endfunc + +func ff_vp8_v_loop_filter8uv_rvv, zve32x + vsetvlstatic8 8 + filter 8 v 1 0 a0 a2 a3 a4 a5 + filter 8 v 1 0 a1 a2 a3 a4 a5 + ret +endfunc + .macro bilin_h_load dst len vsetvlstatic8 \len + 1 vle8.v \dst, (a2)