From patchwork Sat Apr 20 15:55:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: flow gg X-Patchwork-Id: 48195 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:c906:b0:1a9:af23:56c1 with SMTP id gx6csp1312873pzb; Sat, 20 Apr 2024 08:56:13 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCWyzrlilGkRz2sCgolIF1qMypR42H3+4gktUU21MaMKtXRi+Vz0zf2+m6OnNUO4Ip/TsNp/+6PqqDA8nRFhZaMRiP5i6tIvv4/ztg== X-Google-Smtp-Source: AGHT+IEoqNpu/J882gsgtaEPVcZbW+NKyKXxul6zIjfJ6rQbPPcPsvtcsdnHqhE8EyOrJwTHUPRz X-Received: by 2002:a19:e058:0:b0:518:ccf4:bd0e with SMTP id g24-20020a19e058000000b00518ccf4bd0emr3498199lfj.3.1713628572643; Sat, 20 Apr 2024 08:56:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1713628572; cv=none; d=google.com; s=arc-20160816; b=hgqFpt1IOFvIpfcps1xliMcxyRPa1YcpRCmnrBb6qUgDWabUS9HZmAS0NEqSmQS+Hy G6cy1sGYc5gkvARKN7Nbsf3icYoaKwfWLbIlk1xAwXjNMhcf1ITc0i/4PT8hms9XLTzt yO61nE4vg5OntCYA2FPxVmM7nCfDs2tnvMMKZzvou6QbSqrNtQCa1hAPm8L22D+/XtD4 MJpWWBYLDooc9RVjonu/tZ3QOI+Bztp2z4SfktBGM9Cq4PivHvLPsYuxPd6YiGEsXXhK Aa7jIgP37k7e2+dX3wc/g1kFGV3VD5XNjdYQmTjDtKzrmqSoYoTLbjQbZFxXyRwfDWVI 5R6w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject:to :message-id:date:from:mime-version:dkim-signature:delivered-to; bh=0I8Vbge/M2a3/3gP/InjWetqtZajyVRjj3BERMV185M=; fh=e5zN9xSzcxLA6bGo3lF+CqTbY/oLwzApV03EO/RBfgQ=; b=YB3ntasAeDEYZAWrSEFsJBlkOtlZ0cXhTE3nU8BP0NCYQDcUuqXvKKpipusUIOvFNz yQoji6fKkMcZTiM6wpk3JYcpK7GvXTCb+TIObHSCbH947sxDizcysTaoiCisneAuNZf4 0baEOsd/c8g36U8botcD3BypVpIAduQl1FFf2Z/wyEmV95h1FaC4JgizvGGdHn5mSBjc KD5U92eFrVmjV4esUx3LvOUS4tBcyYGHh0WGyiR9lPx1A8G+bv3sLqXSUGnRN17IzIic kNkkI1GCwdoJWdTyjMt6tsMBnxA0kJbUhEhDndEU4qBnC7lOl7N1GtXo5hRaqftegXLd hzJw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=RuIIlzZW; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id k2-20020a170906578200b00a51a11372a2si3547310ejq.301.2024.04.20.08.56.12; Sat, 20 Apr 2024 08:56:12 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=RuIIlzZW; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E627C68D220; Sat, 20 Apr 2024 18:56:09 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-qt1-f180.google.com (mail-qt1-f180.google.com [209.85.160.180]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 4ACCF688189 for ; Sat, 20 Apr 2024 18:56:03 +0300 (EEST) Received: by mail-qt1-f180.google.com with SMTP id d75a77b69052e-437a660c94fso17749351cf.1 for ; Sat, 20 Apr 2024 08:56:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1713628561; x=1714233361; darn=ffmpeg.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=4Mmif4MIq08tthJf//mUFz8WcBK8z7kfjXN8TgS2MlY=; b=RuIIlzZW3PHLlXJ83SZxqZvrDtl9Nhp19nqFY0NjwsimC6QHhos62UV3302KN6HRdq TlGLfto/UmgaLuQvX0ArCmMEetsWoOlBxqLJHMjewvoum6PBSm2WCVizLumQQqlDkL8n hyfvgleEvhlZwMJ8T5YmySizaua6xHnqqwgB0KOOzjAvhHdsfaIDj6NEcOtOfreVmeaO wPJ+rZLqmmUzwH+ESBds93bJxK07OqXOQr/gVyraDSCTpAORe0VO24pqSMTpYS6Pz/jr ANQ3v3cNO7R+0+CmoO9/zCDijx0tnqDo2SwSoWGV6Itf0G1aXJ4BY53Fw5yh8gJykZak r5mQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713628561; x=1714233361; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=4Mmif4MIq08tthJf//mUFz8WcBK8z7kfjXN8TgS2MlY=; b=EiSHr9zEbxmIiE/RHAg+FqMtd8+KxYPeTP3OvEoT5Tg9JEH7L5RRv7jXbcoC27Xmpz 0Wr98zafJP8KW2lSxgUq0LQN1mBBMe5WEV2qBZ2t9dS/JsBLvIwyYXCzFhpwg8AG4fzh ssNm5InUROXABU2zqS8Cx83tq5MxKRp0qQKZ5XqKY96UbCF3h4QIjTmPjr3OnOOxxkpM q/vdJmdvyAAnUVy3ODC76VBo1lEQ/zvtZrExNsQJCxHCbDLfBWdvcpJT8seXdSfz9xYM Ih36vuq9GgD6n+ZGWtfZ8igvXlDU5I0ivHrBWQvcGPYOacDpHHNKDfRp3JuC54Wjo/Q9 rgQQ== X-Gm-Message-State: AOJu0Yze93txdPjb84MX9Ew3QhOs28kJg1Xg8iM+wtDnaSVg4yBY55fF vBWsELkRQxe+I76KzA1+X8aXuD+r9lMqmMQnn9gZi388sVWCv44XD3TWyO12zp5bSt4fyK/nuXx IOTaxCZmHX1Fk5mflXVNv/6SVSMRaug== X-Received: by 2002:ad4:4dca:0:b0:69b:7b95:8ef1 with SMTP id cw10-20020ad44dca000000b0069b7b958ef1mr5615791qvb.22.1713628561677; Sat, 20 Apr 2024 08:56:01 -0700 (PDT) MIME-Version: 1.0 From: flow gg Date: Sat, 20 Apr 2024 23:55:50 +0800 Message-ID: To: FFmpeg development discussions and patches X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: [FFmpeg-devel] [PATCH 3/3] lavc/vp8dsp: R-V V loop_filter X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: z/Z27F/V1baE From cff79c9500b94f4c0abdd9cd68c91cc736366c78 Mon Sep 17 00:00:00 2001 From: sunyuechi Date: Sat, 20 Apr 2024 23:26:58 +0800 Subject: [PATCH 3/3] lavc/vp8dsp: R-V V loop_filter C908: vp8_loop_filter8uv_v_c: 745.5 vp8_loop_filter8uv_v_rvv_i32: 467.2 vp8_loop_filter16y_h_c: 674.2 vp8_loop_filter16y_h_rvv_i32: 553.0 vp8_loop_filter16y_v_c: 732.7 vp8_loop_filter16y_v_rvv_i32: 324.5 --- libavcodec/riscv/vp8dsp_init.c | 4 +++ libavcodec/riscv/vp8dsp_rvv.S | 63 ++++++++++++++++++++++++++++++++++ 2 files changed, 67 insertions(+) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index aa95021df5..597e6acec8 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -123,6 +123,10 @@ av_cold void ff_vp8dsp_init_riscv(VP8DSPContext *c) c->vp8_idct_dc_add4uv = ff_vp8_idct_dc_add4uv_rvv; } + c->vp8_v_loop_filter16y = ff_vp8_v_loop_filter16_rvv; + c->vp8_h_loop_filter16y = ff_vp8_h_loop_filter16_rvv; + c->vp8_v_loop_filter8uv = ff_vp8_v_loop_filter8uv_rvv; + c->vp8_v_loop_filter16y_inner = ff_vp8_v_loop_filter16_inner_rvv; c->vp8_h_loop_filter16y_inner = ff_vp8_h_loop_filter16_inner_rvv; c->vp8_v_loop_filter8uv_inner = ff_vp8_v_loop_filter8uv_inner_rvv; diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index f10e269d9d..af28ea5258 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -229,6 +229,39 @@ endfunc vsra.vi v24, v24, 1 // (f1 + 1) >> 1; vadd.vv v8, v18, v24 vsub.vv v10, v20, v24 + .else + li t5, 27 + li t3, 9 + li a7, 18 + vwmul.vx v2, v11, t5 + vwmul.vx v6, v11, t3 + vwmul.vx v4, v11, a7 + +.ifc \len,16 + vsetvli zero, zero, e16, m2, ta, ma +.else + vsetvli zero, zero, e16, m1, ta, ma +.endif + + li a7, 63 + vzext.vf2 v14, v15 // p2 + vzext.vf2 v24, v10 // q2 + vadd.vx v2, v2, a7 + vadd.vx v4, v4, a7 + vadd.vx v6, v6, a7 + vsra.vi v2, v2, 7 // a0 + vsra.vi v12, v4, 7 // a1 + vsra.vi v6, v6, 7 // a2 + vadd.vv v14, v14, v6 // p2 + a2 + vsub.vv v22, v24, v6 // q2 - a2 + vsub.vv v10, v20, v12 // q1 - a1 + vadd.vv v4, v8, v2 // p0 + a0 + vsub.vv v6, v16, v2 // q0 - a0 + vadd.vv v8, v12, v18 // a1 + p1 + vmax.vx v4, v4, zero + vmax.vx v6, v6, zero + vmax.vx v14, v14, zero + vmax.vx v16, v22, zero .endif vmax.vx v8, v8, zero @@ -253,6 +286,17 @@ endfunc vsse8.v v6, (a6), \stride, v0.t vsse8.v v7, (t4), \stride, v0.t .endif + .if !\inner + vnclipu.wi v14, v14, 0 + vnclipu.wi v16, v16, 0 + .ifc \type,v + vse8.v v14, (t0), v0.t + vse8.v v16, (t6), v0.t + .else + vsse8.v v14, (t0), \stride, v0.t + vsse8.v v16, (t6), \stride, v0.t + .endif + .endif .endif .endm @@ -275,6 +319,25 @@ func ff_vp8_v_loop_filter8uv_inner_rvv, zve32x ret endfunc +func ff_vp8_v_loop_filter16_rvv, zve32x + vsetivli zero, 16, e8, m1, ta, ma + filter 16 v 1 0 a0 a1 a2 a3 a4 + ret +endfunc + +func ff_vp8_h_loop_filter16_rvv, zve32x + vsetivli zero, 16, e8, m1, ta, ma + filter 16 h 1 0 a0 a1 a2 a3 a4 + ret +endfunc + +func ff_vp8_v_loop_filter8uv_rvv, zve32x + vsetivli zero, 8, e8, mf2, ta, ma + filter 8 v 1 0 a0 a2 a3 a4 a5 + filter 8 v 1 0 a1 a2 a3 a4 a5 + ret +endfunc + func ff_vp8_v_loop_filter16_simple_rvv, zve32x vsetivli zero, 16, e8, m1, ta, ma filter 16 v 0 0 a0 a1 a2 a3 a4 -- 2.44.0