From patchwork Fri Mar 22 06:01:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: flow gg X-Patchwork-Id: 47297 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a21:3a4a:b0:1a3:31a3:7958 with SMTP id zu10csp1006156pzb; Thu, 21 Mar 2024 23:01:24 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCUMjMW/q24DwCOAusuBJRZnZ0tY6Cyqu9cRa5azYmWKKn5FLqvluRpTONvUeCp6oQW0z4s5k8F/5wBUOLQX5mSbiOWG+jVWXr3Lqg== X-Google-Smtp-Source: AGHT+IEER0dTbjNgwAvSlab7HyvbBkJND28N3DP09RHEj4+eHWEbHfvt59Qjl7q/H0RAoUd23Azc X-Received: by 2002:a17:906:e0d6:b0:a46:ede0:2370 with SMTP id gl22-20020a170906e0d600b00a46ede02370mr1050156ejb.57.1711087284606; Thu, 21 Mar 2024 23:01:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1711087284; cv=none; d=google.com; s=arc-20160816; b=CfHG0DniMWDV+xOaiPIO+kW6Sj9XvUf6b/QdRZl5QSVzjJGV8byQdfosWB+H96YOYH YYXy5zQ/1bg73k1skS6jAezEbkkEqARnoNTcYaRIXE23U69m5N23WGRU1FoF9nPsB1Es gXrJt6m/1xieYns3sNWlKrB5mm9Ft6L3Uphz0g1s5FIr89IQAduEUUWEQAK0Hp9WNTSp X7um4qYx+4ute9r5eCFxAz2l6ZfvviQ0jzlXLVemjoLnOIB2xROidzrSEiB3aOWjHcvE 9OgawfXy49h0bQ1Nte5L7S1Aol6ysE1Z0/lQBgmQRD8p5Ks+yP7kwP6xa8POAdpQp9QZ 3Cjg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject:to :message-id:date:from:mime-version:dkim-signature:delivered-to; bh=4e0KSdmIdUhtHic/ynX22EdgI3AlMLhKzmYpDAqm+dg=; fh=e5zN9xSzcxLA6bGo3lF+CqTbY/oLwzApV03EO/RBfgQ=; b=rUceWx++eQpK89tqAUwn9QfRD2+p/5wBlGWAxoZiLBWj7/CAjPStaipDyJ7UbPKSZS TUs1tJ0s4+CLGsmTFt5rswR+y7o5ZpOlT4FgoBa3T6Umi+RUw0711+e3d9mrO+Zaialb xmnVvtB+zbPaYkyNglmGXf6j8Rh6BHpP62tKySwawHKMUI6SBQR63tFQMbEd+w3vTm0+ D9nb5bN1Kyd8spc9OGgrEEgYIhTDVhpYpa015jJ+HTIFqZ1Tmjli0rkJUvaJs3dzmxuh 0HGATBnyB696znkYiEtWsMSy/PzSvQxYZcHMzQYbQFv/6dLvarMh4piHltwoZDDFigEc MmJw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=TGML4cck; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id m21-20020a17090607d500b00a471fcf2a47si589914ejc.545.2024.03.21.23.01.23; Thu, 21 Mar 2024 23:01:24 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=TGML4cck; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 07D0C68D2C7; Fri, 22 Mar 2024 08:01:20 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-qk1-f178.google.com (mail-qk1-f178.google.com [209.85.222.178]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id ADB9868C8EA for ; Fri, 22 Mar 2024 08:01:13 +0200 (EET) Received: by mail-qk1-f178.google.com with SMTP id af79cd13be357-789f00aba19so131348785a.0 for ; Thu, 21 Mar 2024 23:01:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1711087272; x=1711692072; darn=ffmpeg.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=igzHmG4hPk6xFZFwMKkCmMtJYxNF/7Aj6zfGrU98KSU=; b=TGML4cckBssEEQ3xs4lvFmqbnuYTrvLJdsg0d3HJVlAUdJeJoULwj/YbNgbA6L8XQJ t5vSA4VTF1ei1ELa6TzCIVLhKUy53yfBhF+zzBiu51hSiZkwC0qP9nRI4bq6ecHCY3vp BCtIZzaw1QEYrL30+2bDsqjmVHLkbEEzYEP4JJe4pchgwoQQatDYtNJjkqd3gDdbV25o eob7YMqi6aFQ5PyJX0UlWU9wcJGwnLuuHr0HIxmeJHc888woRXZoCP3tWSE6T38jrKAx IiolPSWiZ6jz3cCB/I9YT9kNORwb2wdD6mwaF7uB9M51lWyZiuTL3o9sbW3HJu9U6SjK tokw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711087272; x=1711692072; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=igzHmG4hPk6xFZFwMKkCmMtJYxNF/7Aj6zfGrU98KSU=; b=Vh59V8KB5ph1osKIK92xvC8XnhbZMwTUF50EKykP78SZh+yRm6am2z2pJsf01ZDdfj P7OpOzi1GLsVeAdxzajY5TKOpop4/CFXA3oE8Jf1ROcXoQg10E7m2hIfqWQ3zqukoNXp UeHyQp9J0z/2SrhV3Ta8oprfFdVMqDsnfniVhZvlt/G0sH2H6Issl3vPONegKuc5h3l6 9emUCnzAKwCq0G3DhGUX77siNBQrdIRYet8MEwkgmi00YOPCIiLPaPiTnOqXTiwPSS1o FWXMMtAiiYIH1zQPp0ABoYhkh6lVzQ/ml+Q+9uwuBp1DPhf2fY0MAdiARYVI8x1emUeL qaog== X-Gm-Message-State: AOJu0YzkIVynJgYUvA33Joh95LVRAlNtmD8r5gsFgSZ5hYusNlVJp5m4 Y7qtTOLBVj0vI2XaZaZt4QlLG1KeiqsgXVKUku5INGmRIrhSg41brpl13dVNIiGUtRNjh3rChhF SgmiUTrTQUeIbcEWGB5v0kIzTn4NtjflqKz8bxg== X-Received: by 2002:ad4:5e8f:0:b0:691:8261:15ff with SMTP id jl15-20020ad45e8f000000b00691826115ffmr1187985qvb.40.1711087271097; Thu, 21 Mar 2024 23:01:11 -0700 (PDT) MIME-Version: 1.0 From: flow gg Date: Fri, 22 Mar 2024 14:01:00 +0800 Message-ID: To: FFmpeg development discussions and patches X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: [FFmpeg-devel] [PATCH 1/3] lavc/vp8dsp: R-V V put_epel h X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: UpWA4ZtGSsvy (This should be used after applying these 4 patches) ``` [FFmpeg-devel] [PATCH] lavc/vp8dsp: R-V V put_vp8_pixels [FFmpeg-devel] [PATCH 1/3] lavc/vp8dsp: R-V V put_bilin_h 1-3 ``` From 201274b32ef49fdeb6782498634ed78491a9519a Mon Sep 17 00:00:00 2001 From: sunyuechi Date: Sat, 9 Mar 2024 08:41:31 +0800 Subject: [PATCH 1/3] lavc/vp8dsp: R-V V put_epel h C908: vp8_put_epel4_h4_c: 10.7 vp8_put_epel4_h4_rvv_i32: 5.0 vp8_put_epel4_h6_c: 15.0 vp8_put_epel4_h6_rvv_i32: 6.2 vp8_put_epel8_h4_c: 43.2 vp8_put_epel8_h4_rvv_i32: 11.2 vp8_put_epel8_h6_c: 57.5 vp8_put_epel8_h6_rvv_i32: 13.5 vp8_put_epel16_h4_c: 92.5 vp8_put_epel16_h4_rvv_i32: 13.7 vp8_put_epel16_h6_c: 139.0 vp8_put_epel16_h6_rvv_i32: 16.5 --- libavcodec/riscv/vp8dsp_init.c | 7 +++ libavcodec/riscv/vp8dsp_rvv.S | 104 +++++++++++++++++++++++++++++++++ 2 files changed, 111 insertions(+) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index 02dbda979e..6614d661f7 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -78,6 +78,13 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) c->put_vp8_bilinear_pixels_tab[2][1][2] = ff_put_vp8_bilin4_hv_rvv; c->put_vp8_bilinear_pixels_tab[2][2][1] = ff_put_vp8_bilin4_hv_rvv; c->put_vp8_bilinear_pixels_tab[2][2][2] = ff_put_vp8_bilin4_hv_rvv; + + c->put_vp8_epel_pixels_tab[0][0][2] = ff_put_vp8_epel16_h6_rvv; + c->put_vp8_epel_pixels_tab[1][0][2] = ff_put_vp8_epel8_h6_rvv; + c->put_vp8_epel_pixels_tab[2][0][2] = ff_put_vp8_epel4_h6_rvv; + c->put_vp8_epel_pixels_tab[0][0][1] = ff_put_vp8_epel16_h4_rvv; + c->put_vp8_epel_pixels_tab[1][0][1] = ff_put_vp8_epel8_h4_rvv; + c->put_vp8_epel_pixels_tab[2][0][1] = ff_put_vp8_epel4_h4_rvv; } #endif } diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index 9d4ffed255..a0dd46e3a8 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -223,3 +223,107 @@ endfunc func ff_put_vp8_bilin4_hv_rvv, zve32x put_vp8_bilin_hv 4 endfunc + +subpel_filters: + .byte 0, -6, 123, 12, -1, 0 + .byte 2, -11, 108, 36, -8, 1 + .byte 0, -9, 93, 50, -6, 0 + .byte 3, -16, 77, 77, -16, 3 + .byte 0, -6, 50, 93, -9, 0 + .byte 1, -8, 36, 108, -11, 2 + .byte 0, -1, 12, 123, -6, 0 + +.macro epel_filter size + lla t2, subpel_filters + addi t0, a5, -1 + li t1, 6 + mul t0, t0, t1 + add t0, t0, t2 + .irp n 1,2,3,4 + lb t\n, \n(t0) + .endr +.ifc \size,6 + lb t5, 5(t0) + lb t0, (t0) +.endif +.endm + +.macro epel_load dst len size + addi t6, a2, -1 + addi a7, a2, 1 + vle8.v v24, (a2) + vle8.v v22, (t6) + vle8.v v26, (a7) + addi a7, a7, 1 + vle8.v v28, (a7) + vwmulu.vx v16, v24, t2 + vwmulu.vx v20, v26, t3 +.ifc \size,6 + addi t6, t6, -1 + addi a7, a7, 1 + vle8.v v24, (t6) + vle8.v v26, (a7) + vwmaccu.vx v16, t0, v24 + vwmaccu.vx v16, t5, v26 +.endif + li t6, 64 + vwmaccsu.vx v16, t1, v22 + vwmaccsu.vx v16, t4, v28 + vwadd.wx v16, v16, t6 + +.ifc \len,4 + vsetvli zero, zero, e16, mf2, ta, ma +.elseif \len == 8 + vsetvli zero, zero, e16, m1, ta, ma +.else + vsetvli zero, zero, e16, m2, ta, ma +.endif + + vwadd.vv v24, v16, v20 + vnsra.wi v24, v24, 7 + vmax.vx v24, v24, zero +.ifc \len,4 + vsetvli zero, zero, e8, mf4, ta, ma +.elseif \len == 8 + vsetvli zero, zero, e8, mf2, ta, ma +.else + vsetvli zero, zero, e8, m1, ta, ma +.endif + vnclipu.wi \dst, v24, 0 +.endm + +.macro epel_load_inc dst len size + epel_load \dst \len \size + add a2, a2, a3 +.endm + +.macro epel len size + epel_filter \size + +.ifc \len,4 + vsetivli zero, 4, e8, mf4, ta, ma +.elseif \len == 8 + vsetivli zero, 8, e8, mf2, ta, ma +.else + vsetivli zero, 16, e8, m1, ta, ma +.endif + +1: + addi a4, a4, -1 + epel_load_inc v30 \len \size + vse8.v v30, (a0) + add a0, a0, a1 + bnez a4, 1b + + ret +.endm + +.irp len 16,8,4 +func ff_put_vp8_epel\len\()_h6_rvv, zve32x + epel \len 6 +endfunc + +func ff_put_vp8_epel\len\()_h4_rvv, zve32x + epel \len 4 +endfunc +.endr -- 2.44.0 From patchwork Fri Mar 22 06:01:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: flow gg X-Patchwork-Id: 47298 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a21:3a4a:b0:1a3:31a3:7958 with SMTP id zu10csp1006321pzb; Thu, 21 Mar 2024 23:01:44 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCX/lGnJMcld9UYglXvDfuypgwB5nVtdGejaYKE8z5jQMURjywG5F7PNjXBF3FTcF8+7YSBlsiWBqIT1X9Bxlw80upqAydUi4jbtXg== X-Google-Smtp-Source: AGHT+IHkdFeskfgocl2iimsd1Js3wbqBOySawOA178dHxj2/22GtYOce3rmOm1deZsB0s/pg9JnQ X-Received: by 2002:a17:906:7c4c:b0:a46:2a8c:b9f0 with SMTP id g12-20020a1709067c4c00b00a462a8cb9f0mr512442ejp.7.1711087303975; Thu, 21 Mar 2024 23:01:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1711087303; cv=none; d=google.com; s=arc-20160816; b=kKjrB5+RDUmj93voNZuRGHf9wvO6yUIvj+dfByRuy5/8RdB4xvdkjrLUBrFYFVSDgJ ENMXC5GMVpgg0eLsHKEwb4wyzAQoaEg0WkdvBOnoFDcj47/57RfPAYBisCpiXRKwxspB PB3ADc0rLADA5tu2JMzDR2wrlUZ3kNsgJdp2aIHrPB7KdJ2x4Cs/Q62Qa7+T75fr9tlQ hwd5ag0bFHb6Z+CHt77lvYQjAV12SP4nm4iCkDZyx/39yqgaAekNMGWYDEt8A0Dw84uM bmAQ9XEjnQ6f2GhzSwejZf1ATn+q/8H3uKj4wSLfr3sVhR0REkCiBvqrr97plLXTKRw4 dTXg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject:to :message-id:date:from:mime-version:dkim-signature:delivered-to; bh=U/Y1TXRxFDSD4uv4JeyIvMo1819NA+Cfb7mbBb/HGCY=; fh=e5zN9xSzcxLA6bGo3lF+CqTbY/oLwzApV03EO/RBfgQ=; b=0t8+qCg3menFGUgrJpw4doQwvR7STggmSm4OkEYTcvnZmqFW8q6Iidg0uWfx/VGje0 +PdO8KgwC9x78b4HXbmC7dcppH3mrz+9vwDlEefUKOuxgKeFJx4r+gZ/QxPisUA0HfRB hjHs5+NzzJb4AVlXr2ioNzd05MSqmc81jSsYhUcqzJeZrIxFHyTS7FJV2F34ldvSgsvT u5KEdGhK0Yq5FJ3szLEoc3Uj/aEMfccyQ6nVQRhOq9EY3lAXI0DW1WybkbCKZRmtn5Ys ZmWT9QioLHEbxUXV0DCxZzVUN7vidJfEzfF4PwAKQ7REQgEpZhz61tUg16hhDT282vJy xaNg==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b="F/P9W0JS"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id r21-20020a170906c29500b00a4628c598afsi611769ejz.1048.2024.03.21.23.01.43; Thu, 21 Mar 2024 23:01:43 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b="F/P9W0JS"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4645768D514; Fri, 22 Mar 2024 08:01:41 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-qv1-f41.google.com (mail-qv1-f41.google.com [209.85.219.41]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id B37F368C903 for ; Fri, 22 Mar 2024 08:01:34 +0200 (EET) Received: by mail-qv1-f41.google.com with SMTP id 6a1803df08f44-6963cf14771so16482296d6.3 for ; Thu, 21 Mar 2024 23:01:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1711087293; x=1711692093; darn=ffmpeg.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=xvbW5t5ZKlLl4pAdMYMzWNOZQC/KoGnf6f2bN9e9OD0=; b=F/P9W0JSYoQWowV7V5Ty6NGwewCs+jg2L90dp1rB961UFOP66oq/Cv8JUyZelrMPGx i6TZq+8NV37N3HYC+YkXTO9RbQGWEGtgJJa1BvuBzA0irb89zr8E0+bLZnOCY1r2XBkj B3zs8zNwOBz8jQYTCSUEhp6Ox1Z4Yn5Pm1cMKiNowK6xr2Ne5zop7Qo8PFCEqqdfVHGT JFzPS5pwsaKCadv89/K0ou+cADjKOtB4GMjCCfRpywkRdB85LzaJHrK67+FLnPPSG7zY YbKCl6mPPk2otID8uUwHlxsKZP0S6urd8sGEVe0/KQE05zmarJbFOIzFy7ciNfM7J2RP lc/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711087293; x=1711692093; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=xvbW5t5ZKlLl4pAdMYMzWNOZQC/KoGnf6f2bN9e9OD0=; b=IeAOuIkPW3JpLrhFv9okNeH6zth2joCFciehtpWcs9gftWEwAlnPrx6eozkJpE8AmS gu+8mj7OKk3vgsGV8+CWyekjJUiwqxeO+NwJJzQKGPIBS/+xEnbZdBrRLRlM9M3YKqD+ Jcm5wEw38+6ILd/QAzz0PRdvYWLtCdLkXVt1LBNpe2MV8s5lAiggHxRNDnsBh1PmQjqM Bz3t3t8k8yAt+AIbLXKb8ZAgWez1ffIT34+MqhN6YelXgnfY85E2eInJscutbv5t8Nlc GjQzHug/tJr8dK57PbWo8EnoVF3cf5KwM3P2k9jd5xj/ZNqiEGDrdeFsRGqI9iL6xqZq Gdug== X-Gm-Message-State: AOJu0YySZ3kFq2GDBtdqxhw5th3Q+Lan4HpDZPIkL04ndR9BoyxXgN33 Xhlkt8/1SpT+4i7YFa5nGZy6MynVPD62jiuaRbcHCcSvFNieErhrytQiN2O2W61p3UVnCogKT9j j9geR3EowPHSEL+uyQL+dcIRXMl0/BF2RbCA= X-Received: by 2002:a05:6214:40e:b0:68f:dddb:747 with SMTP id z14-20020a056214040e00b0068fdddb0747mr1168730qvx.58.1711087293172; Thu, 21 Mar 2024 23:01:33 -0700 (PDT) MIME-Version: 1.0 From: flow gg Date: Fri, 22 Mar 2024 14:01:21 +0800 Message-ID: To: FFmpeg development discussions and patches X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: [FFmpeg-devel] [PATCH 2/3] lavc/vp8dsp: R-V V put_epel v X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: i8/67XJEEGxL From a59509c554a319f8271ad4175da40788445f7a56 Mon Sep 17 00:00:00 2001 From: sunyuechi Date: Thu, 21 Mar 2024 17:49:54 +0800 Subject: [PATCH 2/3] lavc/vp8dsp: R-V V put_epel v C908: vp8_put_epel4_v4_c: 11.0 vp8_put_epel4_v4_rvv_i32: 5.0 vp8_put_epel4_v6_c: 16.5 vp8_put_epel4_v6_rvv_i32: 6.2 vp8_put_epel8_v4_c: 43.7 vp8_put_epel8_v4_rvv_i32: 11.2 vp8_put_epel8_v6_c: 68.7 vp8_put_epel8_v6_rvv_i32: 13.2 vp8_put_epel16_v4_c: 92.5 vp8_put_epel16_v4_rvv_i32: 13.7 vp8_put_epel16_v6_c: 135.7 vp8_put_epel16_v6_rvv_i32: 16.5 --- libavcodec/riscv/vp8dsp_init.c | 7 ++++++ libavcodec/riscv/vp8dsp_rvv.S | 44 +++++++++++++++++++++++++++------- 2 files changed, 42 insertions(+), 9 deletions(-) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index 6614d661f7..2f123b67fe 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -85,6 +85,13 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) c->put_vp8_epel_pixels_tab[0][0][1] = ff_put_vp8_epel16_h4_rvv; c->put_vp8_epel_pixels_tab[1][0][1] = ff_put_vp8_epel8_h4_rvv; c->put_vp8_epel_pixels_tab[2][0][1] = ff_put_vp8_epel4_h4_rvv; + + c->put_vp8_epel_pixels_tab[0][2][0] = ff_put_vp8_epel16_v6_rvv; + c->put_vp8_epel_pixels_tab[1][2][0] = ff_put_vp8_epel8_v6_rvv; + c->put_vp8_epel_pixels_tab[2][2][0] = ff_put_vp8_epel4_v6_rvv; + c->put_vp8_epel_pixels_tab[0][1][0] = ff_put_vp8_epel16_v4_rvv; + c->put_vp8_epel_pixels_tab[1][1][0] = ff_put_vp8_epel8_v4_rvv; + c->put_vp8_epel_pixels_tab[2][1][0] = ff_put_vp8_epel4_v4_rvv; } #endif } diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index a0dd46e3a8..134154acfc 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -233,9 +233,13 @@ subpel_filters: .byte 1, -8, 36, 108, -11, 2 .byte 0, -1, 12, 123, -6, 0 -.macro epel_filter size +.macro epel_filter size type lla t2, subpel_filters +.ifc \type,v + addi t0, a6, -1 +.elseif \type == h addi t0, a5, -1 +.endif li t1, 6 mul t0, t0, t1 add t0, t0, t2 @@ -248,19 +252,33 @@ subpel_filters: .endif .endm -.macro epel_load dst len size +.macro epel_load dst len size type +.ifc \type,v + sub t6, a2, a3 + add a7, a2, a3 +.elseif \type == h addi t6, a2, -1 addi a7, a2, 1 +.endif vle8.v v24, (a2) vle8.v v22, (t6) vle8.v v26, (a7) +.ifc \type,v + add a7, a7, a3 +.elseif \type == h addi a7, a7, 1 +.endif vle8.v v28, (a7) vwmulu.vx v16, v24, t2 vwmulu.vx v20, v26, t3 .ifc \size,6 +.ifc \type,v + sub t6, t6, a3 + add a7, a7, a3 +.elseif \type == h addi t6, t6, -1 addi a7, a7, 1 +.endif vle8.v v24, (t6) vle8.v v26, (a7) vwmaccu.vx v16, t0, v24 @@ -292,13 +310,13 @@ subpel_filters: vnclipu.wi \dst, v24, 0 .endm -.macro epel_load_inc dst len size - epel_load \dst \len \size +.macro epel_load_inc dst len size type + epel_load \dst \len \size \type add a2, a2, a3 .endm -.macro epel len size - epel_filter \size +.macro epel len size type + epel_filter \size \type .ifc \len,4 vsetivli zero, 4, e8, mf4, ta, ma @@ -310,7 +328,7 @@ subpel_filters: 1: addi a4, a4, -1 - epel_load_inc v30 \len \size + epel_load_inc v30 \len \size \type vse8.v v30, (a0) add a0, a0, a1 bnez a4, 1b @@ -320,10 +338,18 @@ subpel_filters: .irp len 16,8,4 func ff_put_vp8_epel\len\()_h6_rvv, zve32x - epel \len 6 + epel \len 6 h endfunc func ff_put_vp8_epel\len\()_h4_rvv, zve32x - epel \len 4 + epel \len 4 h +endfunc + +func ff_put_vp8_epel\len\()_v6_rvv, zve32x + epel \len 6 v +endfunc + +func ff_put_vp8_epel\len\()_v4_rvv, zve32x + epel \len 4 v endfunc .endr -- 2.44.0 From patchwork Fri Mar 22 06:01:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: flow gg X-Patchwork-Id: 47299 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a21:3a4a:b0:1a3:31a3:7958 with SMTP id zu10csp1006456pzb; Thu, 21 Mar 2024 23:01:59 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCXG8yAgiOJPDkS1CFUVT4QULqIyo3qeO8Ca8w00NLgJJidfbPfNHSJYIjdKEDZ9blpfqqPpW72OaF7MaHjZlgQM6IW02FPy3Rckpg== X-Google-Smtp-Source: AGHT+IE0t2FsIEVznNxaXM0r4KNCAe/Z/NyI04NZz7cayCKM1jGsVWbXzJ/PK6E3V5UlSfHM2pkS X-Received: by 2002:a50:d4c3:0:b0:568:d3b1:90cf with SMTP id e3-20020a50d4c3000000b00568d3b190cfmr756836edj.41.1711087319667; Thu, 21 Mar 2024 23:01:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1711087319; cv=none; d=google.com; s=arc-20160816; b=euDwUYsn/jtDXCJazGB0IP6XxzdpZV3bs9bZPdzUvjpG3Umd4Tqq95hWtoJjKmyoeS 8IFt9jCHqJBhJXIBvzMBJkxCInUTSgn0qtVftPqw3K+4ZjAaPgn1H9pX6tsrEvAIz1d5 lJXTT7hG/igv5FEhjbQnPO9RPflHm9ChOs4vgvXRNUo59SJC4fKJcBoUIDAaJuU9OhgI 7nTpH4GVnQVkbZyIcJ7cie9u65i3u8QowG6T5CwMhaq99gchSQKHnrXGXR1ylMZdSoy8 5XPihPPnqZH+muzQycFnycEWT/BkgG0guRsKH8KINZAlQ+X2KdHIwKd9w3WZJHC4MR1G Ikeg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject:to :message-id:date:from:mime-version:dkim-signature:delivered-to; bh=Scj96ii3M9z1gj5/69bjJkEvfChg37IHk2UyAHU1lbI=; fh=e5zN9xSzcxLA6bGo3lF+CqTbY/oLwzApV03EO/RBfgQ=; b=KzRJxlqZuPPJnJrku5r/tYCJ4ePgtqEIdIukpXXTTUy8hvz/qvhiTjskSJSIKVWEbM u0uCGZRjjqmywxFg1yDuBUFklP48d8dmVqVCBisArCoC5EaFLAE68zJENLd0NCsBKnIy OEXxxN71gX4o2x5ZF4q1wVQUsutbnCAK8EUodEYAZp350+n/OhyypQbDQxj6fIulaJ/f xRiPAEG06BDVyzI9gx5nJWAtWEggaAn1z/xVufTAQvNj3RO2ILL4e2HvgdS6wNn4cLRf Ppy8jC/685sWwBcVE9RCd08IlmOdQ3SOhGLpH8IIq6AMzUax9FNT42WQ59VN7Si2Rc32 NtBA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=fx0d9qrU; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id d24-20020aa7c1d8000000b00568beae376asi571952edp.586.2024.03.21.23.01.59; Thu, 21 Mar 2024 23:01:59 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=fx0d9qrU; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 49A7168D574; Fri, 22 Mar 2024 08:01:56 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-qv1-f43.google.com (mail-qv1-f43.google.com [209.85.219.43]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id EBA0368D4DB for ; Fri, 22 Mar 2024 08:01:49 +0200 (EET) Received: by mail-qv1-f43.google.com with SMTP id 6a1803df08f44-690caa6438aso10717256d6.0 for ; Thu, 21 Mar 2024 23:01:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1711087308; x=1711692108; darn=ffmpeg.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=mgVmowAL2kIqtOvONmwy4ZNyWPfU9bYdBVnSXvHoSRM=; b=fx0d9qrUMOnDqtxS+JcwrcOOgsZAci35s+igaVuSB2+tL01S+Ht/HCf23w2AOQ4a28 TXz4HmDB2X+tJ0sJf9ysyTTWCbPxlkuFZi9Q4ld4hW4qRsxUiglNaUnmLkDshMfe675N RLiAyQraxNq3zLZ9MalprEhP3huMFGTlxGGcAaWHF0DeVbX+1z6Ad8XKtZxrb9O1d+N1 0X6ObYPLEfzZsjKTsos/HLUrNC1IpvNJTnyD29xKak9DD4S+N6zfO9nhYaqgrwyT7GWc xpjGJliMU8J3Cdz3UhtwV73iNMfCZyX+HYwwqUoB0fQ9zRi51LfCSVsFzPqGEv2lTZcO TO1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711087308; x=1711692108; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=mgVmowAL2kIqtOvONmwy4ZNyWPfU9bYdBVnSXvHoSRM=; b=TVAXMIX9NaaStUAfY5jWBGajnQqwBmVsz01ALXpe/OvP5uAaI0L/77xKeEXyigcxvW I+ga4yBgMLiM8Ay7G+jNRaGf0KHXekUSMJHwXQfnfrWvvwKEzbT+6KzgXsSC0eTtANaV x4oe5GfmL5jKlA+5RivVCYdeQMebJKgo/r63Ep/o2/bUG2aaQZZ0r0w9iRG/5tas9xt2 loL0Wx799/agGkYK55Oylzgr/Dk3PZIyrIDWme09OxKoBGhCMHS/i2S2iHiN8FB88RCd gPuLDjg7n+eNQ0IGLlsS8+2crSOziM0T7E6aq0NVwlnttvNXpYX7Vj4lwqQqcA0VKf50 2krg== X-Gm-Message-State: AOJu0YxV5mkW0VwByS3/SeJx2Bw53aZSoASsW/vHpkFQn/F/r//Hu7Po EFeafl6/isOqxVg9T4DA5+V6hsYwN8kzTKf5IgdJ7+/1PNlRl4OFCQOb5eXCnB/VbigCzlG2CpH Q1VNDKZ+sBou/UzMZT2Iy1wRVELyNn9QsELg= X-Received: by 2002:a05:6214:a47:b0:696:39b0:2c78 with SMTP id ee7-20020a0562140a4700b0069639b02c78mr1043975qvb.4.1711087308498; Thu, 21 Mar 2024 23:01:48 -0700 (PDT) MIME-Version: 1.0 From: flow gg Date: Fri, 22 Mar 2024 14:01:37 +0800 Message-ID: To: FFmpeg development discussions and patches X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: [FFmpeg-devel] [PATCH 3/3] lavc/vp8dsp: R-V V put_epel hv X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: edUL2e55H+c7 From 278e473681eddaf24977e47c88f715620105c6b3 Mon Sep 17 00:00:00 2001 From: sunyuechi Date: Thu, 21 Mar 2024 17:50:58 +0800 Subject: [PATCH 3/3] lavc/vp8dsp: R-V V put_epel hv C908: vp8_put_epel4_h4v4_c: 20.0 vp8_put_epel4_h4v4_rvv_i32: 11.0 vp8_put_epel4_h4v6_c: 25.2 vp8_put_epel4_h4v6_rvv_i32: 13.5 vp8_put_epel4_h6v4_c: 22.2 vp8_put_epel4_h6v4_rvv_i32: 14.5 vp8_put_epel4_h6v6_c: 29.0 vp8_put_epel4_h6v6_rvv_i32: 15.7 vp8_put_epel8_h4v4_c: 73.0 vp8_put_epel8_h4v4_rvv_i32: 22.2 vp8_put_epel8_h4v6_c: 90.5 vp8_put_epel8_h4v6_rvv_i32: 26.7 vp8_put_epel8_h6v4_c: 85.0 vp8_put_epel8_h6v4_rvv_i32: 27.2 vp8_put_epel8_h6v6_c: 104.7 vp8_put_epel8_h6v6_rvv_i32: 29.5 vp8_put_epel16_h4v4_c: 145.5 vp8_put_epel16_h4v4_rvv_i32: 26.5 vp8_put_epel16_h4v6_c: 190.7 vp8_put_epel16_h4v6_rvv_i32: 47.5 vp8_put_epel16_h6v4_c: 173.7 vp8_put_epel16_h6v4_rvv_i32: 33.2 vp8_put_epel16_h6v6_c: 222.2 vp8_put_epel16_h6v6_rvv_i32: 35.5 --- libavcodec/riscv/vp8dsp_init.c | 13 ++++ libavcodec/riscv/vp8dsp_rvv.S | 125 +++++++++++++++++++++++++++------ 2 files changed, 117 insertions(+), 21 deletions(-) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index 2f123b67fe..2dd583d079 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -92,6 +92,19 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) c->put_vp8_epel_pixels_tab[0][1][0] = ff_put_vp8_epel16_v4_rvv; c->put_vp8_epel_pixels_tab[1][1][0] = ff_put_vp8_epel8_v4_rvv; c->put_vp8_epel_pixels_tab[2][1][0] = ff_put_vp8_epel4_v4_rvv; + + c->put_vp8_epel_pixels_tab[0][2][2] = ff_put_vp8_epel16_h6v6_rvv; + c->put_vp8_epel_pixels_tab[1][2][2] = ff_put_vp8_epel8_h6v6_rvv; + c->put_vp8_epel_pixels_tab[2][2][2] = ff_put_vp8_epel4_h6v6_rvv; + c->put_vp8_epel_pixels_tab[0][2][1] = ff_put_vp8_epel16_h4v6_rvv; + c->put_vp8_epel_pixels_tab[1][2][1] = ff_put_vp8_epel8_h4v6_rvv; + c->put_vp8_epel_pixels_tab[2][2][1] = ff_put_vp8_epel4_h4v6_rvv; + c->put_vp8_epel_pixels_tab[0][1][1] = ff_put_vp8_epel16_h4v4_rvv; + c->put_vp8_epel_pixels_tab[1][1][1] = ff_put_vp8_epel8_h4v4_rvv; + c->put_vp8_epel_pixels_tab[2][1][1] = ff_put_vp8_epel4_h4v4_rvv; + c->put_vp8_epel_pixels_tab[0][1][2] = ff_put_vp8_epel16_h6v4_rvv; + c->put_vp8_epel_pixels_tab[1][1][2] = ff_put_vp8_epel8_h6v4_rvv; + c->put_vp8_epel_pixels_tab[2][1][2] = ff_put_vp8_epel4_h6v4_rvv; } #endif } diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index 134154acfc..701557a808 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -233,26 +233,26 @@ subpel_filters: .byte 1, -8, 36, 108, -11, 2 .byte 0, -1, 12, 123, -6, 0 -.macro epel_filter size type - lla t2, subpel_filters +.macro epel_filter size type regtype + lla \regtype\()2, subpel_filters .ifc \type,v - addi t0, a6, -1 + addi \regtype\()0, a6, -1 .elseif \type == h - addi t0, a5, -1 + addi \regtype\()0, a5, -1 .endif - li t1, 6 - mul t0, t0, t1 - add t0, t0, t2 + li \regtype\()1, 6 + mul \regtype\()0, \regtype\()0, \regtype\()1 + add \regtype\()0, \regtype\()0, \regtype\()2 .irp n 1,2,3,4 - lb t\n, \n(t0) + lb \regtype\n, \n(\regtype\()0) .endr .ifc \size,6 - lb t5, 5(t0) - lb t0, (t0) + lb \regtype\()5, 5(\regtype\()0) + lb \regtype\()0, (\regtype\()0) .endif .endm -.macro epel_load dst len size type +.macro epel_load dst len size type from_mem regtype .ifc \type,v sub t6, a2, a3 add a7, a2, a3 @@ -260,6 +260,7 @@ subpel_filters: addi t6, a2, -1 addi a7, a2, 1 .endif +.if \from_mem vle8.v v24, (a2) vle8.v v22, (t6) vle8.v v26, (a7) @@ -269,8 +270,8 @@ subpel_filters: addi a7, a7, 1 .endif vle8.v v28, (a7) - vwmulu.vx v16, v24, t2 - vwmulu.vx v20, v26, t3 + vwmulu.vx v16, v24, \regtype\()2 + vwmulu.vx v20, v26, \regtype\()3 .ifc \size,6 .ifc \type,v sub t6, t6, a3 @@ -281,12 +282,22 @@ subpel_filters: .endif vle8.v v24, (t6) vle8.v v26, (a7) - vwmaccu.vx v16, t0, v24 - vwmaccu.vx v16, t5, v26 + vwmaccu.vx v16, \regtype\()0, v24 + vwmaccu.vx v16, \regtype\()5, v26 +.endif + vwmaccsu.vx v16, \regtype\()1, v22 + vwmaccsu.vx v16, \regtype\()4, v28 +.else + vwmulu.vx v16, v4 , \regtype\()2 + vwmulu.vx v20, v6 , \regtype\()3 + .ifc \size,6 + vwmaccu.vx v16, \regtype\()0, v0 + vwmaccu.vx v16, \regtype\()5, v10 + .endif + vwmaccsu.vx v16, \regtype\()1, v2 + vwmaccsu.vx v16, \regtype\()4, v8 .endif li t6, 64 - vwmaccsu.vx v16, t1, v22 - vwmaccsu.vx v16, t4, v28 vwadd.wx v16, v16, t6 .ifc \len,4 @@ -310,13 +321,13 @@ subpel_filters: vnclipu.wi \dst, v24, 0 .endm -.macro epel_load_inc dst len size type - epel_load \dst \len \size \type +.macro epel_load_inc dst len size type from_mem regtype + epel_load \dst \len \size \type \from_mem \regtype add a2, a2, a3 .endm .macro epel len size type - epel_filter \size \type + epel_filter \size \type t .ifc \len,4 vsetivli zero, 4, e8, mf4, ta, ma @@ -328,10 +339,66 @@ subpel_filters: 1: addi a4, a4, -1 - epel_load_inc v30 \len \size \type + epel_load_inc v30 \len \size \type 1 t + vse8.v v30, (a0) + add a0, a0, a1 + bnez a4, 1b + + ret +.endm + +.macro epel_hv len hsize vsize + addi sp, sp, -48 + .irp n 0,1,2,3,4,5 + sd s\n, \n\()<<3(sp) + .endr + sub a2, a2, a3 + epel_filter \hsize h t + epel_filter \vsize v s +.ifc \len,4 + vsetivli zero, 4, e8, mf4, ta, ma +.elseif \len == 8 + vsetivli zero, 8, e8, mf2, ta, ma +.else + vsetivli zero, 16, e8, m1, ta, ma +.endif +.if \hsize == 6 || \vsize == 6 + sub a2, a2, a3 + epel_load_inc v0 \len \hsize h 1 t +.endif + epel_load_inc v2 \len \hsize h 1 t + epel_load_inc v4 \len \hsize h 1 t + epel_load_inc v6 \len \hsize h 1 t + epel_load_inc v8 \len \hsize h 1 t +.if \hsize == 6 || \vsize == 6 + epel_load_inc v10 \len \hsize h 1 t +.endif + addi a4, a4, -1 +1: + addi a4, a4, -1 + epel_load v30 \len \vsize v 0 s vse8.v v30, (a0) +.if \hsize == 6 || \vsize == 6 + vmv.v.v v0, v2 +.endif + vmv.v.v v2, v4 + vmv.v.v v4, v6 + vmv.v.v v6, v8 +.if \hsize == 6 || \vsize == 6 + vmv.v.v v8, v10 + epel_load_inc v10 \len \hsize h 1 t +.else + epel_load_inc v8 \len 4 h 1 t +.endif add a0, a0, a1 bnez a4, 1b + epel_load v30 \len \vsize v 0 s + vse8.v v30, (a0) + + .irp n 0,1,2,3,4,5 + ld s\n, \n\()<<3(sp) + .endr + addi sp, sp, 48 ret .endm @@ -352,4 +419,20 @@ endfunc func ff_put_vp8_epel\len\()_v4_rvv, zve32x epel \len 4 v endfunc + +func ff_put_vp8_epel\len\()_h6v6_rvv, zve32x + epel_hv \len 6 6 +endfunc + +func ff_put_vp8_epel\len\()_h4v4_rvv, zve32x + epel_hv \len 4 4 +endfunc + +func ff_put_vp8_epel\len\()_h6v4_rvv, zve32x + epel_hv \len 6 4 +endfunc + +func ff_put_vp8_epel\len\()_h4v6_rvv, zve32x + epel_hv \len 4 6 +endfunc .endr -- 2.44.0