From patchwork Sat Mar 2 12:05:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: flow gg X-Patchwork-Id: 46697 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:a919:b0:19e:cdac:8cce with SMTP id cd25csp1813157pzb; Sat, 2 Mar 2024 04:06:06 -0800 (PST) X-Forwarded-Encrypted: i=2; AJvYcCVKM8XdPDOemUG9xCpo7mC5czYmdfIvTE+ptBZKX/RCHAhD4+ZcD6OPrk/jk66iLXOzq1qC7wItpasoGcU7U0oqppvB5pfWhXAoLw== X-Google-Smtp-Source: AGHT+IFw19Ozhtpr3MCk+fUDdroAl3W4Bl9J4M/6uVhsRyV2i9CI68Y/0nlopTo/xOmheVfJP4hO X-Received: by 2002:a17:906:398:b0:a44:1103:ede3 with SMTP id b24-20020a170906039800b00a441103ede3mr2996088eja.31.1709381165898; Sat, 02 Mar 2024 04:06:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1709381165; cv=none; d=google.com; s=arc-20160816; b=vPTaMFsebQ02GqTDk5AAzd+E+O7OeIJiNPkwEIK5hJShD9udZ9l0OvPc3tmjaXiNIU ZKc541li+7E2cba9dHLvmqyu+FnihVnno3BWAW8M5ieu27E0WFFLkKiOch952SQo4GA3 bTzdGNurY8PpucXiy7Zlu/784AxoC85xDrxGRWZEXnYVhb/xunpUzROu/8xOgbz+9hD+ IMLe22WEubDriNSY0Z5rLXpEU1cHGA+1MYKjAHB2mFz0UGEtekT25rmlSmSef1rnOyxq mmgZuHzhuw3E2A1tld5fV4EK6odk4JkNdbyh9+y81W9qud211mPJc4pHMQW5Le+jt8wM uYgg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject:to :message-id:date:from:mime-version:dkim-signature:delivered-to; bh=0RPzUKINSRqkFL0o1oez4jryP5Z+JgkqgdXXhCgUYU8=; fh=e5zN9xSzcxLA6bGo3lF+CqTbY/oLwzApV03EO/RBfgQ=; b=08I3xb8zeO9B+678o3TYWNGE2IoAYYFHpTSdtuG4Kfkn9C6Dm9fbIxlpkrbJBfhmDj P1POExtJoJwwPXAVOmruIvA3v3iwYNFSg+xLjlG0QiCcHwDWqfe88pZ0Zagyxc3HBXbh QZy2Ae9CcHMp5vWaZv8nIjf+gHr/MPUAwyDQsx5mvlxYu45grxJ4GxPVQDo2GHGqpRnv oe2PIow+iMqsZAWvWIqW6x8eK1uLG3LVmC1sIbS+1BNovSNCANnbWxxMJCBxLFabWe3j ruwlyClXxCkATAqSQEXfR/gZ1CkeEkM0NQt4u8L75OS0PB6LgBT86j3brwS5FOHSBAp8 zkqw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=Ev7THLbb; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id r1-20020a1709067fc100b00a3e73d9c358si2323153ejs.994.2024.03.02.04.06.05; Sat, 02 Mar 2024 04:06:05 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=Ev7THLbb; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 55AEE68D0BD; Sat, 2 Mar 2024 14:06:01 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-qv1-f52.google.com (mail-qv1-f52.google.com [209.85.219.52]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 6A6BC68CE13 for ; Sat, 2 Mar 2024 14:05:55 +0200 (EET) Received: by mail-qv1-f52.google.com with SMTP id 6a1803df08f44-68f41af71ebso24243276d6.1 for ; Sat, 02 Mar 2024 04:05:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709381154; x=1709985954; darn=ffmpeg.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=Xtp2E8nMlNnHHn+B5ba3JVKVQ9lBYco7xMcahFCW5JY=; b=Ev7THLbbqxUL9Axeay1p/xTTUXquYQUvwzVqrH9VYrDvlF1l6sA/KuGsu8Ii0Wv2tL HBLZvM0ZCi9U9eqG+JQDNeV7mQ+T+XheReqBSpj2qHtLOOLcMxbyT41W2O6/5BYkszwF VjspDyw6lyFAbggh+hD7gGRtihgEztb0TCD9U5PERmkUZrVqcfSXVbJ/aKl3aBRKBT1k nL8/qZVI6dE7T45mUR0BU12oQMcEMI5HXSH5fjY+aFnJQwGkGuTt2dgBq4tBnP8e5BTE 6cwq4r8HjinMCNws/v2i+jc7HFMSzrfWKk4bHvDnmhfZrUQchR8puqbNhJHcA0h6QHU2 6flQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709381154; x=1709985954; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=Xtp2E8nMlNnHHn+B5ba3JVKVQ9lBYco7xMcahFCW5JY=; b=B8YNKEd2+pooZKk6NIpZ5m4uHs+PpzvR1+gJUkW+Ony72HVaFrKUSw37GxfoTv/RpK RsxsfKn1guwPzX7m62TrjevwDgFcjzdcrrewAUYGSry9WiTYLXp9htJt1XAICzYB4HJw oUIRFUCDeWFaSiLdJqq4YVW8XTtiUaiHfs2ZYcDBuEzt7NEy7GaJJg9vnWd7qmv+TlSX cqvkaeknJAWVr4k7M2BHMCQWmWhNTgbk89tbh9+nPqnGwkXNy4xpWNPL6kFsKoYj10X8 Y/ecHBVEz0xvFrkBiMEOLWlzeOxF3tX5ZLh6yAIw3eexRC3eblaQ409tLoepj2n2jZ7+ x2fA== X-Gm-Message-State: AOJu0YzX/c8/oC9Lpan/yGI0K1KkbcylrU/kOQptVZcePPIXoxvKIYbD jOfz09xjsfO53YpdI/P130QF5tiH7xyjuoM3nKBWox49bxJBMdO4giiBptHZ5bn46kUgQ8iAMw+ ALuZxyD/jCbfHfHTnnuIFjoif8D/TzBMAxbU= X-Received: by 2002:a0c:e210:0:b0:68f:44eb:b947 with SMTP id q16-20020a0ce210000000b0068f44ebb947mr4221237qvl.14.1709381153719; Sat, 02 Mar 2024 04:05:53 -0800 (PST) MIME-Version: 1.0 From: flow gg Date: Sat, 2 Mar 2024 20:05:42 +0800 Message-ID: To: FFmpeg development discussions and patches X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: [FFmpeg-devel] [PATCH 1/2] checkasm/vc1dsp: add mspel_pixels test X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: r43bLr+HPpMu From efcb91959cb373145f2fc9fcbfcc6659610172cc Mon Sep 17 00:00:00 2001 From: sunyuechi Date: Fri, 1 Mar 2024 19:45:53 +0800 Subject: [PATCH 1/2] checkasm/vc1dsp: add mspel_pixels test --- tests/checkasm/vc1dsp.c | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) diff --git a/tests/checkasm/vc1dsp.c b/tests/checkasm/vc1dsp.c index 570785776f..42e9c626e0 100644 --- a/tests/checkasm/vc1dsp.c +++ b/tests/checkasm/vc1dsp.c @@ -438,6 +438,40 @@ static void check_unescape(void) } } +static void check_mspel_pixels(void) +{ + LOCAL_ALIGNED_8(uint8_t, src0, [32 * 32]); + LOCAL_ALIGNED_8(uint8_t, src1, [32 * 32]); + LOCAL_ALIGNED_8(uint8_t, dst0, [32 * 32]); + LOCAL_ALIGNED_8(uint8_t, dst1, [32 * 32]); + + VC1DSPContext h; + + const test tests[] = { + VC1DSP_SIZED_TEST(put_vc1_mspel_pixels_tab[0][0], 16, 16) + VC1DSP_SIZED_TEST(put_vc1_mspel_pixels_tab[1][0], 8, 8) + VC1DSP_SIZED_TEST(avg_vc1_mspel_pixels_tab[0][0], 16, 16) + VC1DSP_SIZED_TEST(avg_vc1_mspel_pixels_tab[1][0], 8, 8) + }; + + ff_vc1dsp_init(&h); + + for (size_t t = 0; t < FF_ARRAY_ELEMS(tests); ++t) { + void (*func)(uint8_t *, const uint8_t*, ptrdiff_t, int) = *(void **)((intptr_t) &h + tests[t].offset); + if (check_func(func, "vc1dsp.%s", tests[t].name)) { + declare_func_emms(AV_CPU_FLAG_MMX, void, uint8_t *, const uint8_t*, ptrdiff_t, int); + RANDOMIZE_BUFFER8(dst, 32 * 32); + RANDOMIZE_BUFFER8(src, 32 * 32); + call_ref(dst0, src0, 32, 0); + call_new(dst1, src1, 32, 0); + if (memcmp(dst0, dst1, 32 * 32)) { + fail(); + } + bench_new(dst1, src0, 32, 0); + } + } +} + void checkasm_check_vc1dsp(void) { check_inv_trans_inplace(); @@ -449,4 +483,7 @@ void checkasm_check_vc1dsp(void) check_unescape(); report("unescape_buffer"); + + check_mspel_pixels(); + report("mspel_pixels"); } -- 2.44.0 From patchwork Sat Mar 2 12:06:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: flow gg X-Patchwork-Id: 46698 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:a919:b0:19e:cdac:8cce with SMTP id cd25csp1813351pzb; Sat, 2 Mar 2024 04:06:35 -0800 (PST) X-Forwarded-Encrypted: i=2; AJvYcCX9qoT5GlCymhtbIIjrdSgcHoFu7V2BYmB/oWShi/5c49LOqjQexNmQNv86UqYulkr2gnntM6DK/S3yrkStvlwN7qceubVMw8zPrw== X-Google-Smtp-Source: AGHT+IHavqY0QJkzuWO4geXh4BJB+8BmuMABbEE9A2Ri3oDRtOXNayELdjmFs7XuXZIyRZLTEB4b X-Received: by 2002:a05:6402:904:b0:565:9f59:b3bf with SMTP id g4-20020a056402090400b005659f59b3bfmr3626421edz.6.1709381195063; Sat, 02 Mar 2024 04:06:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1709381195; cv=none; d=google.com; s=arc-20160816; b=DTIEeopaXZyGlxkPhvrMsrCxTxtxA1Gs2rqg1P0n8sPoNLjBzIa9srjgZcGBtpp5Iv q5dFlh9j54cKxJ/WP6fPXQFj6oZF5d8SmGcWAdDIdMjqLnIL8Z2BLI+nc+Q88DOc6Ada SiIS5Ee8rhAA3RBqLNWeAPgjimcHLOIjxPB1MYbJZnboxTVWNhFQlVJ2UnQ9G3Uam9AT Ct/+y1bp1LBxNOB/K70mWqu9piWCi57ZvZY6XTBc1xlNVvK5YCy8GUTmLZVl9J87Azr4 9bU/Z4VneVQlEUh9RdoL3b2AmjRV/shiNhdGdpCoFXs1gza03I0c7VVWA/Vsb6h8AJ/L BGNQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject:to :message-id:date:from:mime-version:dkim-signature:delivered-to; bh=nSSikOygwTy7vl7p1djSGexH4P9Ib01r7e/AnqLQOOk=; fh=e5zN9xSzcxLA6bGo3lF+CqTbY/oLwzApV03EO/RBfgQ=; b=eELYIoRCpHlcKzbjBuVRhul4NMLNvxnoeZr0m4r0n2OEgdXRKroCNoHzZ8Z0yxgFAE Yv58bxCa5aLW0o/I0E4fvjJ0ZM0TB+oBLwbsR2si958Y8hQqM+fndxxKI0cwzHkPyiCF Mp1w5xbAZB8AJrvEnncRr6FZ4sCkLWfMuMzKrgzyswEAV8VNdBzy0vIHWXC5IUcZtmPi WHk02AZ+KAxeoJFuAuTXW5ifAUnCb1fOpbiU541AHF6Y+SqoIknKmV8U5FcAk9Kx22kF r56wfcTn9IbSOFhgPU6pocyOnCf+QYmMzNMDnh2t+Nxw2zVNo6d1WLW3OB0CyhETQP31 gcJQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=YgF6mkdH; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id u29-20020a50c05d000000b00566ad11c3absi2022265edd.219.2024.03.02.04.06.34; Sat, 02 Mar 2024 04:06:35 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=YgF6mkdH; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 89C1668D27A; Sat, 2 Mar 2024 14:06:32 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-qv1-f51.google.com (mail-qv1-f51.google.com [209.85.219.51]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id DFF4C68CFCC for ; Sat, 2 Mar 2024 14:06:25 +0200 (EET) Received: by mail-qv1-f51.google.com with SMTP id 6a1803df08f44-69066459b3fso912416d6.1 for ; Sat, 02 Mar 2024 04:06:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709381184; x=1709985984; darn=ffmpeg.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=rdc548QGSNzgRb+bKjrYQVZPtRS3U8DIUB61g9PRUJ8=; b=YgF6mkdH6W/ChPe6Xnno9WmIN5RRUplNVNmq72mQaTTNCHN2gHnsmCcOFZ6Nt6YceG zwOBzT0pDAcVNPIJCbbS29YSYrPg1lfy8q2D5JaQjft9IL8GBXESS+CzPp+q1cNyuN2B FmyOJuFnl+WCUAS+v02YcFqNS4UlJrD5O3gr6vyA/TEuXwydAvLGGK6RnXNqP7mUXuAy X4ofX3zmULiWWgS9k4Jdw5ucYQ3SjL+E3ObPjurna2IC3JWDPTAxqOXazKGslL4NrgWv FRR96dqWYEPmSZUUa1kq3AHGWYGJaCfaBOto5ACCiCHTA5y0fUaw3U0cM0hrhCCYuIki LAeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709381184; x=1709985984; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=rdc548QGSNzgRb+bKjrYQVZPtRS3U8DIUB61g9PRUJ8=; b=fld86SrBLvMIdWsuXjvIecvkSeXKPl/s8hEqt2DvEMPDmOIBUUqge1uVdu17vxyQdC Bz6YBZRM2kEmOF+h6Ko+ZpwUsWguG2uEkZYAcr16xyrorapQuLUS6GHcVtRzTMfFoUip 5FvAGeOyNVLwGjPuN8aB10An59weT9wCaPosoJp0kW024+IWyUcOfLe1oiBB3TMPmKjd GSbJf8p11MataDSY6cAVIOWgG7IowH686KFnKvkMnXPc4bZzm/+yXc7Xkbir5T+wk1pA MoNqqjQb6I0fgKlEFh1KR34i483kQBnuEfhonHnVlGLFPTciB0GqTmkugl7bBOJABFP9 Gehg== X-Gm-Message-State: AOJu0YzN8xl5M6lmYHrMnARIpe/8E1bEPcuxN/o2EcmKC0Xw5nChOod/ SqyI3cholwIANny4L7Yxkh3gjNsa9j0iV/NQNdL7wy5s3qQQFPftSLj/yrwgtxbRpG5kfOgYgLm BheWZrVk8FnloEUsuEPEJX6VcUSpznbsriX41Gw== X-Received: by 2002:ad4:4512:0:b0:690:5de2:d058 with SMTP id k18-20020ad44512000000b006905de2d058mr2894335qvu.52.1709381184655; Sat, 02 Mar 2024 04:06:24 -0800 (PST) MIME-Version: 1.0 From: flow gg Date: Sat, 2 Mar 2024 20:06:13 +0800 Message-ID: To: FFmpeg development discussions and patches X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: [FFmpeg-devel] [PATCH 2/2] lavc/vc1dsp: R-V V mspel_pixels X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: bbz2WAJgGZBx Here adjusting the order, rather than simply using .rept, will be 13%-24% faster. From 07aa3e2eff0fe1660ac82dec5d06d50fa4c433a4 Mon Sep 17 00:00:00 2001 From: sunyuechi Date: Wed, 28 Feb 2024 16:32:39 +0800 Subject: [PATCH 2/2] lavc/vc1dsp: R-V V mspel_pixels vc1dsp.avg_vc1_mspel_pixels_tab[0][0]_c: 869.2 vc1dsp.avg_vc1_mspel_pixels_tab[0][0]_rvv_i32: 147.7 vc1dsp.avg_vc1_mspel_pixels_tab[1][0]_c: 220.0 vc1dsp.avg_vc1_mspel_pixels_tab[1][0]_rvv_i64: 56.2 vc1dsp.put_vc1_mspel_pixels_tab[0][0]_c: 523.2 vc1dsp.put_vc1_mspel_pixels_tab[0][0]_rvv_i32: 82.0 vc1dsp.put_vc1_mspel_pixels_tab[1][0]_c: 138.0 vc1dsp.put_vc1_mspel_pixels_tab[1][0]_rvv_i64: 24.0 --- libavcodec/riscv/vc1dsp_init.c | 8 ++++ libavcodec/riscv/vc1dsp_rvv.S | 76 ++++++++++++++++++++++++++++++++++ 2 files changed, 84 insertions(+) diff --git a/libavcodec/riscv/vc1dsp_init.c b/libavcodec/riscv/vc1dsp_init.c index e47b644f80..610c43a1a3 100644 --- a/libavcodec/riscv/vc1dsp_init.c +++ b/libavcodec/riscv/vc1dsp_init.c @@ -29,6 +29,10 @@ void ff_vc1_inv_trans_8x8_dc_rvv(uint8_t *dest, ptrdiff_t stride, int16_t *block void ff_vc1_inv_trans_4x8_dc_rvv(uint8_t *dest, ptrdiff_t stride, int16_t *block); void ff_vc1_inv_trans_8x4_dc_rvv(uint8_t *dest, ptrdiff_t stride, int16_t *block); void ff_vc1_inv_trans_4x4_dc_rvv(uint8_t *dest, ptrdiff_t stride, int16_t *block); +void ff_put_pixels16x16_rvv(uint8_t *dst, const uint8_t *src, ptrdiff_t line_size, int rnd); +void ff_put_pixels8x8_rvv(uint8_t *dst, const uint8_t *src, ptrdiff_t line_size, int rnd); +void ff_avg_pixels16x16_rvv(uint8_t *dst, const uint8_t *src, ptrdiff_t line_size, int rnd); +void ff_avg_pixels8x8_rvv(uint8_t *dst, const uint8_t *src, ptrdiff_t line_size, int rnd); av_cold void ff_vc1dsp_init_riscv(VC1DSPContext *dsp) { @@ -38,9 +42,13 @@ av_cold void ff_vc1dsp_init_riscv(VC1DSPContext *dsp) if (flags & AV_CPU_FLAG_RVV_I32 && ff_get_rv_vlenb() >= 16) { dsp->vc1_inv_trans_4x8_dc = ff_vc1_inv_trans_4x8_dc_rvv; dsp->vc1_inv_trans_4x4_dc = ff_vc1_inv_trans_4x4_dc_rvv; + dsp->put_vc1_mspel_pixels_tab[0][0] = ff_put_pixels16x16_rvv; + dsp->avg_vc1_mspel_pixels_tab[0][0] = ff_avg_pixels16x16_rvv; if (flags & AV_CPU_FLAG_RVV_I64) { dsp->vc1_inv_trans_8x8_dc = ff_vc1_inv_trans_8x8_dc_rvv; dsp->vc1_inv_trans_8x4_dc = ff_vc1_inv_trans_8x4_dc_rvv; + dsp->put_vc1_mspel_pixels_tab[1][0] = ff_put_pixels8x8_rvv; + dsp->avg_vc1_mspel_pixels_tab[1][0] = ff_avg_pixels8x8_rvv; } } #endif diff --git a/libavcodec/riscv/vc1dsp_rvv.S b/libavcodec/riscv/vc1dsp_rvv.S index 4a00945ead..af1df85403 100644 --- a/libavcodec/riscv/vc1dsp_rvv.S +++ b/libavcodec/riscv/vc1dsp_rvv.S @@ -111,3 +111,79 @@ func ff_vc1_inv_trans_4x4_dc_rvv, zve32x vsse32.v v0, (a0), a1 ret endfunc + +func ff_put_pixels16x16_rvv, zve32x + vsetivli zero, 16, e8, m1, ta, ma + .irp n 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 + vle8.v v\n, (a1) + add a1, a1, a2 + .endr + vle8.v v31, (a1) + .irp n 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 + vse8.v v\n, (a0) + add a0, a0, a2 + .endr + vse8.v v31, (a0) + + ret +endfunc + +func ff_put_pixels8x8_rvv, zve64x + vsetivli zero, 8, e8, mf2, ta, ma + vlse64.v v8, (a1), a2 + vsse64.v v8, (a0), a2 + + ret +endfunc + +func ff_avg_pixels16x16_rvv, zve32x + csrwi vxrm, 0 + vsetivli zero, 16, e8, m1, ta, ma + .irp n 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 + vle8.v v\n, (a1) + add a1, a1, a2 + .endr + vle8.v v31, (a1) + .irp n 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 + vle8.v v\n, (a0) + add a0, a0, a2 + .endr + vle8.v v15, (a0) + vaaddu.vv v0, v0, v16 + vaaddu.vv v1, v1, v17 + vaaddu.vv v2, v2, v18 + vaaddu.vv v3, v3, v19 + vaaddu.vv v4, v4, v20 + vaaddu.vv v5, v5, v21 + vaaddu.vv v6, v6, v22 + vaaddu.vv v7, v7, v23 + vaaddu.vv v8, v8, v24 + vaaddu.vv v9, v9, v25 + vaaddu.vv v10, v10, v26 + vaaddu.vv v11, v11, v27 + vaaddu.vv v12, v12, v28 + vaaddu.vv v13, v13, v29 + vaaddu.vv v14, v14, v30 + vaaddu.vv v15, v15, v31 + .irp n 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1 + vse8.v v\n, (a0) + sub a0, a0, a2 + .endr + vse8.v v0, (a0) + + ret +endfunc + +func ff_avg_pixels8x8_rvv, zve64x + csrwi vxrm, 0 + li t0, 64 + vsetivli zero, 8, e8, mf2, ta, ma + vlse64.v v16, (a1), a2 + vlse64.v v8, (a0), a2 + vsetvli zero, t0, e8, m4, ta, ma + vaaddu.vv v16, v16, v8 + vsetivli zero, 8, e8, mf2, ta, ma + vsse64.v v16, (a0), a2 + + ret +endfunc -- 2.44.0