diff mbox series

[FFmpeg-devel,1/4] lavc/vp9dsp: R-V V ipred dc

Message ID CAEa-L+t=EBmDtCu3Y+ezvUp7VLR_R0OC22XkJ5TheyasQk_2nA@mail.gmail.com
State New
Headers show
Series [FFmpeg-devel,1/4] lavc/vp9dsp: R-V V ipred dc | expand

Checks

Context Check Description
yinshiyou/make_loongarch64 success Make finished
yinshiyou/make_fate_loongarch64 fail Make fate failed
andriy/make_x86 success Make finished
andriy/make_fate_x86 fail Make fate failed

Commit Message

flow gg March 2, 2024, 7:42 a.m. UTC

Comments

Rémi Denis-Courmont March 2, 2024, 9:03 a.m. UTC | #1
Le lauantaina 2. maaliskuuta 2024, 9.42.06 EET flow gg a écrit :
> 

You would need a lot fewer if/else if you passed the order/bit-width instead 
of the size as macro parameter.

Similarly, this can be folded as a single .else:

+.elseif \type == 127
+        li           t1, 127
+.elseif \type == 128
+        li           t1, 128
+.elseif \type == 129
+        li           t1, 129
flow gg March 2, 2024, 9:48 a.m. UTC | #2
Okay, reduced if/else in the response.

Rémi Denis-Courmont <remi@remlab.net> 于2024年3月2日周六 17:03写道:

> Le lauantaina 2. maaliskuuta 2024, 9.42.06 EET flow gg a écrit :
> >
>
> You would need a lot fewer if/else if you passed the order/bit-width
> instead
> of the size as macro parameter.
>
> Similarly, this can be folded as a single .else:
>
> +.elseif \type == 127
> +        li           t1, 127
> +.elseif \type == 128
> +        li           t1, 128
> +.elseif \type == 129
> +        li           t1, 129
>
> --
> レミ・デニ-クールモン
> http://www.remlab.net/
>
>
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
flow gg March 3, 2024, 1:59 a.m. UTC | #3
updated a little improve in this reply



flow gg <hlefthleft@gmail.com> 于2024年3月2日周六 17:48写道:

> Okay, reduced if/else in the response.
>
> Rémi Denis-Courmont <remi@remlab.net> 于2024年3月2日周六 17:03写道:
>
>> Le lauantaina 2. maaliskuuta 2024, 9.42.06 EET flow gg a écrit :
>> >
>>
>> You would need a lot fewer if/else if you passed the order/bit-width
>> instead
>> of the size as macro parameter.
>>
>> Similarly, this can be folded as a single .else:
>>
>> +.elseif \type == 127
>> +        li           t1, 127
>> +.elseif \type == 128
>> +        li           t1, 128
>> +.elseif \type == 129
>> +        li           t1, 129
>>
>> --
>> レミ・デニ-クールモン
>> http://www.remlab.net/
>>
>>
>>
>> _______________________________________________
>> ffmpeg-devel mailing list
>> ffmpeg-devel@ffmpeg.org
>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>
>> To unsubscribe, visit link above, or email
>> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>>
>
Rémi Denis-Courmont March 3, 2024, 2:46 p.m. UTC | #4
Le sunnuntaina 3. maaliskuuta 2024, 3.59.00 EET flow gg a écrit :
> updated a little improve in this reply

As noted eaerlier, I don't understand why you have two size parameters. It 
seems that \size is always either the same as (1 << (\size2 - 1)) a.k.a. ((1 
<< \size2) / 2), or unused. The assembler *can* compute arithmetic constants.

Similarly, you can use \restore as a truth value directly: `.if \restore`.

FWIW, it seems that you could just as well include func/endfunc inside the 
macros.
flow gg March 3, 2024, 3:31 p.m. UTC | #5
> As noted eaerlier, I don't understand why you have two size parameters. It
seems that \size is always either the same as (1 << (\size2 - 1)) a.k.a. ((1
<< \size2) / 2), or unused. The assembler *can* compute arithmetic
constants.

Thanks , I didn't know that before

> Similarly, you can use \restore as a truth value directly: `.if \restore`.

Okay

FWIW, it seems that you could just as well include func/endfunc inside the
macros.

Do you mean to generate func/endfunc using macros?

Rémi Denis-Courmont <remi@remlab.net> 于2024年3月3日周日 22:46写道:

> Le sunnuntaina 3. maaliskuuta 2024, 3.59.00 EET flow gg a écrit :
> > updated a little improve in this reply
>
> As noted eaerlier, I don't understand why you have two size parameters. It
> seems that \size is always either the same as (1 << (\size2 - 1)) a.k.a.
> ((1
> << \size2) / 2), or unused. The assembler *can* compute arithmetic
> constants.
>
> Similarly, you can use \restore as a truth value directly: `.if \restore`.
>
> FWIW, it seems that you could just as well include func/endfunc inside the
> macros.
>
> --
> レミ・デニ-クールモン
> http://www.remlab.net/
>
>
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
flow gg March 7, 2024, 11:20 a.m. UTC | #6
updated it in the reply

flow gg <hlefthleft@gmail.com> 于2024年3月3日周日 23:31写道:

> > As noted eaerlier, I don't understand why you have two size parameters.
> It
> seems that \size is always either the same as (1 << (\size2 - 1)) a.k.a.
> ((1
> << \size2) / 2), or unused. The assembler *can* compute arithmetic
> constants.
>
> Thanks , I didn't know that before
>
> > Similarly, you can use \restore as a truth value directly: `.if
> \restore`.
>
> Okay
>
> FWIW, it seems that you could just as well include func/endfunc inside the
> macros.
>
> Do you mean to generate func/endfunc using macros?
>
> Rémi Denis-Courmont <remi@remlab.net> 于2024年3月3日周日 22:46写道:
>
>> Le sunnuntaina 3. maaliskuuta 2024, 3.59.00 EET flow gg a écrit :
>> > updated a little improve in this reply
>>
>> As noted eaerlier, I don't understand why you have two size parameters.
>> It
>> seems that \size is always either the same as (1 << (\size2 - 1)) a.k.a.
>> ((1
>> << \size2) / 2), or unused. The assembler *can* compute arithmetic
>> constants.
>>
>> Similarly, you can use \restore as a truth value directly: `.if \restore`.
>>
>> FWIW, it seems that you could just as well include func/endfunc inside
>> the
>> macros.
>>
>> --
>> レミ・デニ-クールモン
>> http://www.remlab.net/
>>
>>
>>
>> _______________________________________________
>> ffmpeg-devel mailing list
>> ffmpeg-devel@ffmpeg.org
>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>
>> To unsubscribe, visit link above, or email
>> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>>
>
flow gg March 22, 2024, 6:02 a.m. UTC | #7
Using macros to shorten function definitions, updated in this response

flow gg <hlefthleft@gmail.com> 于2024年3月7日周四 19:20写道:

> updated it in the reply
>
> flow gg <hlefthleft@gmail.com> 于2024年3月3日周日 23:31写道:
>
>> > As noted eaerlier, I don't understand why you have two size parameters.
>> It
>> seems that \size is always either the same as (1 << (\size2 - 1)) a.k.a.
>> ((1
>> << \size2) / 2), or unused. The assembler *can* compute arithmetic
>> constants.
>>
>> Thanks , I didn't know that before
>>
>> > Similarly, you can use \restore as a truth value directly: `.if
>> \restore`.
>>
>> Okay
>>
>> FWIW, it seems that you could just as well include func/endfunc inside the
>> macros.
>>
>> Do you mean to generate func/endfunc using macros?
>>
>> Rémi Denis-Courmont <remi@remlab.net> 于2024年3月3日周日 22:46写道:
>>
>>> Le sunnuntaina 3. maaliskuuta 2024, 3.59.00 EET flow gg a écrit :
>>> > updated a little improve in this reply
>>>
>>> As noted eaerlier, I don't understand why you have two size parameters.
>>> It
>>> seems that \size is always either the same as (1 << (\size2 - 1)) a.k.a.
>>> ((1
>>> << \size2) / 2), or unused. The assembler *can* compute arithmetic
>>> constants.
>>>
>>> Similarly, you can use \restore as a truth value directly: `.if
>>> \restore`.
>>>
>>> FWIW, it seems that you could just as well include func/endfunc inside
>>> the
>>> macros.
>>>
>>> --
>>> レミ・デニ-クールモン
>>> http://www.remlab.net/
>>>
>>>
>>>
>>> _______________________________________________
>>> ffmpeg-devel mailing list
>>> ffmpeg-devel@ffmpeg.org
>>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>>
>>> To unsubscribe, visit link above, or email
>>> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>>>
>>
Rémi Denis-Courmont March 27, 2024, 3:41 p.m. UTC | #8
Le perjantaina 22. maaliskuuta 2024, 8.02.08 EET flow gg a écrit :
> Using macros to shorten function definitions, updated in this response

Did you try to share the common code after getdc and see how slower it is? If 
an extra static branch has negligible overhead, it would reduce binary size 
quite a bit here, AFAICT.
flow gg March 28, 2024, 2:44 a.m. UTC | #9
I don't quite understand, I think here 8x8 because zve64x is not suitable
for sharing, it shares between dc16x16 and dc32x32, there isn't much common
code, it would require adding 3 if-else statements and function parameters,
it feels okay not to extract too.

Rémi Denis-Courmont <remi@remlab.net> 于2024年3月27日周三 23:41写道:

> Le perjantaina 22. maaliskuuta 2024, 8.02.08 EET flow gg a écrit :
> > Using macros to shorten function definitions, updated in this response
>
> Did you try to share the common code after getdc and see how slower it is?
> If
> an extra static branch has negligible overhead, it would reduce binary
> size
> quite a bit here, AFAICT.
>
> --
> レミ・デニ-クールモン
> http://www.remlab.net/
>
>
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
Rémi Denis-Courmont April 3, 2024, 8:21 p.m. UTC | #10
Le torstaina 28. maaliskuuta 2024, 4.44.33 EEST flow gg a écrit :
> I don't quite understand, I think here 8x8 because zve64x is not suitable
> for sharing, it shares between dc16x16 and dc32x32, there isn't much common
> code, it would require adding 3 if-else statements and function parameters,
> it feels okay not to extract too.

I agree that we can't realistically share code between the different block 
sizes. My point was that the code after getdc is lengthy (after expansion) and 
fixed for a given block size, so *that* code could be shared and jumped as 
common function tail.
flow gg April 7, 2024, 5:38 a.m. UTC | #11
Okay, updated it in the reply and github(
https://github.com/hleft/FFmpeg/tree/vp8vp9)

Rémi Denis-Courmont <remi@remlab.net> 于2024年4月4日周四 04:22写道:

> Le torstaina 28. maaliskuuta 2024, 4.44.33 EEST flow gg a écrit :
> > I don't quite understand, I think here 8x8 because zve64x is not suitable
> > for sharing, it shares between dc16x16 and dc32x32, there isn't much
> common
> > code, it would require adding 3 if-else statements and function
> parameters,
> > it feels okay not to extract too.
>
> I agree that we can't realistically share code between the different block
> sizes. My point was that the code after getdc is lengthy (after expansion)
> and
> fixed for a given block size, so *that* code could be shared and jumped as
> common function tail.
>
> --
> Rémi Denis-Courmont
> http://www.remlab.net/
>
>
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
diff mbox series

Patch

From adaae06a3e18bccec1772a3134334cbea652ae77 Mon Sep 17 00:00:00 2001
From: sunyuechi <sunyuechi@iscas.ac.cn>
Date: Mon, 26 Feb 2024 14:42:17 +0800
Subject: [PATCH 1/4] lavc/vp9dsp: R-V V ipred dc

C908:
vp9_dc_8x8_8bpp_c: 46.0
vp9_dc_8x8_8bpp_rvv_i64: 41.0
vp9_dc_16x16_8bpp_c: 109.2
vp9_dc_16x16_8bpp_rvv_i32: 72.7
vp9_dc_32x32_8bpp_c: 365.2
vp9_dc_32x32_8bpp_rvv_i32: 165.5
vp9_dc_127_8x8_8bpp_c: 23.0
vp9_dc_127_8x8_8bpp_rvv_i64: 22.0
vp9_dc_127_16x16_8bpp_c: 70.2
vp9_dc_127_16x16_8bpp_rvv_i32: 51.7
vp9_dc_127_32x32_8bpp_c: 295.2
vp9_dc_127_32x32_8bpp_rvv_i32: 140.2
vp9_dc_128_8x8_8bpp_c: 23.0
vp9_dc_128_8x8_8bpp_rvv_i64: 22.0
vp9_dc_128_16x16_8bpp_c: 70.2
vp9_dc_128_16x16_8bpp_rvv_i32: 51.7
vp9_dc_128_32x32_8bpp_c: 295.2
vp9_dc_128_32x32_8bpp_rvv_i32: 140.2
vp9_dc_129_8x8_8bpp_c: 23.0
vp9_dc_129_8x8_8bpp_rvv_i64: 22.0
vp9_dc_129_16x16_8bpp_c: 70.2
vp9_dc_129_16x16_8bpp_rvv_i32: 51.7
vp9_dc_129_32x32_8bpp_c: 295.2
vp9_dc_129_32x32_8bpp_rvv_i32: 140.2
vp9_dc_left_8x8_8bpp_c: 38.0
vp9_dc_left_8x8_8bpp_rvv_i64: 36.0
vp9_dc_left_16x16_8bpp_c: 93.2
vp9_dc_left_16x16_8bpp_rvv_i32: 67.7
vp9_dc_left_32x32_8bpp_c: 333.2
vp9_dc_left_32x32_8bpp_rvv_i32: 158.5
vp9_dc_top_8x8_8bpp_c: 38.7
vp9_dc_top_8x8_8bpp_rvv_i64: 36.0
vp9_dc_top_16x16_8bpp_c: 93.2
vp9_dc_top_16x16_8bpp_rvv_i32: 67.7
vp9_dc_top_32x32_8bpp_c: 333.2
vp9_dc_top_32x32_8bpp_rvv_i32: 156.2
---
 libavcodec/riscv/Makefile        |   2 +
 libavcodec/riscv/vp9_intra_rvv.S | 201 +++++++++++++++++++++++++++++++
 libavcodec/riscv/vp9dsp.h        |  64 ++++++++++
 libavcodec/riscv/vp9dsp_init.c   |  61 ++++++++++
 libavcodec/vp9dsp.c              |   2 +
 libavcodec/vp9dsp.h              |   1 +
 6 files changed, 331 insertions(+)
 create mode 100644 libavcodec/riscv/vp9_intra_rvv.S
 create mode 100644 libavcodec/riscv/vp9dsp.h
 create mode 100644 libavcodec/riscv/vp9dsp_init.c

diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile
index dff8784102..c237e60800 100644
--- a/libavcodec/riscv/Makefile
+++ b/libavcodec/riscv/Makefile
@@ -60,5 +60,7 @@  OBJS-$(CONFIG_VC1DSP) += riscv/vc1dsp_init.o
 RVV-OBJS-$(CONFIG_VC1DSP) += riscv/vc1dsp_rvv.o
 OBJS-$(CONFIG_VP8DSP) += riscv/vp8dsp_init.o
 RVV-OBJS-$(CONFIG_VP8DSP) += riscv/vp8dsp_rvv.o
+OBJS-$(CONFIG_VP9_DECODER) += riscv/vp9dsp_init.o
+RVV-OBJS-$(CONFIG_VP9_DECODER) += riscv/vp9_intra_rvv.o
 OBJS-$(CONFIG_VORBIS_DECODER) += riscv/vorbisdsp_init.o
 RVV-OBJS-$(CONFIG_VORBIS_DECODER) += riscv/vorbisdsp_rvv.o
diff --git a/libavcodec/riscv/vp9_intra_rvv.S b/libavcodec/riscv/vp9_intra_rvv.S
new file mode 100644
index 0000000000..b3b0470cfc
--- /dev/null
+++ b/libavcodec/riscv/vp9_intra_rvv.S
@@ -0,0 +1,201 @@ 
+/*
+ * Copyright (c) 2024 Institue of Software Chinese Academy of Sciences (ISCAS).
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavutil/riscv/asm.S"
+
+.macro getdc type size
+        vmv.v.x      v16, zero
+.ifc \type,top
+        vle8.v       v8, (a3)
+        vwredsumu.vs v16, v8, v16
+        vsetivli     zero, 1, e16, m1, ta, ma
+        vmv.x.s      t1, v16
+        .ifc \size,32
+                addi         t1, t1, 16
+                srai         t1, t1, 5
+        .elseif \size == 16
+                addi         t1, t1, 8
+                srai         t1, t1, 4
+        .elseif \size == 8
+                addi         t1, t1, 4
+                srai         t1, t1, 3
+        .endif
+.elseif \type == left
+        vle8.v       v8, (a2)
+        vwredsumu.vs v16, v8, v16
+        vsetivli     zero, 1, e16, m1, ta, ma
+        vmv.x.s      t1, v16
+        .ifc \size,32
+                addi         t1, t1, 16
+                srai         t1, t1, 5
+        .elseif \size == 16
+                addi         t1, t1, 8
+                srai         t1, t1, 4
+        .elseif \size == 8
+                addi         t1, t1, 4
+                srai         t1, t1, 3
+        .endif
+.elseif \type == 127
+        li           t1, 127
+.elseif \type == 128
+        li           t1, 128
+.elseif \type == 129
+        li           t1, 129
+.elseif \type == none
+        vle8.v       v8, (a2)
+        vwredsumu.vs v16, v8, v16
+        vle8.v       v8, (a3)
+        vwredsumu.vs v16, v8, v16
+        vsetivli     zero, 1, e16, m1, ta, ma
+        vmv.x.s      t1, v16
+        .ifc \size,32
+                addi         t1, t1, 32
+                srai         t1, t1, 6
+        .elseif \size == 16
+                addi         t1, t1, 16
+                srai         t1, t1, 5
+        .elseif \size == 8
+                addi         t1, t1, 8
+                srai         t1, t1, 4
+        .endif
+.endif
+.endm
+
+.macro dc32x32 type restore
+        li           t0, 32
+        vsetvli      zero, t0, e8, m2, ta, ma
+        getdc        \type 32
+
+        .ifc \restore,1
+        vsetvli      zero, t0, e8, m2, ta, ma
+        .endif
+        vmv.v.x      v0, t1
+
+        .rept 31
+        vse8.v       v0, (a0)
+        add          a0, a0, a1
+        .endr
+        vse8.v       v0, (a0)
+
+        ret
+.endm
+
+.macro dc16x16 type restore
+        vsetivli     zero, 16, e8, m1, ta, ma
+        getdc        \type 16
+
+        .ifc \restore,1
+        vsetivli     zero, 16, e8, m1, ta, ma
+        .endif
+        vmv.v.x      v0, t1
+
+        .rept 15
+        vse8.v       v0, (a0)
+        add          a0, a0, a1
+        .endr
+        vse8.v       v0, (a0)
+
+        ret
+.endm
+
+.macro dc8x8 type
+        vsetivli     zero, 8, e8, mf2, ta, ma
+        getdc        \type 8
+
+        li           t0, 64
+        vsetvli      zero, t0, e8, m4, ta, ma
+        vmv.v.x      v0, t1
+        vsetivli     zero, 8, e8, mf2, ta, ma
+        vsse64.v     v0, (a0), a1
+
+        ret
+.endm
+
+func ff_dc_127_32x32_rvv, zve32x
+        dc32x32 127 0
+endfunc
+
+func ff_dc_127_16x16_rvv, zve32x
+        dc16x16 127 0
+endfunc
+
+func ff_dc_127_8x8_rvv, zve64x
+        dc8x8 127
+endfunc
+
+func ff_dc_128_32x32_rvv, zve32x
+        dc32x32 128 0
+endfunc
+
+func ff_dc_128_16x16_rvv, zve32x
+        dc16x16 128 0
+endfunc
+
+func ff_dc_128_8x8_rvv, zve64x
+        dc8x8 128
+endfunc
+
+func ff_dc_129_32x32_rvv, zve32x
+        dc32x32 129 0
+endfunc
+
+func ff_dc_129_16x16_rvv, zve32x
+        dc16x16 129 0
+endfunc
+
+func ff_dc_129_8x8_rvv, zve64x
+        dc8x8 129
+endfunc
+
+func ff_dc_32x32_rvv, zve32x
+        dc32x32 none 1
+endfunc
+
+func ff_dc_16x16_rvv, zve32x
+        dc16x16 none 1
+endfunc
+
+func ff_dc_8x8_rvv, zve64x
+        dc8x8 none
+endfunc
+
+func ff_dc_left_32x32_rvv, zve32x
+        dc32x32 left 1
+endfunc
+
+func ff_dc_left_16x16_rvv, zve32x
+        dc16x16 left 1
+endfunc
+
+func ff_dc_left_8x8_rvv, zve64x
+        dc8x8 left
+endfunc
+
+func ff_dc_top_32x32_rvv, zve32x
+        dc32x32 top 1
+endfunc
+
+func ff_dc_top_16x16_rvv, zve32x
+        dc16x16 top 1
+endfunc
+
+func ff_dc_top_8x8_rvv, zve64x
+        dc8x8 top
+endfunc
diff --git a/libavcodec/riscv/vp9dsp.h b/libavcodec/riscv/vp9dsp.h
new file mode 100644
index 0000000000..abd57bd836
--- /dev/null
+++ b/libavcodec/riscv/vp9dsp.h
@@ -0,0 +1,64 @@ 
+/*
+ * Copyright (c) 2024 Institue of Software Chinese Academy of Sciences (ISCAS).
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef AVCODEC_RISCV_VP9DSP_RISCV_H
+#define AVCODEC_RISCV_VP9DSP_RISCV_H
+
+#include <stddef.h>
+#include <stdint.h>
+
+void ff_dc_32x32_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
+                     const uint8_t *a);
+void ff_dc_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
+                     const uint8_t *a);
+void ff_dc_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
+                   const uint8_t *a);
+void ff_dc_top_32x32_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
+                         const uint8_t *a);
+void ff_dc_top_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
+                         const uint8_t *a);
+void ff_dc_top_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
+                       const uint8_t *a);
+void ff_dc_left_32x32_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
+                          const uint8_t *a);
+void ff_dc_left_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
+                          const uint8_t *a);
+void ff_dc_left_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
+                        const uint8_t *a);
+void ff_dc_127_32x32_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
+                         const uint8_t *a);
+void ff_dc_127_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
+                         const uint8_t *a);
+void ff_dc_127_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
+                       const uint8_t *a);
+void ff_dc_128_32x32_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
+                         const uint8_t *a);
+void ff_dc_128_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
+                         const uint8_t *a);
+void ff_dc_128_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
+                       const uint8_t *a);
+void ff_dc_129_32x32_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
+                         const uint8_t *a);
+void ff_dc_129_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
+                         const uint8_t *a);
+void ff_dc_129_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l,
+                       const uint8_t *a);
+
+#endif  // #ifndef AVCODEC_RISCV_VP9DSP_RISCV_H
diff --git a/libavcodec/riscv/vp9dsp_init.c b/libavcodec/riscv/vp9dsp_init.c
new file mode 100644
index 0000000000..69ab39004c
--- /dev/null
+++ b/libavcodec/riscv/vp9dsp_init.c
@@ -0,0 +1,61 @@ 
+/*
+ * Copyright (c) 2024 Institue of Software Chinese Academy of Sciences (ISCAS).
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lervvr General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lervvr General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lervvr General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavutil/attributes.h"
+#include "libavutil/cpu.h"
+#include "libavutil/riscv/cpu.h"
+#include "libavcodec/vp9dsp.h"
+#include "vp9dsp.h"
+
+static av_cold void vp9dsp_intrapred_init_rvv(VP9DSPContext *dsp, int bpp)
+{
+    #if HAVE_RVV
+        int flags = av_get_cpu_flags();
+
+        if (bpp == 8 && flags & AV_CPU_FLAG_RVV_I64 && ff_get_rv_vlenb() >= 16) {
+            dsp->intra_pred[TX_8X8][DC_PRED] = ff_dc_8x8_rvv;
+            dsp->intra_pred[TX_8X8][LEFT_DC_PRED] = ff_dc_left_8x8_rvv;
+            dsp->intra_pred[TX_8X8][DC_127_PRED] = ff_dc_127_8x8_rvv;
+            dsp->intra_pred[TX_8X8][DC_128_PRED] = ff_dc_128_8x8_rvv;
+            dsp->intra_pred[TX_8X8][DC_129_PRED] = ff_dc_129_8x8_rvv;
+            dsp->intra_pred[TX_8X8][TOP_DC_PRED] = ff_dc_top_8x8_rvv;
+        }
+
+        if (bpp == 8 && flags & AV_CPU_FLAG_RVV_I32 && ff_get_rv_vlenb() >= 16) {
+            dsp->intra_pred[TX_32X32][DC_PRED] = ff_dc_32x32_rvv;
+            dsp->intra_pred[TX_16X16][DC_PRED] = ff_dc_16x16_rvv;
+            dsp->intra_pred[TX_32X32][LEFT_DC_PRED] = ff_dc_left_32x32_rvv;
+            dsp->intra_pred[TX_16X16][LEFT_DC_PRED] = ff_dc_left_16x16_rvv;
+            dsp->intra_pred[TX_32X32][DC_127_PRED] = ff_dc_127_32x32_rvv;
+            dsp->intra_pred[TX_16X16][DC_127_PRED] = ff_dc_127_16x16_rvv;
+            dsp->intra_pred[TX_32X32][DC_128_PRED] = ff_dc_128_32x32_rvv;
+            dsp->intra_pred[TX_16X16][DC_128_PRED] = ff_dc_128_16x16_rvv;
+            dsp->intra_pred[TX_32X32][DC_129_PRED] = ff_dc_129_32x32_rvv;
+            dsp->intra_pred[TX_16X16][DC_129_PRED] = ff_dc_129_16x16_rvv;
+            dsp->intra_pred[TX_32X32][TOP_DC_PRED] = ff_dc_top_32x32_rvv;
+            dsp->intra_pred[TX_16X16][TOP_DC_PRED] = ff_dc_top_16x16_rvv;
+        }
+    #endif
+}
+
+av_cold void ff_vp9dsp_init_riscv(VP9DSPContext *dsp, int bpp, int bitexact)
+{
+    vp9dsp_intrapred_init_rvv(dsp, bpp);
+}
diff --git a/libavcodec/vp9dsp.c b/libavcodec/vp9dsp.c
index d8ddf74d4f..967e6e1e1a 100644
--- a/libavcodec/vp9dsp.c
+++ b/libavcodec/vp9dsp.c
@@ -100,6 +100,8 @@  av_cold void ff_vp9dsp_init(VP9DSPContext *dsp, int bpp, int bitexact)
     ff_vp9dsp_init_aarch64(dsp, bpp);
 #elif ARCH_ARM
     ff_vp9dsp_init_arm(dsp, bpp);
+#elif ARCH_RISCV
+    ff_vp9dsp_init_riscv(dsp, bpp, bitexact);
 #elif ARCH_X86
     ff_vp9dsp_init_x86(dsp, bpp, bitexact);
 #elif ARCH_MIPS
diff --git a/libavcodec/vp9dsp.h b/libavcodec/vp9dsp.h
index be0ac0b181..772848e349 100644
--- a/libavcodec/vp9dsp.h
+++ b/libavcodec/vp9dsp.h
@@ -131,6 +131,7 @@  void ff_vp9dsp_init_12(VP9DSPContext *dsp);
 
 void ff_vp9dsp_init_aarch64(VP9DSPContext *dsp, int bpp);
 void ff_vp9dsp_init_arm(VP9DSPContext *dsp, int bpp);
+void ff_vp9dsp_init_riscv(VP9DSPContext *dsp, int bpp, int bitexact);
 void ff_vp9dsp_init_x86(VP9DSPContext *dsp, int bpp, int bitexact);
 void ff_vp9dsp_init_mips(VP9DSPContext *dsp, int bpp);
 void ff_vp9dsp_init_loongarch(VP9DSPContext *dsp, int bpp);
-- 
2.44.0