[FFmpeg-devel] libavcodec/mips: Improve avc idct8 msa function

Submitted by kaustubh.raste@imgtec.com on Aug. 11, 2017, 9:44 a.m.

Details

Message ID 7AC45BA9D7010549B1787997D11B8C30E086514E@PUMAIL01.pu.imgtec.org
State New
Headers show

Commit Message

kaustubh.raste@imgtec.com Aug. 11, 2017, 9:44 a.m.
Please review the patch.

-----Original Message-----
From: ffmpeg-devel [mailto:ffmpeg-devel-bounces@ffmpeg.org] On Behalf Of Kaustubh Raste

Sent: Friday, August 4, 2017 5:24 PM
To: FFmpeg development discussions and patches
Subject: Re: [FFmpeg-devel] [PATCH] libavcodec/mips: Improve avc idct8 msa function

Ping.

-----Original Message-----
From: Manojkumar Bhosale

Sent: Monday, July 31, 2017 3:43 PM
To: FFmpeg development discussions and patches
Cc: Kaustubh Raste
Subject: RE: [FFmpeg-devel] [PATCH] libavcodec/mips: Improve avc idct8 msa function

LGTM

thx

-----Original Message-----
From: ffmpeg-devel [mailto:ffmpeg-devel-bounces@ffmpeg.org] On Behalf Of kaustubh.raste@imgtec.com

Sent: Monday, July 31, 2017 12:07 PM
To: ffmpeg-devel@ffmpeg.org
Cc: Kaustubh Raste
Subject: [FFmpeg-devel] [PATCH] libavcodec/mips: Improve avc idct8 msa function

From: Kaustubh Raste <kaustubh.raste@imgtec.com>


Replace memset call with msa stores.

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>

---
 libavcodec/mips/h264idct_msa.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Patch hide | download patch | download mbox

diff --git a/libavcodec/mips/h264idct_msa.c b/libavcodec/mips/h264idct_msa.c index 861befe..1e1a5c8 100644
--- a/libavcodec/mips/h264idct_msa.c
+++ b/libavcodec/mips/h264idct_msa.c
@@ -120,11 +120,12 @@  static void avc_idct8_addblk_msa(uint8_t *dst, int16_t *src, int32_t dst_stride)
     v4i32 res0_r, res1_r, res2_r, res3_r, res4_r, res5_r, res6_r, res7_r;
     v4i32 res0_l, res1_l, res2_l, res3_l, res4_l, res5_l, res6_l, res7_l;
     v16i8 dst0, dst1, dst2, dst3, dst4, dst5, dst6, dst7;
-    v16i8 zeros = { 0 };
+    v8i16 zeros = { 0 };
 
     src[0] += 32;
 
     LD_SH8(src, 8, src0, src1, src2, src3, src4, src5, src6, src7);
+    ST_SH8(zeros, zeros, zeros, zeros, zeros, zeros, zeros, zeros, src, 
+ 8);
 
     vec0 = src0 + src4;
     vec1 = src0 - src4;
@@ -318,7 +319,6 @@  void ff_h264_idct8_addblk_msa(uint8_t *dst, int16_t *src,
                               int32_t dst_stride)  {
     avc_idct8_addblk_msa(dst, src, dst_stride);
-    memset(src, 0, 64 * sizeof(dctcoef));
 }
 
 void ff_h264_idct4x4_addblk_dc_msa(uint8_t *dst, int16_t *src,
--
1.7.9.5