[FFmpeg-devel,3/3] avcodec/utils: Optimize ff_color_frame()

Submitted by Michael Niedermayer on Dec. 24, 2018, 12:14 a.m.

Details

Message ID 20181224001451.8853-3-michael@niedermayer.cc
State Superseded
Headers show

Commit Message

Michael Niedermayer Dec. 24, 2018, 12:14 a.m.
Fixes: Timeout
Fixes: 10709/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_H264_fuzzer-5630617975259136

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
---
 libavcodec/utils.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Comments

Derek Buitenhuis Dec. 24, 2018, 4:41 p.m.
On 24/12/2018 00:14, Michael Niedermayer wrote:
> Fixes: Timeout
> Fixes: 10709/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_H264_fuzzer-5630617975259136
> 
> Found-by: continuous fuzzing processhttps://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> Signed-off-by: Michael Niedermayer<michael@niedermayer.cc>
> ---
>   libavcodec/utils.c | 9 +++++++--
>   1 file changed, 7 insertions(+), 2 deletions(-)

Comments and stats please...

- Derek
Paul B Mahol Dec. 24, 2018, 4:44 p.m.
On 12/24/18, Michael Niedermayer <michael@niedermayer.cc> wrote:
> Fixes: Timeout
> Fixes:
> 10709/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_H264_fuzzer-5630617975259136
>
> Found-by: continuous fuzzing process
> https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> ---
>  libavcodec/utils.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)

Ugly, set first row and use memcpy for rest of rows.
Michael Niedermayer Dec. 24, 2018, 10:34 p.m.
On Mon, Dec 24, 2018 at 05:44:26PM +0100, Paul B Mahol wrote:
> On 12/24/18, Michael Niedermayer <michael@niedermayer.cc> wrote:
> > Fixes: Timeout
> > Fixes:
> > 10709/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_H264_fuzzer-5630617975259136
> >
> > Found-by: continuous fuzzing process
> > https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> > ---
> >  libavcodec/utils.c | 9 +++++++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> Ugly, set first row and use memcpy for rest of rows.

That would increase the load on the data cache
memcpy has to read the line each time and write it each time
otherwise the code just writes the line.
This should become especially noticable if something else accesses
the data cache like a 2nd core/thread
with just a single thread mempcpy is faster than this simple method
a slightly more complex implementation beats memcpy even single thread
though (at least in my quick test here)

of course if people prefer memcpy() (because its simpler) i can do that
what is preferred ?
code as in the patch ?
memcpy ?

even more messy but slightly beating memcpy ?
(i have to actually clean this up and retest as it does a aliasing violation
 if people want this one)
 
and of course there are more options like with asm to implement that

memcpy:
    5585847 decicycles in A,       4 runs,      0 skips

ugly:
    5696222 decicycles in A,       4 runs,      0 skips

ugly and messy:
    5527612 decicycles in A,       4 runs,      0 skips

original:
    14953532 decicycles in A,       4 runs,      0 skips

tested with: fate-suite//h264/SonyXAVC_LongGOP_green_pixelation_early_Frames.MXF

thanks


[...]
Michael Niedermayer Dec. 24, 2018, 10:37 p.m.
On Mon, Dec 24, 2018 at 04:41:02PM +0000, Derek Buitenhuis wrote:
> On 24/12/2018 00:14, Michael Niedermayer wrote:
> > Fixes: Timeout
> > Fixes: 10709/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_H264_fuzzer-5630617975259136
> > 
> > Found-by: continuous fuzzing processhttps://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> > Signed-off-by: Michael Niedermayer<michael@niedermayer.cc>
> > ---
> >   libavcodec/utils.c | 9 +++++++--
> >   1 file changed, 7 insertions(+), 2 deletions(-)
> 
> Comments and stats please...

I can change the commit message as below?
or add the timing data from fate from the other mail

was that what you had in mind ?

    avcodec/utils: Optimize ff_color_frame()
    
    Fixes: Timeout
    Fixes: 10709/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_H264_fuzzer-5630617975259136
    
    Before change: Executed clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_H264_fuzzer-5630617975259136 in 28285 ms
    After change:  Executed clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_H264_fuzzer-5630617975259136 in 11830 ms
    
    Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
    Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>


[...]

Patch hide | download patch | download mbox

diff --git a/libavcodec/utils.c b/libavcodec/utils.c
index d519b16092..5f8bec6df4 100644
--- a/libavcodec/utils.c
+++ b/libavcodec/utils.c
@@ -423,8 +423,13 @@  void ff_color_frame(AVFrame *frame, const int c[4])
         int height = is_chroma ? AV_CEIL_RSHIFT(frame->height, desc->log2_chroma_h) : frame->height;
         for (y = 0; y < height; y++) {
             if (desc->comp[0].depth >= 9) {
-                for (x = 0; x<bytes; x++)
-                    ((uint16_t*)dst)[x] = c[p];
+                uint64_t v = c[p] * 0x0001000100010001ULL;
+                for (x = 0; x<2*bytes-15; x+=16) {
+                    AV_WN64(dst+x  , v);
+                    AV_WN64(dst+x+8, v);
+                }
+                for (; x<bytes; x++)
+                    ((uint16_t*)dst)[x] = v;
             }else
                 memset(dst, c[p], bytes);
             dst += frame->linesize[p];