diff mbox series

[FFmpeg-devel,4/6] avcodec/mpegpicture: Don't copy unnecessarily, fix race

Message ID DB6PR0101MB22143C8069D6AA5EF488CD5E8F669@DB6PR0101MB2214.eurprd01.prod.exchangelabs.com
State Accepted
Commit b645138a34321fb1d1b7988cd0d78b897e4d65ca
Headers show
Series [FFmpeg-devel,1/6] avcodec/avcodec: Move initializing frame-thrd encoder to encode_preinit | expand

Checks

Context Check Description
yinshiyou/make_loongarch64 success Make finished
yinshiyou/make_fate_loongarch64 success Make fate finished
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Andreas Rheinhardt Aug. 13, 2022, 3:03 p.m. UTC
mpegvideo uses an array of Pictures and when it is done with using
them, it only unreferences them incompletely: Some buffers are kept
so that they can be reused lateron if the same slot in the Picture
array is reused, making this a sort of a bufferpool.
(Basically, a Picture is considered used if the AVFrame's buf is set.)
Yet given that other pieces of the decoder may have a reference to
these buffers, they need not be writable and are made writable using
av_buffer_make_writable() when preparing a new Picture. This involves
reading the buffer's data, although the old content of the buffer
need not be retained.

Worse, this read can be racy, because the buffer can be used by another
thread at the same time. This happens for Real Video 3 and 4.

This commit fixes this race by no longer copying the data;
instead the old buffer is replaced by a new, zero-allocated buffer.

(Here are the details of what happens with three or more decoding threads
when decoding rv30.rm from the FATE-suite as happens in the rv30 test:
The first decoding thread uses the first slot of its picture array
to store its current pic; update_thread_context copies this for the
second thread that decodes a P-frame. It uses the second slot in its
Picture array to store its P-frame. This arrangement is then copied
to the third decode thread, which decodes a B-frame. It uses the third
slot in its Picture array for its current frame.
update_thread_context copies this to the next thread. It unreferences
the third slot containing the other B-frame and then it reuses this
slot for its current frame. Because the pic array slots are only
incompletely unreferenced, the buffers of the previous B-frame are
still in there and they are not writable; in fact the previous
thread is concurrently writing to them, causing races when making
the buffer writable.)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
 libavcodec/mpegpicture.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

Comments

Michael Niedermayer Aug. 16, 2022, 7:48 p.m. UTC | #1
On Sat, Aug 13, 2022 at 05:03:04PM +0200, Andreas Rheinhardt wrote:
> mpegvideo uses an array of Pictures and when it is done with using
> them, it only unreferences them incompletely: Some buffers are kept
> so that they can be reused lateron if the same slot in the Picture
> array is reused, making this a sort of a bufferpool.
> (Basically, a Picture is considered used if the AVFrame's buf is set.)
> Yet given that other pieces of the decoder may have a reference to
> these buffers, they need not be writable and are made writable using
> av_buffer_make_writable() when preparing a new Picture. This involves
> reading the buffer's data, although the old content of the buffer
> need not be retained.
> 
> Worse, this read can be racy, because the buffer can be used by another
> thread at the same time. This happens for Real Video 3 and 4.
> 
> This commit fixes this race by no longer copying the data;
> instead the old buffer is replaced by a new, zero-allocated buffer.
> 
> (Here are the details of what happens with three or more decoding threads
> when decoding rv30.rm from the FATE-suite as happens in the rv30 test:
> The first decoding thread uses the first slot of its picture array
> to store its current pic; update_thread_context copies this for the
> second thread that decodes a P-frame. It uses the second slot in its
> Picture array to store its P-frame. This arrangement is then copied
> to the third decode thread, which decodes a B-frame. It uses the third
> slot in its Picture array for its current frame.
> update_thread_context copies this to the next thread. It unreferences
> the third slot containing the other B-frame and then it reuses this
> slot for its current frame. Because the pic array slots are only
> incompletely unreferenced, the buffers of the previous B-frame are
> still in there and they are not writable; in fact the previous
> thread is concurrently writing to them, causing races when making
> the buffer writable.)
> 
> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
> ---
>  libavcodec/mpegpicture.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)

This causes a difference in some frames of: (i have not investigates why
just reporting as i noticed)
quite possibly thats just showing your bugfix in action

./ffmpeg -y -bitexact -i fate/svq3/Vertical400kbit.sorenson3.mov -ps 50 -bf 1 -bitexact -an -qscale 5  -ss 40 -error_rate 4 -threads 1 /tmp/out4.avi 
./ffmpeg -y -bitexact -v -1 -loglevel 0 -i /tmp/out4.avi -bitexact -vsync drop -f framecrc -

@@ -141,7 +141,7 @@
 0,         22,         22,        1,   115200, 0x4cc933e9
 0,         23,         23,        1,   115200, 0x693a40a9
 0,         24,         24,        1,   115200, 0x956e3b15
-0,         25,         25,        1,   115200, 0x61763ff4
+0,         25,         25,        1,   115200, 0xc9e53d1c
 0,         26,         26,        1,   115200, 0x5c5c3dfc
 0,         27,         27,        1,   115200, 0x7de641ea
 0,         28,         28,        1,   115200, 0xe6cc4136
@@ -187,7 +187,7 @@
 0,         68,         68,        1,   115200, 0x49dcbf4e
 0,         69,         69,        1,   115200, 0x1ea1c7d1
 0,         70,         70,        1,   115200, 0xdf77c67b
-0,         71,         71,        1,   115200, 0x7f6bd16d
+0,         71,         71,        1,   115200, 0x33d9d206
 0,         72,         72,        1,   115200, 0x5e37cb3a
 0,         73,         73,        1,   115200, 0x15abcda3
 0,         74,         74,        1,   115200, 0xbf4dcbd4
@@ -199,7 +199,7 @@
 0,         80,         80,        1,   115200, 0x17d1d667
 0,         81,         81,        1,   115200, 0x0c1fdf9c
 0,         82,         82,        1,   115200, 0x7eabde6b
-0,         83,         83,        1,   115200, 0xe623e7af
+0,         83,         83,        1,   115200, 0x3bf6e873
 0,         84,         84,        1,   115200, 0xf480dc82
 0,         85,         85,        1,   115200, 0x5fd6e098
 0,         86,         86,        1,   115200, 0xf520de95
@@ -211,7 +211,7 @@
 0,         92,         92,        1,   115200, 0x34cfe1c2
 0,         93,         93,        1,   115200, 0x1d94e1c3
 0,         94,         94,        1,   115200, 0x6d32e147
-0,         95,         95,        1,   115200, 0x7e40ee91
+0,         95,         95,        1,   115200, 0x09fbefd0
 0,         96,         96,        1,   115200, 0xa5f5eb43
 0,         97,         97,        1,   115200, 0x39b9ec3d
 0,         98,         98,        1,   115200, 0x3256ec18

[...]
Andreas Rheinhardt Aug. 16, 2022, 8:38 p.m. UTC | #2
Michael Niedermayer:
> On Sat, Aug 13, 2022 at 05:03:04PM +0200, Andreas Rheinhardt wrote:
>> mpegvideo uses an array of Pictures and when it is done with using
>> them, it only unreferences them incompletely: Some buffers are kept
>> so that they can be reused lateron if the same slot in the Picture
>> array is reused, making this a sort of a bufferpool.
>> (Basically, a Picture is considered used if the AVFrame's buf is set.)
>> Yet given that other pieces of the decoder may have a reference to
>> these buffers, they need not be writable and are made writable using
>> av_buffer_make_writable() when preparing a new Picture. This involves
>> reading the buffer's data, although the old content of the buffer
>> need not be retained.
>>
>> Worse, this read can be racy, because the buffer can be used by another
>> thread at the same time. This happens for Real Video 3 and 4.
>>
>> This commit fixes this race by no longer copying the data;
>> instead the old buffer is replaced by a new, zero-allocated buffer.
>>
>> (Here are the details of what happens with three or more decoding threads
>> when decoding rv30.rm from the FATE-suite as happens in the rv30 test:
>> The first decoding thread uses the first slot of its picture array
>> to store its current pic; update_thread_context copies this for the
>> second thread that decodes a P-frame. It uses the second slot in its
>> Picture array to store its P-frame. This arrangement is then copied
>> to the third decode thread, which decodes a B-frame. It uses the third
>> slot in its Picture array for its current frame.
>> update_thread_context copies this to the next thread. It unreferences
>> the third slot containing the other B-frame and then it reuses this
>> slot for its current frame. Because the pic array slots are only
>> incompletely unreferenced, the buffers of the previous B-frame are
>> still in there and they are not writable; in fact the previous
>> thread is concurrently writing to them, causing races when making
>> the buffer writable.)
>>
>> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
>> ---
>>  libavcodec/mpegpicture.c | 16 +++++++++++++++-
>>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> This causes a difference in some frames of: (i have not investigates why
> just reporting as i noticed)
> quite possibly thats just showing your bugfix in action
> 
> ./ffmpeg -y -bitexact -i fate/svq3/Vertical400kbit.sorenson3.mov -ps 50 -bf 1 -bitexact -an -qscale 5  -ss 40 -error_rate 4 -threads 1 /tmp/out4.avi 
> ./ffmpeg -y -bitexact -v -1 -loglevel 0 -i /tmp/out4.avi -bitexact -vsync drop -f framecrc -
> 
> @@ -141,7 +141,7 @@
>  0,         22,         22,        1,   115200, 0x4cc933e9
>  0,         23,         23,        1,   115200, 0x693a40a9
>  0,         24,         24,        1,   115200, 0x956e3b15
> -0,         25,         25,        1,   115200, 0x61763ff4
> +0,         25,         25,        1,   115200, 0xc9e53d1c
>  0,         26,         26,        1,   115200, 0x5c5c3dfc
>  0,         27,         27,        1,   115200, 0x7de641ea
>  0,         28,         28,        1,   115200, 0xe6cc4136
> @@ -187,7 +187,7 @@
>  0,         68,         68,        1,   115200, 0x49dcbf4e
>  0,         69,         69,        1,   115200, 0x1ea1c7d1
>  0,         70,         70,        1,   115200, 0xdf77c67b
> -0,         71,         71,        1,   115200, 0x7f6bd16d
> +0,         71,         71,        1,   115200, 0x33d9d206
>  0,         72,         72,        1,   115200, 0x5e37cb3a
>  0,         73,         73,        1,   115200, 0x15abcda3
>  0,         74,         74,        1,   115200, 0xbf4dcbd4
> @@ -199,7 +199,7 @@
>  0,         80,         80,        1,   115200, 0x17d1d667
>  0,         81,         81,        1,   115200, 0x0c1fdf9c
>  0,         82,         82,        1,   115200, 0x7eabde6b
> -0,         83,         83,        1,   115200, 0xe623e7af
> +0,         83,         83,        1,   115200, 0x3bf6e873
>  0,         84,         84,        1,   115200, 0xf480dc82
>  0,         85,         85,        1,   115200, 0x5fd6e098
>  0,         86,         86,        1,   115200, 0xf520de95
> @@ -211,7 +211,7 @@
>  0,         92,         92,        1,   115200, 0x34cfe1c2
>  0,         93,         93,        1,   115200, 0x1d94e1c3
>  0,         94,         94,        1,   115200, 0x6d32e147
> -0,         95,         95,        1,   115200, 0x7e40ee91
> +0,         95,         95,        1,   115200, 0x09fbefd0
>  0,         96,         96,        1,   115200, 0xa5f5eb43
>  0,         97,         97,        1,   115200, 0x39b9ec3d
>  0,         98,         98,        1,   115200, 0x3256ec18
> 
> [...]
> 

Decoding this sample uses earlier values of mbskip_table. If I zero
mbskip_table_buf in every ff_alloc_picture(), nothing changes due to
this patch and (most importantly) the output of decoding does not depend
upon the number of threads used (which it currently does -- with and
without this patch).
I don't know exactly where the bug is or whether there is a cheaper way
to mitigate it than just to zero the buffer every time.
I guess there are more such bugs affecting damaged samples than we are
currently aware of.

- Andreas
Michael Niedermayer Aug. 16, 2022, 9:09 p.m. UTC | #3
On Tue, Aug 16, 2022 at 10:38:55PM +0200, Andreas Rheinhardt wrote:
> Michael Niedermayer:
> > On Sat, Aug 13, 2022 at 05:03:04PM +0200, Andreas Rheinhardt wrote:
> >> mpegvideo uses an array of Pictures and when it is done with using
> >> them, it only unreferences them incompletely: Some buffers are kept
> >> so that they can be reused lateron if the same slot in the Picture
> >> array is reused, making this a sort of a bufferpool.
> >> (Basically, a Picture is considered used if the AVFrame's buf is set.)
> >> Yet given that other pieces of the decoder may have a reference to
> >> these buffers, they need not be writable and are made writable using
> >> av_buffer_make_writable() when preparing a new Picture. This involves
> >> reading the buffer's data, although the old content of the buffer
> >> need not be retained.
> >>
> >> Worse, this read can be racy, because the buffer can be used by another
> >> thread at the same time. This happens for Real Video 3 and 4.
> >>
> >> This commit fixes this race by no longer copying the data;
> >> instead the old buffer is replaced by a new, zero-allocated buffer.
> >>
> >> (Here are the details of what happens with three or more decoding threads
> >> when decoding rv30.rm from the FATE-suite as happens in the rv30 test:
> >> The first decoding thread uses the first slot of its picture array
> >> to store its current pic; update_thread_context copies this for the
> >> second thread that decodes a P-frame. It uses the second slot in its
> >> Picture array to store its P-frame. This arrangement is then copied
> >> to the third decode thread, which decodes a B-frame. It uses the third
> >> slot in its Picture array for its current frame.
> >> update_thread_context copies this to the next thread. It unreferences
> >> the third slot containing the other B-frame and then it reuses this
> >> slot for its current frame. Because the pic array slots are only
> >> incompletely unreferenced, the buffers of the previous B-frame are
> >> still in there and they are not writable; in fact the previous
> >> thread is concurrently writing to them, causing races when making
> >> the buffer writable.)
> >>
> >> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
> >> ---
> >>  libavcodec/mpegpicture.c | 16 +++++++++++++++-
> >>  1 file changed, 15 insertions(+), 1 deletion(-)
> > 
> > This causes a difference in some frames of: (i have not investigates why
> > just reporting as i noticed)
> > quite possibly thats just showing your bugfix in action
> > 
> > ./ffmpeg -y -bitexact -i fate/svq3/Vertical400kbit.sorenson3.mov -ps 50 -bf 1 -bitexact -an -qscale 5  -ss 40 -error_rate 4 -threads 1 /tmp/out4.avi 
> > ./ffmpeg -y -bitexact -v -1 -loglevel 0 -i /tmp/out4.avi -bitexact -vsync drop -f framecrc -
> > 
> > @@ -141,7 +141,7 @@
> >  0,         22,         22,        1,   115200, 0x4cc933e9
> >  0,         23,         23,        1,   115200, 0x693a40a9
> >  0,         24,         24,        1,   115200, 0x956e3b15
> > -0,         25,         25,        1,   115200, 0x61763ff4
> > +0,         25,         25,        1,   115200, 0xc9e53d1c
> >  0,         26,         26,        1,   115200, 0x5c5c3dfc
> >  0,         27,         27,        1,   115200, 0x7de641ea
> >  0,         28,         28,        1,   115200, 0xe6cc4136
> > @@ -187,7 +187,7 @@
> >  0,         68,         68,        1,   115200, 0x49dcbf4e
> >  0,         69,         69,        1,   115200, 0x1ea1c7d1
> >  0,         70,         70,        1,   115200, 0xdf77c67b
> > -0,         71,         71,        1,   115200, 0x7f6bd16d
> > +0,         71,         71,        1,   115200, 0x33d9d206
> >  0,         72,         72,        1,   115200, 0x5e37cb3a
> >  0,         73,         73,        1,   115200, 0x15abcda3
> >  0,         74,         74,        1,   115200, 0xbf4dcbd4
> > @@ -199,7 +199,7 @@
> >  0,         80,         80,        1,   115200, 0x17d1d667
> >  0,         81,         81,        1,   115200, 0x0c1fdf9c
> >  0,         82,         82,        1,   115200, 0x7eabde6b
> > -0,         83,         83,        1,   115200, 0xe623e7af
> > +0,         83,         83,        1,   115200, 0x3bf6e873
> >  0,         84,         84,        1,   115200, 0xf480dc82
> >  0,         85,         85,        1,   115200, 0x5fd6e098
> >  0,         86,         86,        1,   115200, 0xf520de95
> > @@ -211,7 +211,7 @@
> >  0,         92,         92,        1,   115200, 0x34cfe1c2
> >  0,         93,         93,        1,   115200, 0x1d94e1c3
> >  0,         94,         94,        1,   115200, 0x6d32e147
> > -0,         95,         95,        1,   115200, 0x7e40ee91
> > +0,         95,         95,        1,   115200, 0x09fbefd0
> >  0,         96,         96,        1,   115200, 0xa5f5eb43
> >  0,         97,         97,        1,   115200, 0x39b9ec3d
> >  0,         98,         98,        1,   115200, 0x3256ec18
> > 
> > [...]
> > 
> 
> Decoding this sample uses earlier values of mbskip_table. If I zero
> mbskip_table_buf in every ff_alloc_picture(), nothing changes due to
> this patch and (most importantly) the output of decoding does not depend
> upon the number of threads used (which it currently does -- with and
> without this patch).
> I don't know exactly where the bug is or whether there is a cheaper way
> to mitigate it than just to zero the buffer every time.
> I guess there are more such bugs affecting damaged samples than we are
> currently aware of.

Theres code in ff_er_frame_end() which cleans the mbskip_table
I dont remember it very much but it looks like that should not be optional
If this will not be run mbskip_table should maybe be zeroed on "alloc"

not sure this is the issue, iam just looking at the code 

thx

[...]
diff mbox series

Patch

diff --git a/libavcodec/mpegpicture.c b/libavcodec/mpegpicture.c
index 2192f74cea..ed96abbe2d 100644
--- a/libavcodec/mpegpicture.c
+++ b/libavcodec/mpegpicture.c
@@ -47,11 +47,25 @@  static void av_noinline free_picture_tables(Picture *pic)
     }
 }
 
+static int make_table_writable(AVBufferRef **ref)
+{
+    AVBufferRef *old = *ref, *new;
+
+    if (av_buffer_is_writable(old))
+        return 0;
+    new = av_buffer_allocz(old->size);
+    if (!new)
+        return AVERROR(ENOMEM);
+    av_buffer_unref(ref);
+    *ref = new;
+    return 0;
+}
+
 static int make_tables_writable(Picture *pic)
 {
 #define MAKE_WRITABLE(table) \
 do {\
-    int ret = av_buffer_make_writable(&pic->table); \
+    int ret = make_table_writable(&pic->table); \
     if (ret < 0) \
         return ret; \
 } while (0)