Message ID | DB6PR0101MB22143C8069D6AA5EF488CD5E8F669@DB6PR0101MB2214.eurprd01.prod.exchangelabs.com |
---|---|
State | Accepted |
Commit | b645138a34321fb1d1b7988cd0d78b897e4d65ca |
Headers | show |
Series | [FFmpeg-devel,1/6] avcodec/avcodec: Move initializing frame-thrd encoder to encode_preinit | expand |
Context | Check | Description |
---|---|---|
yinshiyou/make_loongarch64 | success | Make finished |
yinshiyou/make_fate_loongarch64 | success | Make fate finished |
andriy/make_x86 | success | Make finished |
andriy/make_fate_x86 | success | Make fate finished |
On Sat, Aug 13, 2022 at 05:03:04PM +0200, Andreas Rheinhardt wrote: > mpegvideo uses an array of Pictures and when it is done with using > them, it only unreferences them incompletely: Some buffers are kept > so that they can be reused lateron if the same slot in the Picture > array is reused, making this a sort of a bufferpool. > (Basically, a Picture is considered used if the AVFrame's buf is set.) > Yet given that other pieces of the decoder may have a reference to > these buffers, they need not be writable and are made writable using > av_buffer_make_writable() when preparing a new Picture. This involves > reading the buffer's data, although the old content of the buffer > need not be retained. > > Worse, this read can be racy, because the buffer can be used by another > thread at the same time. This happens for Real Video 3 and 4. > > This commit fixes this race by no longer copying the data; > instead the old buffer is replaced by a new, zero-allocated buffer. > > (Here are the details of what happens with three or more decoding threads > when decoding rv30.rm from the FATE-suite as happens in the rv30 test: > The first decoding thread uses the first slot of its picture array > to store its current pic; update_thread_context copies this for the > second thread that decodes a P-frame. It uses the second slot in its > Picture array to store its P-frame. This arrangement is then copied > to the third decode thread, which decodes a B-frame. It uses the third > slot in its Picture array for its current frame. > update_thread_context copies this to the next thread. It unreferences > the third slot containing the other B-frame and then it reuses this > slot for its current frame. Because the pic array slots are only > incompletely unreferenced, the buffers of the previous B-frame are > still in there and they are not writable; in fact the previous > thread is concurrently writing to them, causing races when making > the buffer writable.) > > Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> > --- > libavcodec/mpegpicture.c | 16 +++++++++++++++- > 1 file changed, 15 insertions(+), 1 deletion(-) This causes a difference in some frames of: (i have not investigates why just reporting as i noticed) quite possibly thats just showing your bugfix in action ./ffmpeg -y -bitexact -i fate/svq3/Vertical400kbit.sorenson3.mov -ps 50 -bf 1 -bitexact -an -qscale 5 -ss 40 -error_rate 4 -threads 1 /tmp/out4.avi ./ffmpeg -y -bitexact -v -1 -loglevel 0 -i /tmp/out4.avi -bitexact -vsync drop -f framecrc - @@ -141,7 +141,7 @@ 0, 22, 22, 1, 115200, 0x4cc933e9 0, 23, 23, 1, 115200, 0x693a40a9 0, 24, 24, 1, 115200, 0x956e3b15 -0, 25, 25, 1, 115200, 0x61763ff4 +0, 25, 25, 1, 115200, 0xc9e53d1c 0, 26, 26, 1, 115200, 0x5c5c3dfc 0, 27, 27, 1, 115200, 0x7de641ea 0, 28, 28, 1, 115200, 0xe6cc4136 @@ -187,7 +187,7 @@ 0, 68, 68, 1, 115200, 0x49dcbf4e 0, 69, 69, 1, 115200, 0x1ea1c7d1 0, 70, 70, 1, 115200, 0xdf77c67b -0, 71, 71, 1, 115200, 0x7f6bd16d +0, 71, 71, 1, 115200, 0x33d9d206 0, 72, 72, 1, 115200, 0x5e37cb3a 0, 73, 73, 1, 115200, 0x15abcda3 0, 74, 74, 1, 115200, 0xbf4dcbd4 @@ -199,7 +199,7 @@ 0, 80, 80, 1, 115200, 0x17d1d667 0, 81, 81, 1, 115200, 0x0c1fdf9c 0, 82, 82, 1, 115200, 0x7eabde6b -0, 83, 83, 1, 115200, 0xe623e7af +0, 83, 83, 1, 115200, 0x3bf6e873 0, 84, 84, 1, 115200, 0xf480dc82 0, 85, 85, 1, 115200, 0x5fd6e098 0, 86, 86, 1, 115200, 0xf520de95 @@ -211,7 +211,7 @@ 0, 92, 92, 1, 115200, 0x34cfe1c2 0, 93, 93, 1, 115200, 0x1d94e1c3 0, 94, 94, 1, 115200, 0x6d32e147 -0, 95, 95, 1, 115200, 0x7e40ee91 +0, 95, 95, 1, 115200, 0x09fbefd0 0, 96, 96, 1, 115200, 0xa5f5eb43 0, 97, 97, 1, 115200, 0x39b9ec3d 0, 98, 98, 1, 115200, 0x3256ec18 [...]
Michael Niedermayer: > On Sat, Aug 13, 2022 at 05:03:04PM +0200, Andreas Rheinhardt wrote: >> mpegvideo uses an array of Pictures and when it is done with using >> them, it only unreferences them incompletely: Some buffers are kept >> so that they can be reused lateron if the same slot in the Picture >> array is reused, making this a sort of a bufferpool. >> (Basically, a Picture is considered used if the AVFrame's buf is set.) >> Yet given that other pieces of the decoder may have a reference to >> these buffers, they need not be writable and are made writable using >> av_buffer_make_writable() when preparing a new Picture. This involves >> reading the buffer's data, although the old content of the buffer >> need not be retained. >> >> Worse, this read can be racy, because the buffer can be used by another >> thread at the same time. This happens for Real Video 3 and 4. >> >> This commit fixes this race by no longer copying the data; >> instead the old buffer is replaced by a new, zero-allocated buffer. >> >> (Here are the details of what happens with three or more decoding threads >> when decoding rv30.rm from the FATE-suite as happens in the rv30 test: >> The first decoding thread uses the first slot of its picture array >> to store its current pic; update_thread_context copies this for the >> second thread that decodes a P-frame. It uses the second slot in its >> Picture array to store its P-frame. This arrangement is then copied >> to the third decode thread, which decodes a B-frame. It uses the third >> slot in its Picture array for its current frame. >> update_thread_context copies this to the next thread. It unreferences >> the third slot containing the other B-frame and then it reuses this >> slot for its current frame. Because the pic array slots are only >> incompletely unreferenced, the buffers of the previous B-frame are >> still in there and they are not writable; in fact the previous >> thread is concurrently writing to them, causing races when making >> the buffer writable.) >> >> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> >> --- >> libavcodec/mpegpicture.c | 16 +++++++++++++++- >> 1 file changed, 15 insertions(+), 1 deletion(-) > > This causes a difference in some frames of: (i have not investigates why > just reporting as i noticed) > quite possibly thats just showing your bugfix in action > > ./ffmpeg -y -bitexact -i fate/svq3/Vertical400kbit.sorenson3.mov -ps 50 -bf 1 -bitexact -an -qscale 5 -ss 40 -error_rate 4 -threads 1 /tmp/out4.avi > ./ffmpeg -y -bitexact -v -1 -loglevel 0 -i /tmp/out4.avi -bitexact -vsync drop -f framecrc - > > @@ -141,7 +141,7 @@ > 0, 22, 22, 1, 115200, 0x4cc933e9 > 0, 23, 23, 1, 115200, 0x693a40a9 > 0, 24, 24, 1, 115200, 0x956e3b15 > -0, 25, 25, 1, 115200, 0x61763ff4 > +0, 25, 25, 1, 115200, 0xc9e53d1c > 0, 26, 26, 1, 115200, 0x5c5c3dfc > 0, 27, 27, 1, 115200, 0x7de641ea > 0, 28, 28, 1, 115200, 0xe6cc4136 > @@ -187,7 +187,7 @@ > 0, 68, 68, 1, 115200, 0x49dcbf4e > 0, 69, 69, 1, 115200, 0x1ea1c7d1 > 0, 70, 70, 1, 115200, 0xdf77c67b > -0, 71, 71, 1, 115200, 0x7f6bd16d > +0, 71, 71, 1, 115200, 0x33d9d206 > 0, 72, 72, 1, 115200, 0x5e37cb3a > 0, 73, 73, 1, 115200, 0x15abcda3 > 0, 74, 74, 1, 115200, 0xbf4dcbd4 > @@ -199,7 +199,7 @@ > 0, 80, 80, 1, 115200, 0x17d1d667 > 0, 81, 81, 1, 115200, 0x0c1fdf9c > 0, 82, 82, 1, 115200, 0x7eabde6b > -0, 83, 83, 1, 115200, 0xe623e7af > +0, 83, 83, 1, 115200, 0x3bf6e873 > 0, 84, 84, 1, 115200, 0xf480dc82 > 0, 85, 85, 1, 115200, 0x5fd6e098 > 0, 86, 86, 1, 115200, 0xf520de95 > @@ -211,7 +211,7 @@ > 0, 92, 92, 1, 115200, 0x34cfe1c2 > 0, 93, 93, 1, 115200, 0x1d94e1c3 > 0, 94, 94, 1, 115200, 0x6d32e147 > -0, 95, 95, 1, 115200, 0x7e40ee91 > +0, 95, 95, 1, 115200, 0x09fbefd0 > 0, 96, 96, 1, 115200, 0xa5f5eb43 > 0, 97, 97, 1, 115200, 0x39b9ec3d > 0, 98, 98, 1, 115200, 0x3256ec18 > > [...] > Decoding this sample uses earlier values of mbskip_table. If I zero mbskip_table_buf in every ff_alloc_picture(), nothing changes due to this patch and (most importantly) the output of decoding does not depend upon the number of threads used (which it currently does -- with and without this patch). I don't know exactly where the bug is or whether there is a cheaper way to mitigate it than just to zero the buffer every time. I guess there are more such bugs affecting damaged samples than we are currently aware of. - Andreas
On Tue, Aug 16, 2022 at 10:38:55PM +0200, Andreas Rheinhardt wrote: > Michael Niedermayer: > > On Sat, Aug 13, 2022 at 05:03:04PM +0200, Andreas Rheinhardt wrote: > >> mpegvideo uses an array of Pictures and when it is done with using > >> them, it only unreferences them incompletely: Some buffers are kept > >> so that they can be reused lateron if the same slot in the Picture > >> array is reused, making this a sort of a bufferpool. > >> (Basically, a Picture is considered used if the AVFrame's buf is set.) > >> Yet given that other pieces of the decoder may have a reference to > >> these buffers, they need not be writable and are made writable using > >> av_buffer_make_writable() when preparing a new Picture. This involves > >> reading the buffer's data, although the old content of the buffer > >> need not be retained. > >> > >> Worse, this read can be racy, because the buffer can be used by another > >> thread at the same time. This happens for Real Video 3 and 4. > >> > >> This commit fixes this race by no longer copying the data; > >> instead the old buffer is replaced by a new, zero-allocated buffer. > >> > >> (Here are the details of what happens with three or more decoding threads > >> when decoding rv30.rm from the FATE-suite as happens in the rv30 test: > >> The first decoding thread uses the first slot of its picture array > >> to store its current pic; update_thread_context copies this for the > >> second thread that decodes a P-frame. It uses the second slot in its > >> Picture array to store its P-frame. This arrangement is then copied > >> to the third decode thread, which decodes a B-frame. It uses the third > >> slot in its Picture array for its current frame. > >> update_thread_context copies this to the next thread. It unreferences > >> the third slot containing the other B-frame and then it reuses this > >> slot for its current frame. Because the pic array slots are only > >> incompletely unreferenced, the buffers of the previous B-frame are > >> still in there and they are not writable; in fact the previous > >> thread is concurrently writing to them, causing races when making > >> the buffer writable.) > >> > >> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> > >> --- > >> libavcodec/mpegpicture.c | 16 +++++++++++++++- > >> 1 file changed, 15 insertions(+), 1 deletion(-) > > > > This causes a difference in some frames of: (i have not investigates why > > just reporting as i noticed) > > quite possibly thats just showing your bugfix in action > > > > ./ffmpeg -y -bitexact -i fate/svq3/Vertical400kbit.sorenson3.mov -ps 50 -bf 1 -bitexact -an -qscale 5 -ss 40 -error_rate 4 -threads 1 /tmp/out4.avi > > ./ffmpeg -y -bitexact -v -1 -loglevel 0 -i /tmp/out4.avi -bitexact -vsync drop -f framecrc - > > > > @@ -141,7 +141,7 @@ > > 0, 22, 22, 1, 115200, 0x4cc933e9 > > 0, 23, 23, 1, 115200, 0x693a40a9 > > 0, 24, 24, 1, 115200, 0x956e3b15 > > -0, 25, 25, 1, 115200, 0x61763ff4 > > +0, 25, 25, 1, 115200, 0xc9e53d1c > > 0, 26, 26, 1, 115200, 0x5c5c3dfc > > 0, 27, 27, 1, 115200, 0x7de641ea > > 0, 28, 28, 1, 115200, 0xe6cc4136 > > @@ -187,7 +187,7 @@ > > 0, 68, 68, 1, 115200, 0x49dcbf4e > > 0, 69, 69, 1, 115200, 0x1ea1c7d1 > > 0, 70, 70, 1, 115200, 0xdf77c67b > > -0, 71, 71, 1, 115200, 0x7f6bd16d > > +0, 71, 71, 1, 115200, 0x33d9d206 > > 0, 72, 72, 1, 115200, 0x5e37cb3a > > 0, 73, 73, 1, 115200, 0x15abcda3 > > 0, 74, 74, 1, 115200, 0xbf4dcbd4 > > @@ -199,7 +199,7 @@ > > 0, 80, 80, 1, 115200, 0x17d1d667 > > 0, 81, 81, 1, 115200, 0x0c1fdf9c > > 0, 82, 82, 1, 115200, 0x7eabde6b > > -0, 83, 83, 1, 115200, 0xe623e7af > > +0, 83, 83, 1, 115200, 0x3bf6e873 > > 0, 84, 84, 1, 115200, 0xf480dc82 > > 0, 85, 85, 1, 115200, 0x5fd6e098 > > 0, 86, 86, 1, 115200, 0xf520de95 > > @@ -211,7 +211,7 @@ > > 0, 92, 92, 1, 115200, 0x34cfe1c2 > > 0, 93, 93, 1, 115200, 0x1d94e1c3 > > 0, 94, 94, 1, 115200, 0x6d32e147 > > -0, 95, 95, 1, 115200, 0x7e40ee91 > > +0, 95, 95, 1, 115200, 0x09fbefd0 > > 0, 96, 96, 1, 115200, 0xa5f5eb43 > > 0, 97, 97, 1, 115200, 0x39b9ec3d > > 0, 98, 98, 1, 115200, 0x3256ec18 > > > > [...] > > > > Decoding this sample uses earlier values of mbskip_table. If I zero > mbskip_table_buf in every ff_alloc_picture(), nothing changes due to > this patch and (most importantly) the output of decoding does not depend > upon the number of threads used (which it currently does -- with and > without this patch). > I don't know exactly where the bug is or whether there is a cheaper way > to mitigate it than just to zero the buffer every time. > I guess there are more such bugs affecting damaged samples than we are > currently aware of. Theres code in ff_er_frame_end() which cleans the mbskip_table I dont remember it very much but it looks like that should not be optional If this will not be run mbskip_table should maybe be zeroed on "alloc" not sure this is the issue, iam just looking at the code thx [...]
diff --git a/libavcodec/mpegpicture.c b/libavcodec/mpegpicture.c index 2192f74cea..ed96abbe2d 100644 --- a/libavcodec/mpegpicture.c +++ b/libavcodec/mpegpicture.c @@ -47,11 +47,25 @@ static void av_noinline free_picture_tables(Picture *pic) } } +static int make_table_writable(AVBufferRef **ref) +{ + AVBufferRef *old = *ref, *new; + + if (av_buffer_is_writable(old)) + return 0; + new = av_buffer_allocz(old->size); + if (!new) + return AVERROR(ENOMEM); + av_buffer_unref(ref); + *ref = new; + return 0; +} + static int make_tables_writable(Picture *pic) { #define MAKE_WRITABLE(table) \ do {\ - int ret = av_buffer_make_writable(&pic->table); \ + int ret = make_table_writable(&pic->table); \ if (ret < 0) \ return ret; \ } while (0)
mpegvideo uses an array of Pictures and when it is done with using them, it only unreferences them incompletely: Some buffers are kept so that they can be reused lateron if the same slot in the Picture array is reused, making this a sort of a bufferpool. (Basically, a Picture is considered used if the AVFrame's buf is set.) Yet given that other pieces of the decoder may have a reference to these buffers, they need not be writable and are made writable using av_buffer_make_writable() when preparing a new Picture. This involves reading the buffer's data, although the old content of the buffer need not be retained. Worse, this read can be racy, because the buffer can be used by another thread at the same time. This happens for Real Video 3 and 4. This commit fixes this race by no longer copying the data; instead the old buffer is replaced by a new, zero-allocated buffer. (Here are the details of what happens with three or more decoding threads when decoding rv30.rm from the FATE-suite as happens in the rv30 test: The first decoding thread uses the first slot of its picture array to store its current pic; update_thread_context copies this for the second thread that decodes a P-frame. It uses the second slot in its Picture array to store its P-frame. This arrangement is then copied to the third decode thread, which decodes a B-frame. It uses the third slot in its Picture array for its current frame. update_thread_context copies this to the next thread. It unreferences the third slot containing the other B-frame and then it reuses this slot for its current frame. Because the pic array slots are only incompletely unreferenced, the buffers of the previous B-frame are still in there and they are not writable; in fact the previous thread is concurrently writing to them, causing races when making the buffer writable.) Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> --- libavcodec/mpegpicture.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-)