diff mbox

[FFmpeg-devel,v2] avfilter/pthread: rewrite implementation

Message ID 20170707234506.GU4727@nb4
State New
Headers show

Commit Message

Michael Niedermayer July 7, 2017, 11:45 p.m. UTC
On Fri, Jul 07, 2017 at 09:04:37PM +0700, Muhammad Faiz wrote:
> Avoid pthread_cond_broadcast that wakes up all workers. Make each of them
> uses distict mutex/cond. Also let main thread help running jobs.
> 
> Benchmark using afir with threads=5 and 4096 taps fir:
> channels=1:
> old:
>     1849650 decicycles in afir_execute,       2 runs,      0 skips
>     1525719 decicycles in afir_execute,    1024 runs,      0 skips
>     1546032 decicycles in afir_execute,   16356 runs,     28 skips
> new:
>     1495525 decicycles in afir_execute,       2 runs,      0 skips
>      968897 decicycles in afir_execute,    1024 runs,      0 skips
>      941286 decicycles in afir_execute,   16384 runs,      0 skips
> 
> channels=2:
> old:
>     3135485 decicycles in afir_execute,       2 runs,      0 skips
>     1967158 decicycles in afir_execute,    1024 runs,      0 skips
>     1802430 decicycles in afir_execute,   16364 runs,     20 skips
> new:
>     1864750 decicycles in afir_execute,       2 runs,      0 skips
>     1437792 decicycles in afir_execute,    1024 runs,      0 skips
>     1183963 decicycles in afir_execute,   16382 runs,      2 skips
> 
> channels=4:
> old:
>     4879925 decicycles in afir_execute,       2 runs,      0 skips
>     3557950 decicycles in afir_execute,    1022 runs,      2 skips
>     3206843 decicycles in afir_execute,   16379 runs,      5 skips
> new:
>     2962320 decicycles in afir_execute,       2 runs,      0 skips
>     2450430 decicycles in afir_execute,    1024 runs,      0 skips
>     2446219 decicycles in afir_execute,   16383 runs,      1 skips
> 
> channels=8:
> old:
>     6032455 decicycles in afir_execute,       2 runs,      0 skips
>     4838614 decicycles in afir_execute,    1023 runs,      1 skips
>     4720760 decicycles in afir_execute,   16369 runs,     15 skips
> new:
>     5228150 decicycles in afir_execute,       2 runs,      0 skips
>     4592129 decicycles in afir_execute,    1023 runs,      1 skips
>     4469067 decicycles in afir_execute,   16383 runs,      1 skips

this causes a strange change:

./ffmpeg -i ~/videos/matrixbench_mpeg2.mpg  -vcodec libxavs  -vf scale=80x60 -t 1 file3.nut

results in different files before and after this patch. Neither plays
i suspect this is not a bug in the patch but something odd elsewhere
but i dont know

-rw-r----- 1 michael michael 57671 Jul  8 00:48 file3.nut
-rw-r----- 1 michael michael 62162 Jul  8 00:48 file3p.nut

framecrc difference of video with -vcodec copy -copyinkf -f framecrc




[...]

Comments

Muhammad Faiz July 8, 2017, 2:06 a.m. UTC | #1
On Sat, Jul 8, 2017 at 6:45 AM, Michael Niedermayer
<michael@niedermayer.cc> wrote:
> On Fri, Jul 07, 2017 at 09:04:37PM +0700, Muhammad Faiz wrote:
>> Avoid pthread_cond_broadcast that wakes up all workers. Make each of them
>> uses distict mutex/cond. Also let main thread help running jobs.
>>
>> Benchmark using afir with threads=5 and 4096 taps fir:
>> channels=1:
>> old:
>>     1849650 decicycles in afir_execute,       2 runs,      0 skips
>>     1525719 decicycles in afir_execute,    1024 runs,      0 skips
>>     1546032 decicycles in afir_execute,   16356 runs,     28 skips
>> new:
>>     1495525 decicycles in afir_execute,       2 runs,      0 skips
>>      968897 decicycles in afir_execute,    1024 runs,      0 skips
>>      941286 decicycles in afir_execute,   16384 runs,      0 skips
>>
>> channels=2:
>> old:
>>     3135485 decicycles in afir_execute,       2 runs,      0 skips
>>     1967158 decicycles in afir_execute,    1024 runs,      0 skips
>>     1802430 decicycles in afir_execute,   16364 runs,     20 skips
>> new:
>>     1864750 decicycles in afir_execute,       2 runs,      0 skips
>>     1437792 decicycles in afir_execute,    1024 runs,      0 skips
>>     1183963 decicycles in afir_execute,   16382 runs,      2 skips
>>
>> channels=4:
>> old:
>>     4879925 decicycles in afir_execute,       2 runs,      0 skips
>>     3557950 decicycles in afir_execute,    1022 runs,      2 skips
>>     3206843 decicycles in afir_execute,   16379 runs,      5 skips
>> new:
>>     2962320 decicycles in afir_execute,       2 runs,      0 skips
>>     2450430 decicycles in afir_execute,    1024 runs,      0 skips
>>     2446219 decicycles in afir_execute,   16383 runs,      1 skips
>>
>> channels=8:
>> old:
>>     6032455 decicycles in afir_execute,       2 runs,      0 skips
>>     4838614 decicycles in afir_execute,    1023 runs,      1 skips
>>     4720760 decicycles in afir_execute,   16369 runs,     15 skips
>> new:
>>     5228150 decicycles in afir_execute,       2 runs,      0 skips
>>     4592129 decicycles in afir_execute,    1023 runs,      1 skips
>>     4469067 decicycles in afir_execute,   16383 runs,      1 skips
>
> this causes a strange change:
>
> ./ffmpeg -i ~/videos/matrixbench_mpeg2.mpg  -vcodec libxavs  -vf scale=80x60 -t 1 file3.nut
>
> results in different files before and after this patch. Neither plays
> i suspect this is not a bug in the patch but something odd elsewhere
> but i dont know
>
> -rw-r----- 1 michael michael 57671 Jul  8 00:48 file3.nut
> -rw-r----- 1 michael michael 62162 Jul  8 00:48 file3p.nut
>
> framecrc difference of video with -vcodec copy -copyinkf -f framecrc
>
> --- a   2017-07-08 01:41:49.717555033 +0200
> +++ b   2017-07-08 01:42:09.877555273 +0200
> @@ -5,28 +5,28 @@
>  #dimensions 0: 80x60
>  #sar 0: 1/1
>  0,          0,          0,     2048,     1860, 0xaa19412e, F=0x0
> -0,       2048,       2048,     2048,     1261, 0xd0bd2d34, F=0x0
> -0,       4096,       4096,     2048,     1261, 0x30083a11, F=0x0
> -0,       6144,       6144,     2048,     1347, 0xfd5b5c17, F=0x0
> -0,       8192,       8192,     2048,      933, 0x3e95a0aa, F=0x0
> -0,      10240,      10240,     2048,     1299, 0x5fd141e1, F=0x0
> -0,      12288,      12288,     2048,     1311, 0xcb90563e, F=0x0
> -0,      14336,      14336,     2048,     1288, 0x55553cee, F=0x0
> -0,      16384,      16384,     2048,     1295, 0x68d34476, F=0x0
> -0,      18432,      18432,     2048,     1397, 0xf0646699, F=0x0
> -0,      20480,      20480,     2048,     1353, 0xbd0557f9, F=0x0
> -0,      22528,      22528,     2048,     1358, 0x90095601, F=0x0
> -0,      24576,      24576,     2048,     2004, 0x8de57d88, F=0x0
> -0,      26624,      26624,     2048,     1477, 0x6c099b28, F=0x0
> -0,      28672,      28672,     2048,     1515, 0x2fd78855, F=0x0
> +0,       2048,       2048,     2048,     1827, 0x1a47f795, F=0x0
> +0,       4096,       4096,     2048,     1379, 0x50435dbb, F=0x0
> +0,       6144,       6144,     2048,     1912, 0x9b2529a8, F=0x0
> +0,       8192,       8192,     2048,     1052, 0x6fe1ce3b, F=0x0
> +0,      10240,      10240,     2048,     1862, 0x10e30eae, F=0x0
> +0,      12288,      12288,     2048,     1432, 0x93858555, F=0x0
> +0,      14336,      14336,     2048,     1850, 0xaf3b039d, F=0x0
> +0,      16384,      16384,     2048,     1408, 0x049e668a, F=0x0
> +0,      18432,      18432,     2048,     1956, 0x751c36c6, F=0x0
> +0,      20480,      20480,     2048,     1465, 0xb6e58045, F=0x0
> +0,      22528,      22528,     2048,     1916, 0x22dc1fe7, F=0x0
> +0,      24576,      24576,     2048,     2038, 0x56548c7c, F=0x0
> +0,      26624,      26624,     2048,     1490, 0x7e42a072, F=0x0
> +0,      28672,      28672,     2048,     1521, 0x6e128b71, F=0x0
>  0,      30720,      30720,     2048,     1523, 0xa5819af8, F=0x0
>  0,      32768,      32768,     2048,     1528, 0x9898a156, F=0x0
> -0,      34816,      34816,     2048,     1601, 0x9873cdf4, F=0x0
> +0,      34816,      34816,     2048,     1613, 0x5e97d399, F=0x0
>  0,      36864,      36864,     2048,     1597, 0xf02ad0e6, F=0x0
> -0,      38912,      38912,     2048,     1620, 0x4da2da72, F=0x0
> -0,      40960,      40960,     2048,     1668, 0xb794dc64, F=0x0
> +0,      38912,      38912,     2048,     1625, 0x9bdddcbb, F=0x0
> +0,      40960,      40960,     2048,     1673, 0x0a11de8f, F=0x0
>  0,      43008,      43008,     2048,     1678, 0xd81de01a, F=0x0
>  0,      45056,      45056,     2048,     1647, 0x5ca1d51c, F=0x0
> -0,      47104,      47104,     2048,     1643, 0x8010d916, F=0x0
> -0,      49152,      49152,     2048,     2118, 0x985ea130, F=0x0
> +0,      47104,      47104,     2048,     1655, 0xca23ddf2, F=0x0
> +0,      49152,      49152,     2048,     2158, 0x4619aaac, F=0x0
>  0,      49153,      49153,     2048,        4, 0x00b300b2, F=0x0

I cannot reproduce this. I test it and both are not different.

Thank's
Michael Niedermayer July 8, 2017, 2:26 a.m. UTC | #2
On Sat, Jul 08, 2017 at 09:06:05AM +0700, Muhammad Faiz wrote:
> On Sat, Jul 8, 2017 at 6:45 AM, Michael Niedermayer
> <michael@niedermayer.cc> wrote:
> > On Fri, Jul 07, 2017 at 09:04:37PM +0700, Muhammad Faiz wrote:
> >> Avoid pthread_cond_broadcast that wakes up all workers. Make each of them
> >> uses distict mutex/cond. Also let main thread help running jobs.
> >>
> >> Benchmark using afir with threads=5 and 4096 taps fir:
> >> channels=1:
> >> old:
> >>     1849650 decicycles in afir_execute,       2 runs,      0 skips
> >>     1525719 decicycles in afir_execute,    1024 runs,      0 skips
> >>     1546032 decicycles in afir_execute,   16356 runs,     28 skips
> >> new:
> >>     1495525 decicycles in afir_execute,       2 runs,      0 skips
> >>      968897 decicycles in afir_execute,    1024 runs,      0 skips
> >>      941286 decicycles in afir_execute,   16384 runs,      0 skips
> >>
> >> channels=2:
> >> old:
> >>     3135485 decicycles in afir_execute,       2 runs,      0 skips
> >>     1967158 decicycles in afir_execute,    1024 runs,      0 skips
> >>     1802430 decicycles in afir_execute,   16364 runs,     20 skips
> >> new:
> >>     1864750 decicycles in afir_execute,       2 runs,      0 skips
> >>     1437792 decicycles in afir_execute,    1024 runs,      0 skips
> >>     1183963 decicycles in afir_execute,   16382 runs,      2 skips
> >>
> >> channels=4:
> >> old:
> >>     4879925 decicycles in afir_execute,       2 runs,      0 skips
> >>     3557950 decicycles in afir_execute,    1022 runs,      2 skips
> >>     3206843 decicycles in afir_execute,   16379 runs,      5 skips
> >> new:
> >>     2962320 decicycles in afir_execute,       2 runs,      0 skips
> >>     2450430 decicycles in afir_execute,    1024 runs,      0 skips
> >>     2446219 decicycles in afir_execute,   16383 runs,      1 skips
> >>
> >> channels=8:
> >> old:
> >>     6032455 decicycles in afir_execute,       2 runs,      0 skips
> >>     4838614 decicycles in afir_execute,    1023 runs,      1 skips
> >>     4720760 decicycles in afir_execute,   16369 runs,     15 skips
> >> new:
> >>     5228150 decicycles in afir_execute,       2 runs,      0 skips
> >>     4592129 decicycles in afir_execute,    1023 runs,      1 skips
> >>     4469067 decicycles in afir_execute,   16383 runs,      1 skips
> >
> > this causes a strange change:
> >
> > ./ffmpeg -i ~/videos/matrixbench_mpeg2.mpg  -vcodec libxavs  -vf scale=80x60 -t 1 file3.nut
> >
> > results in different files before and after this patch. Neither plays
> > i suspect this is not a bug in the patch but something odd elsewhere
> > but i dont know
> >
> > -rw-r----- 1 michael michael 57671 Jul  8 00:48 file3.nut
> > -rw-r----- 1 michael michael 62162 Jul  8 00:48 file3p.nut
> >
> > framecrc difference of video with -vcodec copy -copyinkf -f framecrc
> >
> > --- a   2017-07-08 01:41:49.717555033 +0200
> > +++ b   2017-07-08 01:42:09.877555273 +0200
> > @@ -5,28 +5,28 @@
> >  #dimensions 0: 80x60
> >  #sar 0: 1/1
> >  0,          0,          0,     2048,     1860, 0xaa19412e, F=0x0
> > -0,       2048,       2048,     2048,     1261, 0xd0bd2d34, F=0x0
> > -0,       4096,       4096,     2048,     1261, 0x30083a11, F=0x0
> > -0,       6144,       6144,     2048,     1347, 0xfd5b5c17, F=0x0
> > -0,       8192,       8192,     2048,      933, 0x3e95a0aa, F=0x0
> > -0,      10240,      10240,     2048,     1299, 0x5fd141e1, F=0x0
> > -0,      12288,      12288,     2048,     1311, 0xcb90563e, F=0x0
> > -0,      14336,      14336,     2048,     1288, 0x55553cee, F=0x0
> > -0,      16384,      16384,     2048,     1295, 0x68d34476, F=0x0
> > -0,      18432,      18432,     2048,     1397, 0xf0646699, F=0x0
> > -0,      20480,      20480,     2048,     1353, 0xbd0557f9, F=0x0
> > -0,      22528,      22528,     2048,     1358, 0x90095601, F=0x0
> > -0,      24576,      24576,     2048,     2004, 0x8de57d88, F=0x0
> > -0,      26624,      26624,     2048,     1477, 0x6c099b28, F=0x0
> > -0,      28672,      28672,     2048,     1515, 0x2fd78855, F=0x0
> > +0,       2048,       2048,     2048,     1827, 0x1a47f795, F=0x0
> > +0,       4096,       4096,     2048,     1379, 0x50435dbb, F=0x0
> > +0,       6144,       6144,     2048,     1912, 0x9b2529a8, F=0x0
> > +0,       8192,       8192,     2048,     1052, 0x6fe1ce3b, F=0x0
> > +0,      10240,      10240,     2048,     1862, 0x10e30eae, F=0x0
> > +0,      12288,      12288,     2048,     1432, 0x93858555, F=0x0
> > +0,      14336,      14336,     2048,     1850, 0xaf3b039d, F=0x0
> > +0,      16384,      16384,     2048,     1408, 0x049e668a, F=0x0
> > +0,      18432,      18432,     2048,     1956, 0x751c36c6, F=0x0
> > +0,      20480,      20480,     2048,     1465, 0xb6e58045, F=0x0
> > +0,      22528,      22528,     2048,     1916, 0x22dc1fe7, F=0x0
> > +0,      24576,      24576,     2048,     2038, 0x56548c7c, F=0x0
> > +0,      26624,      26624,     2048,     1490, 0x7e42a072, F=0x0
> > +0,      28672,      28672,     2048,     1521, 0x6e128b71, F=0x0
> >  0,      30720,      30720,     2048,     1523, 0xa5819af8, F=0x0
> >  0,      32768,      32768,     2048,     1528, 0x9898a156, F=0x0
> > -0,      34816,      34816,     2048,     1601, 0x9873cdf4, F=0x0
> > +0,      34816,      34816,     2048,     1613, 0x5e97d399, F=0x0
> >  0,      36864,      36864,     2048,     1597, 0xf02ad0e6, F=0x0
> > -0,      38912,      38912,     2048,     1620, 0x4da2da72, F=0x0
> > -0,      40960,      40960,     2048,     1668, 0xb794dc64, F=0x0
> > +0,      38912,      38912,     2048,     1625, 0x9bdddcbb, F=0x0
> > +0,      40960,      40960,     2048,     1673, 0x0a11de8f, F=0x0
> >  0,      43008,      43008,     2048,     1678, 0xd81de01a, F=0x0
> >  0,      45056,      45056,     2048,     1647, 0x5ca1d51c, F=0x0
> > -0,      47104,      47104,     2048,     1643, 0x8010d916, F=0x0
> > -0,      49152,      49152,     2048,     2118, 0x985ea130, F=0x0
> > +0,      47104,      47104,     2048,     1655, 0xca23ddf2, F=0x0
> > +0,      49152,      49152,     2048,     2158, 0x4619aaac, F=0x0
> >  0,      49153,      49153,     2048,        4, 0x00b300b2, F=0x0
> 
> I cannot reproduce this. I test it and both are not different.

ok, then i would guess its unrelated to this patch and just
occured by chance of changed timing or something


[...]
wm4 July 10, 2017, 8:53 a.m. UTC | #3
On Sat, 8 Jul 2017 01:45:06 +0200
Michael Niedermayer <michael@niedermayer.cc> wrote:

> On Fri, Jul 07, 2017 at 09:04:37PM +0700, Muhammad Faiz wrote:
> > Avoid pthread_cond_broadcast that wakes up all workers. Make each of them
> > uses distict mutex/cond. Also let main thread help running jobs.
> > 
> > Benchmark using afir with threads=5 and 4096 taps fir:
> > channels=1:
> > old:
> >     1849650 decicycles in afir_execute,       2 runs,      0 skips
> >     1525719 decicycles in afir_execute,    1024 runs,      0 skips
> >     1546032 decicycles in afir_execute,   16356 runs,     28 skips
> > new:
> >     1495525 decicycles in afir_execute,       2 runs,      0 skips
> >      968897 decicycles in afir_execute,    1024 runs,      0 skips
> >      941286 decicycles in afir_execute,   16384 runs,      0 skips
> > 
> > channels=2:
> > old:
> >     3135485 decicycles in afir_execute,       2 runs,      0 skips
> >     1967158 decicycles in afir_execute,    1024 runs,      0 skips
> >     1802430 decicycles in afir_execute,   16364 runs,     20 skips
> > new:
> >     1864750 decicycles in afir_execute,       2 runs,      0 skips
> >     1437792 decicycles in afir_execute,    1024 runs,      0 skips
> >     1183963 decicycles in afir_execute,   16382 runs,      2 skips
> > 
> > channels=4:
> > old:
> >     4879925 decicycles in afir_execute,       2 runs,      0 skips
> >     3557950 decicycles in afir_execute,    1022 runs,      2 skips
> >     3206843 decicycles in afir_execute,   16379 runs,      5 skips
> > new:
> >     2962320 decicycles in afir_execute,       2 runs,      0 skips
> >     2450430 decicycles in afir_execute,    1024 runs,      0 skips
> >     2446219 decicycles in afir_execute,   16383 runs,      1 skips
> > 
> > channels=8:
> > old:
> >     6032455 decicycles in afir_execute,       2 runs,      0 skips
> >     4838614 decicycles in afir_execute,    1023 runs,      1 skips
> >     4720760 decicycles in afir_execute,   16369 runs,     15 skips
> > new:
> >     5228150 decicycles in afir_execute,       2 runs,      0 skips
> >     4592129 decicycles in afir_execute,    1023 runs,      1 skips
> >     4469067 decicycles in afir_execute,   16383 runs,      1 skips  
> 
> this causes a strange change:
> 
> ./ffmpeg -i ~/videos/matrixbench_mpeg2.mpg  -vcodec libxavs  -vf scale=80x60 -t 1 file3.nut
> 
> results in different files before and after this patch. Neither plays
> i suspect this is not a bug in the patch but something odd elsewhere
> but i dont know

OK so you're saying there's no bug. Something changed, and you're too
lazy to investigate, but I guess he has all time in the world.

So why should he care?
Michael Niedermayer July 10, 2017, 9:40 p.m. UTC | #4
On Mon, Jul 10, 2017 at 10:53:42AM +0200, wm4 wrote:
> On Sat, 8 Jul 2017 01:45:06 +0200
> Michael Niedermayer <michael@niedermayer.cc> wrote:
> 
> > On Fri, Jul 07, 2017 at 09:04:37PM +0700, Muhammad Faiz wrote:
> > > Avoid pthread_cond_broadcast that wakes up all workers. Make each of them
> > > uses distict mutex/cond. Also let main thread help running jobs.
> > > 
> > > Benchmark using afir with threads=5 and 4096 taps fir:
> > > channels=1:
> > > old:
> > >     1849650 decicycles in afir_execute,       2 runs,      0 skips
> > >     1525719 decicycles in afir_execute,    1024 runs,      0 skips
> > >     1546032 decicycles in afir_execute,   16356 runs,     28 skips
> > > new:
> > >     1495525 decicycles in afir_execute,       2 runs,      0 skips
> > >      968897 decicycles in afir_execute,    1024 runs,      0 skips
> > >      941286 decicycles in afir_execute,   16384 runs,      0 skips
> > > 
> > > channels=2:
> > > old:
> > >     3135485 decicycles in afir_execute,       2 runs,      0 skips
> > >     1967158 decicycles in afir_execute,    1024 runs,      0 skips
> > >     1802430 decicycles in afir_execute,   16364 runs,     20 skips
> > > new:
> > >     1864750 decicycles in afir_execute,       2 runs,      0 skips
> > >     1437792 decicycles in afir_execute,    1024 runs,      0 skips
> > >     1183963 decicycles in afir_execute,   16382 runs,      2 skips
> > > 
> > > channels=4:
> > > old:
> > >     4879925 decicycles in afir_execute,       2 runs,      0 skips
> > >     3557950 decicycles in afir_execute,    1022 runs,      2 skips
> > >     3206843 decicycles in afir_execute,   16379 runs,      5 skips
> > > new:
> > >     2962320 decicycles in afir_execute,       2 runs,      0 skips
> > >     2450430 decicycles in afir_execute,    1024 runs,      0 skips
> > >     2446219 decicycles in afir_execute,   16383 runs,      1 skips
> > > 
> > > channels=8:
> > > old:
> > >     6032455 decicycles in afir_execute,       2 runs,      0 skips
> > >     4838614 decicycles in afir_execute,    1023 runs,      1 skips
> > >     4720760 decicycles in afir_execute,   16369 runs,     15 skips
> > > new:
> > >     5228150 decicycles in afir_execute,       2 runs,      0 skips
> > >     4592129 decicycles in afir_execute,    1023 runs,      1 skips
> > >     4469067 decicycles in afir_execute,   16383 runs,      1 skips  
> > 
> > this causes a strange change:
> > 
> > ./ffmpeg -i ~/videos/matrixbench_mpeg2.mpg  -vcodec libxavs  -vf scale=80x60 -t 1 file3.nut
> > 
> > results in different files before and after this patch. Neither plays
> > i suspect this is not a bug in the patch but something odd elsewhere
> > but i dont know
> 

> OK so you're saying there's no bug.

no, i didnt say that


> Something changed,

yes


> and you're too
> lazy to investigate,

no, i didnt say that either, nor is that true


> but I guess he has all time in the world.
> 
> So why should he care?

You seem to try to imply that i asked Muhammad to look into this
or take care of or fix the issue.

please read my reply again its just 3 lines, i have not asked
Muhammad to do anything. Nor was it intended to imply that.
It is just a report of some odd findings which were triggered by
the patch but which quite possibly are unrelated to the patch excpt
by coincidence.

Thanks

[...]
James Almer July 10, 2017, 9:42 p.m. UTC | #5
On 7/10/2017 5:53 AM, wm4 wrote:
> On Sat, 8 Jul 2017 01:45:06 +0200
> Michael Niedermayer <michael@niedermayer.cc> wrote:
> 
>> On Fri, Jul 07, 2017 at 09:04:37PM +0700, Muhammad Faiz wrote:
>>> Avoid pthread_cond_broadcast that wakes up all workers. Make each of them
>>> uses distict mutex/cond. Also let main thread help running jobs.
>>>
>>> Benchmark using afir with threads=5 and 4096 taps fir:
>>> channels=1:
>>> old:
>>>     1849650 decicycles in afir_execute,       2 runs,      0 skips
>>>     1525719 decicycles in afir_execute,    1024 runs,      0 skips
>>>     1546032 decicycles in afir_execute,   16356 runs,     28 skips
>>> new:
>>>     1495525 decicycles in afir_execute,       2 runs,      0 skips
>>>      968897 decicycles in afir_execute,    1024 runs,      0 skips
>>>      941286 decicycles in afir_execute,   16384 runs,      0 skips
>>>
>>> channels=2:
>>> old:
>>>     3135485 decicycles in afir_execute,       2 runs,      0 skips
>>>     1967158 decicycles in afir_execute,    1024 runs,      0 skips
>>>     1802430 decicycles in afir_execute,   16364 runs,     20 skips
>>> new:
>>>     1864750 decicycles in afir_execute,       2 runs,      0 skips
>>>     1437792 decicycles in afir_execute,    1024 runs,      0 skips
>>>     1183963 decicycles in afir_execute,   16382 runs,      2 skips
>>>
>>> channels=4:
>>> old:
>>>     4879925 decicycles in afir_execute,       2 runs,      0 skips
>>>     3557950 decicycles in afir_execute,    1022 runs,      2 skips
>>>     3206843 decicycles in afir_execute,   16379 runs,      5 skips
>>> new:
>>>     2962320 decicycles in afir_execute,       2 runs,      0 skips
>>>     2450430 decicycles in afir_execute,    1024 runs,      0 skips
>>>     2446219 decicycles in afir_execute,   16383 runs,      1 skips
>>>
>>> channels=8:
>>> old:
>>>     6032455 decicycles in afir_execute,       2 runs,      0 skips
>>>     4838614 decicycles in afir_execute,    1023 runs,      1 skips
>>>     4720760 decicycles in afir_execute,   16369 runs,     15 skips
>>> new:
>>>     5228150 decicycles in afir_execute,       2 runs,      0 skips
>>>     4592129 decicycles in afir_execute,    1023 runs,      1 skips
>>>     4469067 decicycles in afir_execute,   16383 runs,      1 skips  
>>
>> this causes a strange change:
>>
>> ./ffmpeg -i ~/videos/matrixbench_mpeg2.mpg  -vcodec libxavs  -vf scale=80x60 -t 1 file3.nut
>>
>> results in different files before and after this patch. Neither plays
>> i suspect this is not a bug in the patch but something odd elsewhere
>> but i dont know
> 
> OK so you're saying there's no bug. Something changed, and you're too
> lazy to investigate, but I guess he has all time in the world.
> 
> So why should he care?

Tone it down already. You're being unnecessarily aggressive in a lot of
your recent emails.
Muhammad Faiz July 11, 2017, 4:18 a.m. UTC | #6
On Tue, Jul 11, 2017 at 4:40 AM, Michael Niedermayer
<michael@niedermayer.cc> wrote:
> On Mon, Jul 10, 2017 at 10:53:42AM +0200, wm4 wrote:
>> On Sat, 8 Jul 2017 01:45:06 +0200
>> Michael Niedermayer <michael@niedermayer.cc> wrote:
>>
>> > On Fri, Jul 07, 2017 at 09:04:37PM +0700, Muhammad Faiz wrote:
>> > > Avoid pthread_cond_broadcast that wakes up all workers. Make each of them
>> > > uses distict mutex/cond. Also let main thread help running jobs.
>> > >
>> > > Benchmark using afir with threads=5 and 4096 taps fir:
>> > > channels=1:
>> > > old:
>> > >     1849650 decicycles in afir_execute,       2 runs,      0 skips
>> > >     1525719 decicycles in afir_execute,    1024 runs,      0 skips
>> > >     1546032 decicycles in afir_execute,   16356 runs,     28 skips
>> > > new:
>> > >     1495525 decicycles in afir_execute,       2 runs,      0 skips
>> > >      968897 decicycles in afir_execute,    1024 runs,      0 skips
>> > >      941286 decicycles in afir_execute,   16384 runs,      0 skips
>> > >
>> > > channels=2:
>> > > old:
>> > >     3135485 decicycles in afir_execute,       2 runs,      0 skips
>> > >     1967158 decicycles in afir_execute,    1024 runs,      0 skips
>> > >     1802430 decicycles in afir_execute,   16364 runs,     20 skips
>> > > new:
>> > >     1864750 decicycles in afir_execute,       2 runs,      0 skips
>> > >     1437792 decicycles in afir_execute,    1024 runs,      0 skips
>> > >     1183963 decicycles in afir_execute,   16382 runs,      2 skips
>> > >
>> > > channels=4:
>> > > old:
>> > >     4879925 decicycles in afir_execute,       2 runs,      0 skips
>> > >     3557950 decicycles in afir_execute,    1022 runs,      2 skips
>> > >     3206843 decicycles in afir_execute,   16379 runs,      5 skips
>> > > new:
>> > >     2962320 decicycles in afir_execute,       2 runs,      0 skips
>> > >     2450430 decicycles in afir_execute,    1024 runs,      0 skips
>> > >     2446219 decicycles in afir_execute,   16383 runs,      1 skips
>> > >
>> > > channels=8:
>> > > old:
>> > >     6032455 decicycles in afir_execute,       2 runs,      0 skips
>> > >     4838614 decicycles in afir_execute,    1023 runs,      1 skips
>> > >     4720760 decicycles in afir_execute,   16369 runs,     15 skips
>> > > new:
>> > >     5228150 decicycles in afir_execute,       2 runs,      0 skips
>> > >     4592129 decicycles in afir_execute,    1023 runs,      1 skips
>> > >     4469067 decicycles in afir_execute,   16383 runs,      1 skips
>> >
>> > this causes a strange change:
>> >
>> > ./ffmpeg -i ~/videos/matrixbench_mpeg2.mpg  -vcodec libxavs  -vf scale=80x60 -t 1 file3.nut
>> >
>> > results in different files before and after this patch. Neither plays
>> > i suspect this is not a bug in the patch but something odd elsewhere
>> > but i dont know
>>
>
>> OK so you're saying there's no bug.
>
> no, i didnt say that
>
>
>> Something changed,
>
> yes
>
>
>> and you're too
>> lazy to investigate,
>
> no, i didnt say that either, nor is that true
>
>
>> but I guess he has all time in the world.
>>
>> So why should he care?
>
> You seem to try to imply that i asked Muhammad to look into this
> or take care of or fix the issue.
>
> please read my reply again its just 3 lines, i have not asked
> Muhammad to do anything. Nor was it intended to imply that.
> It is just a report of some odd findings which were triggered by
> the patch but which quite possibly are unrelated to the patch excpt
> by coincidence.

Probably this patch generates race condition. The silence of tsan is
artificial, the similar patch in avcodec cannot make tsan warning
silence. Currently I'm reworking on this.
My plan is to merge implementation of slice threading in avutil.

Thank's.
Clément Bœsch July 11, 2017, 6:24 a.m. UTC | #7
On Tue, Jul 11, 2017 at 11:18:42AM +0700, Muhammad Faiz wrote:
[...]
> My plan is to merge implementation of slice threading in avutil.
> 

As a public API? With the .c include "trick"? avpriv_? I don't know what's
the current consensus, but people never seem to agree about the path to
take in that regard. You'll also get the discussion about merging all the
libs into one.

If you don't want to loose motivation, you're probably better of sending
working patches with duplication first. But maybe that's what you had in
mind anyway.

Thanks for your work on this, the numbers are pretty cool.

Note on tsan: you may want to increase the number of threads to increase
your "chances" to trigger some races.
wm4 July 11, 2017, 8:08 a.m. UTC | #8
On Mon, 10 Jul 2017 23:40:10 +0200
Michael Niedermayer <michael@niedermayer.cc> wrote:

> On Mon, Jul 10, 2017 at 10:53:42AM +0200, wm4 wrote:
> > On Sat, 8 Jul 2017 01:45:06 +0200
> > Michael Niedermayer <michael@niedermayer.cc> wrote:
> >   
> > > On Fri, Jul 07, 2017 at 09:04:37PM +0700, Muhammad Faiz wrote:  
> > > > Avoid pthread_cond_broadcast that wakes up all workers. Make each of them
> > > > uses distict mutex/cond. Also let main thread help running jobs.
> > > > 
> > > > Benchmark using afir with threads=5 and 4096 taps fir:
> > > > channels=1:
> > > > old:
> > > >     1849650 decicycles in afir_execute,       2 runs,      0 skips
> > > >     1525719 decicycles in afir_execute,    1024 runs,      0 skips
> > > >     1546032 decicycles in afir_execute,   16356 runs,     28 skips
> > > > new:
> > > >     1495525 decicycles in afir_execute,       2 runs,      0 skips
> > > >      968897 decicycles in afir_execute,    1024 runs,      0 skips
> > > >      941286 decicycles in afir_execute,   16384 runs,      0 skips
> > > > 
> > > > channels=2:
> > > > old:
> > > >     3135485 decicycles in afir_execute,       2 runs,      0 skips
> > > >     1967158 decicycles in afir_execute,    1024 runs,      0 skips
> > > >     1802430 decicycles in afir_execute,   16364 runs,     20 skips
> > > > new:
> > > >     1864750 decicycles in afir_execute,       2 runs,      0 skips
> > > >     1437792 decicycles in afir_execute,    1024 runs,      0 skips
> > > >     1183963 decicycles in afir_execute,   16382 runs,      2 skips
> > > > 
> > > > channels=4:
> > > > old:
> > > >     4879925 decicycles in afir_execute,       2 runs,      0 skips
> > > >     3557950 decicycles in afir_execute,    1022 runs,      2 skips
> > > >     3206843 decicycles in afir_execute,   16379 runs,      5 skips
> > > > new:
> > > >     2962320 decicycles in afir_execute,       2 runs,      0 skips
> > > >     2450430 decicycles in afir_execute,    1024 runs,      0 skips
> > > >     2446219 decicycles in afir_execute,   16383 runs,      1 skips
> > > > 
> > > > channels=8:
> > > > old:
> > > >     6032455 decicycles in afir_execute,       2 runs,      0 skips
> > > >     4838614 decicycles in afir_execute,    1023 runs,      1 skips
> > > >     4720760 decicycles in afir_execute,   16369 runs,     15 skips
> > > > new:
> > > >     5228150 decicycles in afir_execute,       2 runs,      0 skips
> > > >     4592129 decicycles in afir_execute,    1023 runs,      1 skips
> > > >     4469067 decicycles in afir_execute,   16383 runs,      1 skips    
> > > 
> > > this causes a strange change:
> > > 
> > > ./ffmpeg -i ~/videos/matrixbench_mpeg2.mpg  -vcodec libxavs  -vf scale=80x60 -t 1 file3.nut
> > > 
> > > results in different files before and after this patch. Neither plays
> > > i suspect this is not a bug in the patch but something odd elsewhere
> > > but i dont know  
> >   
> 
> > OK so you're saying there's no bug.  
> 
> no, i didnt say that
> 
> 
> > Something changed,  
> 
> yes
> 
> 
> > and you're too
> > lazy to investigate,  
> 
> no, i didnt say that either, nor is that true
> 
> 
> > but I guess he has all time in the world.
> > 
> > So why should he care?  
> 
> You seem to try to imply that i asked Muhammad to look into this
> or take care of or fix the issue.
> 
> please read my reply again its just 3 lines, i have not asked
> Muhammad to do anything. Nor was it intended to imply that.
> It is just a report of some odd findings which were triggered by
> the patch but which quite possibly are unrelated to the patch excpt
> by coincidence.

So he can just ignore your mail? What was your mail supposed to
achieve? You basically wrote that something happened and that it
probably doesn't have to do with the patch. Why didn't you investigate
whether it did before writing that? Why post something completely
unrelated? It's not helpful.
Marton Balint July 11, 2017, 9:08 a.m. UTC | #9
On Tue, 11 Jul 2017, wm4 wrote:

> On Mon, 10 Jul 2017 23:40:10 +0200
> Michael Niedermayer <michael@niedermayer.cc> wrote:
>
>> On Mon, Jul 10, 2017 at 10:53:42AM +0200, wm4 wrote:
>> > On Sat, 8 Jul 2017 01:45:06 +0200
>> > Michael Niedermayer <michael@niedermayer.cc> wrote:
>> > 
>> > > On Fri, Jul 07, 2017 at 09:04:37PM +0700, Muhammad Faiz wrote: 
>> > > > Avoid pthread_cond_broadcast that wakes up all workers. Make each of them
>> > > > uses distict mutex/cond. Also let main thread help running jobs.
>> > > > 
>> > > > Benchmark using afir with threads=5 and 4096 taps fir:
>> > > > channels=1:
>> > > > old:
>> > > >     1849650 decicycles in afir_execute,       2 runs,      0 skips
>> > > >     1525719 decicycles in afir_execute,    1024 runs,      0 skips
>> > > >     1546032 decicycles in afir_execute,   16356 runs,     28 skips
>> > > > new:
>> > > >     1495525 decicycles in afir_execute,       2 runs,      0 skips
>> > > >      968897 decicycles in afir_execute,    1024 runs,      0 skips
>> > > >      941286 decicycles in afir_execute,   16384 runs,      0 skips
>> > > > 
>> > > > channels=2:
>> > > > old:
>> > > >     3135485 decicycles in afir_execute,       2 runs,      0 skips
>> > > >     1967158 decicycles in afir_execute,    1024 runs,      0 skips
>> > > >     1802430 decicycles in afir_execute,   16364 runs,     20 skips
>> > > > new:
>> > > >     1864750 decicycles in afir_execute,       2 runs,      0 skips
>> > > >     1437792 decicycles in afir_execute,    1024 runs,      0 skips
>> > > >     1183963 decicycles in afir_execute,   16382 runs,      2 skips
>> > > > 
>> > > > channels=4:
>> > > > old:
>> > > >     4879925 decicycles in afir_execute,       2 runs,      0 skips
>> > > >     3557950 decicycles in afir_execute,    1022 runs,      2 skips
>> > > >     3206843 decicycles in afir_execute,   16379 runs,      5 skips
>> > > > new:
>> > > >     2962320 decicycles in afir_execute,       2 runs,      0 skips
>> > > >     2450430 decicycles in afir_execute,    1024 runs,      0 skips
>> > > >     2446219 decicycles in afir_execute,   16383 runs,      1 skips
>> > > > 
>> > > > channels=8:
>> > > > old:
>> > > >     6032455 decicycles in afir_execute,       2 runs,      0 skips
>> > > >     4838614 decicycles in afir_execute,    1023 runs,      1 skips
>> > > >     4720760 decicycles in afir_execute,   16369 runs,     15 skips
>> > > > new:
>> > > >     5228150 decicycles in afir_execute,       2 runs,      0 skips
>> > > >     4592129 decicycles in afir_execute,    1023 runs,      1 skips
>> > > >     4469067 decicycles in afir_execute,   16383 runs,      1 skips 
>> > > 
>> > > this causes a strange change:
>> > > 
>> > > ./ffmpeg -i ~/videos/matrixbench_mpeg2.mpg  -vcodec libxavs  -vf scale=80x60 -t 1 file3.nut
>> > > 
>> > > results in different files before and after this patch. Neither plays
>> > > i suspect this is not a bug in the patch but something odd elsewhere
>> > > but i dont know 
>> > 
>> 
>> > OK so you're saying there's no bug. 
>> 
>> no, i didnt say that
>> 
>> 
>> > Something changed, 
>> 
>> yes
>> 
>> 
>> > and you're too
>> > lazy to investigate, 
>> 
>> no, i didnt say that either, nor is that true
>> 
>> 
>> > but I guess he has all time in the world.
>> > 
>> > So why should he care? 
>> 
>> You seem to try to imply that i asked Muhammad to look into this
>> or take care of or fix the issue.
>> 
>> please read my reply again its just 3 lines, i have not asked
>> Muhammad to do anything. Nor was it intended to imply that.
>> It is just a report of some odd findings which were triggered by
>> the patch but which quite possibly are unrelated to the patch excpt
>> by coincidence.
>
> So he can just ignore your mail? What was your mail supposed to
> achieve? You basically wrote that something happened and that it
> probably doesn't have to do with the patch. Why didn't you investigate
> whether it did before writing that? Why post something completely
> unrelated? It's not helpful.

I really don't see what is your problem. It is completely understandable 
if somebody does not track down every strange behaviour, we all have 
limted time. Reporting it still might be useful if the issue may be 
related to the patch, because investigating it might help improve the 
patch or find corner cases otherwise not easy to find.

Regards,
Marton
Nicolas George July 11, 2017, 9:56 a.m. UTC | #10
Le tridi 23 messidor, an CCXXV, Clement Boesch a écrit :
> As a public API? With the .c include "trick"? avpriv_? I don't know what's
> the current consensus, but people never seem to agree about the path to
> take in that regard. You'll also get the discussion about merging all the
> libs into one.

Well, you brought it up, not me :-)

> If you don't want to loose motivation, you're probably better of sending

Nit: to loose =~ to set free != to lose =~ to misplace.

> working patches with duplication first. But maybe that's what you had in
> mind anyway.

I will not make difficulties for these changes, as they address code
that is already somewhat duplicated and quite tricky. But at some point
we will need to address the issue.

I think that even if nobody starts to implement anything right away, we
should discuss the question and decide the direction we want to go. At
the very least, it would alleviate bikeshedding for future patches and
make it safer for anybody to start working on something without fear of
being eventually rejected.

So I would like to put the following motion to discussion:

  Starting roughly three months from now, linking (including dynamically
  at run time) with different versions of the libraries will not be
  supported. Only linking with the exact same versions (source code and
  build options) will be supported. This applies even if nothing is
  actually implemented to enforce it (but it would be better to have
  something).

It is far from a complete solution, but at least it gives a direction.

Regards,
Muhammad Faiz July 11, 2017, 2:13 p.m. UTC | #11
On Tue, Jul 11, 2017 at 1:24 PM, Clément Bœsch <u@pkh.me> wrote:
> On Tue, Jul 11, 2017 at 11:18:42AM +0700, Muhammad Faiz wrote:
> [...]
>> My plan is to merge implementation of slice threading in avutil.
>>
>
> As a public API? With the .c include "trick"? avpriv_? I don't know what's
> the current consensus, but people never seem to agree about the path to
> take in that regard. You'll also get the discussion about merging all the
> libs into one.

avpriv_

>
> If you don't want to loose motivation, you're probably better of sending
> working patches with duplication first. But maybe that's what you had in
> mind anyway.

Probably I will send avpriv_. It simplifies my current working.

>
> Thanks for your work on this, the numbers are pretty cool.
>
> Note on tsan: you may want to increase the number of threads to increase
> your "chances" to trigger some races.
>
> --
> Clément B.
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
diff mbox

Patch

--- a   2017-07-08 01:41:49.717555033 +0200
+++ b   2017-07-08 01:42:09.877555273 +0200
@@ -5,28 +5,28 @@ 
 #dimensions 0: 80x60
 #sar 0: 1/1
 0,          0,          0,     2048,     1860, 0xaa19412e, F=0x0
-0,       2048,       2048,     2048,     1261, 0xd0bd2d34, F=0x0
-0,       4096,       4096,     2048,     1261, 0x30083a11, F=0x0
-0,       6144,       6144,     2048,     1347, 0xfd5b5c17, F=0x0
-0,       8192,       8192,     2048,      933, 0x3e95a0aa, F=0x0
-0,      10240,      10240,     2048,     1299, 0x5fd141e1, F=0x0
-0,      12288,      12288,     2048,     1311, 0xcb90563e, F=0x0
-0,      14336,      14336,     2048,     1288, 0x55553cee, F=0x0
-0,      16384,      16384,     2048,     1295, 0x68d34476, F=0x0
-0,      18432,      18432,     2048,     1397, 0xf0646699, F=0x0
-0,      20480,      20480,     2048,     1353, 0xbd0557f9, F=0x0
-0,      22528,      22528,     2048,     1358, 0x90095601, F=0x0
-0,      24576,      24576,     2048,     2004, 0x8de57d88, F=0x0
-0,      26624,      26624,     2048,     1477, 0x6c099b28, F=0x0
-0,      28672,      28672,     2048,     1515, 0x2fd78855, F=0x0
+0,       2048,       2048,     2048,     1827, 0x1a47f795, F=0x0
+0,       4096,       4096,     2048,     1379, 0x50435dbb, F=0x0
+0,       6144,       6144,     2048,     1912, 0x9b2529a8, F=0x0
+0,       8192,       8192,     2048,     1052, 0x6fe1ce3b, F=0x0
+0,      10240,      10240,     2048,     1862, 0x10e30eae, F=0x0
+0,      12288,      12288,     2048,     1432, 0x93858555, F=0x0
+0,      14336,      14336,     2048,     1850, 0xaf3b039d, F=0x0
+0,      16384,      16384,     2048,     1408, 0x049e668a, F=0x0
+0,      18432,      18432,     2048,     1956, 0x751c36c6, F=0x0
+0,      20480,      20480,     2048,     1465, 0xb6e58045, F=0x0
+0,      22528,      22528,     2048,     1916, 0x22dc1fe7, F=0x0
+0,      24576,      24576,     2048,     2038, 0x56548c7c, F=0x0
+0,      26624,      26624,     2048,     1490, 0x7e42a072, F=0x0
+0,      28672,      28672,     2048,     1521, 0x6e128b71, F=0x0
 0,      30720,      30720,     2048,     1523, 0xa5819af8, F=0x0
 0,      32768,      32768,     2048,     1528, 0x9898a156, F=0x0
-0,      34816,      34816,     2048,     1601, 0x9873cdf4, F=0x0
+0,      34816,      34816,     2048,     1613, 0x5e97d399, F=0x0
 0,      36864,      36864,     2048,     1597, 0xf02ad0e6, F=0x0
-0,      38912,      38912,     2048,     1620, 0x4da2da72, F=0x0
-0,      40960,      40960,     2048,     1668, 0xb794dc64, F=0x0
+0,      38912,      38912,     2048,     1625, 0x9bdddcbb, F=0x0
+0,      40960,      40960,     2048,     1673, 0x0a11de8f, F=0x0
 0,      43008,      43008,     2048,     1678, 0xd81de01a, F=0x0
 0,      45056,      45056,     2048,     1647, 0x5ca1d51c, F=0x0
-0,      47104,      47104,     2048,     1643, 0x8010d916, F=0x0
-0,      49152,      49152,     2048,     2118, 0x985ea130, F=0x0
+0,      47104,      47104,     2048,     1655, 0xca23ddf2, F=0x0
+0,      49152,      49152,     2048,     2158, 0x4619aaac, F=0x0
 0,      49153,      49153,     2048,        4, 0x00b300b2, F=0x0