Message ID | 20160830190818.GA1729416@phare.normalesup.org |
---|---|
State | RFC |
Headers | show |
On Tue, Aug 30, 2016 at 09:08:18PM +0200, Nicolas George wrote: > Le quartidi 14 fructidor, an CCXXIV, Paul B Mahol a écrit : > > the filter frame multithreading would just internally, in filter context > > cache frames, once enough frames are in cache - call workers and be done, > > repeat. At eof call workers on remaining frames in cache. > > I have no idea how much thought you have already given to it, but I am > pretty sure it is not as simple as that with the current architecture. By > far. > > In the meantime, I finally got the non-recursive version passing FATE. Here > are the raw patch, so that people can have an idea what this is all about. > There is still a lot of cleanup and documentation to do, as you can see. > > Regards, > > -- > Nicolas George ./ffmpeg -i tickets//679/oversized_pgs_subtitles.mkv -filter_complex '[0:s:1]scale=848x480,[0:v]overlay=shortest=1' test.avi fails assertion: Assertion progress failed at libavfilter/avfilter.c:1391 https://trac.ffmpeg.org/attachment/ticket/679/oversized_pgs_subtitles.mkv ffmpeg -v 0 -i tickets/3539/crash.swf -map 0 -t 1 -f framecrc - output changes, not sure this is a bug but reporting it anyway as i noticed http://samples.ffmpeg.org/ffmpeg-bugs/trac/ticket3539/ doc/examples/filtering_video matrixbench_mpeg2.mpg also breaks [...]
On 8/30/16, Nicolas George <george@nsup.org> wrote: > Le quartidi 14 fructidor, an CCXXIV, Paul B Mahol a ecrit : >> the filter frame multithreading would just internally, in filter context >> cache frames, once enough frames are in cache - call workers and be done, >> repeat. At eof call workers on remaining frames in cache. > > I have no idea how much thought you have already given to it, but I am > pretty sure it is not as simple as that with the current architecture. By > far. Yes, it is very simple. You just need to allocate own buffers that would be needed by filter. Than you just call ctx->internal->execute()... > In the meantime, I finally got the non-recursive version passing FATE. Here > are the raw patch, so that people can have an idea what this is all about. > There is still a lot of cleanup and documentation to do, as you can see. Does this fixes buffer queue overflow for specific scenarios?
On Wed, Aug 31, 2016 at 10:18:31AM +0200, Paul B Mahol wrote: > On 8/30/16, Nicolas George <george@nsup.org> wrote: > > Le quartidi 14 fructidor, an CCXXIV, Paul B Mahol a ecrit : > >> the filter frame multithreading would just internally, in filter context > >> cache frames, once enough frames are in cache - call workers and be done, > >> repeat. At eof call workers on remaining frames in cache. > > > > I have no idea how much thought you have already given to it, but I am > > pretty sure it is not as simple as that with the current architecture. By > > far. > > Yes, it is very simple. You just need to allocate own buffers that > would be needed > by filter. Than you just call ctx->internal->execute()... I would have thought that filter_frame() would insert frames into a input fifo and remove & output frames from a output fifo and pass them on to the next filters ff_filter_frame() if the output fifo is empty and the input full it would block (implicitly gving its CPU time to the workers) and there would be backgound threads continuosly running that pick frames out of the input fifo and process them into an output fifo and wake up the main thread if the fifo state changes using execute() would IIUC (please correct me if i misunderstand) only execute when ff_filter_frame() is executed, this would restrict what can execute at the same time, also execute() needs to wait for all threads to finish before it can return, this too limits paralelism compared to continuously running workers that more independantly pick tasks and work on them but maybe i misuderstood how you intend to use execute() [...]
On 8/31/16, Michael Niedermayer <michael@niedermayer.cc> wrote: > On Wed, Aug 31, 2016 at 10:18:31AM +0200, Paul B Mahol wrote: >> On 8/30/16, Nicolas George <george@nsup.org> wrote: >> > Le quartidi 14 fructidor, an CCXXIV, Paul B Mahol a ecrit : >> >> the filter frame multithreading would just internally, in filter >> >> context >> >> cache frames, once enough frames are in cache - call workers and be >> >> done, >> >> repeat. At eof call workers on remaining frames in cache. >> > >> > I have no idea how much thought you have already given to it, but I am >> > pretty sure it is not as simple as that with the current architecture. >> > By >> > far. >> >> Yes, it is very simple. You just need to allocate own buffers that >> would be needed >> by filter. Than you just call ctx->internal->execute()... > > I would have thought that > filter_frame() would insert frames into a input fifo and remove & > output frames from a output fifo and pass them on to the next filters > ff_filter_frame() > if the output fifo is empty and the input full it would block > (implicitly gving its CPU time to the workers) > > and there would be backgound threads continuosly running that pick > frames out of the input fifo and process them into an output fifo > and wake up the main thread if the fifo state changes > > using execute() would IIUC (please correct me if i misunderstand) > only execute when ff_filter_frame() is executed, this would restrict > what can execute at the same time, also execute() needs to > wait for all threads to finish before it can return, this too limits > paralelism compared to continuously running workers that more > independantly pick tasks and work on them > but maybe i misuderstood how you intend to use execute() How this one would work if for example first frame needs 10 seconds to generate and all other needs 1 second? How would you know from your output fifo that you got first frame, so you can pass first frame to next filter(s)? How do you know that your solution is always optimal? (Not saying that mine is anything better) How do you limit number of threads that will specifically work for this filter instance?
Le quintidi 15 fructidor, an CCXXIV, Michael Niedermayer a écrit : > ./ffmpeg -i tickets//679/oversized_pgs_subtitles.mkv -filter_complex '[0:s:1]scale=848x480,[0:v]overlay=shortest=1' test.avi > fails assertion: > Assertion progress failed at libavfilter/avfilter.c:1391 > > https://trac.ffmpeg.org/attachment/ticket/679/oversized_pgs_subtitles.mkv > > ffmpeg -v 0 -i tickets/3539/crash.swf -map 0 -t 1 -f framecrc - > output changes, not sure this is a bug but reporting it anyway as i > noticed > > http://samples.ffmpeg.org/ffmpeg-bugs/trac/ticket3539/ > > doc/examples/filtering_video matrixbench_mpeg2.mpg > also breaks Thanks for the testing. I reproduced the first one (does not fail with "-f null -") and will try to fix all these. Regards,
Le quintidi 15 fructidor, an CCXXIV, Paul B Mahol a écrit : > Yes, it is very simple. You just need to allocate own buffers that > would be needed > by filter. Than you just call ctx->internal->execute()... I am sorry, but with the current API, to have something that really works with threads, not just pseudo-threading that is so full of blocking that it is mostly sequential, it is much more complicated than that. I think. Please prove me wrong. > Does this fixes buffer queue overflow for specific scenarios? Not by itself, but then rewriting framesync to use the link's fifo becomes very easy and fix all the cases that can possibly work. Regards,
On 8/31/16, Nicolas George <george@nsup.org> wrote: > Le quintidi 15 fructidor, an CCXXIV, Paul B Mahol a ecrit : >> Yes, it is very simple. You just need to allocate own buffers that >> would be needed >> by filter. Than you just call ctx->internal->execute()... > > I am sorry, but with the current API, to have something that really works > with threads, not just pseudo-threading that is so full of blocking that it > is mostly sequential, it is much more complicated than that. I think. > Please prove me wrong. How would one use already available workers from lavfi/pthread.c ? If somebody wrote better implementation than me and its faster I will drop mine. > >> Does this fixes buffer queue overflow for specific scenarios? > > Not by itself, but then rewriting framesync to use the link's fifo becomes > very easy and fix all the cases that can possibly work. > > Regards, > > -- > Nicolas George >
On Wed, Aug 31, 2016 at 02:20:27PM +0200, Paul B Mahol wrote: > On 8/31/16, Michael Niedermayer <michael@niedermayer.cc> wrote: > > On Wed, Aug 31, 2016 at 10:18:31AM +0200, Paul B Mahol wrote: > >> On 8/30/16, Nicolas George <george@nsup.org> wrote: > >> > Le quartidi 14 fructidor, an CCXXIV, Paul B Mahol a ecrit : > >> >> the filter frame multithreading would just internally, in filter > >> >> context > >> >> cache frames, once enough frames are in cache - call workers and be > >> >> done, > >> >> repeat. At eof call workers on remaining frames in cache. > >> > > >> > I have no idea how much thought you have already given to it, but I am > >> > pretty sure it is not as simple as that with the current architecture. > >> > By > >> > far. > >> > >> Yes, it is very simple. You just need to allocate own buffers that > >> would be needed > >> by filter. Than you just call ctx->internal->execute()... > > > > I would have thought that > > filter_frame() would insert frames into a input fifo and remove & > > output frames from a output fifo and pass them on to the next filters > > ff_filter_frame() > > if the output fifo is empty and the input full it would block > > (implicitly gving its CPU time to the workers) > > > > and there would be backgound threads continuosly running that pick > > frames out of the input fifo and process them into an output fifo > > and wake up the main thread if the fifo state changes > > > > using execute() would IIUC (please correct me if i misunderstand) > > only execute when ff_filter_frame() is executed, this would restrict > > what can execute at the same time, also execute() needs to > > wait for all threads to finish before it can return, this too limits > > paralelism compared to continuously running workers that more > > independantly pick tasks and work on them > > but maybe i misuderstood how you intend to use execute() > > How this one would work if for example first frame needs 10 seconds to > generate and all other needs 1 second? How would you know from your output > fifo that you got first frame, so you can pass first frame to next filter(s)? one possibility: frame 1 comes in first worker adds a output entry to the end of the output linked list and starts working on frame 1 frame 2 comes in second worker adds a output entry to the end of the output linked list and starts working on frame 2 second worker finishes and replaces/adds its output to its output entry frame 3 comes in, the next output entry (from worker 1) is not finished so nothing can be output yet second worker adds a output entry to the end of the output linked list and starts working on frame 3 first worker finishes and replaces/adds its output to its output entry frame 4 comes in, the next output entry (from worker 1) is ready and is sent to the next filter > > How do you know that your solution is always optimal? (Not saying that > mine is anything better) i dont know if its always optimal, in fact it likely is not always optimal. It seemed simple and reasonably good. > How do you limit number of threads that will specifically work for > this filter instance? i had assumed that a fixed number of worker threads would be used for each filter, some of these may be idle and consume no CPU if there is nothing to do I assumed a fixed number for simplicity, nothing in the design should depend on the number being fixed [...]
Le quintidi 15 fructidor, an CCXXIV, Michael Niedermayer a écrit : > ./ffmpeg -i tickets//679/oversized_pgs_subtitles.mkv -filter_complex '[0:s:1]scale=848x480,[0:v]overlay=shortest=1' test.avi > fails assertion: > Assertion progress failed at libavfilter/avfilter.c:1391 > > https://trac.ffmpeg.org/attachment/ticket/679/oversized_pgs_subtitles.mkv This one was an easy fix. > ffmpeg -v 0 -i tickets/3539/crash.swf -map 0 -t 1 -f framecrc - > output changes, not sure this is a bug but reporting it anyway as i > noticed This one is much more tricky. When a format change is detected, ffmpeg resets the graphs but does not drain them first. With this big change, frames inserted do not reach as far as previously: less immediate processing, more possibility to work in a thread. But the issue also happens with the current code: ffmpeg -i a%02d.png -vf fps=50 -f framecrc - with two input PNGs of different size, will only output the second one, because the first one is buffered by vf_fps and discarded when the graph is reconfigured. This is very tricky, because flushing the graph on format change is not the same thing as flushing it at EOF. Imagine the input format for an overlay change: flushing would mean finishing the background video with the last image before the format change, this is not at all what expected. I do not see a clean way of getting this particular example and all similar ones working. I can see an ugly solution, but I would rather avoid it and wait for a cleaner implementation of handling format changes. Regards,
On 9/4/16, Nicolas George <george@nsup.org> wrote: > Le quintidi 15 fructidor, an CCXXIV, Michael Niedermayer a ecrit : >> ./ffmpeg -i tickets//679/oversized_pgs_subtitles.mkv -filter_complex >> '[0:s:1]scale=848x480,[0:v]overlay=shortest=1' test.avi >> fails assertion: >> Assertion progress failed at libavfilter/avfilter.c:1391 >> >> https://trac.ffmpeg.org/attachment/ticket/679/oversized_pgs_subtitles.mkv > > This one was an easy fix. > >> ffmpeg -v 0 -i tickets/3539/crash.swf -map 0 -t 1 -f framecrc - >> output changes, not sure this is a bug but reporting it anyway as i >> noticed > > This one is much more tricky. When a format change is detected, ffmpeg > resets the graphs but does not drain them first. With this big change, > frames inserted do not reach as far as previously: less immediate > processing, more possibility to work in a thread. > > But the issue also happens with the current code: > > ffmpeg -i a%02d.png -vf fps=50 -f framecrc - > > with two input PNGs of different size, will only output the second one, > because the first one is buffered by vf_fps and discarded when the graph is > reconfigured. > > This is very tricky, because flushing the graph on format change is not the > same thing as flushing it at EOF. Imagine the input format for an overlay > change: flushing would mean finishing the background video with the last > image before the format change, this is not at all what expected. > > I do not see a clean way of getting this particular example and all similar > ones working. I can see an ugly solution, but I would rather avoid it and > wait for a cleaner implementation of handling format changes. And what would that cleaner implementation do? At current rate your lavfi changes will never get in, which sucks.
Le nonidi 19 fructidor, an CCXXIV, Paul B Mahol a écrit : > And what would that cleaner implementation do? There is a rather simple implementation of format change in lavfi: have on each input a boolean flag "can_deal_with_format_change". If a frame with a different format arrives on a filter that does not have the flag, just insert the scale/aresample filter on the link to force the format to be the same. It is not entirely optimal, but it is quite simple and does the work. And it could probably be implemented independently of my changes. But I do not want to spend time on it before finishing this, or it will never end. (Actually, I think we had something like that for buffersrc but it disappeared at some time. avconv also has some very strange code to handle format changes.) > At current rate your lavfi changes will never get in, which sucks. Which is why I would like to be authorized to ignore this kind of hiccups. Format change does not currently work, this particular case used to work only by chance. Can I break it and repair it later? Regards,
On 9/4/16, Nicolas George <george@nsup.org> wrote: > Which is why I would like to be authorized to ignore this kind of hiccups. > Format change does not currently work, this particular case used to work > only by chance. Can I break it and repair it later? I agree, format change in filtergraph, when works it works by pure luck.
On Sun, Sep 04, 2016 at 10:16:57PM +0200, Nicolas George wrote: > Le nonidi 19 fructidor, an CCXXIV, Paul B Mahol a écrit : > > And what would that cleaner implementation do? > > There is a rather simple implementation of format change in lavfi: have on > each input a boolean flag "can_deal_with_format_change". If a frame with a > different format arrives on a filter that does not have the flag, just > insert the scale/aresample filter on the link to force the format to be the > same. > > It is not entirely optimal, but it is quite simple and does the work. > > And it could probably be implemented independently of my changes. But I do > not want to spend time on it before finishing this, or it will never end. > > (Actually, I think we had something like that for buffersrc but it > disappeared at some time. avconv also has some very strange code to handle > format changes.) we have a few filters that should work fine with format changes i would have assumed that still works > > > At current rate your lavfi changes will never get in, which sucks. > > Which is why I would like to be authorized to ignore this kind of hiccups. > Format change does not currently work, this particular case used to work > only by chance. Can I break it and repair it later? I think there are 2 things First, filters simply supporting format changes without any magic or reinitialization, they just work if formats change. This should continue to be so but it also shouldnt really add any complication or am i missing something ? This was the type i was interrested in previously ... (until patch reviews made me loose interrest) Second, graph reinitialization, this is hard to get right and if its done right it still doesnt work for many usecases due to destroyed state. I dont think temporary worsening graph reinitialization is a problem but thats just my oppinion [...]
On 9/4/16, Michael Niedermayer <michael@niedermayer.cc> wrote: > On Sun, Sep 04, 2016 at 10:16:57PM +0200, Nicolas George wrote: >> Le nonidi 19 fructidor, an CCXXIV, Paul B Mahol a ecrit : >> > And what would that cleaner implementation do? >> >> There is a rather simple implementation of format change in lavfi: have >> on >> each input a boolean flag "can_deal_with_format_change". If a frame with >> a >> different format arrives on a filter that does not have the flag, just >> insert the scale/aresample filter on the link to force the format to be >> the >> same. >> >> It is not entirely optimal, but it is quite simple and does the work. >> >> And it could probably be implemented independently of my changes. But I >> do >> not want to spend time on it before finishing this, or it will never end. >> >> (Actually, I think we had something like that for buffersrc but it >> disappeared at some time. avconv also has some very strange code to >> handle >> format changes.) > > we have a few filters that should work fine with format changes > i would have assumed that still works > > >> >> > At current rate your lavfi changes will never get in, which sucks. >> >> Which is why I would like to be authorized to ignore this kind of >> hiccups. >> Format change does not currently work, this particular case used to work >> only by chance. Can I break it and repair it later? > > I think there are 2 things > First, filters simply supporting format changes without any magic or > reinitialization, > they just work if formats change. This should continue to be so > but it also shouldnt really add any complication or am i missing > something ? > This was the type i was interrested in previously ... (until patch > reviews made me loose interrest) > > Second, graph reinitialization, this is hard to get right and if its > done right it still doesnt work for many usecases due to destroyed > state. > I dont think temporary worsening graph reinitialization is a problem > but thats just my oppinion > So everybody agrees, we should proceed.
Le quartidi 24 fructidor, an CCXXIV, Paul B Mahol a écrit :
> So everybody agrees, we should proceed.
I am proceeding, but as you can see in the patch, there is still a fair
amount of work to be done. Still, people can help if they want to speed
things up, especially since a significant part of the work is design
decisions that I can not do alone and will need to be discussed.
What needs to be done (using this mail as a notepad, but including the tasks
where help is required):
- Finish documenting the scheduling and make sure the implementation matches
the documentation.
- Discuss if "private_fields.h" is acceptable or decide another solution.
- Clearly identify and isolate the parts of the scheduling that are needed
only for request_frame()/request_frame() compatibility.
- Decide exactly what parts of the scheduling are the responsibility of
filters (possibly in the compatibility activate function) and what parts
are handled by the framework.
- Think ahead about threading and use wrapper to access fields that will
require locking or synchronization.
- Think about features whose need I realized while trying to get it working:
distinguish productive / processing activation, synchronize several filter
graphs.
Please feel free to ask details about any of these points: not only would
getting interest help me stay motivated, but discussing implementation
details and explaining the design would help me having a clear idea of the
whole system.
Regards,
On 9/10/16, Nicolas George <george@nsup.org> wrote: > Le quartidi 24 fructidor, an CCXXIV, Paul B Mahol a ecrit : >> So everybody agrees, we should proceed. > > I am proceeding, but as you can see in the patch, there is still a fair > amount of work to be done. Still, people can help if they want to speed > things up, especially since a significant part of the work is design > decisions that I can not do alone and will need to be discussed. > > What needs to be done (using this mail as a notepad, but including the > tasks > where help is required): > > - Finish documenting the scheduling and make sure the implementation > matches > the documentation. > > - Discuss if "private_fields.h" is acceptable or decide another solution. > > - Clearly identify and isolate the parts of the scheduling that are needed > only for request_frame()/request_frame() compatibility. > > - Decide exactly what parts of the scheduling are the responsibility of > filters (possibly in the compatibility activate function) and what parts > are handled by the framework. > > - Think ahead about threading and use wrapper to access fields that will > require locking or synchronization. > > - Think about features whose need I realized while trying to get it > working: > distinguish productive / processing activation, synchronize several > filter > graphs. > > Please feel free to ask details about any of these points: not only would > getting interest help me stay motivated, but discussing implementation > details and explaining the design would help me having a clear idea of the > whole system. For start removal of recursiveness is mostly I'm interested in. What needs to be done for that, can I help somehow?
On 9/11/16, Paul B Mahol <onemda@gmail.com> wrote: > On 9/10/16, Nicolas George <george@nsup.org> wrote: >> Le quartidi 24 fructidor, an CCXXIV, Paul B Mahol a ecrit : >>> So everybody agrees, we should proceed. >> >> I am proceeding, but as you can see in the patch, there is still a fair >> amount of work to be done. Still, people can help if they want to speed >> things up, especially since a significant part of the work is design >> decisions that I can not do alone and will need to be discussed. >> >> What needs to be done (using this mail as a notepad, but including the >> tasks >> where help is required): >> >> - Finish documenting the scheduling and make sure the implementation >> matches >> the documentation. >> >> - Discuss if "private_fields.h" is acceptable or decide another solution. >> >> - Clearly identify and isolate the parts of the scheduling that are >> needed >> only for request_frame()/request_frame() compatibility. >> >> - Decide exactly what parts of the scheduling are the responsibility of >> filters (possibly in the compatibility activate function) and what >> parts >> are handled by the framework. >> >> - Think ahead about threading and use wrapper to access fields that will >> require locking or synchronization. >> >> - Think about features whose need I realized while trying to get it >> working: >> distinguish productive / processing activation, synchronize several >> filter >> graphs. >> >> Please feel free to ask details about any of these points: not only would >> getting interest help me stay motivated, but discussing implementation >> details and explaining the design would help me having a clear idea of >> the >> whole system. > > For start removal of recursiveness is mostly I'm interested in. > What needs to be done for that, can I help somehow? > Hi, So what's remain to be done to have frame threading in lavfi?
On 5/2/18, Paul B Mahol <onemda@gmail.com> wrote: > On 9/11/16, Paul B Mahol <onemda@gmail.com> wrote: >> On 9/10/16, Nicolas George <george@nsup.org> wrote: >>> Le quartidi 24 fructidor, an CCXXIV, Paul B Mahol a ecrit : >>>> So everybody agrees, we should proceed. >>> >>> I am proceeding, but as you can see in the patch, there is still a fair >>> amount of work to be done. Still, people can help if they want to speed >>> things up, especially since a significant part of the work is design >>> decisions that I can not do alone and will need to be discussed. >>> >>> What needs to be done (using this mail as a notepad, but including the >>> tasks >>> where help is required): >>> >>> - Finish documenting the scheduling and make sure the implementation >>> matches >>> the documentation. >>> >>> - Discuss if "private_fields.h" is acceptable or decide another >>> solution. >>> >>> - Clearly identify and isolate the parts of the scheduling that are >>> needed >>> only for request_frame()/request_frame() compatibility. >>> >>> - Decide exactly what parts of the scheduling are the responsibility of >>> filters (possibly in the compatibility activate function) and what >>> parts >>> are handled by the framework. >>> >>> - Think ahead about threading and use wrapper to access fields that will >>> require locking or synchronization. >>> >>> - Think about features whose need I realized while trying to get it >>> working: >>> distinguish productive / processing activation, synchronize several >>> filter >>> graphs. >>> >>> Please feel free to ask details about any of these points: not only >>> would >>> getting interest help me stay motivated, but discussing implementation >>> details and explaining the design would help me having a clear idea of >>> the >>> whole system. >> >> For start removal of recursiveness is mostly I'm interested in. >> What needs to be done for that, can I help somehow? >> > > Hi, > > So what's remain to be done to have frame threading in lavfi? > Ping
On 6/2/18, Paul B Mahol <onemda@gmail.com> wrote: > On 5/2/18, Paul B Mahol <onemda@gmail.com> wrote: >> On 9/11/16, Paul B Mahol <onemda@gmail.com> wrote: >>> On 9/10/16, Nicolas George <george@nsup.org> wrote: >>>> Le quartidi 24 fructidor, an CCXXIV, Paul B Mahol a ecrit : >>>>> So everybody agrees, we should proceed. >>>> >>>> I am proceeding, but as you can see in the patch, there is still a fair >>>> amount of work to be done. Still, people can help if they want to speed >>>> things up, especially since a significant part of the work is design >>>> decisions that I can not do alone and will need to be discussed. >>>> >>>> What needs to be done (using this mail as a notepad, but including the >>>> tasks >>>> where help is required): >>>> >>>> - Finish documenting the scheduling and make sure the implementation >>>> matches >>>> the documentation. >>>> >>>> - Discuss if "private_fields.h" is acceptable or decide another >>>> solution. >>>> >>>> - Clearly identify and isolate the parts of the scheduling that are >>>> needed >>>> only for request_frame()/request_frame() compatibility. >>>> >>>> - Decide exactly what parts of the scheduling are the responsibility of >>>> filters (possibly in the compatibility activate function) and what >>>> parts >>>> are handled by the framework. >>>> >>>> - Think ahead about threading and use wrapper to access fields that >>>> will >>>> require locking or synchronization. >>>> >>>> - Think about features whose need I realized while trying to get it >>>> working: >>>> distinguish productive / processing activation, synchronize several >>>> filter >>>> graphs. >>>> >>>> Please feel free to ask details about any of these points: not only >>>> would >>>> getting interest help me stay motivated, but discussing implementation >>>> details and explaining the design would help me having a clear idea of >>>> the >>>> whole system. >>> >>> For start removal of recursiveness is mostly I'm interested in. >>> What needs to be done for that, can I help somehow? >>> >> >> Hi, >> >> So what's remain to be done to have frame threading in lavfi? >> > > Ping > Ping
On Sun, Jun 03, 2018 at 07:43:24PM +0200, Paul B Mahol wrote: > On 6/2/18, Paul B Mahol <onemda@gmail.com> wrote: > > On 5/2/18, Paul B Mahol <onemda@gmail.com> wrote: > >> On 9/11/16, Paul B Mahol <onemda@gmail.com> wrote: > >>> On 9/10/16, Nicolas George <george@nsup.org> wrote: > >>>> Le quartidi 24 fructidor, an CCXXIV, Paul B Mahol a ecrit : > >>>>> So everybody agrees, we should proceed. > >>>> > >>>> I am proceeding, but as you can see in the patch, there is still a fair > >>>> amount of work to be done. Still, people can help if they want to speed > >>>> things up, especially since a significant part of the work is design > >>>> decisions that I can not do alone and will need to be discussed. > >>>> > >>>> What needs to be done (using this mail as a notepad, but including the > >>>> tasks > >>>> where help is required): > >>>> > >>>> - Finish documenting the scheduling and make sure the implementation > >>>> matches > >>>> the documentation. > >>>> > >>>> - Discuss if "private_fields.h" is acceptable or decide another > >>>> solution. > >>>> > >>>> - Clearly identify and isolate the parts of the scheduling that are > >>>> needed > >>>> only for request_frame()/request_frame() compatibility. > >>>> > >>>> - Decide exactly what parts of the scheduling are the responsibility of > >>>> filters (possibly in the compatibility activate function) and what > >>>> parts > >>>> are handled by the framework. > >>>> > >>>> - Think ahead about threading and use wrapper to access fields that > >>>> will > >>>> require locking or synchronization. > >>>> > >>>> - Think about features whose need I realized while trying to get it > >>>> working: > >>>> distinguish productive / processing activation, synchronize several > >>>> filter > >>>> graphs. > >>>> > >>>> Please feel free to ask details about any of these points: not only > >>>> would > >>>> getting interest help me stay motivated, but discussing implementation > >>>> details and explaining the design would help me having a clear idea of > >>>> the > >>>> whole system. > >>> > >>> For start removal of recursiveness is mostly I'm interested in. > >>> What needs to be done for that, can I help somehow? > >>> > >> > >> Hi, > >> > >> So what's remain to be done to have frame threading in lavfi? > >> > > > > Ping > > > > Ping If noone, who has time to reply knows the awnser then you probably have to find it out from the code and any unfinished patchsets sending nicolas a private mail may also be more vissible to him than the ML in case he is busy
Michael Niedermayer (2018-06-04): > If noone, who has time to reply knows the awnser then you probably have to > find it out from the code and any unfinished patchsets > > sending nicolas a private mail may also be more vissible to him than the ML > in case he is busy In terms of design, the big thing missing from lavfi is a clean API to run a filter graph. Right now, it relies on requests on outputs, with a fragile heuristic to find the "oldest" one. A clean API would allow to run it as a whole and react to frames on output or requests on inputs, possibly with callbacks. This must come before threading, because it is what allows to control threading: threading is efficient when the system can start several threads and let them run, doing their work. If it is constantly stopping and re-starting because the calling API makes too small steps, much time is wasted. But more than that, it requires somebody working on it. Speaking for myself, the toxic ambiance in the project since a few months has destroyed my motivation for doing anything ambitious on it. And to be completely forthright, I feel that Paul is partly responsible for that toxic ambiance; see his interventions on the thread about enforcing the code of conduct for example. Regards,
On 6/6/18, Nicolas George <george@nsup.org> wrote: > Michael Niedermayer (2018-06-04): >> If noone, who has time to reply knows the awnser then you probably have >> to >> find it out from the code and any unfinished patchsets >> >> sending nicolas a private mail may also be more vissible to him than the >> ML >> in case he is busy > > In terms of design, the big thing missing from lavfi is a clean API to > run a filter graph. Right now, it relies on requests on outputs, with a > fragile heuristic to find the "oldest" one. A clean API would allow to > run it as a whole and react to frames on output or requests on inputs, > possibly with callbacks. > > This must come before threading, because it is what allows to control > threading: threading is efficient when the system can start several > threads and let them run, doing their work. If it is constantly stopping > and re-starting because the calling API makes too small steps, much time > is wasted. > > But more than that, it requires somebody working on it. Speaking for > myself, the toxic ambiance in the project since a few months has > destroyed my motivation for doing anything ambitious on it. And to be > completely forthright, I feel that Paul is partly responsible for that > toxic ambiance; see his interventions on the thread about enforcing the > code of conduct for example. Your contributions will be missed. Good bye.
On Thu, Jun 07, 2018 at 11:23:40AM +0200, Paul B Mahol wrote: > On 6/6/18, Nicolas George <george@nsup.org> wrote: > > Michael Niedermayer (2018-06-04): > >> If noone, who has time to reply knows the awnser then you probably have > >> to > >> find it out from the code and any unfinished patchsets > >> > >> sending nicolas a private mail may also be more vissible to him than the > >> ML > >> in case he is busy > > > > In terms of design, the big thing missing from lavfi is a clean API to > > run a filter graph. Right now, it relies on requests on outputs, with a > > fragile heuristic to find the "oldest" one. A clean API would allow to > > run it as a whole and react to frames on output or requests on inputs, > > possibly with callbacks. > > > > This must come before threading, because it is what allows to control > > threading: threading is efficient when the system can start several > > threads and let them run, doing their work. If it is constantly stopping > > and re-starting because the calling API makes too small steps, much time > > is wasted. > > > > But more than that, it requires somebody working on it. Speaking for > > myself, the toxic ambiance in the project since a few months has > > destroyed my motivation for doing anything ambitious on it. And to be > > completely forthright, I feel that Paul is partly responsible for that > > toxic ambiance; see his interventions on the thread about enforcing the > > code of conduct for example. > > Your contributions will be missed. > > Good bye. Id like to see both you and nicolas work together on libavfilter. Thats a "WIN" for the community and FFmpeg. You two are the top 2 libavfilter develoepers currently. Its really _REALLY_ stupid if you two fight like this. Because no matter who "wins" this, everyone looses. [...]
From 7e936674b5d8c7e6f3606f094720fe4137ab5b0e Mon Sep 17 00:00:00 2001 From: Nicolas George <george@nsup.org> Date: Sun, 3 Jan 2016 15:44:42 +0100 Subject: [PATCH 4/4] WIP derecursive filter frame Signed-off-by: Nicolas George <george@nsup.org> --- ffmpeg.c | 1 + libavfilter/Makefile | 1 + libavfilter/avfilter.c | 467 ++++++++++++++++++++++++++++++++++------- libavfilter/avfilter.h | 14 +- libavfilter/avfiltergraph.c | 50 ++--- libavfilter/buffersink.c | 16 +- libavfilter/buffersrc.c | 1 + libavfilter/f_interleave.c | 5 +- libavfilter/framequeue.c | 49 ++++- libavfilter/framequeue.h | 40 +++- libavfilter/internal.h | 10 + libavfilter/private_fields.h | 43 ++++ libavfilter/split.c | 3 +- libavfilter/vf_extractplanes.c | 3 +- tests/ref/fate/source | 1 + 15 files changed, 571 insertions(+), 133 deletions(-) create mode 100644 libavfilter/private_fields.h diff --git a/ffmpeg.c b/ffmpeg.c index 3229823..ef07b4a 100644 --- a/ffmpeg.c +++ b/ffmpeg.c @@ -4087,6 +4087,7 @@ static int transcode_step(void) ost = choose_output(); if (!ost) { if (got_eagain()) { + av_log(0, 16, "no OST and EAGAIN, will sleep\n"); reset_eagain(); av_usleep(10000); return 0; diff --git a/libavfilter/Makefile b/libavfilter/Makefile index 81b40ac..568fdef 100644 --- a/libavfilter/Makefile +++ b/libavfilter/Makefile @@ -18,6 +18,7 @@ OBJS = allfilters.o \ fifo.o \ formats.o \ framepool.o \ + framequeue.o \ graphdump.o \ graphparser.o \ opencl_allkernels.o \ diff --git a/libavfilter/avfilter.c b/libavfilter/avfilter.c index ccbe4d9..9679994 100644 --- a/libavfilter/avfilter.c +++ b/libavfilter/avfilter.c @@ -34,6 +34,7 @@ #include "libavutil/rational.h" #include "libavutil/samplefmt.h" +#include "private_fields.h" #include "audio.h" #include "avfilter.h" #include "formats.h" @@ -135,6 +136,10 @@ int avfilter_link(AVFilterContext *src, unsigned srcpad, { AVFilterLink *link; + av_assert0(src->graph); + av_assert0(dst->graph); + av_assert0(src->graph == dst->graph); + if (src->nb_outputs <= srcpad || dst->nb_inputs <= dstpad || src->outputs[srcpad] || dst->inputs[dstpad]) return AVERROR(EINVAL); @@ -147,6 +152,9 @@ int avfilter_link(AVFilterContext *src, unsigned srcpad, return AVERROR(EINVAL); } +#ifndef AVFILTER_LINK_INTERNAL_FIELDS +# error AVFilterLink internal fields not defined +#endif link = av_mallocz(sizeof(*link)); if (!link) return AVERROR(ENOMEM); @@ -160,6 +168,7 @@ int avfilter_link(AVFilterContext *src, unsigned srcpad, link->type = src->output_pads[srcpad].type; av_assert0(AV_PIX_FMT_NONE == -1 && AV_SAMPLE_FMT_NONE == -1); link->format = -1; + ff_framequeue_init(&link->fifo, &src->graph->internal->frame_queues); return 0; } @@ -182,14 +191,25 @@ int avfilter_link_get_channels(AVFilterLink *link) void ff_avfilter_link_set_in_status(AVFilterLink *link, int status, int64_t pts) { - ff_avfilter_link_set_out_status(link, status, pts); + if (link->status_in == status) + return; + av_assert0(!link->status_in); + link->status_in = status; + link->status_in_pts = pts; + link->frame_wanted_out = 0; + link->frame_blocked_in = 0; // XXX CIG check + // XXX CIG must clear frame_blocked_in on outputs using the old API + ff_filter_schedule(link->dst, "in_status"); } void ff_avfilter_link_set_out_status(AVFilterLink *link, int status, int64_t pts) { - link->status = status; - link->frame_wanted_in = link->frame_wanted_out = 0; - ff_update_link_current_pts(link, pts); + av_assert0(!link->frame_wanted_out); + av_assert0(!link->status_out); + link->status_out = status; + if (pts != AV_NOPTS_VALUE) + ff_update_link_current_pts(link, pts); + ff_filter_schedule(link->src, "out_status"); } void avfilter_link_set_closed(AVFilterLink *link, int closed) @@ -370,10 +390,22 @@ int ff_request_frame(AVFilterLink *link) { FF_TPRINTF_START(NULL, request_frame); ff_tlog_link(NULL, link, 1); - if (link->status) - return link->status; - link->frame_wanted_in = 1; + if (link->status_out) + return link->status_out; + /* XXX only for old API */ + if (ff_framequeue_queued_frames(&link->fifo) > 0 && + ff_framequeue_queued_samples(&link->fifo) >= link->min_samples) { + ff_filter_schedule(link->dst, "request_frame_available"); + av_assert0(link->dst->ready); + return 0; + } + if (link->status_in) { + //av_assert0(!"request_frame must close"); // XXX CIG + link->status_out = link->status_in; + return link->status_out; + } link->frame_wanted_out = 1; + ff_filter_schedule(link->src, "request_frame"); return 0; } @@ -382,22 +414,16 @@ int ff_request_frame_to_filter(AVFilterLink *link) int ret = -1; FF_TPRINTF_START(NULL, request_frame_to_filter); ff_tlog_link(NULL, link, 1); - link->frame_wanted_in = 0; + link->frame_blocked_in = 1; if (link->srcpad->request_frame) ret = link->srcpad->request_frame(link); else if (link->src->inputs[0]) ret = ff_request_frame(link->src->inputs[0]); - if (ret == AVERROR_EOF && link->partial_buf) { - AVFrame *pbuf = link->partial_buf; - link->partial_buf = NULL; - ret = ff_filter_frame_framed(link, pbuf); - ff_avfilter_link_set_in_status(link, AVERROR_EOF, AV_NOPTS_VALUE); - link->frame_wanted_out = 0; - return ret; - } if (ret < 0) { - if (ret != AVERROR(EAGAIN) && ret != link->status) + if (ret != AVERROR(EAGAIN) && ret != link->status_in) ff_avfilter_link_set_in_status(link, ret, AV_NOPTS_VALUE); + if (ret == AVERROR_EOF) + ret = 0; } return ret; } @@ -1056,10 +1082,12 @@ static int ff_filter_frame_framed(AVFilterLink *link, AVFrame *frame) AVFilterCommand *cmd= link->dst->command_queue; int64_t pts; - if (link->status) { +#if 0 + if (link->status_in) { av_frame_free(&frame); - return link->status; + return link->status_in; } +#endif if (!(filter_frame = dst->filter_frame)) filter_frame = default_filter_frame; @@ -1142,52 +1170,9 @@ fail: return ret; } -static int ff_filter_frame_needs_framing(AVFilterLink *link, AVFrame *frame) -{ - int insamples = frame->nb_samples, inpos = 0, nb_samples; - AVFrame *pbuf = link->partial_buf; - int nb_channels = av_frame_get_channels(frame); - int ret = 0; - - /* Handle framing (min_samples, max_samples) */ - while (insamples) { - if (!pbuf) { - AVRational samples_tb = { 1, link->sample_rate }; - pbuf = ff_get_audio_buffer(link, link->partial_buf_size); - if (!pbuf) { - av_log(link->dst, AV_LOG_WARNING, - "Samples dropped due to memory allocation failure.\n"); - return 0; - } - av_frame_copy_props(pbuf, frame); - pbuf->pts = frame->pts; - if (pbuf->pts != AV_NOPTS_VALUE) - pbuf->pts += av_rescale_q(inpos, samples_tb, link->time_base); - pbuf->nb_samples = 0; - } - nb_samples = FFMIN(insamples, - link->partial_buf_size - pbuf->nb_samples); - av_samples_copy(pbuf->extended_data, frame->extended_data, - pbuf->nb_samples, inpos, - nb_samples, nb_channels, link->format); - inpos += nb_samples; - insamples -= nb_samples; - pbuf->nb_samples += nb_samples; - if (pbuf->nb_samples >= link->min_samples) { - ret = ff_filter_frame_framed(link, pbuf); - pbuf = NULL; - } else { - if (link->frame_wanted_out) - link->frame_wanted_in = 1; - } - } - av_frame_free(&frame); - link->partial_buf = pbuf; - return ret; -} - int ff_filter_frame(AVFilterLink *link, AVFrame *frame) { + int ret; FF_TPRINTF_START(NULL, filter_frame); ff_tlog_link(NULL, link, 1); ff_tlog(NULL, " "); ff_tlog_ref(NULL, frame, 1); /* Consistency checks */ @@ -1220,23 +1205,361 @@ int ff_filter_frame(AVFilterLink *link, AVFrame *frame) } } + link->frame_blocked_in = 0; link->frame_wanted_out = 0; link->frame_count_in++; - /* Go directly to actual filtering if possible */ - if (link->type == AVMEDIA_TYPE_AUDIO && - link->min_samples && - (link->partial_buf || - frame->nb_samples < link->min_samples || - frame->nb_samples > link->max_samples)) { - return ff_filter_frame_needs_framing(link, frame); - } else { - return ff_filter_frame_framed(link, frame); + ret = ff_framequeue_add(&link->fifo, frame); + if (ret < 0) { + av_frame_free(&frame); + return ret; } + ff_filter_schedule(link->dst, "filter_frame"); + return 0; + error: av_frame_free(&frame); return AVERROR_PATCHWELCOME; } +static int take_samples(AVFilterLink *link, unsigned min, unsigned max, + AVFrame **rframe) +{ + AVFrame *frame0, *frame, *buf; + unsigned nb_samples, nb_frames, i, p; + int ret; + + /* Note: this function relies on no format changes. */ + if (ff_framequeue_queued_samples(&link->fifo) < min) + return 0; + frame0 = frame = ff_framequeue_peek(&link->fifo, 0); + if (frame->nb_samples >= min && frame->nb_samples < max) { + *rframe = ff_framequeue_take(&link->fifo); + return 1; + } + nb_frames = 0; + nb_samples = 0; + while (1) { + if (nb_samples + frame->nb_samples > max) { + if (nb_samples < min) + nb_samples = max; + break; + } + nb_samples += frame->nb_samples; + nb_frames++; + if (nb_frames == ff_framequeue_queued_frames(&link->fifo)) + break; + frame = ff_framequeue_peek(&link->fifo, nb_frames); + } + + buf = ff_get_audio_buffer(link, nb_samples); + if (!buf) + return AVERROR(ENOMEM); + ret = av_frame_copy_props(buf, frame0); + if (ret < 0) { + av_frame_free(&buf); + return ret; + } + buf->pts = frame0->pts; + + p = 0; + for (i = 0; i < nb_frames; i++) { + frame = ff_framequeue_take(&link->fifo); + av_samples_copy(buf->extended_data, frame->extended_data, p, 0, + frame->nb_samples, link->channels, link->format); + p += frame->nb_samples; + } + if (p < nb_samples) { + unsigned n = nb_samples - p; + frame = ff_framequeue_peek(&link->fifo, 0); + av_samples_copy(buf->extended_data, frame->extended_data, p, 0, n, + link->channels, link->format); + frame->nb_samples -= n; + av_samples_copy(frame->extended_data, frame->extended_data, 0, n, + frame->nb_samples, link->channels, link->format); + if (frame->pts != AV_NOPTS_VALUE) + frame->pts += av_rescale_q(n, av_make_q(1, link->sample_rate), link->time_base); + ff_framequeue_update_peeked(&link->fifo, 0); + ff_framequeue_skip_samples(&link->fifo, n); + } + + *rframe = buf; + return 1; +} + +int ff_filter_frame_to_filter(AVFilterLink *link) +{ + AVFrame *frame; + AVFilterContext *dst = link->dst; + unsigned i; + int ret; + + av_assert1(ff_framequeue_queued_frames(&link->fifo)); + if (link->min_samples) { + int min = link->min_samples; + if (link->status_in) + min = FFMIN(min, ff_framequeue_queued_samples(&link->fifo)); + ret = take_samples(link, min, link->max_samples, &frame); + if (!ret) { + ret = ff_request_frame(link); + av_assert1(!ret); + } + if (ret <= 0) + return ret; + } else { + frame = ff_framequeue_take(&link->fifo); + } + /* The filter will soon have received a new frame, that may allow it to + produce one or more: unblock its outputs. */ + for (i = 0; i < dst->nb_outputs; i++) + dst->outputs[i]->frame_blocked_in = 0; + ret = ff_filter_frame_framed(link, frame); + if (ret < 0 && ret != link->status_out) { + ff_avfilter_link_set_out_status(link, ret, AV_NOPTS_VALUE); + } else { + ff_filter_schedule(dst, "filter_frame_to_filter"); + } + return ret; +} + +void ff_dump_graph_scheduling(AVFilterGraph *graph, const char *tag) +{ + unsigned i, j; + static unsigned round = 0; + + return; + av_log(0, 16, "Graph status (round_%d) for %s:\n", round++, tag); + for (i = 0; i < graph->nb_filters; i++) { + AVFilterContext *f = graph->filters[i]; + av_log(0, 16, " [R:%d] %s\n", f->ready, f->name); + for (j = 0; j < f->nb_inputs; j++) { + AVFilterLink *l = f->inputs[j]; + av_log(0, 16, " %s %s [%zd %ld/%d] %s <- %s\n", + l->frame_wanted_out ? "W" : "-", + l->status_in ? av_err2str(l->status_in) : ".", + ff_framequeue_queued_frames(&l->fifo), + ff_framequeue_queued_samples(&l->fifo), l->min_samples, + l->status_out ? av_err2str(l->status_out) : ".", + l->src->name); + } + for (j = 0; j < f->nb_outputs; j++) { + AVFilterLink *l = f->outputs[j]; + av_log(0, 16, " %s -> %s\n", + l->frame_blocked_in ? "B" : "-", + l->dst->name); + } + } +} + +void ff_filter_schedule(AVFilterContext *filter, const char *tag) +{ + unsigned ready = 0, i; + + ff_dump_graph_scheduling(filter->graph, "filter_schedule"); + for (i = 0; !ready && i < filter->nb_inputs; i++) + if (!filter->inputs[i]->frame_wanted_out && + ff_framequeue_queued_frames(&filter->inputs[i]->fifo)) + ready = 300; + for (i = 0; !ready && i < filter->nb_inputs; i++) + if (filter->inputs[i]->status_in != filter->inputs[i]->status_out) + ready = 200; + for (i = 0; !ready && i < filter->nb_outputs; i++) + if (filter->outputs[i]->frame_wanted_out && + !filter->outputs[i]->frame_blocked_in) + ready = 100; + filter->ready = ready; +} + +static int forward_status_change(AVFilterContext *filter, AVFilterLink *in) +{ + unsigned out = 0, progress = 0; + int ret; + + av_assert0(!in->status_out); + if (!filter->nb_outputs) { + /* not necessary with the current API and sinks */ + return 0; + } + while (!in->status_out) { + if (!filter->outputs[out]->status_in) { + progress++; + ret = ff_request_frame_to_filter(filter->outputs[out]); + filter->outputs[out]->frame_blocked_in = 0; // XXX CIG understand + if (ret < 0) + return ret; + } + if (++out == filter->nb_outputs) { + av_assert0(progress); + progress = 0; + out = 0; + } + } + ff_filter_schedule(filter, "forward_status_change"); + return 0; +} + +#define FFERROR_NOT_READY FFERRTAG('N','R','D','Y') + +static int ff_filter_activate_default(AVFilterContext *filter) +{ + unsigned i; + + for (i = 0; i < filter->nb_inputs; i++) { + if (!filter->inputs[i]->frame_wanted_out && + ff_framequeue_queued_frames(&filter->inputs[i]->fifo)) { + return ff_filter_frame_to_filter(filter->inputs[i]); + } + } + for (i = 0; i < filter->nb_outputs; i++) { + if (filter->outputs[i]->frame_wanted_out && + !filter->outputs[i]->frame_blocked_in) { + return ff_request_frame_to_filter(filter->outputs[i]); + } + } + for (i = 0; i < filter->nb_inputs; i++) { + if (filter->inputs[i]->status_in && !filter->inputs[i]->status_out) { + if (ff_framequeue_queued_frames(&filter->inputs[i]->fifo)) { + // XXX CIG probably impossible: frame_wanted_out should be + // 0, and therefore caught by the first case + av_assert0(!"TODO"); + } else { + return forward_status_change(filter, filter->inputs[i]); + } + } + } + return FFERROR_NOT_READY; +} + +/* + Filter scheduling and activation + + When a filter is activated, it must: + - if possible, output a frame; + - else, check outputs for wanted frames and forward the requests. + + The following AVFilterLink fields are used for activation: + + - frame_wanted_out: + + This field indicates if a frame is needed on this input of the + destination filter. A positive value indicates that a frame is needed + to process queued frames or internal data or to satisfy the + application; a zero value indicates that a frame is not especially + needed but could be processed anyway; a negative value indicates that a + frame would just be queued. + + It is set by filters using ff_request_frame() or ff_request_no_frame(), + when requested by the application through a specific API or when it is + set on one of the outputs. + + It is cleared when a frame is sent from the source using + ff_filter_frame(). + + It is also cleared when a status change is sent from the source using + ff_avfilter_link_set_in_status(). XXX to implement + + - frame_blocked_in: + + This field means that the source filter can not generate a frame as is. + + It is set by the framework on all outputs of a filter before activating it. + + It is automatically cleared by ff_filter_frame(). + + - fifo: + + Contains the frames queued on a filter input. If it contains frames and + frame_wanted_out is not set, then the filter can be activated. If that + result in the filter not able to use these frames, the filter must set + frame_wanted_out to ask for more frames. + + - status_in and status_in_pts: + + Status (EOF or error code) of the link and timestamp of the status + change (in link time base, same as frames) as seen from the input of + the link. The status change is considered happening after the frames + queued in fifo. + + It is set by the source filter using ff_avfilter_link_set_in_status(). + + If + XXX + + - status_out: + + Status of the link as seen from the output of the link. The status + change is considered having already happened. + + It is set by the destination filter using + ff_avfilter_link_set_out_status(). + + A filter is ready if any of these conditions is true on any input or + output, in descending order of priority: + + - in->fifo contains frames and in->frame_wanted_out is not set: + the filter must process frames or set in->frame_wanted_out. + XXX several inputs + + - in->status_in is set but not in->status_out: + the filter XXX + + - out->frame_blocked_in is cleared and out->frame_wanted_out is set + (the filter can produce a frame with its internal data). + + Exemples of scenarios to consider: + + - buffersrc: activate if frame_wanted_out to notify the application; + activate when the application adds a frame to push it immediately. + + - testsrc: activate only if frame_wanted_out to produce and push a frame. + + - concat (not at stitch points): can process a frame on any output. + Activate if frame_wanted_out on output to forward on the corresponding + input. Activate when a frame is present on input to process it + immediately. + + - framesync: needs at least one frame on each input; extra frames on the + wrong input will accumulate. When a frame is first added on one input, + set frame_wanted_out<0 on it to avoid getting more (would trigger + testsrc) and frame_wanted_out>0 on the other to allow processing it. + + Activation of old filters: + + In order to activate a filter implementing the legacy filter_frame() and + request_frame() methods, perform the first possible of the following + actions: + + - If an input has frames in fifo and frame_wanted_out == 0, dequeue a + frame and call filter_frame(). + + Exception: + XXX + + Ratinale: filter frames as soon as possible instead of leaving them + queued; frame_wanted_out < 0 is not possible since the old API does not + set it nor provides any similar feedback. + + - If an output has frame_wanted_out > 0 and not frame_blocked_in, call + request_frame(). + + Rationale: checking frame_blocked_in is necessary to avoid requesting + repeatedly on a blocked input if another is not blocked (example: + [buffersrc1][testsrc1][buffersrc2][testsrc2]concat=v=2). + + XXX needs_fifo + + */ + +int ff_filter_activate(AVFilterContext *filter) +{ + int ret; + + filter->ready = 0; + ret = ff_filter_activate_default(filter); + if (ret == FFERROR_NOT_READY) + ret = 0; + return ret; +} + const AVClass *avfilter_get_class(void) { return &avfilter_class; diff --git a/libavfilter/avfilter.h b/libavfilter/avfilter.h index d21b144..928b08d 100644 --- a/libavfilter/avfilter.h +++ b/libavfilter/avfilter.h @@ -368,6 +368,8 @@ struct AVFilterContext { * Overrides global number of threads set per filter graph. */ int nb_threads; + + unsigned ready; }; /** @@ -541,13 +543,6 @@ struct AVFilterLink { void *video_frame_pool; /** - * True if a frame is currently wanted on the input of this filter. - * Set when ff_request_frame() is called by the output, - * cleared when the request is handled or forwarded. - */ - int frame_wanted_in; - - /** * True if a frame is currently wanted on the output of this filter. * Set when ff_request_frame() is called by the output, * cleared when a frame is filtered. @@ -559,6 +554,11 @@ struct AVFilterLink { * AVHWFramesContext describing the frames. */ AVBufferRef *hw_frames_ctx; + +#ifdef AVFILTER_LINK_INTERNAL_FIELDS + AVFILTER_LINK_INTERNAL_FIELDS +#endif + }; /** diff --git a/libavfilter/avfiltergraph.c b/libavfilter/avfiltergraph.c index 3af698d..e87ce01 100644 --- a/libavfilter/avfiltergraph.c +++ b/libavfilter/avfiltergraph.c @@ -32,6 +32,7 @@ #include "libavutil/opt.h" #include "libavutil/pixdesc.h" +#include "private_fields.h" #include "avfilter.h" #include "formats.h" #include "internal.h" @@ -87,6 +88,7 @@ AVFilterGraph *avfilter_graph_alloc(void) ret->av_class = &filtergraph_class; av_opt_set_defaults(ret); + ff_framequeue_global_init(&ret->internal->frame_queues); return ret; } @@ -1377,7 +1379,6 @@ void ff_avfilter_graph_update_heap(AVFilterGraph *graph, AVFilterLink *link) heap_bubble_down(graph, link, link->age_index); } - int avfilter_graph_request_oldest(AVFilterGraph *graph) { AVFilterLink *oldest = graph->sink_links[0]; @@ -1400,7 +1401,7 @@ int avfilter_graph_request_oldest(AVFilterGraph *graph) if (!graph->sink_links_count) return AVERROR_EOF; av_assert1(oldest->age_index >= 0); - while (oldest->frame_wanted_out) { + while (oldest->frame_wanted_out || oldest->dst->ready) { r = ff_filter_graph_run_once(graph); if (r < 0) return r; @@ -1408,41 +1409,18 @@ int avfilter_graph_request_oldest(AVFilterGraph *graph) return 0; } -static AVFilterLink *graph_run_once_find_filter(AVFilterGraph *graph) -{ - unsigned i, j; - AVFilterContext *f; - - /* TODO: replace scanning the graph with a priority list */ - for (i = 0; i < graph->nb_filters; i++) { - f = graph->filters[i]; - for (j = 0; j < f->nb_outputs; j++) - if (f->outputs[j]->frame_wanted_in) - return f->outputs[j]; - } - for (i = 0; i < graph->nb_filters; i++) { - f = graph->filters[i]; - for (j = 0; j < f->nb_outputs; j++) - if (f->outputs[j]->frame_wanted_out) - return f->outputs[j]; - } - return NULL; -} - int ff_filter_graph_run_once(AVFilterGraph *graph) { - AVFilterLink *link; - int ret; - - link = graph_run_once_find_filter(graph); - if (!link) { - av_log(NULL, AV_LOG_WARNING, "Useless run of a filter graph\n"); + AVFilterContext *filter; + unsigned i; + + ff_dump_graph_scheduling(graph, "run_once"); + av_assert0(graph->nb_filters); + filter = graph->filters[0]; + for (i = 1; i < graph->nb_filters; i++) + if (graph->filters[i]->ready > filter->ready) + filter = graph->filters[i]; + if (!filter->ready) return AVERROR(EAGAIN); - } - ret = ff_request_frame_to_filter(link); - if (ret == AVERROR_EOF) - /* local EOF will be forwarded through request_frame() / - set_status() until it reaches the sink */ - ret = 0; - return ret < 0 ? ret : 1; + return ff_filter_activate(filter); } diff --git a/libavfilter/buffersink.c b/libavfilter/buffersink.c index 2feb56d..8dfb061 100644 --- a/libavfilter/buffersink.c +++ b/libavfilter/buffersink.c @@ -31,6 +31,7 @@ #include "libavutil/mathematics.h" #include "libavutil/opt.h" +#include "private_fields.h" #include "audio.h" #include "avfilter.h" #include "buffersink.h" @@ -134,13 +135,20 @@ int attribute_align_arg av_buffersink_get_frame_flags(AVFilterContext *ctx, AVFr /* no picref available, fetch it from the filterchain */ while (!av_fifo_size(buf->fifo)) { - if (inlink->status) - return inlink->status; - if (flags & AV_BUFFERSINK_FLAG_NO_REQUEST) + if (inlink->status_out) + return inlink->status_out; + if (flags & AV_BUFFERSINK_FLAG_NO_REQUEST) { + if (ff_framequeue_queued_frames(&inlink->fifo)) { + av_log(0, 16, "NO_REQUEST with queued frames %ld, %ld/%d\n", ff_framequeue_queued_frames(&inlink->fifo), ff_framequeue_queued_samples(&inlink->fifo), inlink->min_samples); + } + if (inlink->status_in) { + av_log(0, 16, "NO_REQUEST with status\n"); + } return AVERROR(EAGAIN); + } if ((ret = ff_request_frame(inlink)) < 0) return ret; - while (inlink->frame_wanted_out) { + while (inlink->frame_wanted_out || ctx->ready) { ret = ff_filter_graph_run_once(ctx->graph); if (ret < 0) return ret; diff --git a/libavfilter/buffersrc.c b/libavfilter/buffersrc.c index 9294811..7162336 100644 --- a/libavfilter/buffersrc.c +++ b/libavfilter/buffersrc.c @@ -184,6 +184,7 @@ static int av_buffersrc_add_frame_internal(AVFilterContext *ctx, if (!frame) { s->eof = 1; + ff_avfilter_link_set_in_status(ctx->outputs[0], AVERROR_EOF, AV_NOPTS_VALUE); return 0; } else if (s->eof) return AVERROR(EINVAL); diff --git a/libavfilter/f_interleave.c b/libavfilter/f_interleave.c index 422f2bf..44ee11b 100644 --- a/libavfilter/f_interleave.c +++ b/libavfilter/f_interleave.c @@ -26,6 +26,7 @@ #include "libavutil/avassert.h" #include "libavutil/avstring.h" #include "libavutil/opt.h" +#include "private_fields.h" #include "avfilter.h" #include "bufferqueue.h" #include "formats.h" @@ -59,7 +60,7 @@ inline static int push_frame(AVFilterContext *ctx) for (i = 0; i < ctx->nb_inputs; i++) { struct FFBufQueue *q = &s->queues[i]; - if (!q->available && !ctx->inputs[i]->status) + if (!q->available && !ctx->inputs[i]->status_out) return 0; if (q->available) { frame = ff_bufqueue_peek(q, 0); @@ -190,7 +191,7 @@ static int request_frame(AVFilterLink *outlink) int i, ret; for (i = 0; i < ctx->nb_inputs; i++) { - if (!s->queues[i].available && !ctx->inputs[i]->status) { + if (!s->queues[i].available && !ctx->inputs[i]->status_out) { ret = ff_request_frame(ctx->inputs[i]); if (ret != AVERROR_EOF) return ret; diff --git a/libavfilter/framequeue.c b/libavfilter/framequeue.c index ac226de..b028a21 100644 --- a/libavfilter/framequeue.c +++ b/libavfilter/framequeue.c @@ -19,6 +19,7 @@ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "libavutil/avassert.h" #include "framequeue.h" static inline FFFrameBucket *bucket(FFFrameQueue *fq, size_t idx) @@ -26,7 +27,24 @@ static inline FFFrameBucket *bucket(FFFrameQueue *fq, size_t idx) return &fq->queue[(fq->tail + idx) & (fq->allocated - 1)]; } -void ff_framequeue_init(FFFrameQueue *fq, FFGlobalFrameQueue *gfq) +void ff_framequeue_global_init(FFFrameQueueGlobal *fqg) +{ +} + +static void check_consistency(FFFrameQueue *fq) +{ +#if ASSERT_LEVEL >= 2 + uint64_t nb_samples = 0; + size_t i; + + av_assert0(fq->queued == fq->total_frames_head - fq->total_frames_tail); + for (i = 0; i < fq->queued; i++) + nb_samples += bucket(fq, i)->frame->nb_samples; + av_assert0(nb_samples == fq->total_samples_head - fq->total_samples_tail); +#endif +} + +void ff_framequeue_init(FFFrameQueue *fq, FFFrameQueueGlobal *fqg) { fq->queue = &fq->first_bucket; fq->allocated = 1; @@ -36,6 +54,7 @@ int ff_framequeue_add(FFFrameQueue *fq, AVFrame *frame) { FFFrameBucket *b; + check_consistency(fq); if (fq->queued == fq->allocated) { if (fq->allocated == 1) { size_t na = 8; @@ -52,7 +71,7 @@ int ff_framequeue_add(FFFrameQueue *fq, AVFrame *frame) return AVERROR(ENOMEM); if (fq->tail + fq->queued > fq->allocated) memmove(nq + fq->allocated, nq, - (fq->fail + fq->queued - fq->allocated) * sizeof(*nq)); + (fq->tail + fq->queued - fq->allocated) * sizeof(*nq)); fq->queue = nq; fq->allocated = na; } @@ -60,11 +79,35 @@ int ff_framequeue_add(FFFrameQueue *fq, AVFrame *frame) b = bucket(fq, fq->queued); b->frame = frame; fq->queued++; + fq->total_frames_head++; + fq->total_samples_head += frame->nb_samples; + check_consistency(fq); return 0; } AVFrame *ff_framequeue_take(FFFrameQueue *fq) { + FFFrameBucket *b; + + check_consistency(fq); + av_assert1(fq->queued); + b = bucket(fq, 0); + fq->queued--; + fq->tail++; + fq->tail &= fq->allocated - 1; + fq->total_frames_tail++; + fq->total_samples_tail += b->frame->nb_samples; + check_consistency(fq); + return b->frame; } -#endif /* AVFILTER_FRAMEQUEUE_H */ +AVFrame *ff_framequeue_peek(FFFrameQueue *fq, size_t idx) +{ + FFFrameBucket *b; + + check_consistency(fq); + av_assert1(idx < fq->queued); + b = bucket(fq, idx); + check_consistency(fq); + return b->frame; +} diff --git a/libavfilter/framequeue.h b/libavfilter/framequeue.h index 68da1b7..4040504 100644 --- a/libavfilter/framequeue.h +++ b/libavfilter/framequeue.h @@ -29,28 +29,54 @@ * must be protected by a mutex or any synchronization mechanism. */ -#include "avfilter.h" -#include "libavutil/avassert.h" +#include "libavutil/frame.h" typedef struct FFFrameBucket { AVFrame *frame; } FFFrameBucket; -typedef FFGlobalFrameQueue { -} FFGlobalFrameQueue; +typedef struct FFFrameQueueGlobal { +} FFFrameQueueGlobal; -struct FFFrameQueue { +typedef struct FFFrameQueue { FFFrameBucket *queue; size_t allocated; size_t tail; size_t queued; FFFrameBucket first_bucket; -}; + uint64_t total_frames_head; + uint64_t total_frames_tail; + uint64_t total_samples_head; + uint64_t total_samples_tail; +} FFFrameQueue; -void ff_framequeue_init(FFFrameQueue *fq, FFGlobalFrameQueue *gfq); +void ff_framequeue_global_init(FFFrameQueueGlobal *fqg); + +void ff_framequeue_init(FFFrameQueue *fq, FFFrameQueueGlobal *fqg); int ff_framequeue_add(FFFrameQueue *fq, AVFrame *frame); AVFrame *ff_framequeue_take(FFFrameQueue *fq); +AVFrame *ff_framequeue_peek(FFFrameQueue *fq, size_t idx); + +static inline size_t ff_framequeue_queued_frames(const FFFrameQueue *fq) +{ + return fq->queued; +} + +static inline uint64_t ff_framequeue_queued_samples(const FFFrameQueue *fq) +{ + return fq->total_samples_head - fq->total_samples_tail; +} + +static inline void ff_framequeue_update_peeked(FFFrameQueue *fq, size_t idx) +{ +} + +static inline void ff_framequeue_skip_samples(FFFrameQueue *fq, size_t n) +{ + fq->total_samples_tail += n; +} + #endif /* AVFILTER_FRAMEQUEUE_H */ diff --git a/libavfilter/internal.h b/libavfilter/internal.h index 3856012..d7ad99b 100644 --- a/libavfilter/internal.h +++ b/libavfilter/internal.h @@ -29,6 +29,7 @@ #include "avfiltergraph.h" #include "formats.h" #include "framepool.h" +#include "framequeue.h" #include "thread.h" #include "version.h" #include "video.h" @@ -147,6 +148,7 @@ struct AVFilterPad { struct AVFilterGraphInternal { void *thread; avfilter_execute_func *thread_execute; + FFFrameQueueGlobal frame_queues; }; struct AVFilterInternal { @@ -336,6 +338,8 @@ int ff_request_frame(AVFilterLink *link); int ff_request_frame_to_filter(AVFilterLink *link); +int ff_filter_frame_to_filter(AVFilterLink *link); + #define AVFILTER_DEFINE_CLASS(fname) \ static const AVClass fname##_class = { \ .class_name = #fname, \ @@ -376,6 +380,10 @@ int ff_filter_frame(AVFilterLink *link, AVFrame *frame); */ AVFilterContext *ff_filter_alloc(const AVFilter *filter, const char *inst_name); +void ff_filter_schedule(AVFilterContext *filter, const char *tag); + +int ff_filter_activate(AVFilterContext *filter); + /** * Remove a filter from a graph; */ @@ -408,4 +416,6 @@ static inline int ff_norm_qscale(int qscale, int type) */ int ff_filter_get_nb_threads(AVFilterContext *ctx); +void ff_dump_graph_scheduling(AVFilterGraph *graph, const char *tag); + #endif /* AVFILTER_INTERNAL_H */ diff --git a/libavfilter/private_fields.h b/libavfilter/private_fields.h new file mode 100644 index 0000000..a5df3da --- /dev/null +++ b/libavfilter/private_fields.h @@ -0,0 +1,43 @@ +/* + * Generic frame queue + * Copyright (c) 2015 Nicolas George + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public License + * as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public License + * along with FFmpeg; if not, write to the Free Software Foundation, Inc., + * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "framequeue.h" + +#define AVFILTER_LINK_INTERNAL_FIELDS \ +\ + FFFrameQueue fifo; \ +\ + int frame_blocked_in; \ +\ + /** \ + * Link input status. \ + */ \ + int status_in; \ + int64_t status_in_pts; \ +\ + /** \ + * Link output status. \ + * If not zero, all attempts of request_frame will fail with the \ + * corresponding code. + */ \ + int status_out; \ + \ +/* define AVFILTER_LINK_INTERNAL_FIELDS */ diff --git a/libavfilter/split.c b/libavfilter/split.c index 6cf4ab1..89e3604 100644 --- a/libavfilter/split.c +++ b/libavfilter/split.c @@ -30,6 +30,7 @@ #include "libavutil/mem.h" #include "libavutil/opt.h" +#include "private_fields.h" #include "avfilter.h" #include "audio.h" #include "formats.h" @@ -78,7 +79,7 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *frame) for (i = 0; i < ctx->nb_outputs; i++) { AVFrame *buf_out; - if (ctx->outputs[i]->status) + if (ctx->outputs[i]->status_in) continue; buf_out = av_frame_clone(frame); if (!buf_out) { diff --git a/libavfilter/vf_extractplanes.c b/libavfilter/vf_extractplanes.c index a23f838..c4b01e0 100644 --- a/libavfilter/vf_extractplanes.c +++ b/libavfilter/vf_extractplanes.c @@ -22,6 +22,7 @@ #include "libavutil/imgutils.h" #include "libavutil/opt.h" #include "libavutil/pixdesc.h" +#include "private_fields.h" #include "avfilter.h" #include "drawutils.h" #include "internal.h" @@ -245,7 +246,7 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *frame) const int idx = s->map[i]; AVFrame *out; - if (outlink->status) + if (outlink->status_in) continue; out = ff_get_video_buffer(outlink, outlink->w, outlink->h); diff --git a/tests/ref/fate/source b/tests/ref/fate/source index 63ddd3f..7b35b03 100644 --- a/tests/ref/fate/source +++ b/tests/ref/fate/source @@ -27,3 +27,4 @@ compat/avisynth/windowsPorts/windows2linux.h compat/float/float.h compat/float/limits.h compat/nvenc/nvEncodeAPI.h +libavfilter/private_fields.h -- 2.9.3