Message ID | 20200808140202.586-4-andreas.rheinhardt@gmail.com |
---|---|
State | Accepted |
Headers | show |
Series | [FFmpeg-devel,1/6] avfilter/formats: Remove ff_make_formatu64_list() | expand |
Context | Check | Description |
---|---|---|
andriy/default | pending | |
andriy/make | success | Make finished |
andriy/make_fate | success | Make fate finished |
Andreas Rheinhardt (12020-08-08): > ff_merge_formats(), ff_merge_samplerates() and ff_merge_channel_layouts() > share common semantics: If merging succeeds, a non-NULL pointer is > returned and both input lists (of type AVFilterFormats resp. > AVFilterChannelLayouts) are to be treated as if they had been freed; > the owners of the input parameters (if any) become owners of the > returned list. If merging does not succeed, NULL is returned and both > input lists are supposed to be unchanged. > > The problem is that the functions did not abide by these semantics: > In case of reallocation failure, it is possible for these functions > to return NULL after having already freed one of the two input list. > This happens because sometimes the refs-array of the destined output > gets reallocated twice to its final size and if the second of these > reallocations fails, the first of the two inputs has already been freed > and its refs updated to point to the destined output which in this case > will be freed immediately so that all of the already updated pointers > are now dangling. This leads to use-after-frees and memory corruptions > lateron (when these owners get cleaned up, the lists they own get > unreferenced). Should the input lists don't have owners at all, the > caller (namely can_merge_formats() in avfiltergraph.c) thinks that both > the input lists are unchanged and need to be freed, leading to a double > free. > > The solution to this is simple: Don't reallocate twice; do it just once. > This also saves a reallocation. > > This commit fixes the issue behind Coverity issue #1452636. It might > also make Coverity realize that the issue has been fixed. > > Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> > --- Sorry for the delay, there was a detail that bugged me, I had to look at it carefully. > This is not the only thing wrong with the formats API. Will soon send more; > those who can't wait can already take a look at > https://github.com/mkver/FFmpeg/commits/avfilter > > The most important of these patches is > https://github.com/mkver/FFmpeg/commit/a151b88f395387c4bb85fbf14c042e2cd3f677ed > > libavfilter/formats.c | 41 +++++++++++++++++++++++++++++------------ > 1 file changed, 29 insertions(+), 12 deletions(-) > > diff --git a/libavfilter/formats.c b/libavfilter/formats.c > index 48b27d0c53..ae44006dfe 100644 > --- a/libavfilter/formats.c > +++ b/libavfilter/formats.c > @@ -31,19 +31,14 @@ > > #define KNOWN(l) (!FF_LAYOUT2COUNT(l)) /* for readability */ > > + > /** > * Add all refs from a to ret and destroy a. > + * ret->refs must have enough spare room left for this. > */ > -#define MERGE_REF(ret, a, fmts, type, fail) \ > +#define MERGE_REF_NO_ALLOC(ret, a, fmts) \ > do { \ > - type ***tmp; \ > int i; \ > - \ > - if (!(tmp = av_realloc_array(ret->refs, ret->refcount + a->refcount, \ > - sizeof(*tmp)))) \ > - goto fail; \ > - ret->refs = tmp; \ > - \ > for (i = 0; i < a->refcount; i ++) { \ > ret->refs[ret->refcount] = a->refs[i]; \ > *ret->refs[ret->refcount++] = ret; \ > @@ -54,6 +49,17 @@ do { \ > av_freep(&a); \ > } while (0) > > +#define MERGE_REF(ret, a, fmts, type, fail) \ > +do { \ > + type ***tmp; \ > + \ > + if (!(tmp = av_realloc_array(ret->refs, ret->refcount + a->refcount, \ > + sizeof(*tmp)))) \ > + goto fail; \ > + ret->refs = tmp; \ > + MERGE_REF_NO_ALLOC(ret, a, fmts); \ > +} while (0) > + > /** > * Add all formats common for a and b to ret, copy the refs and destroy > * a and b. > @@ -61,6 +67,7 @@ do { \ > #define MERGE_FORMATS(ret, a, b, fmts, nb, type, fail) \ > do { \ > int i, j, k = 0, count = FFMIN(a->nb, b->nb); \ > + type ***tmp; \ > \ > if (!(ret = av_mallocz(sizeof(*ret)))) \ > goto fail; \ > @@ -85,8 +92,13 @@ do { > if (!ret->nb) \ > goto fail; \ > \ > - MERGE_REF(ret, a, fmts, type, fail); \ > - MERGE_REF(ret, b, fmts, type, fail); \ > + tmp = av_realloc_array(NULL, a->refcount + b->refcount, sizeof(*tmp)); \ I was worried about the NULL here, because it means allocating a new array, even if the memory we already have fits. But it is no matter, because the first call to MERGE_REF is done with ret->refs being NULL. So patch ok, good catch. Possibly: add a comment: /* TODO see if we can reallocate a->refs, or the larger of a->refs and b->refs instead of allocating a brand new array. */ > + if (!tmp) \ > + goto fail; \ > + ret->refs = tmp; \ > + \ > + MERGE_REF_NO_ALLOC(ret, a, fmts); \ > + MERGE_REF_NO_ALLOC(ret, b, fmts); \ > } while (0) > > AVFilterFormats *ff_merge_formats(AVFilterFormats *a, AVFilterFormats *b, > @@ -238,8 +250,13 @@ AVFilterChannelLayouts *ff_merge_channel_layouts(AVFilterChannelLayouts *a, > ret->nb_channel_layouts = ret_nb; > if (!ret->nb_channel_layouts) > goto fail; > - MERGE_REF(ret, a, channel_layouts, AVFilterChannelLayouts, fail); > - MERGE_REF(ret, b, channel_layouts, AVFilterChannelLayouts, fail); > + > + ret->refs = av_realloc_array(NULL, a->refcount + b->refcount, > + sizeof(*ret->refs)); > + if (!ret->refs) > + goto fail; > + MERGE_REF_NO_ALLOC(ret, a, channel_layouts); > + MERGE_REF_NO_ALLOC(ret, b, channel_layouts); > return ret; > > fail: Regards,
Nicolas George: > Andreas Rheinhardt (12020-08-08): >> ff_merge_formats(), ff_merge_samplerates() and ff_merge_channel_layouts() >> share common semantics: If merging succeeds, a non-NULL pointer is >> returned and both input lists (of type AVFilterFormats resp. >> AVFilterChannelLayouts) are to be treated as if they had been freed; >> the owners of the input parameters (if any) become owners of the >> returned list. If merging does not succeed, NULL is returned and both >> input lists are supposed to be unchanged. >> >> The problem is that the functions did not abide by these semantics: >> In case of reallocation failure, it is possible for these functions >> to return NULL after having already freed one of the two input list. >> This happens because sometimes the refs-array of the destined output >> gets reallocated twice to its final size and if the second of these >> reallocations fails, the first of the two inputs has already been freed >> and its refs updated to point to the destined output which in this case >> will be freed immediately so that all of the already updated pointers >> are now dangling. This leads to use-after-frees and memory corruptions >> lateron (when these owners get cleaned up, the lists they own get >> unreferenced). Should the input lists don't have owners at all, the >> caller (namely can_merge_formats() in avfiltergraph.c) thinks that both >> the input lists are unchanged and need to be freed, leading to a double >> free. >> >> The solution to this is simple: Don't reallocate twice; do it just once. >> This also saves a reallocation. >> >> This commit fixes the issue behind Coverity issue #1452636. It might >> also make Coverity realize that the issue has been fixed. >> >> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> >> --- > > Sorry for the delay, there was a detail that bugged me, I had to look at > it carefully. > No problem, I never expected the reviewing process to be instantaneous. >> This is not the only thing wrong with the formats API. Will soon send more; >> those who can't wait can already take a look at >> https://github.com/mkver/FFmpeg/commits/avfilter >> >> The most important of these patches is >> https://github.com/mkver/FFmpeg/commit/a151b88f395387c4bb85fbf14c042e2cd3f677ed >> >> libavfilter/formats.c | 41 +++++++++++++++++++++++++++++------------ >> 1 file changed, 29 insertions(+), 12 deletions(-) >> >> diff --git a/libavfilter/formats.c b/libavfilter/formats.c >> index 48b27d0c53..ae44006dfe 100644 >> --- a/libavfilter/formats.c >> +++ b/libavfilter/formats.c >> @@ -31,19 +31,14 @@ >> >> #define KNOWN(l) (!FF_LAYOUT2COUNT(l)) /* for readability */ >> >> + >> /** >> * Add all refs from a to ret and destroy a. >> + * ret->refs must have enough spare room left for this. >> */ >> -#define MERGE_REF(ret, a, fmts, type, fail) \ >> +#define MERGE_REF_NO_ALLOC(ret, a, fmts) \ >> do { \ >> - type ***tmp; \ >> int i; \ >> - \ >> - if (!(tmp = av_realloc_array(ret->refs, ret->refcount + a->refcount, \ >> - sizeof(*tmp)))) \ >> - goto fail; \ >> - ret->refs = tmp; \ >> - \ >> for (i = 0; i < a->refcount; i ++) { \ >> ret->refs[ret->refcount] = a->refs[i]; \ >> *ret->refs[ret->refcount++] = ret; \ >> @@ -54,6 +49,17 @@ do { \ >> av_freep(&a); \ >> } while (0) >> >> +#define MERGE_REF(ret, a, fmts, type, fail) \ >> +do { \ >> + type ***tmp; \ >> + \ >> + if (!(tmp = av_realloc_array(ret->refs, ret->refcount + a->refcount, \ >> + sizeof(*tmp)))) \ >> + goto fail; \ >> + ret->refs = tmp; \ >> + MERGE_REF_NO_ALLOC(ret, a, fmts); \ >> +} while (0) >> + >> /** >> * Add all formats common for a and b to ret, copy the refs and destroy >> * a and b. >> @@ -61,6 +67,7 @@ do { \ >> #define MERGE_FORMATS(ret, a, b, fmts, nb, type, fail) \ >> do { \ >> int i, j, k = 0, count = FFMIN(a->nb, b->nb); \ >> + type ***tmp; \ >> \ >> if (!(ret = av_mallocz(sizeof(*ret)))) \ >> goto fail; \ >> @@ -85,8 +92,13 @@ do { >> if (!ret->nb) \ >> goto fail; \ >> \ >> - MERGE_REF(ret, a, fmts, type, fail); \ >> - MERGE_REF(ret, b, fmts, type, fail); \ > >> + tmp = av_realloc_array(NULL, a->refcount + b->refcount, sizeof(*tmp)); \ > > I was worried about the NULL here, because it means allocating a new > array, even if the memory we already have fits. But it is no matter, > because the first call to MERGE_REF is done with ret->refs being NULL. > > So patch ok, good catch. > > Possibly: add a comment: > > /* TODO see if we can reallocate a->refs, or the larger of a->refs and > b->refs instead of allocating a brand new array. > > */ Do you remember https://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/267549.html? If I were allowed to reorder a and b so that a->nb <= b->nb, one would not have to allocate a new list to return at all: One could reuse a and would only have to reallocate a->refs to also include b->refs and update the refs coming from b (the ones from a are already fine!). I have also pondered a second possible way of improvement: There are two types of callers (all in avfiltergraph.c) of the ff_merge_ functions: can_merge_formats() actually only wants to know whether two formats are mergeable and to do so it currently creates short-lived throwaway copies of the lists. Very wasteful. can_merge_formats() could be removed altogether if one would implement check-only functions (which should return an int, not an AVFilterFormats *). The other callers also don't really inspect the current return value at all except from checking whether it is NULL or not (that's because in this case the returned AVFilterFormats has owners, so one doesn't need to care about leaks). These could be satisfied by a function returning an int which also allows to actually distinguish between inability to merge and (re)allocation failures (< 0 would be errors, 0 inability to merge and 1 successfull merge). - Andreas
Andreas Rheinhardt (12020-08-11): > Do you remember > https://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/267549.html? If I > were allowed to reorder a and b so that a->nb <= b->nb, one would not > have to allocate a new list to return at all: One could reuse a and > would only have to reallocate a->refs to also include b->refs and update > the refs coming from b (the ones from a are already fine!). I remember, but we are not talking about the same thing. You are talking about the intersection algorithm, applied to the list of supported formats. For this, we know that the output is at most as large as the smaller list, and we could keep it here instead of allocating a new list. I will reply to this. But but here I was talking about the list of references, which are just concatenated: instead of creating an empty array and copying both lists into it, we could grow one of the list and copy the other. For references, there is no doubt that the order does not matter. Bot both are micro-optimizations, probably better left for a bit later when other things have been made more clean and robust. > I have also pondered a second possible way of improvement: There are two > types of callers (all in avfiltergraph.c) of the ff_merge_ functions: > can_merge_formats() actually only wants to know whether two formats are > mergeable and to do so it currently creates short-lived throwaway copies > of the lists. Very wasteful. can_merge_formats() could be removed > altogether if one would implement check-only functions (which should > return an int, not an AVFilterFormats *). > The other callers also don't really inspect the current return value at > all except from checking whether it is NULL or not (that's because in > this case the returned AVFilterFormats has owners, so one doesn't need > to care about leaks). These could be satisfied by a function returning > an int which also allows to actually distinguish between inability to > merge and (re)allocation failures (< 0 would be errors, 0 inability to > merge and 1 successfull merge). You are right, this part is not very satisfactory; it was added as an afterthought If we are to support subtitles or data frames, and partial graph reconfiguration, we will need to include all these afterthoughts into a clean whole. In particular, we probably need to negotiate the media type to avoid having n filters with the same function just for different media types. I suspect it would be better to have a system of types with callbacks rather than a labyrinth of macros. But my ideas about this are not yet mature. Regards,
Nicolas George: > Andreas Rheinhardt (12020-08-11): >> Do you remember >> https://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/267549.html? If I >> were allowed to reorder a and b so that a->nb <= b->nb, one would not >> have to allocate a new list to return at all: One could reuse a and >> would only have to reallocate a->refs to also include b->refs and update >> the refs coming from b (the ones from a are already fine!). > > I remember, but we are not talking about the same thing. > > You are talking about the intersection algorithm, applied to the list of > supported formats. For this, we know that the output is at most as large > as the smaller list, and we could keep it here instead of allocating a > new list. I will reply to this. > > But but here I was talking about the list of references, which are just > concatenated: instead of creating an empty array and copying both lists > into it, we could grow one of the list and copy the other. > My intention was to eventually reuse the whole AVFilterFormats with the smaller nb_formats; only the refs need to be reallocated and only the refs from the other AVFilterFormats will need to be added and updated (the ones from the AVFilterFormats that is reused are already fine after all). This will of course not always ensure that the bigger refs list is retained, but if one is willing to live with this (as I was when I wrote what you answered to), then reusing the list of supported formats and reusing a refs list goes nicely together. Therefore I thought of these two issues as linked and intended to address them together. If one always wants to reuse the larger refs array, one should also reuse the AVFilterFormats struct belonging to said refs array (so that these refs needn't be updated). This would involve a minor complication if one also wants to reuse the formats array. Anyway there is another complication for reusing the formats array: If reallocating the refs array fails lateron, one has already changed one formats array; and if one reallocates the refs array before looking through the formats arrays, it might be that said reallocating was in vain (namely if there are no common formats). The only solution to this conundrum is that reusing the formats list will have to wait until the can_merge part is factored out, because when one is able to return an int < 0 for failure, one can just document that in this case the formats lists might have been modified. > For references, there is no doubt that the order does not matter. > > Bot both are micro-optimizations, probably better left for a bit later > when other things have been made more clean and robust. > >> I have also pondered a second possible way of improvement: There are two >> types of callers (all in avfiltergraph.c) of the ff_merge_ functions: >> can_merge_formats() actually only wants to know whether two formats are >> mergeable and to do so it currently creates short-lived throwaway copies >> of the lists. Very wasteful. can_merge_formats() could be removed >> altogether if one would implement check-only functions (which should >> return an int, not an AVFilterFormats *). >> The other callers also don't really inspect the current return value at >> all except from checking whether it is NULL or not (that's because in >> this case the returned AVFilterFormats has owners, so one doesn't need >> to care about leaks). These could be satisfied by a function returning >> an int which also allows to actually distinguish between inability to >> merge and (re)allocation failures (< 0 would be errors, 0 inability to >> merge and 1 successfull merge). > > You are right, this part is not very satisfactory; it was added as an > afterthought > > If we are to support subtitles or data frames, and partial graph > reconfiguration, we will need to include all these afterthoughts into a > clean whole. In particular, we probably need to negotiate the media type > to avoid having n filters with the same function just for different > media types. > > I suspect it would be better to have a system of types with callbacks > rather than a labyrinth of macros. But my ideas about this are not yet > mature. > Then I'll prepare some patches for the low-hanging fruits here and proceed with other things not related to the formats API. - Andreas PS: I'll apply patches 1-6 this evening or so if there are no more objections.
Andreas Rheinhardt (12020-08-12): > My intention was to eventually reuse the whole AVFilterFormats with the > smaller nb_formats; only the refs need to be reallocated and only the > refs from the other AVFilterFormats will need to be added and updated > (the ones from the AVFilterFormats that is reused are already fine after > all). I had not thought of that optimization, good idea. > This will of course not always ensure that the bigger refs list is > retained, but if one is willing to live with this (as I was when I wrote > what you answered to), then reusing the list of supported formats and > reusing a refs list goes nicely together. Therefore I thought of these > two issues as linked and intended to address them together. You are right, they are somewhat linked. > If one always wants to reuse the larger refs array, one should also > reuse the AVFilterFormats struct belonging to said refs array (so that > these refs needn't be updated). This would involve a minor complication > if one also wants to reuse the formats array. Since the lists are about to be merged anyway, we can use any part we want: we can build the new list in a local variable, using the smallest formats array and the larger refs array, and in the end overwrite one of the lists and free the other. > Anyway there is another complication for reusing the formats array: If > reallocating the refs array fails lateron, one has already changed one > formats array; and if one reallocates the refs array before looking > through the formats arrays, it might be that said reallocating was in > vain (namely if there are no common formats). I do not think memory allocation failures are worth worrying about too much: if something that small fails, then everything else will fail. So we have to handle them gracefully, but we do not need to be able to recover from them. So I say: if memory allocation fails, let us free everything and cleanly return an error. But I have another idea worth considering, that could get us rid of the refs array entirely; I would like your advice on it before trying to implement it: Let us use a linked list of references instead of an array. That would look approximatively like this: struct AVFilterFormatsRef { AVFilterFormats *formats; AVFilterFormatsRef *next; }; And we put this structure directly into AVFilterLink. Then to merge two formats lists, we just walk the list and update all the pointers, and then connect the linked lists. There is the small drawback that the next pointer stays in the link structure forever, but if we worry about this, there is more gain to be obtained by putting everything related to the initialization in a separate structure. > Then I'll prepare some patches for the low-hanging fruits here and > proceed with other things not related to the formats API. > PS: I'll apply patches 1-6 this evening or so if there are no more > objections. Ok to both. Regards,
diff --git a/libavfilter/formats.c b/libavfilter/formats.c index 48b27d0c53..ae44006dfe 100644 --- a/libavfilter/formats.c +++ b/libavfilter/formats.c @@ -31,19 +31,14 @@ #define KNOWN(l) (!FF_LAYOUT2COUNT(l)) /* for readability */ + /** * Add all refs from a to ret and destroy a. + * ret->refs must have enough spare room left for this. */ -#define MERGE_REF(ret, a, fmts, type, fail) \ +#define MERGE_REF_NO_ALLOC(ret, a, fmts) \ do { \ - type ***tmp; \ int i; \ - \ - if (!(tmp = av_realloc_array(ret->refs, ret->refcount + a->refcount, \ - sizeof(*tmp)))) \ - goto fail; \ - ret->refs = tmp; \ - \ for (i = 0; i < a->refcount; i ++) { \ ret->refs[ret->refcount] = a->refs[i]; \ *ret->refs[ret->refcount++] = ret; \ @@ -54,6 +49,17 @@ do { \ av_freep(&a); \ } while (0) +#define MERGE_REF(ret, a, fmts, type, fail) \ +do { \ + type ***tmp; \ + \ + if (!(tmp = av_realloc_array(ret->refs, ret->refcount + a->refcount, \ + sizeof(*tmp)))) \ + goto fail; \ + ret->refs = tmp; \ + MERGE_REF_NO_ALLOC(ret, a, fmts); \ +} while (0) + /** * Add all formats common for a and b to ret, copy the refs and destroy * a and b. @@ -61,6 +67,7 @@ do { \ #define MERGE_FORMATS(ret, a, b, fmts, nb, type, fail) \ do { \ int i, j, k = 0, count = FFMIN(a->nb, b->nb); \ + type ***tmp; \ \ if (!(ret = av_mallocz(sizeof(*ret)))) \ goto fail; \ @@ -85,8 +92,13 @@ do { if (!ret->nb) \ goto fail; \ \ - MERGE_REF(ret, a, fmts, type, fail); \ - MERGE_REF(ret, b, fmts, type, fail); \ + tmp = av_realloc_array(NULL, a->refcount + b->refcount, sizeof(*tmp)); \ + if (!tmp) \ + goto fail; \ + ret->refs = tmp; \ + \ + MERGE_REF_NO_ALLOC(ret, a, fmts); \ + MERGE_REF_NO_ALLOC(ret, b, fmts); \ } while (0) AVFilterFormats *ff_merge_formats(AVFilterFormats *a, AVFilterFormats *b, @@ -238,8 +250,13 @@ AVFilterChannelLayouts *ff_merge_channel_layouts(AVFilterChannelLayouts *a, ret->nb_channel_layouts = ret_nb; if (!ret->nb_channel_layouts) goto fail; - MERGE_REF(ret, a, channel_layouts, AVFilterChannelLayouts, fail); - MERGE_REF(ret, b, channel_layouts, AVFilterChannelLayouts, fail); + + ret->refs = av_realloc_array(NULL, a->refcount + b->refcount, + sizeof(*ret->refs)); + if (!ret->refs) + goto fail; + MERGE_REF_NO_ALLOC(ret, a, channel_layouts); + MERGE_REF_NO_ALLOC(ret, b, channel_layouts); return ret; fail:
ff_merge_formats(), ff_merge_samplerates() and ff_merge_channel_layouts() share common semantics: If merging succeeds, a non-NULL pointer is returned and both input lists (of type AVFilterFormats resp. AVFilterChannelLayouts) are to be treated as if they had been freed; the owners of the input parameters (if any) become owners of the returned list. If merging does not succeed, NULL is returned and both input lists are supposed to be unchanged. The problem is that the functions did not abide by these semantics: In case of reallocation failure, it is possible for these functions to return NULL after having already freed one of the two input list. This happens because sometimes the refs-array of the destined output gets reallocated twice to its final size and if the second of these reallocations fails, the first of the two inputs has already been freed and its refs updated to point to the destined output which in this case will be freed immediately so that all of the already updated pointers are now dangling. This leads to use-after-frees and memory corruptions lateron (when these owners get cleaned up, the lists they own get unreferenced). Should the input lists don't have owners at all, the caller (namely can_merge_formats() in avfiltergraph.c) thinks that both the input lists are unchanged and need to be freed, leading to a double free. The solution to this is simple: Don't reallocate twice; do it just once. This also saves a reallocation. This commit fixes the issue behind Coverity issue #1452636. It might also make Coverity realize that the issue has been fixed. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> --- This is not the only thing wrong with the formats API. Will soon send more; those who can't wait can already take a look at https://github.com/mkver/FFmpeg/commits/avfilter The most important of these patches is https://github.com/mkver/FFmpeg/commit/a151b88f395387c4bb85fbf14c042e2cd3f677ed libavfilter/formats.c | 41 +++++++++++++++++++++++++++++------------ 1 file changed, 29 insertions(+), 12 deletions(-)