diff mbox series

[FFmpeg-devel,1/7] avfilter/formats: Make check for buffer overflow redundant

Message ID 20200815054816.375-1-andreas.rheinhardt@gmail.com
State Accepted
Commit 55eb24a92fdb44da65364da61917f085286a48d2
Headers show
Series [FFmpeg-devel,1/7] avfilter/formats: Make check for buffer overflow redundant | expand

Checks

Context Check Description
andriy/default pending
andriy/make success Make finished
andriy/make_fate success Make fate finished

Commit Message

Andreas Rheinhardt Aug. 15, 2020, 5:48 a.m. UTC
and remove the redundant check.

This check for whether the allocated buffer is sufficient has been added
in commit 1cbf7fb4345a3e5b7791d483241bf4759bde4ece (merging commit
5775a1832c4165e6acc1d307004b38701bb463f4). It is not sufficient to
detect invalid input lists (namely lists with duplicates); its only use
is to avoid buffer overflows. And this can be achieved by simpler means:
Make sure that one allocates space for so many elements as the outer loop
ranges over and break out of the inner loop if a match has been found.
For valid input without duplicates, no further match will be found anyway.

This change will temporarily make the allocated formats array larger
than before and larger than necessary; this will be fixed in a later
commit that avoids the allocation altogether.

If a check for duplicates in the lists is deemed necessary, it should be
done properly somewhere else.

Finally, the error message that is removed in this commit used
__FUNCTION__, which is a GCC extension (C99 added __func__ for this).
So this commit removes a warning when compiling in -pedantic mode.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
---
 libavfilter/formats.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

Comments

Nicolas George Aug. 19, 2020, 4:54 p.m. UTC | #1
Andreas Rheinhardt (12020-08-15):
> and remove the redundant check.
> 
> This check for whether the allocated buffer is sufficient has been added
> in commit 1cbf7fb4345a3e5b7791d483241bf4759bde4ece (merging commit
> 5775a1832c4165e6acc1d307004b38701bb463f4). It is not sufficient to
> detect invalid input lists (namely lists with duplicates); its only use
> is to avoid buffer overflows. And this can be achieved by simpler means:
> Make sure that one allocates space for so many elements as the outer loop
> ranges over and break out of the inner loop if a match has been found.
> For valid input without duplicates, no further match will be found anyway.
> 
> This change will temporarily make the allocated formats array larger
> than before and larger than necessary; this will be fixed in a later
> commit that avoids the allocation altogether.
> 
> If a check for duplicates in the lists is deemed necessary, it should be
> done properly somewhere else.
> 
> Finally, the error message that is removed in this commit used
> __FUNCTION__, which is a GCC extension (C99 added __func__ for this).
> So this commit removes a warning when compiling in -pedantic mode.
> 
> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
> ---
>  libavfilter/formats.c | 9 ++-------
>  1 file changed, 2 insertions(+), 7 deletions(-)

LGTM.

Regards,
Nicolas George Aug. 19, 2020, 5:11 p.m. UTC | #2
Andreas Rheinhardt (12020-08-15):
> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
> ---
>  libavfilter/formats.c | 4 ++++
>  1 file changed, 4 insertions(+)

The patch works, but I do not think it is necessary or even useful:
these arrays are short-lived anyway, they will be freed very soon
anyway.

(Also, we may want to make them power-of-two dynarrays at some places,
to avoid quadratic adds of formats.)

Regards,
Nicolas George Aug. 19, 2020, 5:16 p.m. UTC | #3
Andreas Rheinhardt (12020-08-15):
> by adapting the MERGE_FORMATS() so that only one instance of the
> MERGE_REF() macro needs to exist in ff_merge_samplerates().

Nit: the first line of the commit message should be a short summary.

> 
> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
> ---
>  libavfilter/formats.c | 26 ++++++++++++++------------
>  1 file changed, 14 insertions(+), 12 deletions(-)
> 
> diff --git a/libavfilter/formats.c b/libavfilter/formats.c
> index 4efbcbebfe..e8a43a434d 100644
> --- a/libavfilter/formats.c
> +++ b/libavfilter/formats.c
> @@ -56,12 +56,21 @@ do {                                                                       \
>  
>  /**
>   * Add all formats common to a and b to a, add b's refs to a and destroy b.
> + * If empty_allowed is set and one of a,b->nb is zero, the lists are
> + * merged; otherwise, it is treated as error.
>   */
> -#define MERGE_FORMATS(a, b, fmts, nb, type, fail_statement)                \
> +#define MERGE_FORMATS(a, b, fmts, nb, type, fail_statement, empty_allowed) \
>  do {                                                                            \
>      int i, j, k = 0;                                                       \
>      void *tmp;                                                             \
>                                                                                  \

> +    if (empty_allowed) {                                                   \
> +        if (!a->nb || !b->nb) {                                            \
> +            if (!a->nb)                                                    \
> +                FFSWAP(type *, a, b);                                      \
> +            goto merge_ref;                                                \
> +        }                                                                  \
> +    }                                                                      \

I think a big if () / else would be better than a goto.

>          for (i = 0; i < a->nb; i++)                                             \
>              for (j = 0; j < b->nb; j++)                                         \
>                  if (a->fmts[i] == b->fmts[j]) {                                 \
> @@ -77,6 +86,7 @@ do {
>      if (tmp)                                                               \
>          a->fmts = tmp;                                                     \
>                                                                                  \
> +merge_ref:                                                                 \
>      MERGE_REF(a, b, fmts, type, fail_statement);                           \
>  } while (0)
>  
> @@ -114,7 +124,7 @@ AVFilterFormats *ff_merge_formats(AVFilterFormats *a, AVFilterFormats *b,
>      if (alpha2 > alpha1 || chroma2 > chroma1)
>          return NULL;
>  
> -    MERGE_FORMATS(a, b, formats, nb_formats, AVFilterFormats, return NULL;);
> +    MERGE_FORMATS(a, b, formats, nb_formats, AVFilterFormats, return NULL;, 0);
>  
>      return a;
>  }
> @@ -124,16 +134,8 @@ AVFilterFormats *ff_merge_samplerates(AVFilterFormats *a,
>  {
>      if (a == b) return a;
>  
> -    if (a->nb_formats && b->nb_formats) {
> -        MERGE_FORMATS(a, b, formats, nb_formats, AVFilterFormats, return NULL;);
> -        return a;
> -    } else if (a->nb_formats) {
> -        MERGE_REF(a, b, formats, AVFilterFormats, return NULL;);
> -        return a;
> -    } else {
> -        MERGE_REF(b, a, formats, AVFilterFormats, return NULL;);
> -        return b;
> -    }
> +    MERGE_FORMATS(a, b, formats, nb_formats, AVFilterFormats, return NULL;, 1);
> +    return a;
>  }
>  
>  AVFilterChannelLayouts *ff_merge_channel_layouts(AVFilterChannelLayouts *a,

Overall, it does not look very beneficial, but I suppose it is ok.

Regards,
Nicolas George Aug. 19, 2020, 5:23 p.m. UTC | #4
Andreas Rheinhardt (12020-08-15):
> The callers of the ff_merge_*() functions fall into two categories with
> quite different needs:
> 
> One caller is can_merge_formats() which only wants to test for mergeability
> without it merging anything. In order to do so, it duplicates the lists
> it intends to test and resets their owners so that they are not modified
> by ff_merge_*(). It also means that it needs to receive the merged list
> (and not only an int containing whether the lists are mergeable) to
> properly free it.
> 
> The other callers want the lists to be actually merged. But given the
> fact that ff_merge_*() automatically updates the owners of the lists,
> they only want the information whether the merge succeeded or not; they
> don't want a link to the new list.
> 
> Therefore this commit splits these functions in two: ff_merge_*() for
> the latter callers and ff_can_merge_*() for the former.
> ff_merge_*() doesn't need to return a pointer to the combined list at all
> and hence these functions have been modified to return an int, which
> allows to distinguish between incompability and memory allocation failures.
> 
> ff_can_merge_*() meanwhile doesn't modify its arguments at all obviating
> the need for copies. This in turn implies that there is no reason to
> return a pointer to the new list, as nothing needs to be freed. These
> functions therefore return an int as well. This allowed to completely
> remove can_merge_formats() in avfiltergraph.c.
> 
> Notice that no ff_can_merge_channel_layouts() has been created, because
> there is currently no caller for this. It could be added if needed.
> 
> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>

LGTM, well done.

Regards,
Nicolas George Aug. 19, 2020, 5:31 p.m. UTC | #5
Andreas Rheinhardt (12020-08-15):
> This means that one only needs to update the shorter list of references.
> 
> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
> ---
> I doubt that this optimizations is worth the additional complexity. I
> have just added it for you to decide.

I must say, I do not understand the logic, especially the preserve_fmts
bit.

I think we can leave it for later.

Have you given thought to the linked list option I suggested?

Regards,
Paul B Mahol Aug. 19, 2020, 6:13 p.m. UTC | #6
On 8/15/20, Andreas Rheinhardt <andreas.rheinhardt@gmail.com> wrote:
> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
> ---
>  libavfilter/formats.c | 18 +++++++++---------
>  1 file changed, 9 insertions(+), 9 deletions(-)
>

LGTM
Andreas Rheinhardt Aug. 19, 2020, 9:47 p.m. UTC | #7
Nicolas George:
> Andreas Rheinhardt (12020-08-15):
>> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
>> ---
>>  libavfilter/formats.c | 4 ++++
>>  1 file changed, 4 insertions(+)
> 
> The patch works, but I do not think it is necessary or even useful:
> these arrays are short-lived anyway, they will be freed very soon
> anyway.
> 
> (Also, we may want to make them power-of-two dynarrays at some places,
> to avoid quadratic adds of formats.)
> 
> Regards,
> 
I was unsure about this one and actually never liked it. I only did it
because the slight size increase is the only downside from patch two.
But you're right. It's dropped.

- Andreas
Andreas Rheinhardt Aug. 19, 2020, 10:22 p.m. UTC | #8
Nicolas George:
> Andreas Rheinhardt (12020-08-15):
>> This means that one only needs to update the shorter list of references.
>>
>> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
>> ---
>> I doubt that this optimizations is worth the additional complexity. I
>> have just added it for you to decide.
> 
> I must say, I do not understand the logic, especially the preserve_fmts
> bit.
> 
The MERGE_REF() macro is asymmetric by design: The refs of the second
argument are added to the refs of the first argument and the second
argument is freed. The list of the first argument meanwhile is/ought to
be preserved. When swapping, one therefore has to swap the
non-ref-fields, too. But the MERGE_REF at the end of
ff_merge_channel_layouts() is special. The list will be freed and
replaced by a different list after the macro anyway, so one doesn't need
to preserve the list at all. preserve_fmts exists so that the compiler
can optimize this part of MERGE_REF() away for that invocation of the macro.

> I think we can leave it for later.
> 

Or it can be dropped altogether. Which I would actually prefer.

> Have you given thought to the linked list option I suggested?
> 
It is certainly interesting (e.g. if I am not mistaken, it would mean
that ff_merge_formats/samplerates() can't fail at all any more). I see
two obvious downsides: Traversing the list for freeing is slower (but it
should only be a constant factor) and it increases the size of the
AVFilterLink even after merging. But I don't consider these obstacles to
have much weight.

Speaking of AVFilterLink: All ff_merge_*() as well as ff_can_merge_*()
always use the in- and out-fields of one and the same AVFilterLink. I
wonder whether these function should not just take the AVFilterLink * as
parameter. It would also mean that ff_merge_formats() would have the
same signature as the other functions, which simplifies macros.

- Andreas
Nicolas George Aug. 20, 2020, 3:12 p.m. UTC | #9
Andreas Rheinhardt (12020-08-20):
> The MERGE_REF() macro is asymmetric by design: The refs of the second
> argument are added to the refs of the first argument and the second
> argument is freed. The list of the first argument meanwhile is/ought to
> be preserved. When swapping, one therefore has to swap the
> non-ref-fields, too. But the MERGE_REF at the end of
> ff_merge_channel_layouts() is special. The list will be freed and
> replaced by a different list after the macro anyway, so one doesn't need
> to preserve the list at all. preserve_fmts exists so that the compiler
> can optimize this part of MERGE_REF() away for that invocation of the macro.

Thanks for the explanation. I had not realized that these are precisely
the functions that make the lists the same.

> It is certainly interesting (e.g. if I am not mistaken, it would mean
> that ff_merge_formats/samplerates() can't fail at all any more). I see

This is one of the benefits I expect, yes.

> two obvious downsides: Traversing the list for freeing is slower (but it
> should only be a constant factor) and it increases the size of the
> AVFilterLink even after merging. But I don't consider these obstacles to
> have much weight.

If we become worried about the size of the fields AVFilterLink used only
at initialization, we can move them all together in a separate structure
that gets freed once the filter is configured.

> Speaking of AVFilterLink: All ff_merge_*() as well as ff_can_merge_*()
> always use the in- and out-fields of one and the same AVFilterLink. I
> wonder whether these function should not just take the AVFilterLink * as
> parameter. It would also mean that ff_merge_formats() would have the
> same signature as the other functions, which simplifies macros.

Now that you mention it, it is probably something that can be done,
indeed.

Regards,
diff mbox series

Patch

diff --git a/libavfilter/formats.c b/libavfilter/formats.c
index 04b8c7cc39..1df0c347f9 100644
--- a/libavfilter/formats.c
+++ b/libavfilter/formats.c
@@ -65,7 +65,7 @@  do {                                                                       \
  */
 #define MERGE_FORMATS(ret, a, b, fmts, nb, type, fail)                          \
 do {                                                                            \
-    int i, j, k = 0, count = FFMIN(a->nb, b->nb);                               \
+    int i, j, k = 0, count = a->nb;                                             \
     type ***tmp;                                                                \
                                                                                 \
     if (!(ret = av_mallocz(sizeof(*ret))))                                      \
@@ -77,13 +77,8 @@  do {
         for (i = 0; i < a->nb; i++)                                             \
             for (j = 0; j < b->nb; j++)                                         \
                 if (a->fmts[i] == b->fmts[j]) {                                 \
-                    if(k >= FFMIN(a->nb, b->nb)){                               \
-                        av_log(NULL, AV_LOG_ERROR, "Duplicate formats in %s detected\n", __FUNCTION__); \
-                        av_free(ret->fmts);                                     \
-                        av_free(ret);                                           \
-                        return NULL;                                            \
-                    }                                                           \
                     ret->fmts[k++] = a->fmts[i];                                \
+                    break;                                                      \
                 }                                                               \
     }                                                                           \
     ret->nb = k;                                                                \