Message ID | 20170308124012.27793-2-nfxjfg@googlemail.com |
---|---|
State | New |
Headers | show |
On Wed, Mar 08, 2017 at 01:40:11PM +0100, wm4 wrote: > It looks like this could lead to security issues, as side data readers anything could, but side data wasnt excluded from fuzzing, in fact ive seen fuzzers find and trigger the split code also a demuxer and user app could always set side data to something bad. All uses of side data must assume the data to be untrusted, because it is even without all the split code. > will for example rely on side data allocation sizes to be as large as > needed for correct operation. Is this really so ? we allocate at least AV_INPUT_BUFFER_PADDING_SIZE, accesses beyond that require a check on the size. also size checks are needed for the case where a lib gets replaced by one with newer version that has a bigger struct. > If such files exist at all, they also > should be brought out of circulation, so fully reject them. Under normal > circumstances, nothing creates such files. > > To avoid problems for now, we let the concat and hls demuxers do this > (they merely return previously-demuxed packets, whose side data might > have been merged by libavformat itself after demuxing). The > special-cases will be removed in the next commit. > --- > libavformat/utils.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/libavformat/utils.c b/libavformat/utils.c > index 37d7024465..68f3c977d6 100644 > --- a/libavformat/utils.c > +++ b/libavformat/utils.c > @@ -840,6 +840,15 @@ int ff_read_packet(AVFormatContext *s, AVPacket *pkt) > *pkt = tmp; > } > > + if (strcmp(s->iformat->name, "concat") && strcmp(s->iformat->name, "hls,applehttp") can strcmp() be avoided here ? it would be run per packet which is rather ugly > + && av_packet_split_side_data(pkt) == 1) { this could fail with ENOMEM which complicates things if we do block such packets, its prbobably better to add a new static inline function to a header that checks if a packet has splitable side data and use that in av_packet_split_side_data() and here Iam suggesting "static inline" here to make backporting easier so no new symbol is added to libavcodec that is needed by libavformat also it may be interresting to disable this check for fuzzing so side data can be fuzzed in a wider range of cases and any past testcases that happen to use this can still be used for regression testing [...]
On Wed, 8 Mar 2017 15:36:25 +0100 Michael Niedermayer <michael@niedermayer.cc> wrote: > On Wed, Mar 08, 2017 at 01:40:11PM +0100, wm4 wrote: > > It looks like this could lead to security issues, as side data readers > > anything could, but side data wasnt excluded from fuzzing, in fact > ive seen fuzzers find and trigger the split code The split code runs on _any_ input data, so this is not surprising. My intention is to remove merging/splitting completely (the patches don't go that far yet). > also a demuxer and user app could always set side data to something > bad. > All uses of side data must assume the data to be untrusted, because > it is even without all the split code. This depends. For example, AV_PKT_DATA_NEW_EXTRADATA is of course 100% defined by untrusted data (except that the size fits into memory). On the other hand, I expect something like AV_PKT_DATA_DISPLAYMATRIX to have at least the correct size. FFmpeg code also does this sometimes, for example dump_sidedata() doesn't check the size of this particular side data type. In my own FFmpeg API using code I assume that side data has the correct minimum size for a given type, and I will blame any potentially security-relevant deviations on FFmpeg. I'm sure most other API users also will, and they're probably not even aware that this us supposedly possible. Besides, fewer ways for untrusted input data to leak into complex/obscure data structures is an improvement for security. > > > > will for example rely on side data allocation sizes to be as large as > > needed for correct operation. > > Is this really so ? > we allocate at least AV_INPUT_BUFFER_PADDING_SIZE, accesses beyond > that require a check on the size. > Except av_frame_new_side_data() doesn't do that, and often side data is copied from packets to frames without additional checks (see ff_init_buffer_info() for e.g. the DISPLAYMATRIX case). > also size checks are needed for the case where a lib gets replaced by > one with newer version that has a bigger struct. Oh really? We never do this. Normal API structs are also considered appendable, so compiling against a newer API and then linking against an older version doesn't work. This is exactly the same case. > > > If such files exist at all, they also > > should be brought out of circulation, so fully reject them. Under normal > > circumstances, nothing creates such files. > > > > To avoid problems for now, we let the concat and hls demuxers do this > > (they merely return previously-demuxed packets, whose side data might > > have been merged by libavformat itself after demuxing). The > > special-cases will be removed in the next commit. > > --- > > libavformat/utils.c | 9 +++++++++ > > 1 file changed, 9 insertions(+) > > > > diff --git a/libavformat/utils.c b/libavformat/utils.c > > index 37d7024465..68f3c977d6 100644 > > --- a/libavformat/utils.c > > +++ b/libavformat/utils.c > > @@ -840,6 +840,15 @@ int ff_read_packet(AVFormatContext *s, AVPacket *pkt) > > *pkt = tmp; > > } > > > > + if (strcmp(s->iformat->name, "concat") && strcmp(s->iformat->name, "hls,applehttp") > > can strcmp() be avoided here ? > it would be run per packet which is rather ugly We could add a flag, but considering the next commit removes these strcmp()s again, and that strcmp() on very short strings is very cheap, I don't think it's worth doing this. > > > + && av_packet_split_side_data(pkt) == 1) { > > this could fail with ENOMEM which complicates things I can add a check for ENOMEM too. This should be the only thing that looks like a failure, but could work in a separate call (like the one on the libavcodec side). > > if we do block such packets, its prbobably better to add a new > static inline function to a header that checks if a packet has > splitable side data and use that in av_packet_split_side_data() and > here Not sure what you mean here. > Iam suggesting "static inline" here to make backporting easier so no > new symbol is added to libavcodec that is needed by libavformat What's the purpose? > also it may be interresting to disable this check for fuzzing so > side data can be fuzzed in a wider range of cases and any past > testcases that happen to use this can still be used for regression > testing I think what you want is fault injection for memory errors, seems out of scope here.
On Wed, Mar 08, 2017 at 04:06:20PM +0100, wm4 wrote: > On Wed, 8 Mar 2017 15:36:25 +0100 > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > On Wed, Mar 08, 2017 at 01:40:11PM +0100, wm4 wrote: > > > It looks like this could lead to security issues, as side data readers [...] > > also size checks are needed for the case where a lib gets replaced by > > one with newer version that has a bigger struct. > > Oh really? We never do this. Normal API structs are also considered > appendable, so compiling against a newer API and then linking against > an older version doesn't work. This is exactly the same case. no its not what you call normal structs are allocated by an allocator that is part of the lib that defines it, the struct, and lib dependancies ensure that its new. Any allocated struct as a result is large enough though possibly not every field is set with side data the code using it sets the size explicitly that makes the size generally hardcoded in the lib using the code theres no longer a common allocator (some exceptions exist). The size a lib allocates that way is the compile time sizeof() which may differ from another lib and side data can be passed in both directions between libs not just in the direction of their dependancy so you can end up with a smaller side data and that means you have to do checks. [...] > > > > > + && av_packet_split_side_data(pkt) == 1) { > > > > this could fail with ENOMEM which complicates things > > I can add a check for ENOMEM too. This should be the only thing that > looks like a failure, but could work in a separate call (like the one > on the libavcodec side). > > > > > if we do block such packets, its prbobably better to add a new > > static inline function to a header that checks if a packet has > > splitable side data and use that in av_packet_split_side_data() and > > here > > Not sure what you mean here. > > > Iam suggesting "static inline" here to make backporting easier so no > > new symbol is added to libavcodec that is needed by libavformat > > What's the purpose? Its simpler, cleaner and faster > > > also it may be interresting to disable this check for fuzzing so > > side data can be fuzzed in a wider range of cases and any past > > testcases that happen to use this can still be used for regression > > testing > > I think what you want is fault injection for memory errors, seems out > of scope here. no, i want fuzzing to continue to fuzz side data, it did so in the past and it should continue to do so. side data is in libav maybe native structs, in FFmpeg it was bytestream like packet data[], thats at least how i viewed it. [...]
On Wed, 8 Mar 2017 17:11:12 +0100 Michael Niedermayer <michael@niedermayer.cc> wrote: > On Wed, Mar 08, 2017 at 04:06:20PM +0100, wm4 wrote: > > On Wed, 8 Mar 2017 15:36:25 +0100 > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > On Wed, Mar 08, 2017 at 01:40:11PM +0100, wm4 wrote: > > > > It looks like this could lead to security issues, as side data readers > [...] > > > also size checks are needed for the case where a lib gets replaced by > > > one with newer version that has a bigger struct. > > > > Oh really? We never do this. Normal API structs are also considered > > appendable, so compiling against a newer API and then linking against > > an older version doesn't work. This is exactly the same case. > > no its not > > what you call normal structs are allocated by an allocator that is > part of the lib that defines it, the struct, and lib dependancies > ensure that its new. Any allocated struct as a result is large > enough though possibly not every field is set > > with side data the code using it sets the size explicitly that makes > the size generally hardcoded in the lib using the code theres no > longer a common allocator (some exceptions exist). > The size a lib allocates that way is the > compile time sizeof() which may differ from another lib > and side data can be passed in both directions between libs not just > in the direction of their dependancy > so you can end up with a smaller side data and that means you have to > do checks. This is wrong. side data which has structs have corresponding functions to get their allocation size. Of course that's all very error prone and hard to use correctly and some were added only recently because the API had holes, but that's how the libav* APIs are for now. Besides, manually checking struct sizes/allocations makes for an even worse ABI compatibility concept than FFmpeg currently follows. (Worse as in ease of use and accidental incompatibilities and UB triggered as consequence.) > > > [...] > > > > > > > > + && av_packet_split_side_data(pkt) == 1) { > > > > > > this could fail with ENOMEM which complicates things > > > > I can add a check for ENOMEM too. This should be the only thing that > > looks like a failure, but could work in a separate call (like the one > > on the libavcodec side). > > > > > > > > > if we do block such packets, its prbobably better to add a new > > > static inline function to a header that checks if a packet has > > > splitable side data and use that in av_packet_split_side_data() and > > > here > > > > Not sure what you mean here. > > > > > Iam suggesting "static inline" here to make backporting easier so no > > > new symbol is added to libavcodec that is needed by libavformat > > > > What's the purpose? > > Its simpler, cleaner and faster I mean of that function, or why we should care about symbols present. > > > > > > also it may be interresting to disable this check for fuzzing so > > > side data can be fuzzed in a wider range of cases and any past > > > testcases that happen to use this can still be used for regression > > > testing > > > > I think what you want is fault injection for memory errors, seems out > > of scope here. > > no, i want fuzzing to continue to fuzz side data, it did so in the > past and it should continue to do so. You can fuzz side data as much as you can fuzz AVFrame or AVCodecContext. I believe randomly changing in-memory data structures is referred to as fault injection, not fuzzing. > side data is in libav maybe native structs, in FFmpeg it was bytestream No. FFmpeg has at least 1 struct-defined side data that Libav doesn't have at all (mastering display data). > like packet data[], thats at least how i viewed it. You may have viewed it this way, but that's not how it actually is. Like I said in my another reply, even FFmpeg code occasionally uses side data without checking the side data size. Also, in Libav side data contents cannot come directly from raw files, and thus has potentially fewer security issues than FFmpeg.
On Wed, Mar 08, 2017 at 05:26:57PM +0100, wm4 wrote: > On Wed, 8 Mar 2017 17:11:12 +0100 > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > On Wed, Mar 08, 2017 at 04:06:20PM +0100, wm4 wrote: > > > On Wed, 8 Mar 2017 15:36:25 +0100 > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > On Wed, Mar 08, 2017 at 01:40:11PM +0100, wm4 wrote: > > > > > It looks like this could lead to security issues, as side data readers > > [...] > > > > also size checks are needed for the case where a lib gets replaced by > > > > one with newer version that has a bigger struct. > > > > > > Oh really? We never do this. Normal API structs are also considered > > > appendable, so compiling against a newer API and then linking against > > > an older version doesn't work. This is exactly the same case. > > > > no its not > > > > what you call normal structs are allocated by an allocator that is > > part of the lib that defines it, the struct, and lib dependancies > > ensure that its new. Any allocated struct as a result is large > > enough though possibly not every field is set > > > > with side data the code using it sets the size explicitly that makes > > the size generally hardcoded in the lib using the code theres no > > longer a common allocator (some exceptions exist). > > The size a lib allocates that way is the > > compile time sizeof() which may differ from another lib > > and side data can be passed in both directions between libs not just > > in the direction of their dependancy > > so you can end up with a smaller side data and that means you have to > > do checks. > > This is wrong. > side data which has structs have corresponding functions > to get their allocation size. Of course that's all very error prone and > hard to use correctly and some were added only recently because the > API had holes, but that's how the libav* APIs are for now. you talk about something else here. fact is the allocated side data uses hardcoded size values often anyone can look at git grep -A3 new_side_data theres is sizeof() use and there are litteral numbers also if these ever change, checks on the size become needed which was the original thing i meant in this sub argumentation. checks are needed, or something else in their place is needed in that case > > Besides, manually checking struct sizes/allocations makes for an even > worse ABI compatibility concept than FFmpeg currently follows. (Worse > as in ease of use and accidental incompatibilities and UB triggered as > consequence.) > > > > > > > [...] > > > > > > > > > > > + && av_packet_split_side_data(pkt) == 1) { > > > > > > > > this could fail with ENOMEM which complicates things > > > > > > I can add a check for ENOMEM too. This should be the only thing that > > > looks like a failure, but could work in a separate call (like the one > > > on the libavcodec side). > > > > > > > > > > > > > if we do block such packets, its prbobably better to add a new > > > > static inline function to a header that checks if a packet has > > > > splitable side data and use that in av_packet_split_side_data() and > > > > here > > > > > > Not sure what you mean here. > > > > > > > Iam suggesting "static inline" here to make backporting easier so no > > > > new symbol is added to libavcodec that is needed by libavformat > > > > > > What's the purpose? > > > > Its simpler, cleaner and faster > > I mean of that function, or why we should care about symbols present. i think i explained this but ill try again all more verbosly using a functiom to check for splitable sidedata instead of spliting the side data is cleaner as we dont want to split it we just want to check if theres any. Its simpler as we dont have to deal with errors from the split code and also dont need to deal with the case that split happened. its faster as a static inline function as it would be inlined its easier to backport as theres no new symbol added to another lib on which libavformat depends. (that is if its static inline vs non static) If we add a new symbol we should bump minor but we cannot do that in many past releases due to master using the next minor.Its easier thus with no new symbol. [...]
On Wed, Mar 08, 2017 at 05:26:57PM +0100, wm4 wrote: > On Wed, 8 Mar 2017 17:11:12 +0100 > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > On Wed, Mar 08, 2017 at 04:06:20PM +0100, wm4 wrote: > > > On Wed, 8 Mar 2017 15:36:25 +0100 > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > On Wed, Mar 08, 2017 at 01:40:11PM +0100, wm4 wrote: [...] > > > > > > > > > also it may be interresting to disable this check for fuzzing so > > > > side data can be fuzzed in a wider range of cases and any past > > > > testcases that happen to use this can still be used for regression > > > > testing > > > > > > I think what you want is fault injection for memory errors, seems out > > > of scope here. > > > > no, i want fuzzing to continue to fuzz side data, it did so in the > > past and it should continue to do so. > > You can fuzz side data as much as you can fuzz AVFrame or > AVCodecContext. I believe randomly changing in-memory data structures > is referred to as fault injection, not fuzzing. it doesnt really matter what you call it, but it was done and the patch breaks it if theres no option to disable it or something else [...]
On Wed, 8 Mar 2017 19:03:21 +0100 Michael Niedermayer <michael@niedermayer.cc> wrote: > On Wed, Mar 08, 2017 at 05:26:57PM +0100, wm4 wrote: > > On Wed, 8 Mar 2017 17:11:12 +0100 > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > On Wed, Mar 08, 2017 at 04:06:20PM +0100, wm4 wrote: > > > > On Wed, 8 Mar 2017 15:36:25 +0100 > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > On Wed, Mar 08, 2017 at 01:40:11PM +0100, wm4 wrote: > > > > > > It looks like this could lead to security issues, as side data readers > > > [...] > > > > > also size checks are needed for the case where a lib gets replaced by > > > > > one with newer version that has a bigger struct. > > > > > > > > Oh really? We never do this. Normal API structs are also considered > > > > appendable, so compiling against a newer API and then linking against > > > > an older version doesn't work. This is exactly the same case. > > > > > > no its not > > > > > > what you call normal structs are allocated by an allocator that is > > > part of the lib that defines it, the struct, and lib dependancies > > > ensure that its new. Any allocated struct as a result is large > > > enough though possibly not every field is set > > > > > > with side data the code using it sets the size explicitly that makes > > > the size generally hardcoded in the lib using the code theres no > > > longer a common allocator (some exceptions exist). > > > The size a lib allocates that way is the > > > compile time sizeof() which may differ from another lib > > > and side data can be passed in both directions between libs not just > > > in the direction of their dependancy > > > so you can end up with a smaller side data and that means you have to > > > do checks. > > > > This is wrong. > > > side data which has structs have corresponding functions > > to get their allocation size. Of course that's all very error prone and > > hard to use correctly and some were added only recently because the > > API had holes, but that's how the libav* APIs are for now. > > you talk about something else here. > > fact is the allocated side data uses hardcoded size values often > anyone can look at > git grep -A3 new_side_data > > theres is sizeof() use and there are litteral numbers also You have to use whatever is correct in each specific case. Using a number or sizeof() argument for new_side_data is simply an API violation in some cases, similar to e.g. using av_malloc(sizeof(AVFrame)). There are a few. I don't know why you want to "check" these uses with FATE. As I've said in the other thread, that's like letting FATE check sizeof(AVFrame). The right way is to check it in new_side_data, or have an API that is not so hard to use incorrectly. This has been discussed before, when we added new functions to add display mastering data or something similar in an ABI-safe way. > if these ever change, checks on the size become needed > which was the original thing i meant in this sub argumentation. > checks are needed, or something else in their place > is needed in that case And FATE-checking the sizes is the wrong thing to do. At least for the side data types whose byte layouts are defined by the C ABI like MASTERING_DISPLAY_METADATA, not by something wire-like as for example the SKIP_SAMPLES ones. > > > > > > Besides, manually checking struct sizes/allocations makes for an even > > worse ABI compatibility concept than FFmpeg currently follows. (Worse > > as in ease of use and accidental incompatibilities and UB triggered as > > consequence.) > > > > > > > > > > > > [...] > > > > > > > > > > > > > > + && av_packet_split_side_data(pkt) == 1) { > > > > > > > > > > this could fail with ENOMEM which complicates things > > > > > > > > I can add a check for ENOMEM too. This should be the only thing that > > > > looks like a failure, but could work in a separate call (like the one > > > > on the libavcodec side). > > > > > > > > > > > > > > > > > if we do block such packets, its prbobably better to add a new > > > > > static inline function to a header that checks if a packet has > > > > > splitable side data and use that in av_packet_split_side_data() and > > > > > here > > > > > > > > Not sure what you mean here. > > > > > > > > > Iam suggesting "static inline" here to make backporting easier so no > > > > > new symbol is added to libavcodec that is needed by libavformat > > > > > > > > What's the purpose? > > > > > > Its simpler, cleaner and faster > > > > I mean of that function, or why we should care about symbols present. > > i think i explained this but ill try again all more verbosly > > using a functiom to check for splitable sidedata instead of > spliting the side data is cleaner as we dont want to split it we just > want to check if theres any. > > Its simpler as we dont have to deal with errors from the split code > and also dont need to deal with the case that split happened. I don't see much simplicity in code duplication, putting code into public headers (which also means you have to make sure this compiles with C++), reducing compile times, and potentially exposing implementation details. In my opinion, this is a non-issue, since the split check call should go away as soon as libavcodec doesn't try to split side data anymore. > its faster as a static inline function as it would be inlined This is in a code path whose performance is bound by raw I/O and maybe things like memory allocation. The speed argument is not going to work. > its easier to backport as theres no new symbol added to another lib > on which libavformat depends. (that is if its static inline vs non > static) > If we add a new symbol we should bump minor but we cannot do that in > many past releases due to master using the next minor.Its easier thus > with no new symbol. Or we just stay with the split call, which in the normal case will run and exit quickly.
On Wed, 8 Mar 2017 19:20:15 +0100 Michael Niedermayer <michael@niedermayer.cc> wrote: > On Wed, Mar 08, 2017 at 05:26:57PM +0100, wm4 wrote: > > On Wed, 8 Mar 2017 17:11:12 +0100 > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > On Wed, Mar 08, 2017 at 04:06:20PM +0100, wm4 wrote: > > > > On Wed, 8 Mar 2017 15:36:25 +0100 > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > On Wed, Mar 08, 2017 at 01:40:11PM +0100, wm4 wrote: > [...] > > > > > > > > > > > > also it may be interresting to disable this check for fuzzing so > > > > > side data can be fuzzed in a wider range of cases and any past > > > > > testcases that happen to use this can still be used for regression > > > > > testing > > > > > > > > I think what you want is fault injection for memory errors, seems out > > > > of scope here. > > > > > > no, i want fuzzing to continue to fuzz side data, it did so in the > > > past and it should continue to do so. > > > > You can fuzz side data as much as you can fuzz AVFrame or > > AVCodecContext. I believe randomly changing in-memory data structures > > is referred to as fault injection, not fuzzing. > > it doesnt really matter what you call it, but it was done and the > patch breaks it if theres no option to disable it or something else Well, you can't do it anymore. Why are you so afraid that a potential error source is being eliminated? Of course you're still free to randomly change random memory locations and experience new bugs, but I'm not sure why this would matter for anything.
On Wed, 8 Mar 2017 19:20:15 +0100 Michael Niedermayer <michael@niedermayer.cc> wrote: > On Wed, Mar 08, 2017 at 05:26:57PM +0100, wm4 wrote: > > On Wed, 8 Mar 2017 17:11:12 +0100 > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > On Wed, Mar 08, 2017 at 04:06:20PM +0100, wm4 wrote: > > > > On Wed, 8 Mar 2017 15:36:25 +0100 > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > On Wed, Mar 08, 2017 at 01:40:11PM +0100, wm4 wrote: > [...] > > > > > > > > > > > > also it may be interresting to disable this check for fuzzing so > > > > > side data can be fuzzed in a wider range of cases and any past > > > > > testcases that happen to use this can still be used for regression > > > > > testing > > > > > > > > I think what you want is fault injection for memory errors, seems out > > > > of scope here. > > > > > > no, i want fuzzing to continue to fuzz side data, it did so in the > > > past and it should continue to do so. > > > > You can fuzz side data as much as you can fuzz AVFrame or > > AVCodecContext. I believe randomly changing in-memory data structures > > is referred to as fault injection, not fuzzing. > > it doesnt really matter what you call it, but it was done and the > patch breaks it if theres no option to disable it or something else PS: honestly, I think you're trolling with this. The possibility that random packets could be interpreted as side-data doesn't look like something that was accounted for, and I highly doubt this plays a role for fuzzing (or ever occurred in a fuzzing case). Why are you so hell-bent in finally preventing this arguably dangerous corner case? It makes no sense at all. Explain yourself.
On 3/8/2017 3:35 PM, wm4 wrote: > On Wed, 8 Mar 2017 19:20:15 +0100 > Michael Niedermayer <michael@niedermayer.cc> wrote: > >> On Wed, Mar 08, 2017 at 05:26:57PM +0100, wm4 wrote: >>> On Wed, 8 Mar 2017 17:11:12 +0100 >>> Michael Niedermayer <michael@niedermayer.cc> wrote: >>> >>>> On Wed, Mar 08, 2017 at 04:06:20PM +0100, wm4 wrote: >>>>> On Wed, 8 Mar 2017 15:36:25 +0100 >>>>> Michael Niedermayer <michael@niedermayer.cc> wrote: >>>>> >>>>>> On Wed, Mar 08, 2017 at 01:40:11PM +0100, wm4 wrote: >> [...] >>>> >>>>> >>>>>> also it may be interresting to disable this check for fuzzing so >>>>>> side data can be fuzzed in a wider range of cases and any past >>>>>> testcases that happen to use this can still be used for regression >>>>>> testing >>>>> >>>>> I think what you want is fault injection for memory errors, seems out >>>>> of scope here. >>>> >>>> no, i want fuzzing to continue to fuzz side data, it did so in the >>>> past and it should continue to do so. >>> >>> You can fuzz side data as much as you can fuzz AVFrame or >>> AVCodecContext. I believe randomly changing in-memory data structures >>> is referred to as fault injection, not fuzzing. >> >> it doesnt really matter what you call it, but it was done and the >> patch breaks it if theres no option to disable it or something else > > PS: honestly, I think you're trolling with this. Can the two of you stop accusing the other of trolling? If there's some miscommunication then try to work around it instead of assuming malice from the other side. The discussion so far has been fairly civil and technical, so please don't escalate it for no gain.
On Wed, Mar 08, 2017 at 07:31:27PM +0100, wm4 wrote: > On Wed, 8 Mar 2017 19:03:21 +0100 > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > On Wed, Mar 08, 2017 at 05:26:57PM +0100, wm4 wrote: > > > On Wed, 8 Mar 2017 17:11:12 +0100 > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > On Wed, Mar 08, 2017 at 04:06:20PM +0100, wm4 wrote: > > > > > On Wed, 8 Mar 2017 15:36:25 +0100 > > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > > > On Wed, Mar 08, 2017 at 01:40:11PM +0100, wm4 wrote: > > > > > > > It looks like this could lead to security issues, as side data readers > > > > [...] > > > > > > also size checks are needed for the case where a lib gets replaced by > > > > > > one with newer version that has a bigger struct. > > > > > > > > > > Oh really? We never do this. Normal API structs are also considered > > > > > appendable, so compiling against a newer API and then linking against > > > > > an older version doesn't work. This is exactly the same case. > > > > > > > > no its not > > > > > > > > what you call normal structs are allocated by an allocator that is > > > > part of the lib that defines it, the struct, and lib dependancies > > > > ensure that its new. Any allocated struct as a result is large > > > > enough though possibly not every field is set > > > > > > > > with side data the code using it sets the size explicitly that makes > > > > the size generally hardcoded in the lib using the code theres no > > > > longer a common allocator (some exceptions exist). > > > > The size a lib allocates that way is the > > > > compile time sizeof() which may differ from another lib > > > > and side data can be passed in both directions between libs not just > > > > in the direction of their dependancy > > > > so you can end up with a smaller side data and that means you have to > > > > do checks. > > > > > > This is wrong. > > > > > side data which has structs have corresponding functions > > > to get their allocation size. Of course that's all very error prone and > > > hard to use correctly and some were added only recently because the > > > API had holes, but that's how the libav* APIs are for now. > > > > you talk about something else here. > > > > fact is the allocated side data uses hardcoded size values often > > anyone can look at > > git grep -A3 new_side_data > > > > theres is sizeof() use and there are litteral numbers also > > You have to use whatever is correct in each specific case. Using a > number or sizeof() argument for new_side_data is simply an API > violation in some cases, similar to e.g. using > av_malloc(sizeof(AVFrame)). There are a few. > > I don't know why you want to "check" these uses with FATE. As I've said > in the other thread, that's like letting FATE check sizeof(AVFrame). > > The right way is to check it in new_side_data, or have an API that is > not so hard to use incorrectly. This has been discussed before, when > we added new functions to add display mastering data or something > similar in an ABI-safe way. > > > if these ever change, checks on the size become needed > > which was the original thing i meant in this sub argumentation. > > checks are needed, or something else in their place > > is needed in that case > > And FATE-checking the sizes is the wrong thing to do. At least for the > side data types whose byte layouts are defined by the C ABI like > MASTERING_DISPLAY_METADATA, not by something wire-like as for > example the SKIP_SAMPLES ones. wrong thread or you totally misunderstand me what you originally said: > It looks like this could lead to security issues, as side data readers > will for example rely on side data allocation sizes to be as large as > needed for correct operation. And what i replied: [...] also size checks are needed for the case where a lib gets replaced by one with newer version that has a bigger struct. Now fact is that for cases where you link to a lib with a differently sized struct or more generally any side data (which is what was meant) if its not using an allocator it needs a check in the code or something else, thats what i meant and thats from where this bizare sub discussion started where you seem to keep disagreeing about things i did not say Iam not talking about fate here, thats a seperate thing > > > > > > > > > > > Besides, manually checking struct sizes/allocations makes for an even > > > worse ABI compatibility concept than FFmpeg currently follows. (Worse > > > as in ease of use and accidental incompatibilities and UB triggered as > > > consequence.) > > > > > > > > > > > > > > > > > [...] > > > > > > > > > > > > > > > > > + && av_packet_split_side_data(pkt) == 1) { > > > > > > > > > > > > this could fail with ENOMEM which complicates things > > > > > > > > > > I can add a check for ENOMEM too. This should be the only thing that > > > > > looks like a failure, but could work in a separate call (like the one > > > > > on the libavcodec side). > > > > > > > > > > > > > > > > > > > > > if we do block such packets, its prbobably better to add a new > > > > > > static inline function to a header that checks if a packet has > > > > > > splitable side data and use that in av_packet_split_side_data() and > > > > > > here > > > > > > > > > > Not sure what you mean here. > > > > > > > > > > > Iam suggesting "static inline" here to make backporting easier so no > > > > > > new symbol is added to libavcodec that is needed by libavformat > > > > > > > > > > What's the purpose? > > > > > > > > Its simpler, cleaner and faster > > > > > > I mean of that function, or why we should care about symbols present. > > > > i think i explained this but ill try again all more verbosly > > > > using a functiom to check for splitable sidedata instead of > > spliting the side data is cleaner as we dont want to split it we just > > want to check if theres any. > > > > Its simpler as we dont have to deal with errors from the split code > > and also dont need to deal with the case that split happened. > > I don't see much simplicity in code duplication, putting code into > public headers (which also means you have to make sure this > compiles with C++), reducing compile times, and potentially exposing > implementation details. It can be in a private header, it doesnt need to be public [...]
On Wed, Mar 08, 2017 at 07:35:50PM +0100, wm4 wrote: > On Wed, 8 Mar 2017 19:20:15 +0100 > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > On Wed, Mar 08, 2017 at 05:26:57PM +0100, wm4 wrote: > > > On Wed, 8 Mar 2017 17:11:12 +0100 > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > On Wed, Mar 08, 2017 at 04:06:20PM +0100, wm4 wrote: > > > > > On Wed, 8 Mar 2017 15:36:25 +0100 > > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > > > On Wed, Mar 08, 2017 at 01:40:11PM +0100, wm4 wrote: > > [...] > > > > > > > > > > > > > > > also it may be interresting to disable this check for fuzzing so > > > > > > side data can be fuzzed in a wider range of cases and any past > > > > > > testcases that happen to use this can still be used for regression > > > > > > testing > > > > > > > > > > I think what you want is fault injection for memory errors, seems out > > > > > of scope here. > > > > > > > > no, i want fuzzing to continue to fuzz side data, it did so in the > > > > past and it should continue to do so. > > > > > > You can fuzz side data as much as you can fuzz AVFrame or > > > AVCodecContext. I believe randomly changing in-memory data structures > > > is referred to as fault injection, not fuzzing. > > > > it doesnt really matter what you call it, but it was done and the > > patch breaks it if theres no option to disable it or something else > > PS: honestly, I think you're trolling with this. The possibility that > random packets could be interpreted as side-data doesn't look like > something that was accounted for, and I highly doubt this plays a role > for fuzzing (or ever occurred in a fuzzing case). Why are you so > hell-bent in finally preventing this arguably dangerous corner case? It > makes no sense at all. Explain yourself. It plays a role, for example the timeout in 731 goes away if i change FF_MERGE_MARKER and the value prior to teh change is in the sample according to my hex editor so the fuzzer seems capable to create data to get past the check iam not trolling and this does play a role, ive seen other issues that used the side data split how the fuzzer does this i dont know but i can imagine a few ways like looking at the individual bytes of the comparission and evolving towards breaking it or maybe some input file was generated by libavformat leaking the marker into it somehow. [...]
On Wed, 8 Mar 2017 19:56:42 +0100 Michael Niedermayer <michael@niedermayer.cc> wrote: > On Wed, Mar 08, 2017 at 07:31:27PM +0100, wm4 wrote: > > On Wed, 8 Mar 2017 19:03:21 +0100 > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > On Wed, Mar 08, 2017 at 05:26:57PM +0100, wm4 wrote: > > > > On Wed, 8 Mar 2017 17:11:12 +0100 > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > On Wed, Mar 08, 2017 at 04:06:20PM +0100, wm4 wrote: > > > > > > On Wed, 8 Mar 2017 15:36:25 +0100 > > > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > > > > > On Wed, Mar 08, 2017 at 01:40:11PM +0100, wm4 wrote: > > > > > > > > It looks like this could lead to security issues, as side data readers > > > > > [...] > > > > > > > also size checks are needed for the case where a lib gets replaced by > > > > > > > one with newer version that has a bigger struct. > > > > > > > > > > > > Oh really? We never do this. Normal API structs are also considered > > > > > > appendable, so compiling against a newer API and then linking against > > > > > > an older version doesn't work. This is exactly the same case. > > > > > > > > > > no its not > > > > > > > > > > what you call normal structs are allocated by an allocator that is > > > > > part of the lib that defines it, the struct, and lib dependancies > > > > > ensure that its new. Any allocated struct as a result is large > > > > > enough though possibly not every field is set > > > > > > > > > > with side data the code using it sets the size explicitly that makes > > > > > the size generally hardcoded in the lib using the code theres no > > > > > longer a common allocator (some exceptions exist). > > > > > The size a lib allocates that way is the > > > > > compile time sizeof() which may differ from another lib > > > > > and side data can be passed in both directions between libs not just > > > > > in the direction of their dependancy > > > > > so you can end up with a smaller side data and that means you have to > > > > > do checks. > > > > > > > > This is wrong. > > > > > > > side data which has structs have corresponding functions > > > > to get their allocation size. Of course that's all very error prone and > > > > hard to use correctly and some were added only recently because the > > > > API had holes, but that's how the libav* APIs are for now. > > > > > > you talk about something else here. > > > > > > fact is the allocated side data uses hardcoded size values often > > > anyone can look at > > > git grep -A3 new_side_data > > > > > > theres is sizeof() use and there are litteral numbers also > > > > You have to use whatever is correct in each specific case. Using a > > number or sizeof() argument for new_side_data is simply an API > > violation in some cases, similar to e.g. using > > av_malloc(sizeof(AVFrame)). There are a few. > > > > > I don't know why you want to "check" these uses with FATE. As I've said > > in the other thread, that's like letting FATE check sizeof(AVFrame). > > > > The right way is to check it in new_side_data, or have an API that is > > not so hard to use incorrectly. This has been discussed before, when > > we added new functions to add display mastering data or something > > similar in an ABI-safe way. > > > > > if these ever change, checks on the size become needed > > > which was the original thing i meant in this sub argumentation. > > > checks are needed, or something else in their place > > > is needed in that case > > > > And FATE-checking the sizes is the wrong thing to do. At least for the > > side data types whose byte layouts are defined by the C ABI like > > MASTERING_DISPLAY_METADATA, not by something wire-like as for > > example the SKIP_SAMPLES ones. > > wrong thread or you totally misunderstand me > > what you originally said: > > It looks like this could lead to security issues, as side data readers > > will for example rely on side data allocation sizes to be as large as > > needed for correct operation. > > And what i replied: > [...] > > also size checks are needed for the case where a lib gets replaced by > one with newer version that has a bigger struct. Yes, and that's why I said somewhere that this is the same as other "appendable" structs in the libav* API. For example, AVFrame is supposed to be extendable, which is why everything outside of libavutil (the lib that defines the struct) must use av_frame_alloc() to allocate AVFrame. sizeof(AVFrame) is not part of the ABI. There is _no_ intention that an API user compiled against, say, libavutil version 1.2.100 can link to libavutil version 1.1.100, because 1.2.100 might have extended some structs (like AVFrame). So where does your argument for size checks come from? We don't do anything like this. There is no av_get_avframe_size(), and there's no code that would do "if(av_get_avframe_size() < sizeof(AVFRame))". But for side data we're suddenly supposed to do this? For side data we have e.g. av_mastering_display_metadata_alloc() and av_mastering_display_metadata_create_side_data(). Nowhere are you supposed to know or check the side data size. (I thought some side data types have an API function that returns the size of the corresponding struct, but I didn't find one. My mistake.) And finally, my claim that it's a security problem: if some code for example accesses mastering display side data, and the side data is too small, that code will obviously read out of bounds. There isn't even a way to check this, short of using sizeof(AVMastering...), which is not part of the ABI. This is why arbitrary side data read from files should be prevented. Some side-data types are defined like a file format - these generally use explicit size (as in, numbers in the code), and you accessed them with AV_WB32/AV_RB32 etc. - these are different. > Now fact is that for cases where you link to a lib with a differently > sized struct or more generally any side data (which is what was meant) > if its not using an allocator it needs a check in the code or something > else, thats what i meant and thats from where this bizare sub > discussion started where you seem to keep disagreeing about things i > did not say > > Iam not talking about fate here, thats a seperate thing OK, then I don't know what you're arguing about. Isn't the ABI situation pretty clear? > > > > > > > > > > > > > > > > > Besides, manually checking struct sizes/allocations makes for an even > > > > worse ABI compatibility concept than FFmpeg currently follows. (Worse > > > > as in ease of use and accidental incompatibilities and UB triggered as > > > > consequence.) > > > > > > > > > > > > > > > > > > > > > > [...] > > > > > > > > > > > > > > > > > > > > + && av_packet_split_side_data(pkt) == 1) { > > > > > > > > > > > > > > this could fail with ENOMEM which complicates things > > > > > > > > > > > > I can add a check for ENOMEM too. This should be the only thing that > > > > > > looks like a failure, but could work in a separate call (like the one > > > > > > on the libavcodec side). > > > > > > > > > > > > > > > > > > > > > > > > > if we do block such packets, its prbobably better to add a new > > > > > > > static inline function to a header that checks if a packet has > > > > > > > splitable side data and use that in av_packet_split_side_data() and > > > > > > > here > > > > > > > > > > > > Not sure what you mean here. > > > > > > > > > > > > > Iam suggesting "static inline" here to make backporting easier so no > > > > > > > new symbol is added to libavcodec that is needed by libavformat > > > > > > > > > > > > What's the purpose? > > > > > > > > > > Its simpler, cleaner and faster > > > > > > > > I mean of that function, or why we should care about symbols present. > > > > > > i think i explained this but ill try again all more verbosly > > > > > > using a functiom to check for splitable sidedata instead of > > > spliting the side data is cleaner as we dont want to split it we just > > > want to check if theres any. > > > > > > Its simpler as we dont have to deal with errors from the split code > > > and also dont need to deal with the case that split happened. > > > > I don't see much simplicity in code duplication, putting code into > > public headers (which also means you have to make sure this > > compiles with C++), reducing compile times, and potentially exposing > > implementation details. > > It can be in a private header, it doesnt need to be public OK, that could probably be reduced to the magic footer.
On Wed, 8 Mar 2017 20:15:24 +0100 Michael Niedermayer <michael@niedermayer.cc> wrote: > On Wed, Mar 08, 2017 at 07:35:50PM +0100, wm4 wrote: > > On Wed, 8 Mar 2017 19:20:15 +0100 > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > On Wed, Mar 08, 2017 at 05:26:57PM +0100, wm4 wrote: > > > > On Wed, 8 Mar 2017 17:11:12 +0100 > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > On Wed, Mar 08, 2017 at 04:06:20PM +0100, wm4 wrote: > > > > > > On Wed, 8 Mar 2017 15:36:25 +0100 > > > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > > > > > On Wed, Mar 08, 2017 at 01:40:11PM +0100, wm4 wrote: > > > [...] > > > > > > > > > > > > > > > > > > also it may be interresting to disable this check for fuzzing so > > > > > > > side data can be fuzzed in a wider range of cases and any past > > > > > > > testcases that happen to use this can still be used for regression > > > > > > > testing > > > > > > > > > > > > I think what you want is fault injection for memory errors, seems out > > > > > > of scope here. > > > > > > > > > > no, i want fuzzing to continue to fuzz side data, it did so in the > > > > > past and it should continue to do so. > > > > > > > > You can fuzz side data as much as you can fuzz AVFrame or > > > > AVCodecContext. I believe randomly changing in-memory data structures > > > > is referred to as fault injection, not fuzzing. > > > > > > it doesnt really matter what you call it, but it was done and the > > > patch breaks it if theres no option to disable it or something else > > > > PS: honestly, I think you're trolling with this. The possibility that > > random packets could be interpreted as side-data doesn't look like > > something that was accounted for, and I highly doubt this plays a role > > for fuzzing (or ever occurred in a fuzzing case). Why are you so > > hell-bent in finally preventing this arguably dangerous corner case? It > > makes no sense at all. Explain yourself. > > It plays a role, for example the timeout in 731 goes away if > i change FF_MERGE_MARKER and the value prior to teh change is in the > sample according to my hex editor > > so the fuzzer seems capable to create data to get past the check > iam not trolling and this does play a role, ive seen other issues > that used the side data split > > how the fuzzer does this i dont know but i can imagine a few ways > like looking at the individual bytes of the comparission and evolving > towards breaking it or maybe some input file was generated by > libavformat leaking the marker into it somehow. If you argue this way, the fuzzer will expose issues that can happen with the normal data flow too. (For example, if a demuxer sets a field in a side data struct to a value read from the file - obviously the fuzzer will be capable of exposing bugs related in readers of this field.) Normally you would be glad that an issue the fuzzer previously exposed goes away completely.
On Wed, Mar 08, 2017 at 08:23:26PM +0100, wm4 wrote: > On Wed, 8 Mar 2017 20:15:24 +0100 > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > On Wed, Mar 08, 2017 at 07:35:50PM +0100, wm4 wrote: > > > On Wed, 8 Mar 2017 19:20:15 +0100 > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > On Wed, Mar 08, 2017 at 05:26:57PM +0100, wm4 wrote: > > > > > On Wed, 8 Mar 2017 17:11:12 +0100 > > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > > > On Wed, Mar 08, 2017 at 04:06:20PM +0100, wm4 wrote: > > > > > > > On Wed, 8 Mar 2017 15:36:25 +0100 > > > > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > > > > > > > On Wed, Mar 08, 2017 at 01:40:11PM +0100, wm4 wrote: > > > > [...] > > > > > > > > > > > > > > > > > > > > > also it may be interresting to disable this check for fuzzing so > > > > > > > > side data can be fuzzed in a wider range of cases and any past > > > > > > > > testcases that happen to use this can still be used for regression > > > > > > > > testing > > > > > > > > > > > > > > I think what you want is fault injection for memory errors, seems out > > > > > > > of scope here. > > > > > > > > > > > > no, i want fuzzing to continue to fuzz side data, it did so in the > > > > > > past and it should continue to do so. > > > > > > > > > > You can fuzz side data as much as you can fuzz AVFrame or > > > > > AVCodecContext. I believe randomly changing in-memory data structures > > > > > is referred to as fault injection, not fuzzing. > > > > > > > > it doesnt really matter what you call it, but it was done and the > > > > patch breaks it if theres no option to disable it or something else > > > > > > PS: honestly, I think you're trolling with this. The possibility that > > > random packets could be interpreted as side-data doesn't look like > > > something that was accounted for, and I highly doubt this plays a role > > > for fuzzing (or ever occurred in a fuzzing case). Why are you so > > > hell-bent in finally preventing this arguably dangerous corner case? It > > > makes no sense at all. Explain yourself. > > > > It plays a role, for example the timeout in 731 goes away if > > i change FF_MERGE_MARKER and the value prior to teh change is in the > > sample according to my hex editor > > > > so the fuzzer seems capable to create data to get past the check > > iam not trolling and this does play a role, ive seen other issues > > that used the side data split > > > > how the fuzzer does this i dont know but i can imagine a few ways > > like looking at the individual bytes of the comparission and evolving > > towards breaking it or maybe some input file was generated by > > libavformat leaking the marker into it somehow. > > If you argue this way, the fuzzer will expose issues that can happen > with the normal data flow too. (For example, if a demuxer sets a field > in a side data struct to a value read from the file - obviously the > fuzzer will be capable of exposing bugs related in readers of this > field.) > > Normally you would be glad that an issue the fuzzer previously exposed > goes away completely. It doesnt go completely away, it just closes an easy way to introduce side data but yes, iam in fact happy if the path marker->libavformat->libavcodec goes away its the details iam not completely happy with [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB If you fake or manipulate statistics in a paper in physics you will never get a job again. If you fake or manipulate statistics in a paper in medicin you will get a job for life at the pharma industry.
On Wed, Mar 08, 2017 at 07:32:26PM +0100, wm4 wrote: > On Wed, 8 Mar 2017 19:20:15 +0100 > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > On Wed, Mar 08, 2017 at 05:26:57PM +0100, wm4 wrote: > > > On Wed, 8 Mar 2017 17:11:12 +0100 > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > On Wed, Mar 08, 2017 at 04:06:20PM +0100, wm4 wrote: > > > > > On Wed, 8 Mar 2017 15:36:25 +0100 > > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > > > On Wed, Mar 08, 2017 at 01:40:11PM +0100, wm4 wrote: > > [...] > > > > > > > > > > > > > > > also it may be interresting to disable this check for fuzzing so > > > > > > side data can be fuzzed in a wider range of cases and any past > > > > > > testcases that happen to use this can still be used for regression > > > > > > testing > > > > > > > > > > I think what you want is fault injection for memory errors, seems out > > > > > of scope here. > > > > > > > > no, i want fuzzing to continue to fuzz side data, it did so in the > > > > past and it should continue to do so. > > > > > > You can fuzz side data as much as you can fuzz AVFrame or > > > AVCodecContext. I believe randomly changing in-memory data structures > > > is referred to as fault injection, not fuzzing. > > > > it doesnt really matter what you call it, but it was done and the > > patch breaks it if theres no option to disable it or something else > > Well, you can't do it anymore. Why are you so afraid that a potential > error source is being eliminated? well its rather hidden, harder to test for not eliminated. [...]
On Wed, 8 Mar 2017 20:42:17 +0100 Michael Niedermayer <michael@niedermayer.cc> wrote: > On Wed, Mar 08, 2017 at 08:23:26PM +0100, wm4 wrote: > > On Wed, 8 Mar 2017 20:15:24 +0100 > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > On Wed, Mar 08, 2017 at 07:35:50PM +0100, wm4 wrote: > > > > On Wed, 8 Mar 2017 19:20:15 +0100 > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > On Wed, Mar 08, 2017 at 05:26:57PM +0100, wm4 wrote: > > > > > > On Wed, 8 Mar 2017 17:11:12 +0100 > > > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > > > > > On Wed, Mar 08, 2017 at 04:06:20PM +0100, wm4 wrote: > > > > > > > > On Wed, 8 Mar 2017 15:36:25 +0100 > > > > > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > > > > > > > > > On Wed, Mar 08, 2017 at 01:40:11PM +0100, wm4 wrote: > > > > > [...] > > > > > > > > > > > > > > > > > > > > > > > > also it may be interresting to disable this check for fuzzing so > > > > > > > > > side data can be fuzzed in a wider range of cases and any past > > > > > > > > > testcases that happen to use this can still be used for regression > > > > > > > > > testing > > > > > > > > > > > > > > > > I think what you want is fault injection for memory errors, seems out > > > > > > > > of scope here. > > > > > > > > > > > > > > no, i want fuzzing to continue to fuzz side data, it did so in the > > > > > > > past and it should continue to do so. > > > > > > > > > > > > You can fuzz side data as much as you can fuzz AVFrame or > > > > > > AVCodecContext. I believe randomly changing in-memory data structures > > > > > > is referred to as fault injection, not fuzzing. > > > > > > > > > > it doesnt really matter what you call it, but it was done and the > > > > > patch breaks it if theres no option to disable it or something else > > > > > > > > PS: honestly, I think you're trolling with this. The possibility that > > > > random packets could be interpreted as side-data doesn't look like > > > > something that was accounted for, and I highly doubt this plays a role > > > > for fuzzing (or ever occurred in a fuzzing case). Why are you so > > > > hell-bent in finally preventing this arguably dangerous corner case? It > > > > makes no sense at all. Explain yourself. > > > > > > It plays a role, for example the timeout in 731 goes away if > > > i change FF_MERGE_MARKER and the value prior to teh change is in the > > > sample according to my hex editor > > > > > > so the fuzzer seems capable to create data to get past the check > > > iam not trolling and this does play a role, ive seen other issues > > > that used the side data split > > > > > > how the fuzzer does this i dont know but i can imagine a few ways > > > like looking at the individual bytes of the comparission and evolving > > > towards breaking it or maybe some input file was generated by > > > libavformat leaking the marker into it somehow. > > > > If you argue this way, the fuzzer will expose issues that can happen > > with the normal data flow too. (For example, if a demuxer sets a field > > in a side data struct to a value read from the file - obviously the > > fuzzer will be capable of exposing bugs related in readers of this > > field.) > > > > > Normally you would be glad that an issue the fuzzer previously exposed > > goes away completely. > > It doesnt go completely away, it just closes an easy way to introduce > side data Well, in the first place it prevents accidental interpretation of input data as merged side data. You seem to be sad about the fact that the fuzzer won't be able to test such accidentally interpreted side data, which makes no sense to me. > but yes, iam in fact happy if the path marker->libavformat->libavcodec > goes away Like what? > its the details iam not completely happy with Which is? Other than the ENOMEM issue (will fix), strcmp (you didn't reply to my remarks), and the idea to introduce a special function to check for merged side data (I'm hesitant, maybe will do).
On Wed, 8 Mar 2017 20:54:43 +0100 Michael Niedermayer <michael@niedermayer.cc> wrote: > On Wed, Mar 08, 2017 at 07:32:26PM +0100, wm4 wrote: > > On Wed, 8 Mar 2017 19:20:15 +0100 > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > On Wed, Mar 08, 2017 at 05:26:57PM +0100, wm4 wrote: > > > > On Wed, 8 Mar 2017 17:11:12 +0100 > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > On Wed, Mar 08, 2017 at 04:06:20PM +0100, wm4 wrote: > > > > > > On Wed, 8 Mar 2017 15:36:25 +0100 > > > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > > > > > On Wed, Mar 08, 2017 at 01:40:11PM +0100, wm4 wrote: > > > [...] > > > > > > > > > > > > > > > > > > also it may be interresting to disable this check for fuzzing so > > > > > > > side data can be fuzzed in a wider range of cases and any past > > > > > > > testcases that happen to use this can still be used for regression > > > > > > > testing > > > > > > > > > > > > I think what you want is fault injection for memory errors, seems out > > > > > > of scope here. > > > > > > > > > > no, i want fuzzing to continue to fuzz side data, it did so in the > > > > > past and it should continue to do so. > > > > > > > > You can fuzz side data as much as you can fuzz AVFrame or > > > > AVCodecContext. I believe randomly changing in-memory data structures > > > > is referred to as fault injection, not fuzzing. > > > > > > it doesnt really matter what you call it, but it was done and the > > > patch breaks it if theres no option to disable it or something else > > > > Well, you can't do it anymore. Why are you so afraid that a potential > > error source is being eliminated? > > well its rather hidden, harder to test for not eliminated. How is it not eliminated?
On Wed, Mar 08, 2017 at 09:01:32PM +0100, wm4 wrote: > On Wed, 8 Mar 2017 20:54:43 +0100 > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > On Wed, Mar 08, 2017 at 07:32:26PM +0100, wm4 wrote: > > > On Wed, 8 Mar 2017 19:20:15 +0100 > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > On Wed, Mar 08, 2017 at 05:26:57PM +0100, wm4 wrote: > > > > > On Wed, 8 Mar 2017 17:11:12 +0100 > > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > > > On Wed, Mar 08, 2017 at 04:06:20PM +0100, wm4 wrote: > > > > > > > On Wed, 8 Mar 2017 15:36:25 +0100 > > > > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > > > > > > > On Wed, Mar 08, 2017 at 01:40:11PM +0100, wm4 wrote: > > > > [...] > > > > > > > > > > > > > > > > > > > > > also it may be interresting to disable this check for fuzzing so > > > > > > > > side data can be fuzzed in a wider range of cases and any past > > > > > > > > testcases that happen to use this can still be used for regression > > > > > > > > testing > > > > > > > > > > > > > > I think what you want is fault injection for memory errors, seems out > > > > > > > of scope here. > > > > > > > > > > > > no, i want fuzzing to continue to fuzz side data, it did so in the > > > > > > past and it should continue to do so. > > > > > > > > > > You can fuzz side data as much as you can fuzz AVFrame or > > > > > AVCodecContext. I believe randomly changing in-memory data structures > > > > > is referred to as fault injection, not fuzzing. > > > > > > > > it doesnt really matter what you call it, but it was done and the > > > > patch breaks it if theres no option to disable it or something else > > > > > > Well, you can't do it anymore. Why are you so afraid that a potential > > > error source is being eliminated? > > > > well its rather hidden, harder to test for not eliminated. > > How is it not eliminated? every side data we support can be generated by something otherwise we wouldnt have that side data type [...]
On Wed, 8 Mar 2017 23:34:39 +0100 Michael Niedermayer <michael@niedermayer.cc> wrote: > On Wed, Mar 08, 2017 at 09:01:32PM +0100, wm4 wrote: > > On Wed, 8 Mar 2017 20:54:43 +0100 > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > On Wed, Mar 08, 2017 at 07:32:26PM +0100, wm4 wrote: > > > > On Wed, 8 Mar 2017 19:20:15 +0100 > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > On Wed, Mar 08, 2017 at 05:26:57PM +0100, wm4 wrote: > > > > > > On Wed, 8 Mar 2017 17:11:12 +0100 > > > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > > > > > On Wed, Mar 08, 2017 at 04:06:20PM +0100, wm4 wrote: > > > > > > > > On Wed, 8 Mar 2017 15:36:25 +0100 > > > > > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > > > > > > > > > On Wed, Mar 08, 2017 at 01:40:11PM +0100, wm4 wrote: > > > > > [...] > > > > > > > > > > > > > > > > > > > > > > > > also it may be interresting to disable this check for fuzzing so > > > > > > > > > side data can be fuzzed in a wider range of cases and any past > > > > > > > > > testcases that happen to use this can still be used for regression > > > > > > > > > testing > > > > > > > > > > > > > > > > I think what you want is fault injection for memory errors, seems out > > > > > > > > of scope here. > > > > > > > > > > > > > > no, i want fuzzing to continue to fuzz side data, it did so in the > > > > > > > past and it should continue to do so. > > > > > > > > > > > > You can fuzz side data as much as you can fuzz AVFrame or > > > > > > AVCodecContext. I believe randomly changing in-memory data structures > > > > > > is referred to as fault injection, not fuzzing. > > > > > > > > > > it doesnt really matter what you call it, but it was done and the > > > > > patch breaks it if theres no option to disable it or something else > > > > > > > > Well, you can't do it anymore. Why are you so afraid that a potential > > > > error source is being eliminated? > > > > > > well its rather hidden, harder to test for not eliminated. > > > > How is it not eliminated? > > every side data we support can be generated by something otherwise Like what? You're not answering, just spouting confused stuff. What kind of reply is this? > we wouldnt have that side data type ????????????????
On Thu, Mar 09, 2017 at 07:50:14AM +0100, wm4 wrote: > On Wed, 8 Mar 2017 23:34:39 +0100 > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > On Wed, Mar 08, 2017 at 09:01:32PM +0100, wm4 wrote: > > > On Wed, 8 Mar 2017 20:54:43 +0100 > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > On Wed, Mar 08, 2017 at 07:32:26PM +0100, wm4 wrote: > > > > > On Wed, 8 Mar 2017 19:20:15 +0100 > > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > > > On Wed, Mar 08, 2017 at 05:26:57PM +0100, wm4 wrote: > > > > > > > On Wed, 8 Mar 2017 17:11:12 +0100 > > > > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > > > > > > > On Wed, Mar 08, 2017 at 04:06:20PM +0100, wm4 wrote: > > > > > > > > > On Wed, 8 Mar 2017 15:36:25 +0100 > > > > > > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > > > > > > > > > > > On Wed, Mar 08, 2017 at 01:40:11PM +0100, wm4 wrote: > > > > > > [...] > > > > > > > > > > > > > > > > > > > > > > > > > > > also it may be interresting to disable this check for fuzzing so > > > > > > > > > > side data can be fuzzed in a wider range of cases and any past > > > > > > > > > > testcases that happen to use this can still be used for regression > > > > > > > > > > testing > > > > > > > > > > > > > > > > > > I think what you want is fault injection for memory errors, seems out > > > > > > > > > of scope here. > > > > > > > > > > > > > > > > no, i want fuzzing to continue to fuzz side data, it did so in the > > > > > > > > past and it should continue to do so. > > > > > > > > > > > > > > You can fuzz side data as much as you can fuzz AVFrame or > > > > > > > AVCodecContext. I believe randomly changing in-memory data structures > > > > > > > is referred to as fault injection, not fuzzing. > > > > > > > > > > > > it doesnt really matter what you call it, but it was done and the > > > > > > patch breaks it if theres no option to disable it or something else > > > > > > > > > > Well, you can't do it anymore. Why are you so afraid that a potential > > > > > error source is being eliminated? > > > > > > > > well its rather hidden, harder to test for not eliminated. > > > > > > How is it not eliminated? > > > > every side data we support can be generated by something otherwise > > Like what? You're not answering, just spouting confused stuff. What > kind of reply is this? > > > we wouldnt have that side data type > > ???????????????? This is very basic really but lets elaborate for each side data type T possiblity A nothing uses side data type T possiblity B something uses side data type T Its the same with a codec, either a codec is used in some case or its used in no case. If something is used in no case then it has been eliminated as you describe. If somehing is still used in a case it has not been eliminated If as you describe side data has been eliminated then you could remove side data as a whole from the source code. If you cannot remove side data or a specific side data type from the source code then it has not been eliminated your change removes one way for an attacker to set side data but by the fact that you dont remove any of the side data types its clear you are aware of that every is still in use in some code path. a attacker may need to use a specific container format to set a specific side data type or may depend on a specific demuxer lib or application that allows him to set a side data type. now if you remove every way to set side data for an attacker then you can remove that side data type as a whole from the code. Of course that removes whatever the side data is for. Let me provide a specific example If a container suports changing extradata mid stream it will either be support or not. if any demuxer supports it then you have not eliminated the possiblity for an attacker I hope writing a elaborate reply will not lead to this discussion to shift onto some unrelated detail [...]
Le nonidi 19 ventôse, an CCXXV, Michael Niedermayer a écrit : > This is very basic really but lets elaborate > for each side data type T > possiblity A > nothing uses side data type T > > possiblity B > something uses side data type T > > Its the same with a codec, either a codec is used in some case or > its used in no case. > > If something is used in no case then it has been eliminated as you > describe. > If somehing is still used in a case it has not been eliminated > > If as you describe side data has been eliminated then you could > remove side data as a whole from the source code. > > If you cannot remove side data or a specific side data type from > the source code then it has not been eliminated > > your change removes one way for an attacker to set side data but > by the fact that you dont remove any of the side data types its > clear you are aware of that every is still in use in some code path. > > a attacker may need to use a specific container format to set a > specific side data type or may depend on a specific demuxer lib or > application that allows him to set a side data type. > > now if you remove every way to set side data for an attacker then > you can remove that side data type as a whole from the code. > Of course that removes whatever the side data is for. > > Let me provide a specific example > If a container suports changing extradata mid stream it will either > be support or not. > if any demuxer supports it then you have not eliminated the possiblity > for an attacker > > I hope writing a elaborate reply will not lead to this discussion > to shift onto some unrelated detail You are rehashing a lot of obvious facts, but you do not address the important questions. Side data is useful. It is a badly designed API, because it adds a lot of complexity for a hypothetical benefit but fails to reap that benefit. As such, it should be kept but enhanced, either by removing the complexity or by actually reaping the benefit. But it is not the topic of this discussion. MERGED side data is a completely brainded design that should never have been written. I am purposefully not looking at the archive to find out who is responsible for this mess. Now is not the time to point fingers but to fix the code. Now, please answer this very specific question: If someone were to REMOVE ALL AND EVERY use of av_packet_merge_side_data() and av_packet_split_side_data(), what would be the actual bad consequences? But before you start with fuzzing or anything similar, let me stop you: fuzzing exposes bugs that can be triggered by crafted inputs. If fuzzing can not trigger it, that means the bug does not exist, period. Regards,
On Thu, 9 Mar 2017 12:00:38 +0100 Michael Niedermayer <michael@niedermayer.cc> wrote: > On Thu, Mar 09, 2017 at 07:50:14AM +0100, wm4 wrote: > > On Wed, 8 Mar 2017 23:34:39 +0100 > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > On Wed, Mar 08, 2017 at 09:01:32PM +0100, wm4 wrote: > > > > On Wed, 8 Mar 2017 20:54:43 +0100 > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > On Wed, Mar 08, 2017 at 07:32:26PM +0100, wm4 wrote: > > > > > > On Wed, 8 Mar 2017 19:20:15 +0100 > > > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > > > > > On Wed, Mar 08, 2017 at 05:26:57PM +0100, wm4 wrote: > > > > > > > > On Wed, 8 Mar 2017 17:11:12 +0100 > > > > > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > > > > > > > > > On Wed, Mar 08, 2017 at 04:06:20PM +0100, wm4 wrote: > > > > > > > > > > On Wed, 8 Mar 2017 15:36:25 +0100 > > > > > > > > > > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 08, 2017 at 01:40:11PM +0100, wm4 wrote: > > > > > > > [...] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > also it may be interresting to disable this check for fuzzing so > > > > > > > > > > > side data can be fuzzed in a wider range of cases and any past > > > > > > > > > > > testcases that happen to use this can still be used for regression > > > > > > > > > > > testing > > > > > > > > > > > > > > > > > > > > I think what you want is fault injection for memory errors, seems out > > > > > > > > > > of scope here. > > > > > > > > > > > > > > > > > > no, i want fuzzing to continue to fuzz side data, it did so in the > > > > > > > > > past and it should continue to do so. > > > > > > > > > > > > > > > > You can fuzz side data as much as you can fuzz AVFrame or > > > > > > > > AVCodecContext. I believe randomly changing in-memory data structures > > > > > > > > is referred to as fault injection, not fuzzing. > > > > > > > > > > > > > > it doesnt really matter what you call it, but it was done and the > > > > > > > patch breaks it if theres no option to disable it or something else > > > > > > > > > > > > Well, you can't do it anymore. Why are you so afraid that a potential > > > > > > error source is being eliminated? > > > > > > > > > > well its rather hidden, harder to test for not eliminated. > > > > > > > > How is it not eliminated? > > > > > > every side data we support can be generated by something otherwise > > > > Like what? You're not answering, just spouting confused stuff. What > > kind of reply is this? > > > > > we wouldnt have that side data type > > > > ???????????????? > > This is very basic really but lets elaborate > for each side data type T > possiblity A > nothing uses side data type T > > possiblity B > something uses side data type T > > > Its the same with a codec, either a codec is used in some case or > its used in no case. > > If something is used in no case then it has been eliminated as you > describe. > If somehing is still used in a case it has not been eliminated > > If as you describe side data has been eliminated then you could > remove side data as a whole from the source code. > > If you cannot remove side data or a specific side data type from > the source code then it has not been eliminated > > your change removes one way for an attacker to set side data but > by the fact that you dont remove any of the side data types its > clear you are aware of that every is still in use in some code path. > > a attacker may need to use a specific container format to set a > specific side data type or may depend on a specific demuxer lib or > application that allows him to set a side data type. > > now if you remove every way to set side data for an attacker then > you can remove that side data type as a whole from the code. > Of course that removes whatever the side data is for. > > Let me provide a specific example > If a container suports changing extradata mid stream it will either > be support or not. > if any demuxer supports it then you have not eliminated the possiblity > for an attacker > > I hope writing a elaborate reply will not lead to this discussion > to shift onto some unrelated detail OK, but that has absolutely nothing to do with the issue at hand. Of course side data can still be used after this patch! Now if you could get back on topic...
On Thu, 9 Mar 2017 12:16:09 +0100 Nicolas George <george@nsup.org> wrote: > Now, please answer this very specific question: > > If someone were to REMOVE ALL AND EVERY use of > av_packet_merge_side_data() and av_packet_split_side_data(), what would > be the actual bad consequences? Simply that API users, which pass only the packet data itself from libavformat to libavcodec, will break in cases where the side data is "required". The argument is that such API users exist because they cannot pass the AVPacket reference along, but have to go through their own layers, which can transport only the raw packet data itself and maybe timestamps. > But before you start with fuzzing or anything similar, let me stop you: > fuzzing exposes bugs that can be triggered by crafted inputs. If fuzzing > can not trigger it, that means the bug does not exist, period. I think the argument is that it's easier to fuzz side-data related things if side data can be accidentally read from raw packets that look like they have merged side data.
On Thu, Mar 09, 2017 at 12:16:09PM +0100, Nicolas George wrote: > Le nonidi 19 ventôse, an CCXXV, Michael Niedermayer a écrit : > > This is very basic really but lets elaborate > > for each side data type T > > possiblity A > > nothing uses side data type T > > > > possiblity B > > something uses side data type T > > > > Its the same with a codec, either a codec is used in some case or > > its used in no case. > > > > If something is used in no case then it has been eliminated as you > > describe. > > If somehing is still used in a case it has not been eliminated > > > > If as you describe side data has been eliminated then you could > > remove side data as a whole from the source code. > > > > If you cannot remove side data or a specific side data type from > > the source code then it has not been eliminated > > > > your change removes one way for an attacker to set side data but > > by the fact that you dont remove any of the side data types its > > clear you are aware of that every is still in use in some code path. > > > > a attacker may need to use a specific container format to set a > > specific side data type or may depend on a specific demuxer lib or > > application that allows him to set a side data type. > > > > now if you remove every way to set side data for an attacker then > > you can remove that side data type as a whole from the code. > > Of course that removes whatever the side data is for. > > > > Let me provide a specific example > > If a container suports changing extradata mid stream it will either > > be support or not. > > if any demuxer supports it then you have not eliminated the possiblity > > for an attacker > > > > I hope writing a elaborate reply will not lead to this discussion > > to shift onto some unrelated detail > > You are rehashing a lot of obvious facts, but you do not address the > important questions. yes, i was trying to clarify a reply that was apparently unclear and not understood. its like statement -> point out disagreement -> do not understand -> clarify and clarify -> "You are rehashing a lot of obvious facts" Sorry if that felt off topic, it probably was [...]
Le nonidi 19 ventôse, an CCXXV, Michael Niedermayer a écrit : > yes, i was trying to clarify a reply that was apparently unclear and > not understood. > its like > statement -> point out disagreement -> do not understand -> clarify > and > clarify -> "You are rehashing a lot of obvious facts" > > Sorry if that felt off topic, it probably was Ok, but please answer the actual question in my mail. Regards,
On Thu, Mar 09, 2017 at 12:16:09PM +0100, Nicolas George wrote: > Le nonidi 19 ventôse, an CCXXV, Michael Niedermayer a écrit : > > This is very basic really but lets elaborate > > for each side data type T > > possiblity A > > nothing uses side data type T > > > > possiblity B > > something uses side data type T > > > > Its the same with a codec, either a codec is used in some case or > > its used in no case. > > > > If something is used in no case then it has been eliminated as you > > describe. > > If somehing is still used in a case it has not been eliminated > > > > If as you describe side data has been eliminated then you could > > remove side data as a whole from the source code. > > > > If you cannot remove side data or a specific side data type from > > the source code then it has not been eliminated > > > > your change removes one way for an attacker to set side data but > > by the fact that you dont remove any of the side data types its > > clear you are aware of that every is still in use in some code path. > > > > a attacker may need to use a specific container format to set a > > specific side data type or may depend on a specific demuxer lib or > > application that allows him to set a side data type. > > > > now if you remove every way to set side data for an attacker then > > you can remove that side data type as a whole from the code. > > Of course that removes whatever the side data is for. > > > > Let me provide a specific example > > If a container suports changing extradata mid stream it will either > > be support or not. > > if any demuxer supports it then you have not eliminated the possiblity > > for an attacker > > > > I hope writing a elaborate reply will not lead to this discussion > > to shift onto some unrelated detail > > You are rehashing a lot of obvious facts, but you do not address the > important questions. > > Side data is useful. It is a badly designed API, because it adds a lot > of complexity for a hypothetical benefit but fails to reap that benefit. > As such, it should be kept but enhanced, either by removing the > complexity or by actually reaping the benefit. > > But it is not the topic of this discussion. > > MERGED side data is a completely brainded design that should never have > been written. Iam not sure its on topic, but it gave FFmpeg an advantage over our competition at the time when this code was added. > > I am purposefully not looking at the archive to find out who is > responsible for this mess. Now is not the time to point fingers but to > fix the code. > > Now, please answer this very specific question: > > If someone were to REMOVE ALL AND EVERY use of > av_packet_merge_side_data() and av_packet_split_side_data(), what would > be the actual bad consequences? * All applications using libavformat and libavcodec would need to preserve the side data between the layers, they currently do not have to. Thats also a API/ABI change requiring a major bump * Any testcases that used this to inject side data will no longer function as regression tests * It reduces (but does not eliminate) the ways by which a fuzzer can inject side data. > > But before you start with fuzzing or anything similar, let me stop you: > fuzzing exposes bugs that can be triggered by crafted inputs. > If fuzzing > can not trigger it, that means the bug does not exist, period. This may be formally correct if you have infinite time but we dont. A specific example would be a raw h264 file used as input to the fuzzer with the split code the fuzzer can inject side data by adding a 8 byte value, fuzzers actually manage to do this without split code the fuzzer can still inject side data but it needs to build the right container around the h264 frames, like a matroska file that supports the specific type of side data. This is much harder for a fuzzer to do So the fuzzer then could only do that with cases that are mkv files to begin with which reduces the attack surface probed by the fuzzer [...]
diff --git a/libavformat/utils.c b/libavformat/utils.c index 37d7024465..68f3c977d6 100644 --- a/libavformat/utils.c +++ b/libavformat/utils.c @@ -840,6 +840,15 @@ int ff_read_packet(AVFormatContext *s, AVPacket *pkt) *pkt = tmp; } + if (strcmp(s->iformat->name, "concat") && strcmp(s->iformat->name, "hls,applehttp") + && av_packet_split_side_data(pkt) == 1) { + av_log(s, AV_LOG_ERROR, + "FFmpeg-style merged side data found in raw packet. " + "The packet is rejected for security reasons.\n"); + av_packet_unref(pkt); + return AVERROR_INVALIDDATA; + } + if ((s->flags & AVFMT_FLAG_DISCARD_CORRUPT) && (pkt->flags & AV_PKT_FLAG_CORRUPT)) { av_log(s, AV_LOG_WARNING,