diff mbox

[FFmpeg-devel] lavf/swfdec: Reduce score when auto-detecting swf files

Message ID CAB0OVGquzQmNau5nOfw+3ZxndW=Rv_0_FSmsC5_bUo-mxd5jsw@mail.gmail.com
State Accepted
Headers show

Commit Message

Carl Eugen Hoyos Jan. 18, 2018, 8:28 p.m. UTC
Hi!

The probe score for swf files (with uncompressed headers) is currently
very high after testing a little more than 24bit, attached patch
reduces the score.

Please comment, Carl Eugen

Comments

Tomas Härdin Jan. 19, 2018, 3:21 p.m. UTC | #1
On 2018-01-18 21:28, Carl Eugen Hoyos wrote:
> Hi!
>
> The probe score for swf files (with uncompressed headers) is currently
> very high after testing a little more than 24bit, attached patch
> reduces the score.
>
> Please comment, Carl Eugen

Would be consistent with the discussion about .c2 files. Could we 
document this in avformat.h? Something like:

     // Score should be no more than AVPROBE_SCORE_MAX * identifying_bits/64
     // if the number of identifying "magic" bits are less than 64

/Tomas
Michael Niedermayer Jan. 19, 2018, 5:33 p.m. UTC | #2
On Fri, Jan 19, 2018 at 04:21:54PM +0100, Tomas Härdin wrote:
> On 2018-01-18 21:28, Carl Eugen Hoyos wrote:
> >Hi!
> >
> >The probe score for swf files (with uncompressed headers) is currently
> >very high after testing a little more than 24bit, attached patch
> >reduces the score.
> >
> >Please comment, Carl Eugen
> 
> Would be consistent with the discussion about .c2 files. Could we document
> this in avformat.h? Something like:
> 

>     // Score should be no more than AVPROBE_SCORE_MAX * identifying_bits/64
>     // if the number of identifying "magic" bits are less than 64

this is a bit oversimplifying it i think
its more a question of "entropy" than bits
24bits tested for at one location is much stronger than 24 bits found at any
location in 20mb. (later is in fact close to telling us nothing)

If we want to define a limit then one way we might define it somehow based
on how often a large random input would cause probe to trigger with that or a 
higher score.
(this is also testable)

Care has to be taken that this works with existing probe functions.
It would be bad if a rule is added that when followed breaks existing
probe code or doesnt interoperate well with it


[...]
Michael Niedermayer Jan. 19, 2018, 5:51 p.m. UTC | #3
On Thu, Jan 18, 2018 at 09:28:40PM +0100, Carl Eugen Hoyos wrote:
> Hi!
> 
> The probe score for swf files (with uncompressed headers) is currently
> very high after testing a little more than 24bit, attached patch
> reduces the score.

hmm
the first 24 bits are tested and all but 2 values are rejected

thats 23 bits that must match

then 4 values are tested which can be from 1 to 31 bits each
the tests are each either == 0 or != 0 so they are a bit weak
but its at least adding 4 bits that must match and also len
itself has to be smallish so we could argue that this gets us to
about 28 bits
and then buf[3] is checked to be < 20
at this point we need about 32bits to be matching, still not huge but
id think this is stronger than what file extensions prove which are
MAX/2
The test is weakened by using 0 / not 0 / ascii as values though

Have you seen an actual probe failure ?

[...]
Carl Eugen Hoyos Jan. 19, 2018, 6:22 p.m. UTC | #4
2018-01-19 18:33 GMT+01:00 Michael Niedermayer <michael@niedermayer.cc>:
> On Fri, Jan 19, 2018 at 04:21:54PM +0100, Tomas Härdin wrote:
>> On 2018-01-18 21:28, Carl Eugen Hoyos wrote:
>> >Hi!
>> >
>> >The probe score for swf files (with uncompressed headers) is currently
>> >very high after testing a little more than 24bit, attached patch
>> >reduces the score.
>> >
>> >Please comment, Carl Eugen
>>
>> Would be consistent with the discussion about .c2 files. Could we document
>> this in avformat.h? Something like:
>>
>
>>     // Score should be no more than AVPROBE_SCORE_MAX * identifying_bits/64
>>     // if the number of identifying "magic" bits are less than 64
>
> this is a bit oversimplifying it i think

I don't disagree.

> its more a question of "entropy" than bits
> 24bits tested for at one location is much stronger than 24 bits found at any
> location in 20mb. (later is in fact close to telling us nothing)

Of course.

I believe that for ~64 initial bits, we should return MAX - 1.

But for many formats, "initial bits" is of no relevance.

Carl Eugen
Carl Eugen Hoyos Jan. 19, 2018, 6:25 p.m. UTC | #5
2018-01-19 18:51 GMT+01:00 Michael Niedermayer <michael@niedermayer.cc>:
> On Thu, Jan 18, 2018 at 09:28:40PM +0100, Carl Eugen Hoyos wrote:
>> Hi!
>>
>> The probe score for swf files (with uncompressed headers) is currently
>> very high after testing a little more than 24bit, attached patch
>> reduces the score.
>
> hmm
> the first 24 bits are tested and all but 2 values are rejected
>
> thats 23 bits that must match

Ok.

> then 4 values are tested which can be from 1 to 31 bits each
> the tests are each either == 0 or != 0 so they are a bit weak
> but its at least adding 4 bits that must match and also len

I was sure this is not equivalent to four bits on a specific
position (but for the patch I assumed a "best case scenario"
where this actually were the case).

> itself has to be smallish so we could argue that this gets us to
> about 28 bits

> and then buf[3] is checked to be < 20
> at this point we need about 32bits to be matching, still not huge but
> id think this is stronger than what file extensions prove which are
> MAX/2

We return MAX/2 for many "initial 32 bits" tests and I believe
we made good experience (and for MAX/2 the extension is
ignored or do I misremember?)

> The test is weakened by using 0 / not 0 / ascii as values though
>
> Have you seen an actual probe failure ?

No, I was looking at another issue.

Carl Eugen
Michael Niedermayer Jan. 19, 2018, 10:42 p.m. UTC | #6
On Fri, Jan 19, 2018 at 07:25:43PM +0100, Carl Eugen Hoyos wrote:
> 2018-01-19 18:51 GMT+01:00 Michael Niedermayer <michael@niedermayer.cc>:
> > On Thu, Jan 18, 2018 at 09:28:40PM +0100, Carl Eugen Hoyos wrote:
> >> Hi!
> >>
> >> The probe score for swf files (with uncompressed headers) is currently
> >> very high after testing a little more than 24bit, attached patch
> >> reduces the score.
> >
> > hmm
> > the first 24 bits are tested and all but 2 values are rejected
> >
> > thats 23 bits that must match
> 
> Ok.
> 
> > then 4 values are tested which can be from 1 to 31 bits each
> > the tests are each either == 0 or != 0 so they are a bit weak
> > but its at least adding 4 bits that must match and also len
> 
> I was sure this is not equivalent to four bits on a specific
> position (but for the patch I assumed a "best case scenario"
> where this actually were the case).
> 
> > itself has to be smallish so we could argue that this gets us to
> > about 28 bits
> 
> > and then buf[3] is checked to be < 20
> > at this point we need about 32bits to be matching, still not huge but
> > id think this is stronger than what file extensions prove which are
> > MAX/2
> 
> We return MAX/2 for many "initial 32 bits" tests and I believe
> we made good experience (and for MAX/2 the extension is
> ignored or do I misremember?)
> 

> > The test is weakened by using 0 / not 0 / ascii as values though
> >
> > Have you seen an actual probe failure ?
> 
> No, I was looking at another issue.

we have many files with wrong extensions, its not uncommon

so if we have no example of this failing then it is clearly
a stronger check than file extension checking which is at score 50
so the score should be higher

[...]
Carl Eugen Hoyos Jan. 20, 2018, 1:36 a.m. UTC | #7
2018-01-19 23:42 GMT+01:00 Michael Niedermayer <michael@niedermayer.cc>:
> On Fri, Jan 19, 2018 at 07:25:43PM +0100, Carl Eugen Hoyos wrote:
>> 2018-01-19 18:51 GMT+01:00 Michael Niedermayer <michael@niedermayer.cc>:
>> > On Thu, Jan 18, 2018 at 09:28:40PM +0100, Carl Eugen Hoyos wrote:
>> >> Hi!
>> >>
>> >> The probe score for swf files (with uncompressed headers) is currently
>> >> very high after testing a little more than 24bit, attached patch
>> >> reduces the score.
>> >
>> > hmm
>> > the first 24 bits are tested and all but 2 values are rejected
>> >
>> > thats 23 bits that must match
>>
>> Ok.
>>
>> > then 4 values are tested which can be from 1 to 31 bits each
>> > the tests are each either == 0 or != 0 so they are a bit weak
>> > but its at least adding 4 bits that must match and also len
>>
>> I was sure this is not equivalent to four bits on a specific
>> position (but for the patch I assumed a "best case scenario"
>> where this actually were the case).
>>
>> > itself has to be smallish so we could argue that this gets us to
>> > about 28 bits
>>
>> > and then buf[3] is checked to be < 20
>> > at this point we need about 32bits to be matching, still not huge but
>> > id think this is stronger than what file extensions prove which are
>> > MAX/2
>>
>> We return MAX/2 for many "initial 32 bits" tests and I believe
>> we made good experience (and for MAX/2 the extension is
>> ignored or do I misremember?)
>>
>
>> > The test is weakened by using 0 / not 0 / ascii as values though
>> >
>> > Have you seen an actual probe failure ?
>>
>> No, I was looking at another issue.
>
> we have many files with wrong extensions, its not uncommon

Yes, I misremembered how extensions are rated, I actually
wanted to set the return value to "AVPROBE_SCORE_EXTENSION + 1".
Would that be ok?

Carl Eugen
Michael Niedermayer Jan. 20, 2018, 2:14 a.m. UTC | #8
On Sat, Jan 20, 2018 at 02:36:08AM +0100, Carl Eugen Hoyos wrote:
> 2018-01-19 23:42 GMT+01:00 Michael Niedermayer <michael@niedermayer.cc>:
> > On Fri, Jan 19, 2018 at 07:25:43PM +0100, Carl Eugen Hoyos wrote:
> >> 2018-01-19 18:51 GMT+01:00 Michael Niedermayer <michael@niedermayer.cc>:
> >> > On Thu, Jan 18, 2018 at 09:28:40PM +0100, Carl Eugen Hoyos wrote:
> >> >> Hi!
> >> >>
> >> >> The probe score for swf files (with uncompressed headers) is currently
> >> >> very high after testing a little more than 24bit, attached patch
> >> >> reduces the score.
> >> >
> >> > hmm
> >> > the first 24 bits are tested and all but 2 values are rejected
> >> >
> >> > thats 23 bits that must match
> >>
> >> Ok.
> >>
> >> > then 4 values are tested which can be from 1 to 31 bits each
> >> > the tests are each either == 0 or != 0 so they are a bit weak
> >> > but its at least adding 4 bits that must match and also len
> >>
> >> I was sure this is not equivalent to four bits on a specific
> >> position (but for the patch I assumed a "best case scenario"
> >> where this actually were the case).
> >>
> >> > itself has to be smallish so we could argue that this gets us to
> >> > about 28 bits
> >>
> >> > and then buf[3] is checked to be < 20
> >> > at this point we need about 32bits to be matching, still not huge but
> >> > id think this is stronger than what file extensions prove which are
> >> > MAX/2
> >>
> >> We return MAX/2 for many "initial 32 bits" tests and I believe
> >> we made good experience (and for MAX/2 the extension is
> >> ignored or do I misremember?)
> >>
> >
> >> > The test is weakened by using 0 / not 0 / ascii as values though
> >> >
> >> > Have you seen an actual probe failure ?
> >>
> >> No, I was looking at another issue.
> >
> > we have many files with wrong extensions, its not uncommon
> 
> Yes, I misremembered how extensions are rated, I actually
> wanted to set the return value to "AVPROBE_SCORE_EXTENSION + 1".
> Would that be ok?

yes

thx

[...]
Carl Eugen Hoyos Jan. 20, 2018, 2:42 p.m. UTC | #9
2018-01-20 3:14 GMT+01:00 Michael Niedermayer <michael@niedermayer.cc>:
> On Sat, Jan 20, 2018 at 02:36:08AM +0100, Carl Eugen Hoyos wrote:
>> 2018-01-19 23:42 GMT+01:00 Michael Niedermayer <michael@niedermayer.cc>:
>> > On Fri, Jan 19, 2018 at 07:25:43PM +0100, Carl Eugen Hoyos wrote:
>> >> 2018-01-19 18:51 GMT+01:00 Michael Niedermayer <michael@niedermayer.cc>:
>> >> > On Thu, Jan 18, 2018 at 09:28:40PM +0100, Carl Eugen Hoyos wrote:
>> >> >> Hi!
>> >> >>
>> >> >> The probe score for swf files (with uncompressed headers) is currently
>> >> >> very high after testing a little more than 24bit, attached patch
>> >> >> reduces the score.
>> >> >
>> >> > hmm
>> >> > the first 24 bits are tested and all but 2 values are rejected
>> >> >
>> >> > thats 23 bits that must match
>> >>
>> >> Ok.
>> >>
>> >> > then 4 values are tested which can be from 1 to 31 bits each
>> >> > the tests are each either == 0 or != 0 so they are a bit weak
>> >> > but its at least adding 4 bits that must match and also len
>> >>
>> >> I was sure this is not equivalent to four bits on a specific
>> >> position (but for the patch I assumed a "best case scenario"
>> >> where this actually were the case).
>> >>
>> >> > itself has to be smallish so we could argue that this gets us to
>> >> > about 28 bits
>> >>
>> >> > and then buf[3] is checked to be < 20
>> >> > at this point we need about 32bits to be matching, still not huge but
>> >> > id think this is stronger than what file extensions prove which are
>> >> > MAX/2
>> >>
>> >> We return MAX/2 for many "initial 32 bits" tests and I believe
>> >> we made good experience (and for MAX/2 the extension is
>> >> ignored or do I misremember?)
>> >>
>> >
>> >> > The test is weakened by using 0 / not 0 / ascii as values though
>> >> >
>> >> > Have you seen an actual probe failure ?
>> >>
>> >> No, I was looking at another issue.
>> >
>> > we have many files with wrong extensions, its not uncommon
>>
>> Yes, I misremembered how extensions are rated, I actually
>> wanted to set the return value to "AVPROBE_SCORE_EXTENSION + 1".
>> Would that be ok?
>
> yes

Pushed with this new value.

Thank you, Carl Eugen
diff mbox

Patch

From 561cb5cea0ead726c747edea7d1c3e8c768eac81 Mon Sep 17 00:00:00 2001
From: Carl Eugen Hoyos <ceffmpeg@gmail.com>
Date: Thu, 18 Jan 2018 21:25:49 +0100
Subject: [PATCH] lavf/swfdec: Reduce score when auto-detecting swf files.

Only a little more than 24bit are tested.
---
 libavformat/swfdec.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libavformat/swfdec.c b/libavformat/swfdec.c
index 57b619f..da58755 100644
--- a/libavformat/swfdec.c
+++ b/libavformat/swfdec.c
@@ -95,7 +95,7 @@  static int swf_probe(AVProbeData *p)
     if (p->buf[3] >= 20 || xmax < 16 || ymax < 16)
         return AVPROBE_SCORE_MAX / 4;
 
-    return AVPROBE_SCORE_MAX;
+    return AVPROBE_SCORE_MAX / 2;
 }
 
 #if CONFIG_ZLIB
-- 
1.7.10.4