diff mbox series

[FFmpeg-devel] avformat/hls: fail on probing non hls/m3u8 file extensions

Message ID 20230503123038.13030-1-michael@niedermayer.cc
State New
Headers show
Series [FFmpeg-devel] avformat/hls: fail on probing non hls/m3u8 file extensions | expand

Checks

Context Check Description
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Michael Niedermayer May 3, 2023, 12:30 p.m. UTC
Its unexpected that a .avi or other "standard" file turns into a playlist.
The goal of this patch is to avoid this unexpected behavior and possible
privacy or security differences.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
---
 libavformat/hls.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

Comments

Steven Liu May 3, 2023, 11:23 p.m. UTC | #1
Michael Niedermayer <michael@niedermayer.cc> 于2023年5月3日周三 20:30写道:
>
> Its unexpected that a .avi or other "standard" file turns into a playlist.
> The goal of this patch is to avoid this unexpected behavior and possible
> privacy or security differences.
>
> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> ---
>  libavformat/hls.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/libavformat/hls.c b/libavformat/hls.c
> index 8a96a37ff9..826135016d 100644
> --- a/libavformat/hls.c
> +++ b/libavformat/hls.c
> @@ -2530,10 +2530,18 @@ static int hls_probe(const AVProbeData *p)
>      if (strncmp(p->buf, "#EXTM3U", 7))
>          return 0;
>
> +
Empty line,
>      if (strstr(p->buf, "#EXT-X-STREAM-INF:")     ||
>          strstr(p->buf, "#EXT-X-TARGETDURATION:") ||
> -        strstr(p->buf, "#EXT-X-MEDIA-SEQUENCE:"))
> +        strstr(p->buf, "#EXT-X-MEDIA-SEQUENCE:")) {
> +
> +        if (!av_match_ext(p->filename, "m3u8,hls,m3u")) {
> +            av_log(NULL, AV_LOG_ERROR, "Not detecting m3u8/hls with non standard extension\n");
> +            return 0;
> +        }
> +
>          return AVPROBE_SCORE_MAX;
> +    }
>      return 0;
>  }
>
> --
> 2.17.1
>
> _______________________________________________

LGTM

Thanks
Steven
Michael Niedermayer May 5, 2023, 8:40 p.m. UTC | #2
On Thu, May 04, 2023 at 07:23:02AM +0800, Steven Liu wrote:
> Michael Niedermayer <michael@niedermayer.cc> 于2023年5月3日周三 20:30写道:
> >
> > Its unexpected that a .avi or other "standard" file turns into a playlist.
> > The goal of this patch is to avoid this unexpected behavior and possible
> > privacy or security differences.
> >
> > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> > ---
> >  libavformat/hls.c | 10 +++++++++-
> >  1 file changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/libavformat/hls.c b/libavformat/hls.c
> > index 8a96a37ff9..826135016d 100644
> > --- a/libavformat/hls.c
> > +++ b/libavformat/hls.c
> > @@ -2530,10 +2530,18 @@ static int hls_probe(const AVProbeData *p)
> >      if (strncmp(p->buf, "#EXTM3U", 7))
> >          return 0;
> >
> > +
> Empty line,

removed


> >      if (strstr(p->buf, "#EXT-X-STREAM-INF:")     ||
> >          strstr(p->buf, "#EXT-X-TARGETDURATION:") ||
> > -        strstr(p->buf, "#EXT-X-MEDIA-SEQUENCE:"))
> > +        strstr(p->buf, "#EXT-X-MEDIA-SEQUENCE:")) {
> > +
> > +        if (!av_match_ext(p->filename, "m3u8,hls,m3u")) {
> > +            av_log(NULL, AV_LOG_ERROR, "Not detecting m3u8/hls with non standard extension\n");
> > +            return 0;
> > +        }
> > +
> >          return AVPROBE_SCORE_MAX;
> > +    }
> >      return 0;
> >  }
> >
> > --
> > 2.17.1
> >
> > _______________________________________________
> 
> LGTM

will apply

thx

[...]
Anton Khirnov May 7, 2023, 8:42 p.m. UTC | #3
Quoting Michael Niedermayer (2023-05-03 14:30:38)
> Its unexpected that a .avi or other "standard" file turns into a playlist.
> The goal of this patch is to avoid this unexpected behavior and possible
> privacy or security differences.
> 

I very much dislike this approach.
Michael Niedermayer May 8, 2023, 11:25 p.m. UTC | #4
On Sun, May 07, 2023 at 10:42:56PM +0200, Anton Khirnov wrote:
> Quoting Michael Niedermayer (2023-05-03 14:30:38)
> > Its unexpected that a .avi or other "standard" file turns into a playlist.
> > The goal of this patch is to avoid this unexpected behavior and possible
> > privacy or security differences.
> > 
> 
> I very much dislike this approach.

What else do you suggest ?

We could have a configuration option that specifies one
or more trusted directories/files/urls. And everything outside would be
limited to selfcontained files
The average user can put * as trusted url if thats what they want
While someone who cares about their privacy and security could restrict 
that with little effort to the place where they store their music and
videos which they know is ok. While OTOH still throwing random URLs
at ffmpeg without expecting overly unexpected results (not considering bugs)
Thats similar to how some office software can handle macros.

Or do you have some other suggestion ?

thx

[...]
Anton Khirnov May 9, 2023, 6:15 a.m. UTC | #5
Quoting Michael Niedermayer (2023-05-09 01:25:00)
> On Sun, May 07, 2023 at 10:42:56PM +0200, Anton Khirnov wrote:
> > Quoting Michael Niedermayer (2023-05-03 14:30:38)
> > > Its unexpected that a .avi or other "standard" file turns into a playlist.
> > > The goal of this patch is to avoid this unexpected behavior and possible
> > > privacy or security differences.
> > > 
> > 
> > I very much dislike this approach.
> 
> What else do you suggest ?
> 
> We could have a configuration option that specifies one
> or more trusted directories/files/urls. And everything outside would be
> limited to selfcontained files
> The average user can put * as trusted url if thats what they want
> While someone who cares about their privacy and security could restrict 
> that with little effort to the place where they store their music and
> videos which they know is ok. While OTOH still throwing random URLs
> at ffmpeg without expecting overly unexpected results (not considering bugs)
> Thats similar to how some office software can handle macros.
> 
> Or do you have some other suggestion ?
> 

I don't see what actual problem is this supposed to address. The commit
message mentions some vague "possible issues", but
* this seems like the wrong layer for this kind of policy decisions
* I think there needs to be a clearly defined thread model before we can
  discuss what the right solution is
Michael Niedermayer May 9, 2023, 3:41 p.m. UTC | #6
On Tue, May 09, 2023 at 08:15:34AM +0200, Anton Khirnov wrote:
> Quoting Michael Niedermayer (2023-05-09 01:25:00)
> > On Sun, May 07, 2023 at 10:42:56PM +0200, Anton Khirnov wrote:
> > > Quoting Michael Niedermayer (2023-05-03 14:30:38)
> > > > Its unexpected that a .avi or other "standard" file turns into a playlist.
> > > > The goal of this patch is to avoid this unexpected behavior and possible
> > > > privacy or security differences.
> > > > 
> > > 
> > > I very much dislike this approach.
> > 
> > What else do you suggest ?
> > 
> > We could have a configuration option that specifies one
> > or more trusted directories/files/urls. And everything outside would be
> > limited to selfcontained files
> > The average user can put * as trusted url if thats what they want
> > While someone who cares about their privacy and security could restrict 
> > that with little effort to the place where they store their music and
> > videos which they know is ok. While OTOH still throwing random URLs
> > at ffmpeg without expecting overly unexpected results (not considering bugs)
> > Thats similar to how some office software can handle macros.
> > 
> > Or do you have some other suggestion ?
> > 
> 
> I don't see what actual problem is this supposed to address. The commit
> message mentions some vague "possible issues", but
> * this seems like the wrong layer for this kind of policy decisions
> * I think there needs to be a clearly defined thread model before we can
>   discuss what the right solution is

Its not one threat but many. And i already mentioned some but we can pick
one hypothetical example for sake of discussion and detail it more.
(being more detailed this is sadly a longer mail)

Lets assume an attacker wants to infiltrate a specific company
part of that will require gaining access to the companies internal 
network. So the attacker starts chating with some employee, the
medium of communication is irrelevant here.
The attacker cannot just ask the employee to run his network scanner
the employee would realize whats going on and report it (or should at least)
Maybe the attacker asks the emplyee to look at that really funny weblog
of his, but the emplyee might be trained not to open random links in
his webbrowser or the webbrowser might be so locked down that it
cant access the companies internal net.
So our attacker sends him a link to 
companyX_boss_and_secretary_sex_hd.avi
So he tries watching that because he now isnt sure if the guy meant that
was about his boss or someone else, after all he knows its safe because he watched
so much porn on so many shady sites with his video player and never got
hacked from it.
in reality that avi file is not a avi, its maybe some playlist allowing
arbitrary URLs.
While the employee watches some boss and secretary the bits any pieces betweem
the scenes scan the companies internal network and the external http 
accesses which go back to the attacker, allow him to get all the information of
it.

The problem is that playing some files allows things like scaning
the network and passing this information on to an attacker.
I dont have a real full testcase so maybe theres something iam missing
why this wouldnt work ...

compounding that our probing code is really good and detects such attacks
and makes them work even if the file is named incorrectly and has the wrong
mime type.
This makes such an attack easier to do and harder to detect even for a smart
user

ideal would be if the whole attack wouldnt work at all and network realms
would be clearly seperated and nothing touching the companies private net
or localnet could touch the public internet. 
This is not compatible with a playlist randomly mixing local, priavte and
public content.
If its just my own machiene, i have no such playlist and see no use in one
for me. But some people want that i think. 
They maybe arent the employee who as part of his job has to protect the
company and not play random playlists. 
Also they maybe arent the privacy or security concious users.

Having default same origin policy and a list of files/directories/urls that are
excempt from it would fix this in the example above.

The companies employees would not override default security settings to
watch a odd file. Or they could loose their job
A user caring about security / privacy could setup the settings
carefully so safe playlists work and other things are secure
and A user not caring could just disable a same origin check, and
this may be ok for her

thx

[...]
Leo Izen May 10, 2023, 4:24 p.m. UTC | #7
On 5/9/23 11:41, Michael Niedermayer wrote:
> On Tue, May 09, 2023 at 08:15:34AM +0200, Anton Khirnov wrote:
>> Quoting Michael Niedermayer (2023-05-09 01:25:00)
>>> On Sun, May 07, 2023 at 10:42:56PM +0200, Anton Khirnov wrote:
>>>> Quoting Michael Niedermayer (2023-05-03 14:30:38)
>>>>> Its unexpected that a .avi or other "standard" file turns into a playlist.
>>>>> The goal of this patch is to avoid this unexpected behavior and possible
>>>>> privacy or security differences.
>>>>>
>>>>
>>>> I very much dislike this approach.
>>>
>>> What else do you suggest ?
>>>
>>> We could have a configuration option that specifies one
>>> or more trusted directories/files/urls. And everything outside would be
>>> limited to selfcontained files
>>> The average user can put * as trusted url if thats what they want
>>> While someone who cares about their privacy and security could restrict
>>> that with little effort to the place where they store their music and
>>> videos which they know is ok. While OTOH still throwing random URLs
>>> at ffmpeg without expecting overly unexpected results (not considering bugs)
>>> Thats similar to how some office software can handle macros.
>>>
>>> Or do you have some other suggestion ?
>>>
>>
>> I don't see what actual problem is this supposed to address. The commit
>> message mentions some vague "possible issues", but
>> * this seems like the wrong layer for this kind of policy decisions
>> * I think there needs to be a clearly defined thread model before we can
>>    discuss what the right solution is
> 
> Its not one threat but many. And i already mentioned some but we can pick
> one hypothetical example for sake of discussion and detail it more.
> (being more detailed this is sadly a longer mail)
> 
> Lets assume an attacker wants to infiltrate a specific company
> part of that will require gaining access to the companies internal
> network. So the attacker starts chating with some employee, the
> medium of communication is irrelevant here.
> The attacker cannot just ask the employee to run his network scanner
> the employee would realize whats going on and report it (or should at least)
> Maybe the attacker asks the emplyee to look at that really funny weblog
> of his, but the emplyee might be trained not to open random links in
> his webbrowser or the webbrowser might be so locked down that it
> cant access the companies internal net.
> So our attacker sends him a link to
> companyX_boss_and_secretary_sex_hd.avi
> So he tries watching that because he now isnt sure if the guy meant that
> was about his boss or someone else, after all he knows its safe because he watched
> so much porn on so many shady sites with his video player and never got
> hacked from it.
> in reality that avi file is not a avi, its maybe some playlist allowing
> arbitrary URLs.
> While the employee watches some boss and secretary the bits any pieces betweem
> the scenes scan the companies internal network and the external http
> accesses which go back to the attacker, allow him to get all the information of
> it.
> 
> The problem is that playing some files allows things like scaning
> the network and passing this information on to an attacker.
> I dont have a real full testcase so maybe theres something iam missing
> why this wouldnt work ...
> 
> compounding that our probing code is really good and detects such attacks
> and makes them work even if the file is named incorrectly and has the wrong
> mime type.
> This makes such an attack easier to do and harder to detect even for a smart
> user
> 
> ideal would be if the whole attack wouldnt work at all and network realms
> would be clearly seperated and nothing touching the companies private net
> or localnet could touch the public internet.
> This is not compatible with a playlist randomly mixing local, priavte and
> public content.
> If its just my own machiene, i have no such playlist and see no use in one
> for me. But some people want that i think.
> They maybe arent the employee who as part of his job has to protect the
> company and not play random playlists.
> Also they maybe arent the privacy or security concious users.
> 
> Having default same origin policy and a list of files/directories/urls that are
> excempt from it would fix this in the example above.
> 
> The companies employees would not override default security settings to
> watch a odd file. Or they could loose their job
> A user caring about security / privacy could setup the settings
> carefully so safe playlists work and other things are secure
> and A user not caring could just disable a same origin check, and
> this may be ok for her
> 

One potential issue is that if an open TLS HTTP/2 socket already exists 
with the server, then the URL isn't being probed, but rather the 
response from a specific GET request.

This issue arose in mpv, because mpv opens an HTTP socket with the 
server and keeps it alive when running its own probe mechanics, before 
falling back to lavf's probe. In the event of an already open socket the 
URL isn't a valid filename, so mpv sends the empty string. The empty 
string doesn't have the .m3u8 extension, so it fails the validation check.

In every other instance of lavf probes, it bases it on content as higher 
priority than filename. Rename a .avi to .mp4 and it will still properly 
detect it as .avi, so this seems to be inconsistent with the existing 
behavior. IMO we should check file contents and MIME type sent by the 
server rather than just the file extension. If it starts with #EXTM3U 
and the other headers etc. and has an appropriate Content-type header, 
that should be sufficient.

In either case, it's unclear to me that, if we keep the check in, we 
should permit empty string as an exception to the check because of the 
HTTP socket issue, or if we should require clients to pass "foo.m3u8" or 
something to work around the problem.

- Leo Izen (Traneptora / thebombzen)
diff mbox series

Patch

diff --git a/libavformat/hls.c b/libavformat/hls.c
index 8a96a37ff9..826135016d 100644
--- a/libavformat/hls.c
+++ b/libavformat/hls.c
@@ -2530,10 +2530,18 @@  static int hls_probe(const AVProbeData *p)
     if (strncmp(p->buf, "#EXTM3U", 7))
         return 0;
 
+
     if (strstr(p->buf, "#EXT-X-STREAM-INF:")     ||
         strstr(p->buf, "#EXT-X-TARGETDURATION:") ||
-        strstr(p->buf, "#EXT-X-MEDIA-SEQUENCE:"))
+        strstr(p->buf, "#EXT-X-MEDIA-SEQUENCE:")) {
+
+        if (!av_match_ext(p->filename, "m3u8,hls,m3u")) {
+            av_log(NULL, AV_LOG_ERROR, "Not detecting m3u8/hls with non standard extension\n");
+            return 0;
+        }
+
         return AVPROBE_SCORE_MAX;
+    }
     return 0;
 }