diff mbox series

[FFmpeg-devel] avformat/hls: look for trailing GET headers with m3u8 extension check

Message ID 20230512202622.29531-1-leo.izen@gmail.com
State New
Headers show
Series [FFmpeg-devel] avformat/hls: look for trailing GET headers with m3u8 extension check | expand

Checks

Context Check Description
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished
yinshiyou/make_loongarch64 success Make finished
yinshiyou/make_fate_loongarch64 success Make fate finished

Commit Message

Leo Izen May 12, 2023, 8:26 p.m. UTC
After commit 6b1f68ccb04d791f0250e05687c346a99ff47ea1 we refuse to use
URLs of the form https://foo.bar/baz.m3u8?foo=bar because it fails the
file extension check. This commit strips the ?foo=bar at the end before
checking the file extension.

Signed-off-by: Leo Izen <leo.izen@gmail.com>
---
 libavformat/hls.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

Comments

Michael Niedermayer May 13, 2023, 2:54 p.m. UTC | #1
On Fri, May 12, 2023 at 04:26:22PM -0400, Leo Izen wrote:
> After commit 6b1f68ccb04d791f0250e05687c346a99ff47ea1 we refuse to use
> URLs of the form https://foo.bar/baz.m3u8?foo=bar because it fails the
> file extension check. This commit strips the ?foo=bar at the end before
> checking the file extension.
> 
> Signed-off-by: Leo Izen <leo.izen@gmail.com>
> ---
>  libavformat/hls.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/libavformat/hls.c b/libavformat/hls.c
> index 11e345b280..6a97cced17 100644
> --- a/libavformat/hls.c
> +++ b/libavformat/hls.c
> @@ -2534,7 +2534,16 @@ static int hls_probe(const AVProbeData *p)
>          strstr(p->buf, "#EXT-X-TARGETDURATION:") ||
>          strstr(p->buf, "#EXT-X-MEDIA-SEQUENCE:")) {
>  
> -        if (!av_match_ext(p->filename, "m3u8,hls,m3u")) {
> +        char *request_qmark = strchr(p->filename, '?');
> +        int match_ext;
> +
> +        if (request_qmark)
> +            *request_qmark = '\0';
> +        match_ext = av_match_ext(p->filename, "m3u8,hls,m3u");
> +        if (request_qmark)
> +            *request_qmark = '?';
> +
> +        if (!match_ext) {
>              av_log(NULL, AV_LOG_ERROR, "Not detecting m3u8/hls with non standard extension\n");
>              return 0;
>          }

the av_match_ext here matches the probe code
all should be fixed. Also differences between local files and urls should
be considered in extension extraction

thx

[...]
Leo Izen May 13, 2023, 5:06 p.m. UTC | #2
On 5/13/23 10:54, Michael Niedermayer wrote:
> On Fri, May 12, 2023 at 04:26:22PM -0400, Leo Izen wrote:
>> After commit 6b1f68ccb04d791f0250e05687c346a99ff47ea1 we refuse to use
>> URLs of the form https://foo.bar/baz.m3u8?foo=bar because it fails the
>> file extension check. This commit strips the ?foo=bar at the end before
>> checking the file extension.
>>
>> Signed-off-by: Leo Izen <leo.izen@gmail.com>
>> ---
>>   libavformat/hls.c | 11 ++++++++++-
>>   1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/libavformat/hls.c b/libavformat/hls.c
>> index 11e345b280..6a97cced17 100644
>> --- a/libavformat/hls.c
>> +++ b/libavformat/hls.c
>> @@ -2534,7 +2534,16 @@ static int hls_probe(const AVProbeData *p)
>>           strstr(p->buf, "#EXT-X-TARGETDURATION:") ||
>>           strstr(p->buf, "#EXT-X-MEDIA-SEQUENCE:")) {
>>   
>> -        if (!av_match_ext(p->filename, "m3u8,hls,m3u")) {
>> +        char *request_qmark = strchr(p->filename, '?');
>> +        int match_ext;
>> +
>> +        if (request_qmark)
>> +            *request_qmark = '\0';
>> +        match_ext = av_match_ext(p->filename, "m3u8,hls,m3u");
>> +        if (request_qmark)
>> +            *request_qmark = '?';
>> +
>> +        if (!match_ext) {
>>               av_log(NULL, AV_LOG_ERROR, "Not detecting m3u8/hls with non standard extension\n");
>>               return 0;
>>           }
> 
> the av_match_ext here matches the probe code
> all should be fixed. Also differences between local files and urls should
> be considered in extension extraction

If you're requiring that we check that a file is local before stripping 
tailing request headers, how would you check if a file is local? having 
a scheme:// is not sufficient to make that check, as file:// is a valid 
scheme. You could check for https?:// I suppose, but the spec doesn't 
actually require that HTTP be used (section 2):

    Data SHOULD be carried over HTTP [RFC7230], but,
    in general, a URI can specify any protocol that can reliably transfer
    the specified resource on demand.

Do note that your original patch is not spec-compliant. RFC 8216 section 
4 says the following:

    Each Playlist file MUST be identifiable either by the path component
    of its URI or by HTTP Content-Type.  In the first case, the path MUST
    end with either .m3u8 or .m3u.  In the second, the HTTP Content-Type
    MUST be "application/vnd.apple.mpegurl" or "audio/mpegurl".  Clients
    SHOULD refuse to parse Playlists that are not so identified.


This implies that (1) .hls is not a valid extension if that is being 
used, and (2) a valid HLS mimetype in a content-type header is 
sufficient to mark a file as HLS regardless of the extension used.

So the extension check should only be done for local files or files that 
don't serve an appropriate content-type header, should you wish to do 
the extension check.

However, HTTP streams are always safe (mpv marks them as such, for 
example) so the original patch is still a bit silly.

- Leo Izen (Traneptora / thebombzen)
Andreas Rheinhardt May 14, 2023, 9:31 a.m. UTC | #3
Leo Izen:
> After commit 6b1f68ccb04d791f0250e05687c346a99ff47ea1 we refuse to use
> URLs of the form https://foo.bar/baz.m3u8?foo=bar because it fails the
> file extension check. This commit strips the ?foo=bar at the end before
> checking the file extension.
> 
> Signed-off-by: Leo Izen <leo.izen@gmail.com>
> ---
>  libavformat/hls.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/libavformat/hls.c b/libavformat/hls.c
> index 11e345b280..6a97cced17 100644
> --- a/libavformat/hls.c
> +++ b/libavformat/hls.c
> @@ -2534,7 +2534,16 @@ static int hls_probe(const AVProbeData *p)
>          strstr(p->buf, "#EXT-X-TARGETDURATION:") ||
>          strstr(p->buf, "#EXT-X-MEDIA-SEQUENCE:")) {
>  
> -        if (!av_match_ext(p->filename, "m3u8,hls,m3u")) {
> +        char *request_qmark = strchr(p->filename, '?');
> +        int match_ext;
> +
> +        if (request_qmark)
> +            *request_qmark = '\0';
> +        match_ext = av_match_ext(p->filename, "m3u8,hls,m3u");
> +        if (request_qmark)
> +            *request_qmark = '?';
> +
> +        if (!match_ext) {
>              av_log(NULL, AV_LOG_ERROR, "Not detecting m3u8/hls with non standard extension\n");
>              return 0;
>          }

This temporarily modifies p->filename which is a const char* (you let
strchr cast the const away); it is provided by the user and may point to
read-only memory, i.e. restoring the string is not safe. Furthermore, it
may lead to data races, because the string might be used somewhere else
concurrently (hypothetically, we could even run the probe functions in a
multi-threaded way).

- Andreas
Leo Izen May 14, 2023, 12:03 p.m. UTC | #4
On 5/14/23 05:31, Andreas Rheinhardt wrote:
> Leo Izen:
>> After commit 6b1f68ccb04d791f0250e05687c346a99ff47ea1 we refuse to use
>> URLs of the form https://foo.bar/baz.m3u8?foo=bar because it fails the
>> file extension check. This commit strips the ?foo=bar at the end before
>> checking the file extension.
>>
>> Signed-off-by: Leo Izen <leo.izen@gmail.com>
>> ---
>>   libavformat/hls.c | 11 ++++++++++-
>>   1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/libavformat/hls.c b/libavformat/hls.c
>> index 11e345b280..6a97cced17 100644
>> --- a/libavformat/hls.c
>> +++ b/libavformat/hls.c
>> @@ -2534,7 +2534,16 @@ static int hls_probe(const AVProbeData *p)
>>           strstr(p->buf, "#EXT-X-TARGETDURATION:") ||
>>           strstr(p->buf, "#EXT-X-MEDIA-SEQUENCE:")) {
>>   
>> -        if (!av_match_ext(p->filename, "m3u8,hls,m3u")) {
>> +        char *request_qmark = strchr(p->filename, '?');
>> +        int match_ext;
>> +
>> +        if (request_qmark)
>> +            *request_qmark = '\0';
>> +        match_ext = av_match_ext(p->filename, "m3u8,hls,m3u");
>> +        if (request_qmark)
>> +            *request_qmark = '?';
>> +
>> +        if (!match_ext) {
>>               av_log(NULL, AV_LOG_ERROR, "Not detecting m3u8/hls with non standard extension\n");
>>               return 0;
>>           }
> 
> This temporarily modifies p->filename which is a const char* (you let
> strchr cast the const away); it is provided by the user and may point to
> read-only memory, i.e. restoring the string is not safe. Furthermore, it
> may lead to data races, because the string might be used somewhere else
> concurrently (hypothetically, we could even run the probe functions in a
> multi-threaded way).
> 
> - Andreas

Would you recommend I strdup instead? I considered that but wanted to 
avoid the heap allocation.

- Leo Izen
Michael Niedermayer May 14, 2023, 8:43 p.m. UTC | #5
On Sat, May 13, 2023 at 01:06:25PM -0400, Leo Izen wrote:
> 
> 
> On 5/13/23 10:54, Michael Niedermayer wrote:
> > On Fri, May 12, 2023 at 04:26:22PM -0400, Leo Izen wrote:
> > > After commit 6b1f68ccb04d791f0250e05687c346a99ff47ea1 we refuse to use
> > > URLs of the form https://foo.bar/baz.m3u8?foo=bar because it fails the
> > > file extension check. This commit strips the ?foo=bar at the end before
> > > checking the file extension.
> > > 
> > > Signed-off-by: Leo Izen <leo.izen@gmail.com>
> > > ---
> > >   libavformat/hls.c | 11 ++++++++++-
> > >   1 file changed, 10 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/libavformat/hls.c b/libavformat/hls.c
> > > index 11e345b280..6a97cced17 100644
> > > --- a/libavformat/hls.c
> > > +++ b/libavformat/hls.c
> > > @@ -2534,7 +2534,16 @@ static int hls_probe(const AVProbeData *p)
> > >           strstr(p->buf, "#EXT-X-TARGETDURATION:") ||
> > >           strstr(p->buf, "#EXT-X-MEDIA-SEQUENCE:")) {
> > > -        if (!av_match_ext(p->filename, "m3u8,hls,m3u")) {
> > > +        char *request_qmark = strchr(p->filename, '?');
> > > +        int match_ext;
> > > +
> > > +        if (request_qmark)
> > > +            *request_qmark = '\0';
> > > +        match_ext = av_match_ext(p->filename, "m3u8,hls,m3u");
> > > +        if (request_qmark)
> > > +            *request_qmark = '?';
> > > +
> > > +        if (!match_ext) {
> > >               av_log(NULL, AV_LOG_ERROR, "Not detecting m3u8/hls with non standard extension\n");
> > >               return 0;
> > >           }
> > 
> > the av_match_ext here matches the probe code
> > all should be fixed. Also differences between local files and urls should
> > be considered in extension extraction
> 
> If you're requiring 

That way you word this is a little odd


> that we check that a file is local before stripping
> tailing request headers, how would you check if a file is local? having a
> scheme:// is not sufficient to make that check, as file:// is a valid
> scheme. You could check for https?:// I suppose, but the spec doesn't
> actually require that HTTP be used (section 2):
> 
>    Data SHOULD be carried over HTTP [RFC7230], but,
>    in general, a URI can specify any protocol that can reliably transfer
>    the specified resource on demand.

ATM the extension handling across the codebase treats everything like filenames
not like URIs, "?" has no special meaning.
You add unconditional special meaning to "?" in one function ignoring
everything else. I dont think thats improving the overall extension handling

But lets consider:
file:///home/myname/myfile.m3u8?file.avi
/home/myname/myfile.m3u8?file.avi
http:/server/myfile.m3u8?file.avi

The first is odd, iam not sure what "?file.avi" is and i wonder if we
could simply reject this at file protocol level.
If its accepted, I think it would map to /home/myname/myfile.m3u8 on disk
not "/home/myname/myfile.m3u8?file.avi"
Thats also how my web browser seems to interpret a file:///... the ?foobar part seems
stripped

OTOH /home/myname/myfile.m3u8?file.avi is a avi file with avi extension
its oddly named but its valid

the 3rd is a m3u8 file/script or whatever with file.avi as a parameter


> 
> Do note that your original patch is not spec-compliant. RFC 8216 section 4
> says the following:
> 
>    Each Playlist file MUST be identifiable either by the path component
>    of its URI or by HTTP Content-Type.  In the first case, the path MUST
>    end with either .m3u8 or .m3u.  In the second, the HTTP Content-Type
>    MUST be "application/vnd.apple.mpegurl" or "audio/mpegurl".  Clients
>    SHOULD refuse to parse Playlists that are not so identified.

The MUST statements sound like a muxer/server side requirement.
the SHOULD would affect us and tells us to reject not to accept


> 
> 
> This implies that (1) .hls is not a valid extension if that is being used,
> and 

do you suggest we should not accept .hls files ?


> (2) a valid HLS mimetype in a content-type header is sufficient to mark
> a file as HLS regardless of the extension used.

There are at least 4 cases here
A extension is m3u8/m3u
B extension is a well known non hls type (txt,avi,mkv,...), mime type is *hls* , 
C extension is something else,                              mime type is *hls* , 
D extension is not m3u8/m3u,                                mime type is not *hls*

In case of A and C we should detect hls by default, thats needed so our code
works without annoying the user
In D we should not detect hls, this is the SHOULD in the RFC

The B case is a oddball, does this case exist in non malicious cases ?


This matter is touching quite a few seperate areas so its very possible iam
missing something

thx

[...]
Leo Izen May 14, 2023, 9:03 p.m. UTC | #6
On 5/14/23 16:43, Michael Niedermayer wrote:
> On Sat, May 13, 2023 at 01:06:25PM -0400, Leo Izen wrote:
>>
>>
>> On 5/13/23 10:54, Michael Niedermayer wrote:
>>> On Fri, May 12, 2023 at 04:26:22PM -0400, Leo Izen wrote:
>>>> After commit 6b1f68ccb04d791f0250e05687c346a99ff47ea1 we refuse to use
>>>> URLs of the form https://foo.bar/baz.m3u8?foo=bar because it fails the
>>>> file extension check. This commit strips the ?foo=bar at the end before
>>>> checking the file extension.
>>>>
>>>> Signed-off-by: Leo Izen <leo.izen@gmail.com>
>>>> ---
>>>>    libavformat/hls.c | 11 ++++++++++-
>>>>    1 file changed, 10 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/libavformat/hls.c b/libavformat/hls.c
>>>> index 11e345b280..6a97cced17 100644
>>>> --- a/libavformat/hls.c
>>>> +++ b/libavformat/hls.c
>>>> @@ -2534,7 +2534,16 @@ static int hls_probe(const AVProbeData *p)
>>>>            strstr(p->buf, "#EXT-X-TARGETDURATION:") ||
>>>>            strstr(p->buf, "#EXT-X-MEDIA-SEQUENCE:")) {
>>>> -        if (!av_match_ext(p->filename, "m3u8,hls,m3u")) {
>>>> +        char *request_qmark = strchr(p->filename, '?');
>>>> +        int match_ext;
>>>> +
>>>> +        if (request_qmark)
>>>> +            *request_qmark = '\0';
>>>> +        match_ext = av_match_ext(p->filename, "m3u8,hls,m3u");
>>>> +        if (request_qmark)
>>>> +            *request_qmark = '?';
>>>> +
>>>> +        if (!match_ext) {
>>>>                av_log(NULL, AV_LOG_ERROR, "Not detecting m3u8/hls with non standard extension\n");
>>>>                return 0;
>>>>            }
>>>
>>> the av_match_ext here matches the probe code
>>> all should be fixed. Also differences between local files and urls should
>>> be considered in extension extraction
>>
>> If you're requiring
> 
> That way you word this is a little odd
> 
> 
>> that we check that a file is local before stripping
>> tailing request headers, how would you check if a file is local? having a
>> scheme:// is not sufficient to make that check, as file:// is a valid
>> scheme. You could check for https?:// I suppose, but the spec doesn't
>> actually require that HTTP be used (section 2):
>>
>>     Data SHOULD be carried over HTTP [RFC7230], but,
>>     in general, a URI can specify any protocol that can reliably transfer
>>     the specified resource on demand.
> 
> ATM the extension handling across the codebase treats everything like filenames
> not like URIs, "?" has no special meaning.
> You add unconditional special meaning to "?" in one function ignoring
> everything else. I dont think thats improving the overall extension handling

Your ridiculous "security" check rejects files that have a .m3u8 
extension because of query strings. This is a bug, and it needs to be fixed.

> 
> But lets consider:
> file:///home/myname/myfile.m3u8?file.avi
> /home/myname/myfile.m3u8?file.avi
> http:/server/myfile.m3u8?file.avi
> 
> The first is odd, iam not sure what "?file.avi" is and i wonder if we
> could simply reject this at file protocol level.

> If its accepted, I think it would map to /home/myname/myfile.m3u8 on disk
> not "/home/myname/myfile.m3u8?file.avi"


This is incorrect. Try it by naming a file "foo.m3u8?bar.txt" and run 
xdg-open 'file:///home/leo/foo.m3u8?bar.txt' and you will find that it 
opens it.

> Thats also how my web browser seems to interpret a file:///... the ?foobar part seems
> stripped >
> OTOH /home/myname/myfile.m3u8?file.avi is a avi file with avi extension
> its oddly named but its valid
> 
> the 3rd is a m3u8 file/script or whatever with file.avi as a parameter
> 
> 
>>
>> Do note that your original patch is not spec-compliant. RFC 8216 section 4
>> says the following:
>>
>>     Each Playlist file MUST be identifiable either by the path component
>>     of its URI or by HTTP Content-Type.  In the first case, the path MUST
>>     end with either .m3u8 or .m3u.  In the second, the HTTP Content-Type
>>     MUST be "application/vnd.apple.mpegurl" or "audio/mpegurl".  Clients
>>     SHOULD refuse to parse Playlists that are not so identified.
> 
> The MUST statements sound like a muxer/server side requirement.
> the SHOULD would affect us and tells us to reject not to accept
> 
> 
>>
>>
>> This implies that (1) .hls is not a valid extension if that is being used,
>> and
> 
> do you suggest we should not accept .hls files ?
> 

I actually suggest we *not reject by file extension*

> 
>> (2) a valid HLS mimetype in a content-type header is sufficient to mark
>> a file as HLS regardless of the extension used.
> 
> There are at least 4 cases here
> A extension is m3u8/m3u
> B extension is a well known non hls type (txt,avi,mkv,...), mime type is *hls* ,
> C extension is something else,                              mime type is *hls* ,
> D extension is not m3u8/m3u,                                mime type is not *hls*
> 
> In case of A and C we should detect hls by default, thats needed so our code
> works without annoying the user
> In D we should not detect hls, this is the SHOULD in the RFC
> 
> The B case is a oddball, does this case exist in non malicious cases ?
> 
> 
> This matter is touching quite a few seperate areas so its very possible iam
> missing something
> 

Yes, you're missing that if the *contents* contain *HLS* contents then 
we shouldn't refuse to probe the file based on the filename. That's not 
how *any of the other probe options* work.

Using filename to determine file type instead of contents is not 
security. It's actually the opposite of security.
Michael Niedermayer May 14, 2023, 9:38 p.m. UTC | #7
On Sun, May 14, 2023 at 05:03:57PM -0400, Leo Izen wrote:
> 
> 
> On 5/14/23 16:43, Michael Niedermayer wrote:
> > On Sat, May 13, 2023 at 01:06:25PM -0400, Leo Izen wrote:
> > > 
> > > 
> > > On 5/13/23 10:54, Michael Niedermayer wrote:
> > > > On Fri, May 12, 2023 at 04:26:22PM -0400, Leo Izen wrote:
> > > > > After commit 6b1f68ccb04d791f0250e05687c346a99ff47ea1 we refuse to use
> > > > > URLs of the form https://foo.bar/baz.m3u8?foo=bar because it fails the
> > > > > file extension check. This commit strips the ?foo=bar at the end before
> > > > > checking the file extension.
> > > > > 
> > > > > Signed-off-by: Leo Izen <leo.izen@gmail.com>
> > > > > ---
> > > > >    libavformat/hls.c | 11 ++++++++++-
> > > > >    1 file changed, 10 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/libavformat/hls.c b/libavformat/hls.c
> > > > > index 11e345b280..6a97cced17 100644
> > > > > --- a/libavformat/hls.c
> > > > > +++ b/libavformat/hls.c
> > > > > @@ -2534,7 +2534,16 @@ static int hls_probe(const AVProbeData *p)
> > > > >            strstr(p->buf, "#EXT-X-TARGETDURATION:") ||
> > > > >            strstr(p->buf, "#EXT-X-MEDIA-SEQUENCE:")) {
> > > > > -        if (!av_match_ext(p->filename, "m3u8,hls,m3u")) {
> > > > > +        char *request_qmark = strchr(p->filename, '?');
> > > > > +        int match_ext;
> > > > > +
> > > > > +        if (request_qmark)
> > > > > +            *request_qmark = '\0';
> > > > > +        match_ext = av_match_ext(p->filename, "m3u8,hls,m3u");
> > > > > +        if (request_qmark)
> > > > > +            *request_qmark = '?';
> > > > > +
> > > > > +        if (!match_ext) {
> > > > >                av_log(NULL, AV_LOG_ERROR, "Not detecting m3u8/hls with non standard extension\n");
> > > > >                return 0;
> > > > >            }
> > > > 
> > > > the av_match_ext here matches the probe code
> > > > all should be fixed. Also differences between local files and urls should
> > > > be considered in extension extraction
> > > 
> > > If you're requiring
> > 
> > That way you word this is a little odd
> > 
> > 
> > > that we check that a file is local before stripping
> > > tailing request headers, how would you check if a file is local? having a
> > > scheme:// is not sufficient to make that check, as file:// is a valid
> > > scheme. You could check for https?:// I suppose, but the spec doesn't
> > > actually require that HTTP be used (section 2):
> > > 
> > >     Data SHOULD be carried over HTTP [RFC7230], but,
> > >     in general, a URI can specify any protocol that can reliably transfer
> > >     the specified resource on demand.
> > 
> > ATM the extension handling across the codebase treats everything like filenames
> > not like URIs, "?" has no special meaning.
> > You add unconditional special meaning to "?" in one function ignoring
> > everything else. I dont think thats improving the overall extension handling
> 
> Your ridiculous "security" check rejects files that have a .m3u8 extension
> because of query strings. This is a bug, and it needs to be fixed.

yes theres a bug and we should fix that bug


> 
> > 
> > But lets consider:
> > file:///home/myname/myfile.m3u8?file.avi
> > /home/myname/myfile.m3u8?file.avi
> > http:/server/myfile.m3u8?file.avi
> > 
> > The first is odd, iam not sure what "?file.avi" is and i wonder if we
> > could simply reject this at file protocol level.
> 
> > If its accepted, I think it would map to /home/myname/myfile.m3u8 on disk
> > not "/home/myname/myfile.m3u8?file.avi"
> 
> 
> This is incorrect. Try it by naming a file "foo.m3u8?bar.txt" and run
> xdg-open 'file:///home/leo/foo.m3u8?bar.txt' and you will find that it opens
> it.

What is incorrect ?
we have some tools that will interpret "file:///home/leo/foo.m3u8?bar.txt" as
/home/leo/foo.m3u8?bar.txt and some /home/leo/foo.m3u8 on disk

I think that makes that sort of file URLs ambigous, dont you agree ?

thx

[...]
Rémi Denis-Courmont May 16, 2023, 12:41 p.m. UTC | #8
Le 15 mai 2023 05:38:22 GMT+08:00, Michael Niedermayer <michael@niedermayer.cc> a écrit :
>> > 
>> > But lets consider:
>> > file:///home/myname/myfile.m3u8?file.avi
>> > /home/myname/myfile.m3u8?file.avi
>> > http:/server/myfile.m3u8?file.avi
>> > 
>> > The first is odd, iam not sure what "?file.avi" is and i wonder if we
>> > could simply reject this at file protocol level.
>> 
>> > If its accepted, I think it would map to /home/myname/myfile.m3u8 on disk
>> > not "/home/myname/myfile.m3u8?file.avi"

Yes. You would escape the question mark if you wanted it in the local file path.

>> This is incorrect.

No.

>> try it by naming a file "foo.m3u8?bar.txt" and run
>> xdg-open 'file:///home/leo/foo.m3u8?bar.txt' and you will find that it opens
>> it.

It should stop at the question mark and drop everything from there. FWIW, URL syntax is specified by IETF, not XDG.

>What is incorrect ?
>we have some tools that will interpret "file:///home/leo/foo.m3u8?bar.txt" as
>/home/leo/foo.m3u8?bar.txt and some /home/leo/foo.m3u8 on disk
>
>I think that makes that sort of file URLs ambigous, dont you agree ?

Absolute URLs aren't ambiguous, and if the file scheme is specified, then the URL is absolute by definition. Situations whence relative locations are accepted, are ambiguous because some use URL syntax, some use local file path syntax (which is subtly incompatibly different) and some use a messed up mix of both relying on the tolerance of web browsers.
diff mbox series

Patch

diff --git a/libavformat/hls.c b/libavformat/hls.c
index 11e345b280..6a97cced17 100644
--- a/libavformat/hls.c
+++ b/libavformat/hls.c
@@ -2534,7 +2534,16 @@  static int hls_probe(const AVProbeData *p)
         strstr(p->buf, "#EXT-X-TARGETDURATION:") ||
         strstr(p->buf, "#EXT-X-MEDIA-SEQUENCE:")) {
 
-        if (!av_match_ext(p->filename, "m3u8,hls,m3u")) {
+        char *request_qmark = strchr(p->filename, '?');
+        int match_ext;
+
+        if (request_qmark)
+            *request_qmark = '\0';
+        match_ext = av_match_ext(p->filename, "m3u8,hls,m3u");
+        if (request_qmark)
+            *request_qmark = '?';
+
+        if (!match_ext) {
             av_log(NULL, AV_LOG_ERROR, "Not detecting m3u8/hls with non standard extension\n");
             return 0;
         }