Message ID | 20181129181818.13295-1-andrey.semashev@gmail.com |
---|---|
State | New |
Headers | show |
Andrey Semashev (2018-11-29): > Previously, URIs with authority field were incorrectly interpreted as if > the authority was part of the path. The "file:" prefix does not indicate a file:// URI but a path for the file: protocol of FFmpeg. You can check by yourself that they are not URIs by trying to get FFmpeg to open file:///dev/nul%6c for example. Regards,
On 11/29/18 9:24 PM, Nicolas George wrote: > Andrey Semashev (2018-11-29): >> Previously, URIs with authority field were incorrectly interpreted as if >> the authority was part of the path. > > The "file:" prefix does not indicate a file:// URI but a path for the > file: protocol of FFmpeg. It does. avformat_open_input, avio_open and ffurl_open[_whitelist] docs all say it's an URL and they don't perform any conversion. So the file backend should be prepared to receive a URL, with a scheme and authority. > You can check by yourself that they are not URIs by trying to get FFmpeg > to open file:///dev/nul%6c for example. It will probably currently fail because of the escape sequence. But even if it weren't for this reason, it would still be interpreting the URI the wrong way because of the authority part, which is what this patch fixes. Escape sequences, if needed, can be fixed separately.
Andrey Semashev (2018-11-29): > It does. avformat_open_input, avio_open and ffurl_open[_whitelist] docs all > say it's an URL and they don't perform any conversion. So the file backend > should be prepared to receive a URL, with a scheme and authority. So either the docs are slightly wrong or the code is. Do you have an argument to decide it is one rather than the other? I do: > It will probably currently fail because of the escape sequence. Exactly. Since escaping, a very basic feature of URIs, is not handled at all, it is a clear indication that the paths are NOT meant to be considered URIs. The documentation was added much later, and made the same mistake you are doing now; same goes for a few private function names. > Escape sequences, if needed, can be fixed separately. That would break a lot of working applications and is therefore not a good idea. Regards,
On 11/29/18 9:47 PM, Nicolas George wrote: > Andrey Semashev (2018-11-29): >> It does. avformat_open_input, avio_open and ffurl_open[_whitelist] docs all >> say it's an URL and they don't perform any conversion. So the file backend >> should be prepared to receive a URL, with a scheme and authority. > > So either the docs are slightly wrong or the code is. Do you have an > argument to decide it is one rather than the other? > > I do: > >> It will probably currently fail because of the escape sequence. > > Exactly. Since escaping, a very basic feature of URIs, is not handled at > all, it is a clear indication that the paths are NOT meant to be > considered URIs. The documentation was added much later, and made the > same mistake you are doing now; same goes for a few private function > names. I condider the lack of support for escape sequences a bug, which is probably a rudiment of the past, when ffmpeg was primarily targeted for working with local files. The fact that all these functions also accept raw filesystem paths instead of URIs is also there for the same reason, only with additional benefit of convenience. Nowdays, there is one common interface for interacting with ffmpeg, and this interface is URIs (or raw local paths). There is no third pseudo-URI option, AFAICT. So, in my humble opinion the docs are correct, it is the implementation that needs to catch up. >> Escape sequences, if needed, can be fixed separately. > > That would break a lot of working applications and is therefore not a > good idea. If an application passes a URI and expects that it is not interpreted as such is already broken. I could make a patch adding support for escape sequences as well, but it seems you would not accept it. Am I correct?
Andrey Semashev (2018-11-29): > Nowdays, there is one common interface > for interacting with ffmpeg, and this interface is URIs (or raw local > paths). There is no third pseudo-URI option, AFAICT. So, in my humble > opinion the docs are correct, it is the implementation that needs to catch > up. You are wrong. There is one common interface: that is pseudi-URI. URI is not an option. > If an application passes a URI and expects that it is not interpreted as > such is already broken. And it always was. Breaking something that works is worse than having something that never worked still not work. > I could make a patch adding support for escape > sequences as well, but it seems you would not accept it. Am I correct? As is, "fixing" the file: protocol paths to be treated as URIs would be an API break, it is not acceptable. You can propose patches to make FFmpeg support real URIs instead / in addition to its old pseudo-URI syntax, but you would need to design with API compatibility in mind. Regards,
On 11/29/18 10:16 PM, Nicolas George wrote: > Andrey Semashev (2018-11-29): >> Nowdays, there is one common interface >> for interacting with ffmpeg, and this interface is URIs (or raw local >> paths). There is no third pseudo-URI option, AFAICT. So, in my humble >> opinion the docs are correct, it is the implementation that needs to catch >> up. > > You are wrong. There is one common interface: that is pseudi-URI. URI is > not an option. > >> If an application passes a URI and expects that it is not interpreted as >> such is already broken. > > And it always was. Breaking something that works is worse than having > something that never worked still not work. > >> I could make a patch adding support for escape >> sequences as well, but it seems you would not accept it. Am I correct? > > As is, "fixing" the file: protocol paths to be treated as URIs would be > an API break, it is not acceptable. > > You can propose patches to make FFmpeg support real URIs instead / in > addition to its old pseudo-URI syntax, but you would need to design with > API compatibility in mind. Ok, thanks for your comments.
diff --git a/libavformat/file.c b/libavformat/file.c index 1d321c4205..040197d50d 100644 --- a/libavformat/file.c +++ b/libavformat/file.c @@ -19,6 +19,8 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "libavutil/log.h" +#include "libavutil/error.h" #include "libavutil/avstring.h" #include "libavutil/internal.h" #include "libavutil/opt.h" @@ -104,6 +106,31 @@ static const AVClass pipe_class = { .version = LIBAVUTIL_VERSION_INT, }; +static int file_get_path(const char* filename, const char** ppath) +{ + const char* path = filename; + // Check if the filename is a file URI. https://tools.ietf.org/html/rfc8089#section-2 + if (av_strncasecmp(path, "file:", sizeof("file:") - 1) == 0) { + path += sizeof("file:") - 1; + if (path[0] == '/' && path[1] == '/') { + // The URI may have an authority part. Check that the authority does not contain + // a remote host name. We cannot access filesystem on a different host. + path += 2; + if (path[0] != '/') { + if (strncmp(path, "localhost/", sizeof("localhost/") - 1) == 0) { + path += sizeof("localhost") - 1; + } else { + av_log(NULL, AV_LOG_ERROR, "File URIs referencing a remote host are not supported: %s\n", filename); + return AVERROR(EINVAL); + } + } + } + } + + *ppath = path; + return 0; +} + static int file_read(URLContext *h, unsigned char *buf, int size) { FileContext *c = h->priv_data; @@ -136,7 +163,9 @@ static int file_check(URLContext *h, int mask) { int ret = 0; const char *filename = h->filename; - av_strstart(filename, "file:", &filename); + ret = file_get_path(filename, &filename); + if (ret < 0) + return ret; { #if HAVE_ACCESS && defined(R_OK) @@ -167,10 +196,12 @@ static int file_check(URLContext *h, int mask) static int file_delete(URLContext *h) { -#if HAVE_UNISTD_H +#if HAVE_UNISTD_H || defined(_WIN32) int ret; const char *filename = h->filename; - av_strstart(filename, "file:", &filename); + ret = file_get_path(filename, &filename); + if (ret < 0) + return ret; ret = rmdir(filename); if (ret < 0 && errno == ENOTDIR) @@ -188,8 +219,12 @@ static int file_move(URLContext *h_src, URLContext *h_dst) { const char *filename_src = h_src->filename; const char *filename_dst = h_dst->filename; - av_strstart(filename_src, "file:", &filename_src); - av_strstart(filename_dst, "file:", &filename_dst); + int ret = file_get_path(filename_src, &filename_src); + if (ret < 0) + return ret; + ret = file_get_path(filename_dst, &filename_dst); + if (ret < 0) + return ret; if (rename(filename_src, filename_dst) < 0) return AVERROR(errno); @@ -206,7 +241,9 @@ static int file_open(URLContext *h, const char *filename, int flags) int fd; struct stat st; - av_strstart(filename, "file:", &filename); + int ret = file_get_path(filename, &filename); + if (ret < 0) + return ret; if (flags & AVIO_FLAG_WRITE && flags & AVIO_FLAG_READ) { access = O_CREAT | O_RDWR; @@ -264,8 +301,12 @@ static int file_open_dir(URLContext *h) { #if HAVE_LSTAT FileContext *c = h->priv_data; + const char* dirname = h->filename; + int ret = file_get_path(dirname, &dirname); + if (ret < 0) + return ret; - c->dir = opendir(h->filename); + c->dir = opendir(dirname); if (!c->dir) return AVERROR(errno);