diff mbox series

[FFmpeg-devel] avformat/file: Fix handing of file URIs

Message ID f8e7e935-4b11-c596-d55b-2f1836b28144@gentoo.org
State New
Headers show
Series [FFmpeg-devel] avformat/file: Fix handing of file URIs | expand

Checks

Context Check Description
andriy/configure_x86 warning Failed to apply patch
yinshiyou/configure_loongarch64 warning Failed to apply patch

Commit Message

Nick Sarnie Nov. 24, 2022, 5:59 p.m. UTC
The current URI handling only removes the file: prefix, but we also need 
to consider the case of percent encoding.
Percent encoding can happen with non-ASCII characters in the path.

I hit this through mpv with vobsub subtitles when dragging subtitles 
onto the player using the mouse.
Vobsub uses two files: an .idx file and a .sub file. mpv decodes the 
file path for direct inputs to mpv (the .idx file here), so that's why 
mpv works for most things today.
When passed the .idx file from a mouse-drag, mpv calls the FFmpeg mpeg 
demuxer which then tries to find the corresponding .sub file in 
vobsub_read_header.
However, mpv does not decode the path for the vobsub track name, which 
FFmpeg uses to find the sub file,
so open() was receiving the path with "file:" removed but still had the 
percent encoding, so it obviously failed.

I have a similar patch for mpv to make mpv decode the vobsub sub file 
name before passing to
FFmpeg as well, but I have received feedback from multiple mpv 
developers that this is also a
FFmpeg issue, so hopefully we can fix it in both places.

Signed-off-by: Nick Sarnie <sarnex@gentoo.org>
---
  libavformat/file.c | 18 ++++++++++++++++--
  1 file changed, 16 insertions(+), 2 deletions(-)

      const char *filename = h->filename;
      av_strstart(filename, "file:", &filename);
-
+    filename = ff_urldecode(filename, 0);
+    if(!filename)
+      return AVERROR(ENOMEM);
      {
  #if HAVE_ACCESS && defined(R_OK)
      if (access(filename, F_OK) < 0)
@@ -174,7 +177,9 @@ static int file_delete(URLContext *h)
      int ret;
      const char *filename = h->filename;
      av_strstart(filename, "file:", &filename);
-
+    filename = ff_urldecode(filename, 0);
+    if(!filename)
+      return AVERROR(ENOMEM);
      ret = rmdir(filename);
      if (ret < 0 && (errno == ENOTDIR
  #   ifdef _WIN32
@@ -196,7 +201,13 @@ static int file_move(URLContext *h_src, URLContext 
*h_dst)
      const char *filename_src = h_src->filename;
      const char *filename_dst = h_dst->filename;
      av_strstart(filename_src, "file:", &filename_src);
+    filename_src = ff_urldecode(filename_src, 0);
+    if(!filename_src)
+      return AVERROR(ENOMEM);
      av_strstart(filename_dst, "file:", &filename_dst);
+    filename_dst = ff_urldecode(filename_dst, 0);
+    if(!filename_dst)
+      return AVERROR(ENOMEM);
       if (rename(filename_src, filename_dst) < 0)
          return AVERROR(errno);
@@ -212,6 +223,9 @@ static int file_open(URLContext *h, const char 
*filename, int flags)
      struct stat st;
       av_strstart(filename, "file:", &filename);
+    filename = ff_urldecode(filename, 0);
+    if(!filename)
+      return AVERROR(ENOMEM);
       if (flags & AVIO_FLAG_WRITE && flags & AVIO_FLAG_READ) {
          access = O_CREAT | O_RDWR;

Comments

Nicolas George Nov. 24, 2022, 6:12 p.m. UTC | #1
Nick Sarnie (12022-11-24):
> The current URI handling only removes the file: prefix, but we also need to
> consider the case of percent encoding.
> Percent encoding can happen with non-ASCII characters in the path.

NAK, this is a huge compatibility break.

But this fix is necessary, you are right.

What needs to happen to handle this correctly:

- Introduce a “fs:” protocol as a synonym to file:.

- Change “file:” (but not “fs:”!) to detect if the rest of the path
  looks like a real file URL or a raw file path.

  - If it looks like a real file URL, de-percent-escape it.

  - If it looks like a raw file path, treat it the legacy way after
    printing a warning.

- In a few years, remove the heuristics and always handle “file:” in the
  standards-compliant way.

I believe we can ignore incomplete / relative file:// URLs in the
transition period. The triple / would be a very reliable way of
identification.

Regards,
Nick Sarnie Nov. 24, 2022, 6:40 p.m. UTC | #2
Okay, thanks. I tried asking if this patch would be accepted in IRC 
before posting this to prevent this kind of situation but nobody responded.

I don't know enough about the codebase to implement the suggested 
solution, so I'll let you all decide if you want to do it.
diff mbox series

Patch

diff --git a/libavformat/file.c b/libavformat/file.c
index 6103c37b34..416e7a0b6b 100644
--- a/libavformat/file.c
+++ b/libavformat/file.c
@@ -21,6 +21,7 @@ 
   #include "config_components.h"
  +#include "libavformat/urldecode.h"
  #include "libavutil/avstring.h"
  #include "libavutil/file_open.h"
  #include "libavutil/internal.h"
@@ -142,7 +143,9 @@  static int file_check(URLContext *h, int mask)
      int ret = 0;