Message ID | 20220216150859.16844-4-nil-admirari@mailo.com |
---|---|
State | New |
Headers | show |
Series | [FFmpeg-devel,v3,1/5] libavutil/wchar_filename.h: Add wchartoansi and utf8toansi | expand |
Context | Check | Description |
---|---|---|
andriy/make_x86 | success | Make finished |
andriy/make_fate_x86 | success | Make fate finished |
andriy/make_ppc | success | Make finished |
andriy/make_fate_ppc | success | Make fate finished |
On Wed, 16 Feb 2022, nihil-admirari wrote: > --- > fftools/cmdutils.c | 31 +++++++++++++++++++++++++------ > 1 file changed, 25 insertions(+), 6 deletions(-) > > diff --git a/fftools/cmdutils.c b/fftools/cmdutils.c > index 4b50e15..ea78897 100644 > --- a/fftools/cmdutils.c > +++ b/fftools/cmdutils.c > @@ -62,6 +62,7 @@ > #endif > #ifdef _WIN32 > #include <windows.h> > +#include "compat/w32dlfcn.h" > #endif > > static int init_report(const char *env); > @@ -2065,6 +2066,9 @@ FILE *get_preset_file(char *filename, size_t filename_size, > { > FILE *f = NULL; > int i; > +#if HAVE_GETMODULEHANDLE && defined(_WIN32) > + char *datadir = NULL; > +#endif > const char *base[3] = { getenv("FFMPEG_DATADIR"), > getenv("HOME"), > FFMPEG_DATADIR, }; > @@ -2074,19 +2078,31 @@ FILE *get_preset_file(char *filename, size_t filename_size, > f = fopen(filename, "r"); > } else { > #if HAVE_GETMODULEHANDLE && defined(_WIN32) > - char datadir[MAX_PATH], *ls; > + wchar_t *datadir_w = get_module_filename(NULL); > base[2] = NULL; > > - if (GetModuleFileNameA(GetModuleHandleA(NULL), datadir, sizeof(datadir) - 1)) > + if (wchartoansi(datadir_w, &datadir)) > + datadir = NULL; Why would you use ansi here? Aren't all internal char based paths supposed to be UTF-8, unless passing them to an external API that expects e.g. ACP paths? // Martin
Previously there was GetModuleFileNameA. wchartoansi is used to match old behaviour. I can replace it with wchartoutf8 if you wish. > - if (GetModuleFileNameA(GetModuleHandleA(NULL), datadir, sizeof(datadir) - 1)) > + if (wchartoansi(datadir_w, &datadir)) > + datadir = NULL; From: Martin Storsjö <martin@martin.st> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Subject: Re: [FFmpeg-devel] [PATCH v3 4/5] fftools/cmdutils.c: Replace MAX_PATH-sized buffers with dynamically sized ones Date: 16/02/2022 17:08:12 Europe/Moscow Why would you use ansi here? Aren't all internal char based paths supposed to be UTF-8, unless passing them to an external API that expects e.g. ACP paths? // Martin
> On Feb 16, 2022, at 18:32, nil-admirari@mailo.com wrote: > > Previously there was GetModuleFileNameA. wchartoansi is used to match old behaviour. I can replace it with wchartoutf8 if you wish. Oh, right. Well yes - if the path later is going to end up in a codepath that expects it to be UTF8 (please do check!), then we should go that way instead, then the existing code had a latent bug wrt that. (Also, don’t top post!) // Martin
> if the path later is going to end up in a codepath that expects it to be UTF8 (please do check!), then we should go that way instead I checked. datadir ends up in (cmdutils.c:2104) base[2] = datadir; and base[*] are later used in (cmdutils.c:2112 or 2116) snprintf(filename, filename_size, "%s%s/%s.ffpreset", base[i], i != 1 ? "" : "/.ffmpeg", preset_name); f = fopen(filename, "r"); On Windows fopen expects ANSI encoded strings, so we cannot change the encoding to UTF-8 without rewriting the rest of the function. I've also overlooked something from your previous comment. > unless passing them to an external API Previous version of the manifest (https://ffmpeg.org/pipermail/ffmpeg-devel/2022-February/293109.html) included <ws2019:activeCodePage>UTF-8</ws2019:activeCodePage> which makes CP_ACP the same as CP_UTF8. Unfortunately external APIs must also opt in for such a change, otherwise they won't be able to decode the strings FFmpeg sent them. A new version of the manifest without code page changes is at https://ffmpeg.org/pipermail/ffmpeg-devel/2022-February/293168.html.
On Thu, 17 Feb 2022, nil-admirari@mailo.com wrote: >> if the path later is going to end up in a codepath that expects it to be UTF8 (please do check!), then we should go that way instead > > I checked. datadir ends up in (cmdutils.c:2104) > > base[2] = datadir; > > and base[*] are later used in (cmdutils.c:2112 or 2116) > > snprintf(filename, filename_size, "%s%s/%s.ffpreset", base[i], > i != 1 ? "" : "/.ffmpeg", preset_name); > f = fopen(filename, "r"); > > On Windows fopen expects ANSI encoded strings, so we cannot change the > encoding to UTF-8 without rewriting the rest of the function. Ok, well maybe we should change that too, to use wchar paths (or utf8->whcar routines?). But maybe those are libavutil internals that cmdutils shouldn't use? If we stick with this form, with ansi paths, it would be good to leave a comment at the call to wchartoansi(), to explain why that doesn't use UTF-8 like everything else. // Martin
> Ok, well maybe we should change that too, to use wchar paths (or utf8->whcar routines?). Changed to wchartoutf8, and replaced fopen with av_fopen_utf8 throughout the file. See https://ffmpeg.org/pipermail/ffmpeg-devel/2022-February/293229.html. > doesn't use UTF-8 like everything else. This is not the only place where plain fopen is used. grep -RP '\bfopen\(' | wc -l gives 66 occurrences.
diff --git a/fftools/cmdutils.c b/fftools/cmdutils.c index 4b50e15..ea78897 100644 --- a/fftools/cmdutils.c +++ b/fftools/cmdutils.c @@ -62,6 +62,7 @@ #endif #ifdef _WIN32 #include <windows.h> +#include "compat/w32dlfcn.h" #endif static int init_report(const char *env); @@ -2065,6 +2066,9 @@ FILE *get_preset_file(char *filename, size_t filename_size, { FILE *f = NULL; int i; +#if HAVE_GETMODULEHANDLE && defined(_WIN32) + char *datadir = NULL; +#endif const char *base[3] = { getenv("FFMPEG_DATADIR"), getenv("HOME"), FFMPEG_DATADIR, }; @@ -2074,19 +2078,31 @@ FILE *get_preset_file(char *filename, size_t filename_size, f = fopen(filename, "r"); } else { #if HAVE_GETMODULEHANDLE && defined(_WIN32) - char datadir[MAX_PATH], *ls; + wchar_t *datadir_w = get_module_filename(NULL); base[2] = NULL; - if (GetModuleFileNameA(GetModuleHandleA(NULL), datadir, sizeof(datadir) - 1)) + if (wchartoansi(datadir_w, &datadir)) + datadir = NULL; + av_free(datadir_w); + + if (datadir) { - for (ls = datadir; ls < datadir + strlen(datadir); ls++) + char *ls; + for (ls = datadir; *ls; ls++) if (*ls == '\\') *ls = '/'; if (ls = strrchr(datadir, '/')) { - *ls = 0; - strncat(datadir, "/ffpresets", sizeof(datadir) - 1 - strlen(datadir)); - base[2] = datadir; + const int datadir_len = ls - datadir; + const int desired_size = datadir_len + strlen("/ffpresets") + 1; + char *new_datadir = av_realloc_array( + datadir, desired_size, sizeof *datadir); + if (new_datadir) { + datadir = new_datadir; + datadir[datadir_len] = 0; + strncat(datadir, "/ffpresets", desired_size - 1 - datadir_len); + base[2] = datadir; + } } } #endif @@ -2106,6 +2122,9 @@ FILE *get_preset_file(char *filename, size_t filename_size, } } +#if HAVE_GETMODULEHANDLE && defined(_WIN32) + av_free(datadir); +#endif return f; }