From patchwork Mon Jun 20 10:29:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nil Admirari X-Patchwork-Id: 36349 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:1a22:b0:84:42e0:ad30 with SMTP id cj34csp1582427pzb; Mon, 20 Jun 2022 03:30:29 -0700 (PDT) X-Google-Smtp-Source: AGRyM1tk60bSb74pG7Ce/e2QFpWIif2zvpWbOHntFsQDfFm90F7GqvDyoywVy8A5uSTJViMoGCAi X-Received: by 2002:a17:906:51d5:b0:711:f4c7:5085 with SMTP id v21-20020a17090651d500b00711f4c75085mr20876190ejk.650.1655721029042; Mon, 20 Jun 2022 03:30:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1655721029; cv=none; d=google.com; s=arc-20160816; b=0Wtg2ieCkvEqJFIxnj08VQgeycWssuIK1BMt3taE/p1gTgTndLKycih+KxPV0RAjlR EXlGjplrbkgckU4AGLgMJRYYJP4hKAYS8hq/5hSHOnagx1f2Sb74+F8nUtpKo7IZ39JV PU1f2w5GRS56y72W7aGZ/ADcphcawiZbGfBRdOvKqaf10dqZnalHXSiQpEoNA6g5OyZP WJwYREwLpKocauOrRRM8RHqrTHw+zo++OAzjwGm7WUmP/aGpwNJlJvaQF0TIDJJbsCFu hcdfsLcgPii1hqGQ2GcIYfu3afxVrRL9NMlTuY3lr7pdsLX7GKBOF6DqAQv7m0hpEpVc gwnw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=NcjM/neI7K1CLNdwwEK9N877fduUBByDuCN318dLxm4=; b=tAhAQ5s36/Zu5xMdWf1ewm8VXqSEYxOQ8YgpW5uLnvRT6zdXwa7vCLgpgeSn7mir9I o55907GI5qhspzoUAefw6S36FL8Deiv1i4HdxBEMayajClMlrIwe562X1FM55gSrLemC 99vY2P0/ZjCc7U/YjMx4Tykwi7QuL9SRPBeYnSrkkrD4CQBvgnn4RApBVhGVU3whFpck DHhmsNOq7skHNF/Lm/hYKo0VGkCFnZQjXl0qmIouEoUi7bU7xZuUi2bwYPSvDlA9Xo8S k/mE8HRuUq4RP9lEiCjVUkT43eCUvlcoNxpw6PCfK+GCiT94uNVxVnjS9XZXhBkygIzb QEmg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@mailo.com header.s=mailo header.b=Hoge4LL5; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=mailo.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id mm6-20020a170906cc4600b006ff453083c1si13672902ejb.189.2022.06.20.03.30.28; Mon, 20 Jun 2022 03:30:29 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@mailo.com header.s=mailo header.b=Hoge4LL5; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=mailo.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id F1D9468B61E; Mon, 20 Jun 2022 13:30:23 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from msg-1.mailo.com (msg-1.mailo.com [213.182.54.11]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 2513468B49A for ; Mon, 20 Jun 2022 13:30:17 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=mailo.com; s=mailo; t=1655721011; bh=5YJ6l5+eYCo/coSEItEBRcynw1N86PwP376ZZQrmM6w=; h=X-EA-Auth:From:To:Subject:Date:Message-Id:X-Mailer:MIME-Version: Content-Transfer-Encoding; b=Hoge4LL5KJCA1+XK07lBdIdj85kOMNFz/B+naUP9TJ0axzEw0sxu8ouN8AsIMlRP+ cckKU755dnHwJ5Qjaf018WSbLFSFAsk8HV72F+Ub2p0AdCv+ebkTxocsIDj71h5sRc yTONnmHkztCq/5TZiGIo4qCq3j4u0QKHM6ZkVOdE= Received: by b-4.in.mailobj.net [192.168.90.14] with ESMTP via ip-206.mailobj.net [213.182.55.206] Mon, 20 Jun 2022 12:30:11 +0200 (CEST) X-EA-Auth: mHaxrBuLReljtflpc7azTWiDrbNRNedGeORO/M/14H98H0ExOPO1MdpA2zaeI+p6FDj8kyZw6yiD5KCD2U5JcH4jA+9PKtiMCbSQo36WHV0= From: Nil Admirari To: ffmpeg-devel@ffmpeg.org Date: Mon, 20 Jun 2022 13:29:57 +0300 Message-Id: <20220620103001.15035-1-nil-admirari@mailo.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v20 1/5] libavutil: Add wchartoutf8(), wchartoansi(), utf8toansi(), getenv_utf8(), freeenv_utf8() and getenv_dup() X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 4dsDkTg9+sZ5 wchartoutf8() converts strings returned by WinAPI into UTF-8, which is FFmpeg's preffered encoding. Some external dependencies, such as AviSynth, are still not Unicode-enabled. utf8toansi() converts UTF-8 strings into ANSI in two steps: UTF-8 -> wchar_t -> ANSI. wchartoansi() is responsible for the second step of the conversion. Conversion in just one step is not supported by WinAPI. Since these character converting functions allocate the buffer of necessary size, they also facilitate the removal of MAX_PATH limit in places where fixed-size ANSI/WCHAR strings were used as filename buffers. On Windows, getenv_utf8() wraps _wgetenv() converting its input from and its output to UTF-8. Strings returned by getenv_utf8() must be freed by freeenv_utf8(). On all other platforms getenv_utf8() is a wrapper around getenv(), and freeenv_utf8() is a no-op. The value returned by plain getenv() cannot be modified; av_strdup() is usually used when modifications are required. However, on Windows, av_strdup() after getenv_utf8() leads to unnecessary allocation. getenv_dup() is introduced to avoid such an allocation. Value returned by getenv_dup() must be freed by av_free(). Because of cleanup complexities, in places that only test the existence of an environment variable or compare its value with a string consisting entirely of ASCII characters, the use of plain getenv() is still preferred. (libavutil/log.c check_color_terminal() is an example of such a place.) Plain getenv() is also preffered in UNIX-only code, such as bktr.c, fbdev_common.c, oss.c in libavdevice or af_ladspa.c in libavfilter. --- configure | 1 + libavutil/getenv_utf8.h | 86 ++++++++++++++++++++++++++++++++++++++ libavutil/wchar_filename.h | 53 +++++++++++++++++++++++ 3 files changed, 140 insertions(+) create mode 100644 libavutil/getenv_utf8.h diff --git a/configure b/configure index 7ffbb85e21..3a97610209 100755 --- a/configure +++ b/configure @@ -2272,6 +2272,7 @@ SYSTEM_FUNCS=" fcntl getaddrinfo getauxval + getenv gethrtime getopt GetModuleHandle diff --git a/libavutil/getenv_utf8.h b/libavutil/getenv_utf8.h new file mode 100644 index 0000000000..c10291adfc --- /dev/null +++ b/libavutil/getenv_utf8.h @@ -0,0 +1,86 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVUTIL_GETENV_UTF8_H +#define AVUTIL_GETENV_UTF8_H + +#include + +#include "config.h" +#include "mem.h" + +#if HAVE_GETENV && defined(_WIN32) + +#include "libavutil/wchar_filename.h" + +static inline char *getenv_utf8(const char *varname) +{ + wchar_t *varname_w, *var_w; + char *var; + + if (utf8towchar(varname, &varname_w)) + return NULL; + if (!varname_w) + return NULL; + + var_w = _wgetenv(varname_w); + av_free(varname_w); + + if (!var_w) + return NULL; + if (wchartoutf8(var_w, &var)) + return NULL; + + return var; + + // No CP_ACP fallback compared to other *_utf8() functions: + // non UTF-8 strings must not be returned. +} + +static inline void freeenv_utf8(char *var) +{ + av_free(var); +} + +static inline char *getenv_dup(const char *varname) +{ + return getenv_utf8(varname); +} + +#else + +static inline char *getenv_utf8(const char *varname) +{ + return getenv(varname); +} + +static inline void freeenv_utf8(char *var) +{ +} + +static inline char *getenv_dup(const char *varname) +{ + char *var = getenv(varname); + if (!var) + return NULL; + return av_strdup(var); +} + +#endif // HAVE_GETENV && defined(_WIN32) + +#endif // AVUTIL_GETENV_UTF8_H diff --git a/libavutil/wchar_filename.h b/libavutil/wchar_filename.h index f36d9dfea3..08de073ed7 100644 --- a/libavutil/wchar_filename.h +++ b/libavutil/wchar_filename.h @@ -20,6 +20,8 @@ #define AVUTIL_WCHAR_FILENAME_H #ifdef _WIN32 + +#define WIN32_LEAN_AND_MEAN #include #include "mem.h" @@ -41,6 +43,57 @@ static inline int utf8towchar(const char *filename_utf8, wchar_t **filename_w) return 0; } +av_warn_unused_result +static inline int wchartocp(unsigned int code_page, const wchar_t *filename_w, + char **filename) +{ + DWORD flags = code_page == CP_UTF8 ? WC_ERR_INVALID_CHARS : 0; + int num_chars = WideCharToMultiByte(code_page, flags, filename_w, -1, + NULL, 0, NULL, NULL); + if (num_chars <= 0) { + *filename = NULL; + return 0; + } + *filename = av_malloc_array(num_chars, sizeof *filename); + if (!*filename) { + errno = ENOMEM; + return -1; + } + WideCharToMultiByte(code_page, flags, filename_w, -1, + *filename, num_chars, NULL, NULL); + return 0; +} + +av_warn_unused_result +static inline int wchartoutf8(const wchar_t *filename_w, char **filename) +{ + return wchartocp(CP_UTF8, filename_w, filename); +} + +av_warn_unused_result +static inline int wchartoansi(const wchar_t *filename_w, char **filename) +{ + return wchartocp(CP_ACP, filename_w, filename); +} + +av_warn_unused_result +static inline int utf8toansi(const char *filename_utf8, char **filename) +{ + wchar_t *filename_w = NULL; + int ret = -1; + if (utf8towchar(filename_utf8, &filename_w)) + return -1; + + if (!filename_w) { + *filename = NULL; + return 0; + } + + ret = wchartoansi(filename_w, filename); + av_free(filename_w); + return ret; +} + /** * Checks for extended path prefixes for which normalization needs to be skipped. * see .NET6: PathInternal.IsExtended()