From patchwork Mon Jun 13 16:26:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nil Admirari X-Patchwork-Id: 36196 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:1a22:b0:84:42e0:ad30 with SMTP id cj34csp586679pzb; Mon, 13 Jun 2022 09:26:52 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyjgHezc9wZSavX0TfV/y5pGaV6LK2+Ew/fcIBYkGKUlo19w4GQQQVnLtduLGFmuPGUpSE+ X-Received: by 2002:a17:906:482:b0:6fe:86fa:a5c5 with SMTP id f2-20020a170906048200b006fe86faa5c5mr588599eja.28.1655137611956; Mon, 13 Jun 2022 09:26:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1655137611; cv=none; d=google.com; s=arc-20160816; b=cD+39QakHFLCbHG9vZQW0r4xISWdBP+qv+c+pmt34MWDHoID+ypnkuJSfhvWFuR2kL G4ictmmCpWESYzie5qIa9CsRaRsx76lxocEbloMDuNzLbT90FuYIm28kPJHbig6hwJUo s0tTqJ2iN0qcXVyWkRrADCpMqERTw01OSxdIUSRoWa2gNPhsjAIafgAMrN9pmVZP8DXz S8UZQ2W8wU/wWOQ+MYPqSAHDYa4ponMzVHDzmAhqnkF5PFCtWNE+WVaebsHilSWheURB WA73RETmEzti0kDCgOqu0jADkhD79Em9kyrT9bNo96Gya++WNA5oxpTwyIAU0OtxBwPK nqJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=wttcUeeS2uOqiH4ZVmmkkD6VSiG11AsRfDNnckuS9D8=; b=evUvi1Fohtpu99ZbePrF1E0svjawhV4Fey+yq6Sv6BAg53t1OqQhzi3MvzfXbesChv ysdaYxDGzMKbQHicgHFYbVGGnDFrFsdVdBz4HQTNNz3DLX0NpnuAoFUYOMa+S3Zs7K+O uLWKx6BtqMCE53WAc3r28Cc4aHlQfUq0hzqAz4DrIbJVxT4VL2Pv0lrxMWDKTHL9agAy o+fjSLip7jHbgP7H9KzZgP7qLVsV5wGn1Z2/b07vzmUQtsqTNSmnFC0QHPnYk4ONUKrU WhaZ0VZ+wiSboYZs0a4a5SCL+nvduOhyJqJ9XrzXlwbPcXxPQZ5eD9y2yp6N0+tbZyo6 U1rg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@mailo.com header.s=mailo header.b=UZVHLjq3; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=mailo.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id nd17-20020a170907629100b006f42a32ebcdsi9473547ejc.753.2022.06.13.09.26.50; Mon, 13 Jun 2022 09:26:51 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@mailo.com header.s=mailo header.b=UZVHLjq3; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=mailo.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4674B68B5E4; Mon, 13 Jun 2022 19:26:47 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from msg-1.mailo.com (msg-1.mailo.com [213.182.54.11]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id B6EFE68B58E for ; Mon, 13 Jun 2022 19:26:40 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=mailo.com; s=mailo; t=1655137596; bh=77Wb4gh4Cv4bixNKcMI8kRLP0s4QXhB/Hm6SDJJiq6A=; h=X-EA-Auth:From:To:Subject:Date:Message-Id:X-Mailer:MIME-Version: Content-Transfer-Encoding; b=UZVHLjq3YdkVx69GQJbR2J4txklEkcLlFQTPnM0z4SiewzKW7EE0JnPtxY3Vx3Gjr ka49joTBQyQ1PRLybgo63NgdaBajobpBdn2r2HApu/fVj7K1IVtJloKBVQGdfMTBrp BYaQ3S6029LIGVGrWmrb++N3BH4pURyzSgQZmCrk= Received: by b-1.in.mailobj.net [192.168.90.11] with ESMTP via ip-206.mailobj.net [213.182.55.206] Mon, 13 Jun 2022 18:26:35 +0200 (CEST) X-EA-Auth: FRgBwIs9K0BKuprg2Ldc5MxkPqLTa78B1+lkkV47urFD8JK4zEivJTnWQTKd8owGGdE+btwdpRfUXcBqnRRglvOngZq3EsxacSFwHNk/HqI= From: Nil Admirari To: ffmpeg-devel@ffmpeg.org Date: Mon, 13 Jun 2022 19:26:22 +0300 Message-Id: <20220613162626.11541-1-nil-admirari@mailo.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v14 1/5] libavutil: Add wchartoutf8(), wchartoansi(), utf8toansi() and getenv_utf8() X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: tvTlysW13JF6 wchartoutf8() converts strings returned by WinAPI into UTF-8, which is FFmpeg's preffered encoding. Some external dependencies, such as AviSynth, are still not Unicode-enabled. utf8toansi() converts UTF-8 strings into ANSI in two steps: UTF-8 -> wchar_t -> ANSI. wchartoansi() is responsible for the second step of the conversion. Conversion in just one step is not supported by WinAPI. Since these character converting functions allocate the buffer of necessary size, they also facilitate the removal of MAX_PATH limit in places where fixed-size ANSI/WCHAR strings were used as filename buffers. getenv_utf8() wraps _wgetenv() converting its input from and its output to UTF-8. Compared to plain getenv(), getenv_utf8() requires a cleanup. Because of that, in places that only test the existence of an environment variable or compare its value with a string consisting entirely of ASCII characters, the use of plain getenv() is still preferred. (libavutil/log.c check_color_terminal() is an example of such a place.) Plain getenv() is also preffered in UNIX-only code, such as bktr.c, fbdev_common.c, oss.c in libavdevice or af_ladspa.c in libavfilter. --- libavutil/getenv_utf8.h | 63 ++++++++++++++++++++++++++++++++++++++ libavutil/wchar_filename.h | 51 ++++++++++++++++++++++++++++++ 2 files changed, 114 insertions(+) create mode 100644 libavutil/getenv_utf8.h diff --git a/libavutil/getenv_utf8.h b/libavutil/getenv_utf8.h new file mode 100644 index 0000000000..2c48a36355 --- /dev/null +++ b/libavutil/getenv_utf8.h @@ -0,0 +1,63 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVUTIL_GETENV_UTF8_H +#define AVUTIL_GETENV_UTF8_H + +#include + +#include "mem.h" + +#ifdef _WIN32 + +#include "libavutil/wchar_filename.h" + +static inline char *getenv_utf8(const char *varname) +{ + wchar_t *varname_w, *var_w; + char *var; + + if (utf8towchar(varname, &varname_w)) + return NULL; + if (!varname_w) + return NULL; + + var_w = _wgetenv(varname_w); + av_free(varname_w); + + if (!var_w) + return NULL; + if (wchartoutf8(var_w, &var)) + return NULL; + + return var; + + // No CP_ACP fallback compared to other *_utf8() functions: + // non UTF-8 strings must not be returned. +} + +#else + +static inline char *getenv_utf8(const char *varname) +{ + return av_strdup(getenv(varname)); +} + +#endif // _WIN32 + +#endif // AVUTIL_GETENV_UTF8_H diff --git a/libavutil/wchar_filename.h b/libavutil/wchar_filename.h index f36d9dfea3..a6d71e52e5 100644 --- a/libavutil/wchar_filename.h +++ b/libavutil/wchar_filename.h @@ -41,6 +41,57 @@ static inline int utf8towchar(const char *filename_utf8, wchar_t **filename_w) return 0; } +av_warn_unused_result +static inline int wchartocp(unsigned int code_page, const wchar_t *filename_w, + char **filename) +{ + DWORD flags = code_page == CP_UTF8 ? WC_ERR_INVALID_CHARS : 0; + int num_chars = WideCharToMultiByte(code_page, flags, filename_w, -1, + NULL, 0, NULL, NULL); + if (num_chars <= 0) { + *filename = NULL; + return 0; + } + *filename = av_malloc_array(num_chars, sizeof *filename); + if (!*filename) { + errno = ENOMEM; + return -1; + } + WideCharToMultiByte(code_page, flags, filename_w, -1, + *filename, num_chars, NULL, NULL); + return 0; +} + +av_warn_unused_result +static inline int wchartoutf8(const wchar_t *filename_w, char **filename) +{ + return wchartocp(CP_UTF8, filename_w, filename); +} + +av_warn_unused_result +static inline int wchartoansi(const wchar_t *filename_w, char **filename) +{ + return wchartocp(CP_ACP, filename_w, filename); +} + +av_warn_unused_result +static inline int utf8toansi(const char *filename_utf8, char **filename) +{ + wchar_t *filename_w = NULL; + int ret = -1; + if (utf8towchar(filename_utf8, &filename_w)) + return -1; + + if (!filename_w) { + *filename = NULL; + return 0; + } + + ret = wchartoansi(filename_w, filename); + av_free(filename_w); + return ret; +} + /** * Checks for extended path prefixes for which normalization needs to be skipped. * see .NET6: PathInternal.IsExtended()