C++ Logo

sg16

Advanced search

Re: [isocpp-lib-ext] std::environment

From: Tom Honermann <tom_at_[hidden]>
Date: Mon, 2 Jan 2023 14:09:24 -0500
On 1/1/23 6:28 PM, Ville Voutilainen wrote:
> On Mon, 2 Jan 2023 at 01:20, Ville Voutilainen
> <ville.voutilainen_at_[hidden]> wrote:
>
>>> POLL: std::environment should be immutable
>> This poll defeats the purpose of the facility. If I'll get a portable
>> replacement for getenv(), I shouldn't need to use putenv() either any
>> more. We should expose the capabilities of the system-specific
>> facilities we wrap into a portable API, not hide them behind
>> idealistic
>> APIs that don't allow something like setting an environment variable.
>> The environment is already global mutable state, and it's not
>> the job of std::environment to pretend otherwise, if you ask me. This
>> poll is also suggesting that we leave room for a language underneath
>> C++ here, and we shouldn't do that, in principle.
> I should perhaps add that there is existing practice, here:
> https://doc.qt.io/qt-6/qtglobal.html#qputenv
>
> I'm fully aware that qgetenv() and qEnvironmentVariable() have
> different data-loss behaviors on different
> platforms, so that part is a bit quirky to take as existing practice
> to consider for standardization. :)

Thank you for those links. Reading the documentation for them reinforced
my belief that programmers have a need to access raw environment
variable values (as in qgetenv()
<https://doc.qt.io/qt-6/qtglobal.html#qgetenv>), but without requiring
conversion to char or byte-based storage (e.g., raw wchar_t access on
Windows) and to access values as text (as in qEnvironmentVariable()
<https://doc.qt.io/qt-6/qtglobal.html#qEnvironmentVariable> via
conversion to the associated encodings of char, wchar_t, char8_t,
char16_t, and char32_t with the understanding that such conversion will
be lossy in some cases). The design used for std::filesystem::path will
suffice for both purposes with a minor tweak; we should provide separate
interfaces for access to the raw data vs access as text so that the
latter can provide valid encoding guarantees. For example, given an
environment variable FOO with the value "a\xFF\xFFz" (four bytes long
containing the values 'a', 0xFF, 0xFF, 'z') on a POSIX system using
UTF-8 for the execution encoding and UTF-32 for the wide execution
encoding, access of the value via the following member functions would
yield results with the indicated type and value (where encoding
conversion is from UTF-8 (the execution encoding) and follows Unicode
PR-121 <http://unicode.org/review/pr-121.html> policy 1 for substitution
of ill-formed code unit sequences; U+FFFD is the Unicode replacement
character).

  * raw() -> std::span<char> (std::span<wchar_t> on Windows) where the
    spanned range is "a\xFF\xFFz".
  * string() -> std::string containing "a\uFFFDz" (UTF-8).
  * wstring() -> std::wstring containing L"a\uFFFDz" (UTF-32).
  * u8string() -> std::u8sstring containing u8"a\uFFFDz" (UTF-8).
  * u16string() -> std::u16string containing u"a\uFFFDz" (UTF-16).
  * u32string() -> std::u32string containing U"a\uFFFDz" (UTF-32).

This differs from (the intent of and existing practice for)
std::filesystem::path as indicated by the notes in [fs.path.type.cvt]p2
<http://eel.is/c++draft/fs.path.cvt#fs.path.type.cvt-2> that state that
no conversion is performed when the access is for a type that matches
the implementation-defined value_type.

Tom.

Received on 2023-01-02 19:09:29