C++ Logo

sg16

Advanced search

Re: Follow up on SG16 review of P2996R2 (Reflection for C++26)

From: Tom Honermann <tom_at_[hidden]>
Date: Mon, 29 Apr 2024 15:39:06 -0400
On 4/29/24 1:57 PM, Jens Maurer wrote:
>
> On 28/04/2024 17.49, Tom Honermann via SG16 wrote:
>> During the meeting, we briefly discussed use of an opaque type for the return type of name_of() and friends. I would like to see further exploration of this idea prior to our next review. I'm envisioning a type something like the following (This particular formulation follows existing precedent established by the std::filesystem::path native format observers, [fs.path.native.obs] <http://eel.is/c++draft/fs.path.native.obs>).
>>
>> class name {
>> std::string_view /internal-representation/; // exposition only.
>> name(/* unspecified */);
>> public:
>> constexpr std::string string() const; // ordinary literal encoding.
>> constexpr std::wstring wstring() const; // wide literal encoding.
>> constexpr std::u8string u8string() const; // UTF-8.
>> constexpr std::u16string u16string() const; // UTF-16.
>> constexpr std::u32string u32string() const; // UTF-32.
>> };
> Are all encodings created equal?
No, some are more capable than others. But I think the question really
asking is, are all encodings equally important?
> It seems to me that UTF-8 is more important than UTF-16 and UTF-32.
UTF-16 is very important on Windows and will remain so. However, UTF-16
via wchar_t is much more important than UTF-16 via char16_t there. I
think char16_t is only really important for ICU.
> (I thought Windows has stated they're moving towards UTF-8 for their
> OS interfaces.)

The people I've talked to at Microsoft have indicated that there is no
intent at this point to ever change the default Windows encoding
selection. If there is an announcement that I've missed, I'd love to be
informed of it. It is true that Microsoft has added support for UTF-8,
but it remains a beta option buried deep in the Language and region
settings. My understanding is that there are backward compatibility
concerns that restrict their options; older versions of some Microsoft
libraries, I think including the standard library, were unable to
accommodate encodings that require more than two bytes to encode a
character and those libraries have been statically linked into many
executables that remain in use according to their internal testing.

> Going forward, maybe we want to establish guidance that we support
> only basic transcoding involving UTF-16 and UTF-32, but do not
> provide all library functionality for those (e.g. std::format,
> std::to_chars etc.).

I think that is reasonable. Given that we have a precedent, I think
motivation for a change should be argued in a paper and should be
stronger than, "we don't think we need it", particularly since, once
support for UTF-8/char8_t is in place, it is almost trivial to add
support for char16_t and char32_t. The existing standard libraries
already have conversion routines.

Tom.

>
> Jens

Received on 2024-04-29 19:39:10