ISOCPP sg16 List: Re: [isocpp-sg16] Agenda for the 2026-01-28 SG16 meeting

From: Jan Schultke <janschultke_at_[hidden]>
Date: Wed, 28 Jan 2026 07:03:27 +0100

For P3876R0, I think the overarching point that needs to be answered (and
which was contested last time) is whether we want to support these
additional character types in the first place. There are a few key points
that motivate this:

   1. Historically, we've always signaled encoding using different
   character types, such as using wchar_t-based facilities or u8string in
   <filesystem>. If we concede that it's desirable to have a
   guaranteed-UTF-8 variant of <charconv>, it seems questionable not to
   design it like everything else for the last 25 years.
   2. We don't want to introduce any overhead in the likely scenario that
   the ordinary literal encoding is UTF-8 anyway. In that case, if the user
   holds a std::u8string_view that they want to convert to e.g. a float,
   the implementation could just reinterpret_cast and use the char
   overloads internally. It could even "cheat" and somehow do this
   during constant evaluation, which the user is not able to do.
   3. Say the user wants to implement a function with the signature float
   parse_float(std::u8string_view). Given that there is no upper bound on
   the length of strings (consider 0.000...0001e+10000), the general case
   requires dynamic allocations to implement this function if the user first
   has to transcode into a separate char[]. This is entirely unnecessary
   overhead and raises questions about freestanding support, allocators,
   exception handling, etc.
   4. If we don't provide wrappers around from_chars, users are just going
   to implement such parse_float functions themselves. If you're working
   with std::u8string_view extensively in your code base, having to call
   the char overloads of std::from_chars takes considerable effort.

Considering these points, I don't see any justification not to support
char8_t (and char16_t, char32_t) strings other than if you think Unicode
character types are pointless in the first place and that no one should use
them. Otherwise, it seems inevitable that one would want to parse their
contents sooner or later, and this is not trivial to do if you only have
char support.

Received on 2026-01-28 06:03:42