Date: Wed, 28 Jan 2026 07:03:27 +0100
For P3876R0, I think the overarching point that needs to be answered (and
which was contested last time) is whether we want to support these
additional character types in the first place. There are a few key points
that motivate this:
1. Historically, we've always signaled encoding using different
character types, such as using wchar_t-based facilities or u8string in
<filesystem>. If we concede that it's desirable to have a
guaranteed-UTF-8 variant of <charconv>, it seems questionable not to
design it like everything else for the last 25 years.
2. We don't want to introduce any overhead in the likely scenario that
the ordinary literal encoding is UTF-8 anyway. In that case, if the user
holds a std::u8string_view that they want to convert to e.g. a float,
the implementation could just reinterpret_cast and use the char
overloads internally. It could even "cheat" and somehow do this
during constant evaluation, which the user is not able to do.
3. Say the user wants to implement a function with the signature float
parse_float(std::u8string_view). Given that there is no upper bound on
the length of strings (consider 0.000...0001e+10000), the general case
requires dynamic allocations to implement this function if the user first
has to transcode into a separate char[]. This is entirely unnecessary
overhead and raises questions about freestanding support, allocators,
exception handling, etc.
4. If we don't provide wrappers around from_chars, users are just going
to implement such parse_float functions themselves. If you're working
with std::u8string_view extensively in your code base, having to call
the char overloads of std::from_chars takes considerable effort.
Considering these points, I don't see any justification not to support
char8_t (and char16_t, char32_t) strings other than if you think Unicode
character types are pointless in the first place and that no one should use
them. Otherwise, it seems inevitable that one would want to parse their
contents sooner or later, and this is not trivial to do if you only have
char support.
which was contested last time) is whether we want to support these
additional character types in the first place. There are a few key points
that motivate this:
1. Historically, we've always signaled encoding using different
character types, such as using wchar_t-based facilities or u8string in
<filesystem>. If we concede that it's desirable to have a
guaranteed-UTF-8 variant of <charconv>, it seems questionable not to
design it like everything else for the last 25 years.
2. We don't want to introduce any overhead in the likely scenario that
the ordinary literal encoding is UTF-8 anyway. In that case, if the user
holds a std::u8string_view that they want to convert to e.g. a float,
the implementation could just reinterpret_cast and use the char
overloads internally. It could even "cheat" and somehow do this
during constant evaluation, which the user is not able to do.
3. Say the user wants to implement a function with the signature float
parse_float(std::u8string_view). Given that there is no upper bound on
the length of strings (consider 0.000...0001e+10000), the general case
requires dynamic allocations to implement this function if the user first
has to transcode into a separate char[]. This is entirely unnecessary
overhead and raises questions about freestanding support, allocators,
exception handling, etc.
4. If we don't provide wrappers around from_chars, users are just going
to implement such parse_float functions themselves. If you're working
with std::u8string_view extensively in your code base, having to call
the char overloads of std::from_chars takes considerable effort.
Considering these points, I don't see any justification not to support
char8_t (and char16_t, char32_t) strings other than if you think Unicode
character types are pointless in the first place and that no one should use
them. Otherwise, it seems inevitable that one would want to parse their
contents sooner or later, and this is not trivial to do if you only have
char support.
Received on 2026-01-28 06:03:42
