Holding values in their "wire" endianness is common practice in every codebase I've ever worked on.
You say:
"Why would I be holding a vector of big-endian uint16_t data in the first place? There is nothing I can do with this vector except transform its endianness so it becomes useful, or shoot myself in the foot by forgetting about the endianness of the data inside."
However this overlooks two things:
- There's an additional useful thing you can do with the vector, which is write it in the same endianness in which it was sourced
- Obtaining a wire-endian vector requires no eager work (since the data was read that way), unlike eagerly endian swapping it
"I should either be holding a byte vector where the date [sic] is encoded in big-endian, or a vector of uint16_t or char16_t with native endianness."
I think focusing exclusively on vector makes your argument seem stronger than it is when applied to the general case. If instead of a vector I have, for example, some structure which models a message from a wire protocol and which contains an embedded array
(this is not uncommon), I think it becomes much clearer why someone would have a wire-endian range.
Moreover I don't see a compelling argument for modeling a vector of big-endian 16-bit integers as a vector of bytes. I understand the argument that 16-bit integers with non-native endianness aren't exactly std::uint16_t, but they're also not just bytes either
(a vector of bytes can have a length which isn't divisible by sizeof(std::uint16_t), for example).
--Robert
From: SG16 <sg16-bounces@lists.isocpp.org> on behalf of Jan Schultke via SG16 <sg16@lists.isocpp.org>
Sent: Monday, June 22, 2026 00:47
To: sg16@lists.isocpp.org <sg16@lists.isocpp.org>
Cc: Jan Schultke <janschultke@googlemail.com>; SG9 ranges <sg9@lists.isocpp.org>; Tom Honermann <tom@honermann.net>
Subject: Re: [isocpp-sg16] Thoughts on P4030R0: Endian Views
Let's add mandates for
CHAR_BIT == 8 everywhere UTF related. UTF is simply not defined in other scenarios.
So if you're on a platform without 8-bit bytes we should also disable support like std::to_chars with char8_t, std::format with char8_t format strings, etc.? That seems a bit too far. And if you're saying it's totally fine for char8_t to not be 8-bit (its underlying
type is unsigned char by the way) but not for char to be non-8-bit if someone wants UTF support, I don't see any coherence to the design. It seems like we would basically need to yeet all Unicode support across the standard library out the window on such platforms.
I'm also not a fan of supporting weird char sizes, but if they are allowed in the language, we probably shouldn't make the Unicode support pay for it. I don't see much of a problem with having a few unused upper bits in a char8_t or char. Just because it has
64 bits doesn't mean you can't store UTF-8 code unit values inside.
Anyway, my greater issue with endian views is that they do the endianness conversion in the wrong place. You get a mathematically meaningless uint32_t value when performing a byteswap on, say, a Unicode code point; the only purpose is to dump the bytes of that
uint32_t to memory. The byteswap should be taking place in a serialization view that produces a range of bytes and can either encode to little endian or big endian.
The little dance we force users to go through for, say, serializing std::float32_t is just not ergonomic: bit-cast the range to uint32_t, use a second view to transform the endianness, use a third view to generate the byte array, and a fourth view to join the
bytes into a single range again. This sounds pretty terrible to me.
The examples in the paper are contrived, in order to obtain a nicer-looking before/after comparison table.
constexpr vector<uint32_t> utf16be_to_utf32be(
const vector<uint16_t>& utf16be_data)
Why would I be holding a vector of big-endian uint16_t data in the first place? There is nothing I can do with this vector except transform its endianness so it becomes useful, or shoot myself in the foot by forgetting about the endianness of the data inside.
I should either be holding a byte vector where the date is encoded in big-endian, or a vector of uint16_t or char16_t with native endianness.
-
If I started with a byte vector, it would be obvious in the comparison table that the paper's proposed feature is doing little to help the user.
-
If I started with native endian data, I wouldn't need the paper's feature at all.