So... in other words, yes, you're taking the "deal with one byte at a
time" approach. *Maybe* that *interface* is a candidate for
standardization, *iff* it is able to operate on values in-place. The
reason I say that is because it will (ahem: "conditionally", for all the
pedants that replied) byte-swap under the hood, as any other
implementation is almost certainly going to perform much more poorly.
By in-place, you mean *modifying* an integral *lvalue*? That sounds highly bug-prone; what if you lose track of whether a particular variable has been converted from format to host endian?
(FWIW, I've found the safest technique is to have (library) types whose object representation is (potentially) byte-swapped, and convert on load and store; in C++, that means conversion operator and converting constructor. But while it'd be nice to have that in the library, it's more important to get the lower level facility first.)
Also, a pure function has better performance, not worse, because it leaves its result in a prvalue. Note that bswap operates on registers, so an in-place operation would be 3 instructions (load, bswap, store) while a pure function is at most 2 instructions (mov, bswap) and no memory access.
You're quibbling over semantics. Your implementation does still
byte-swap, in that it takes input bytes and (effectively) copies them in
a different order. Your approach also appears to currently have the
limitation of requiring a second copy of the data, which is potentially
inefficient if the data is a mix of endian-dependent and endian-agnostic
data, though I expect that could be relaxed.
A second copy of the data in register, which is practically free, and can be elided if you don't access the format-endian data after converting it to host-endian.
Also, if you aren't using
actual byte-swap intrinsics under the hood, you are most likely leaving
performance on the table. (Granted, your approach is more portable, but
a vendor implementation would be expected to use intrinsics.)
gcc's byteswap intrinsics are exposed as pure functions. Also, what compiler is incapable of recognizing a hand-rolled swap and optimizing it to the bswap instruction?
Note that I mention "byte swapping" because that's what the operation
actually *does* (except when it's a no-op), not because I'm suggesting
an API that *unconditionally* swaps. The original proposal was to
standardize the ntohX / htonX functions, and that is clearly a better
approach, although "best" would be more like hto{le,be} and {le,be}toh
(with no 'X' needed because this is C++ and we have overloads).
ntohX / htonX are pure functions.
Overloads taking `span` might be nice, also.
How can you convert a span of bytes without knowing the layout of the fields within it? Say you have a range of 10 bytes that is {int64, uint32, 4x int8, int16}? And what if there's IEEE floats in there?