On 10/28/19 11:47 AM, Arthur O'Dwyer via Std-Proposals wrote:
On Thu, Oct 24, 2019 at 7:51 PM Lyberta via Std-Proposals <std-proposals@lists.isocpp.org> wrote:
Arthur O'Dwyer via Std-Proposals:
> Lyberta, did your survey turn up any C++ implementations where CHAR_BIT !=
> 8?  If so, what version of C++ were they — C++03, 11, 14, 17?

Clang has been recently ported in 16 bit byte architecture:


Yes, but notice the final paragraph of that article:

One missing piece is a target to act as a test for this new behavior. At Embecosm, we have been working on AAP for just this purpose. At the moment AAP has 8-bit byte addressed memory, however, the purpose of the architecture is to work as a test case for interesting features, so in order to support non 8-bit characters we are creating a version of the architecture which is 16-bit word addressed.

That is: They did all this work to support 16-bit bytes in the abstract, and then the only thing that was left to do was find an actual machine with 16-bit bytes. They had no real machine in mind for LLVM to target, so they had to invent one. Except that the one they invented currently also has 8-bit bytes. They are in the process of modifying their contrived machine, which they invented to serve as the only model of their contrived abstraction.
(For "are", read "were"; the article is from April 2017.)

I don't think that is an accurate representation of their motivation.  My understanding from having followed an LLVM or Clang email thread on the subject (I didn't go back to check, so don't just take my word for it) is that they have local code for a backend targeting a real machine architecture, but they are currently unable to contribute that backend to LLVM or, even if they did, the LLVM build and test infrastructure wouldn't be able to test it due to lack of such machines.  So, in order to satisfy requests that the changes to support non-8-bit bytes be actively tested, they are considering doing the work to enable support for a virtual machine that could fit into the existing test infrastructure.

So it's possible that 16-bit-byte machines exist, but the Embecosm article is not an example of any such machine.

They do exist.  That isn't a matter of debate.


I also wonder what such a machine would do with `char8_t`, or character processing in general.  (I mean, it could just waste half of the space in each 16-bit word; but it seems like it would be easier to just store two C++ "bytes" in each 16-bit word.)