C++ Logo


Advanced search

[SG16] Making wide-character literals containing multiple c-char ill-formed

From: Corentin <corentin.jabot_at_[hidden]>
Date: Mon, 1 Jun 2020 00:44:26 +0200

L'ab' currently has an implementation defined value
GCC, MSVC and Clang treats that value to be equivalent to L'a' and emit a

However, consider

L'é' which after phase one is represented as L'e\u00B4' (LATIN SMALL LETTER

The author of the code probably intends the character to be a single c-char.

Therefore, I think this should be made ill-formed.

Note that this is less of an issue for multi character literals as no
combining character has a representation in any single-byte encoding (that
I know of).
(And multi character literals, are, to my dismay, used in production code).

However we should probably require that each individual c-char in a multi
character literal has a representation in the execution encoding or is a
member of the basic latin block.

What do you think ?

Received on 2020-05-31 17:47:43