On Fri, Nov 5, 2021 at 10:05 PM Hubert Tong <hubert.reinterpretcast@gmail.com> wrote:

The current R2 draft has this:
A multicharacter literal shall not have an encoding prefix. Each character represented by a basic-c-char or a universal-character-name in a multicharacter literal shall be encodable as a single code unit in the narrow literal encoding.

The above does not provide a restriction on conditional-escape-sequences and numeric-escape-sequences in multicharacter literals. We presumably only want to allow ones that are valid as the sole c-char in a character-literal with no encoding prefix. Indeed, that general description may be sufficient for all forms of c-char.

It would make sense to fix the description of non-encodable character literals as well. The "cannot be encoded as a single code unit" condition is different from the C wording for cases with no encoding prefix in that the C wording talks of a "single-byte execution character". The C++ wording needs "in the initial shift state" added.

Also, the title of the paper is not particularly helpful in terms of indicating what it proposes. I think something like "Support only straightforward multicharacter literals and encodable string literals" would be better.

-- HT