sg16: [SG16] Is it an error to encounter a character without a valid UCN?

From: Alisdair Meredith <alisdairm_at_[hidden]>
Date: Tue, 2 Jun 2020 12:34:27 +0100

Translation phase 1 maps source code to either a member of the
basic character set, or a UCN corresponding to that character.
What if there is no such UCN? Is that undefined behavior, or is
the program ill-formed? I can find nothing on this in [lex.phases]
where we describe processing the source through an implemetation
defined character mapping.

When we get to [lex.charset] we can see it is clearly ill-formed if
the produced UCN is invalid - is that supposed to be the resolution
here? Source must always map to a UCN, but the UCN need not
be valid, so we get an error parsing the (implied) UCN in a later
phase?

AlisdairM

Received on 2020-06-02 06:37:35