On Aug 24, 2020, at 16:23, Jens Maurer <Jens.Maurer@gmx.net> wrote:

On 24/08/2020 21.44, Alisdair Meredith via SG16 wrote:
Got another good corner case for you!

In the template form of user defined literals, the template parameter pack
is instiated with characters corresponding to the source text, currently
mapping non-basic characters to UCNs, so that the template parser can
assume all characters are members of the basic source character set:

See [lex.ext] 5.13.8p3/4

By no longer mapping to UCNs, we break any UDL parsers that work with
UCNs today.  I don’t know how many there are in production, possibly zero,
but it is a risk to address, and provide an entry in compatibility Annex C.

UCNs may only be introduced for characters not in the basic source
character set.  Could please point out which of the characters allowed
in a user-defined-integer-literal or user-defined-floating-point-literal
are not in the basic source character set?


I don’t find the part of the spec that restricts the contents of the token
being passed to a numeric literal operator contain some restricted
subset of characters that are meaningful to existing parses built into
the language - only that the eventual result must be either an appropriate
integeral or floating point type.

While I have no examples of users doing this in the wild, I see nothing
in the current spec that forbids such things. - for example base36 literals
will meaningfully parse all 26 letters in addition to the 10 digits - why can
this not be extended (other than common sense) to use extended
characters that map to UCNs in phase 1?

AlisdairM