C++ Logo

sg16

Advanced search

Pattern Syntax and Whitespace

From: Steve Downey <sdowney_at_[hidden]>
Date: Wed, 14 Sep 2022 19:10:55 -0400
The new Unicode 15.0 version of UAX 31 clarifies
https://unicode.org/reports/tr31/#Pattern_Syntax that the definitions of
whitespace and characters used for syntax are intended to apply to
programming languages. C++ does not use them, of course, nor am I
suggesting we should make this change now.

For purposes of the conformance annex, E,
http://eel.is/c++draft/uaxid#pattern, I am thinking of extracting the
profile we do use from our lexer defs, and refer back to those sections as
being normative. Does that seem reasonable?

They've also clarified in a few places where the intent has always been to
choose from one of the alternatives, where for example Restricted Format
Characters R1a is ruled out if you're following R1,
https://unicode.org/reports/tr31/#R1 and
https://unicode.org/reports/tr31/#R1a , because the characters are excluded
from XID_Continue.

My plan is to have a draft early next week, and it's currently on our NB
comment list, with a note to have changes to refer to.

I'll also add the updates from 14 to 15 as comments. Comments are due for
us to INCITS on the 28th, so internal discussion is leaning towards getting
in anything we think should be considered, and better two comments than
none.

Received on 2022-09-14 23:11:08