Change in 5.4 [lex.pptoken] paragraphs 1-2:
- The Unicode Consortium. Unicode Standard Annex, UAX #44, Unicode Character Database [online]. Edited by TODO author. Revision XX; issued for Unicode 12.0.0. 2019-02-15 [viewed 2020-02-23]. Available at http://www.unicode.org/reports/tr44/tr44-XX.html TODO XXX
Add a new section after 5.9preprocessing-token : header-name import-keyword module-keyword export-keywordEach preprocessing token that is converted to a token (5.6) shall have the lexical form of a keyword, an identifier, a literal, or an operator or punctuator.identifierpp-identifier pp-number character-literal user-defined-character-literal string-literal user-defined-string-literal preprocessing-op-or-punc each non-white-space character that cannot be one of the aboveA preprocessing token is the minimal lexical element of the language in translation phases 3 through 6. The categories of preprocessing token are: header names, placeholder tokens produced by preprocessing import and module directives (import-keyword, module-keyword, and export-keyword), preprocessing identifiers, preprocessing numbers, character literals (including user-defined character literals), string literals (including user-defined string literals), preprocessing operators and punctuators, and single non-white-space characters that do not lexically match the other preprocessing token categories. ...
Remove the grammar from 5.10 [lex.name]; it was moved to 5.10new [lex.ppident].5.10new Preprocessing identifiers [lex.ppident]
pp-identifier: identifier-nondigit pp-identifier digit pp-identifier identifier-nondigit identifier-nondigit: nondigit universal-character-name nondigit: one of a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z _ digit: one of 0 1 2 3 4 5 6 7 8 9Preprocessing identifier tokens lexically include all identifiers (5.10 [lex.name]) and keywords (5.11 [lex.key]).
Remove tables [tab:lex.name.allowed] and [tab:lex.name.disallowed].
Change in 5.10 [lex.name] paragraph 1:
Change in 5.11 [lex.key] paragraph 1:An identifier is an arbitrarily long sequence of letters and digits. Each universal-character-name in an identifier shall designate a character whose encoding in ISO/IEC 10646 falls into one of the ranges specified in Table 2. The initial element shall not be a universal-character-name designating a character whose encoding falls into one of the ranges specified in Table 3. Upper- and lower-case letters are different. All characters are significant.identifier: pp-identifierA universal-character-name at the start of an identifier shall designate a character of class XID_Start; any other universal-character-name in an identifier shall designate a character of class XID_Continue (see UAX #44 for the definition of the classes). [ Footnote: On systems in which linkers cannot accept extended characters, an encoding of the universal-character-name may be used in forming valid external identifiers. For example, some otherwise unused character or sequence of characters may be used to encode the\u
in a universal-character-name. Extended characters may produce a long external identifier, but C++ does not place a translation limit on significant characters for external identifiers.In C++, upper- and lower-case letters are considered different for all identifiers, including external identifiers.] An identifier shall conform to the NFC normalization specified in ISO/IEC 10646.[ Note: Upper- and lower-case letters are considered different for all identifiers. -- end note ]
[ Note: In translation phase 4, identifier also includes those preprocessing-tokens (5.4 [lex.pptoken]) differentiated as keywords (5.11 [lex.key]) in the later translation phase 7 (5.6 [lex.token]). -- end note ]
Change in 15.6 [cpp.replace] paragraph 10:keyword: anyTheidentifierpp-identifier listed in Table [tab:lex.key] import-keyword module-keyword export-keywordidentifierspp-identifiers shown in Table [tab:lex.key] are reserved for use as keywords (that is, they are unconditionally treated as keywords in phase 7) except in an attribute-token (9.12.1). ...
A preprocessing directive of the formChange in 15.6 [cpp.replace] paragraph 12:# define identifier replacement-list new-linedefines an object-like macro that causes each subsequent instance of the macro name [ Footnote: ... ] to be replaced by the replacement list of preprocessing tokens that constitute the remainder of the directive. [ Footnote: ... ] [ Note: A pp-identifier that is not in NFC normalization form is not an identifier and thus is never the instance of a macro name. ] The replacement list is then rescanned for more macro names as specified below.
... Each subsequent instance of the function-like macro name followed by a ( as the next preprocessing token introduces the sequence of preprocessing tokens that is replaced by the replacement list in the definition (an invocation of the macro). The replaced sequence of preprocessing tokens is terminated by the matching ) preprocessing token, skipping intervening matched pairs of left and right parenthesis preprocessing tokens. Within the sequence of preprocessing tokens making up an invocation of a function-like macro, new-line is considered a normal white-space character. [ Note: A pp-identifier that is not in NFC normalization form is not an identifier and thus is never the instance of a macro name. ]Add after 15.6.1 [cpp.subst] paragraph 2:
An identifier __VA_ARGS__ that occurs in the replacement list shall be treated as if it were a parameter, and the variable arguments shall form the preprocessing tokens used to replace it.Add to the bibliography:[ Note: A pp-identifier that is not in NFC normalization form is not an identifier and thus never names a parameter. ]
The Unicode Consortium. Unicode Standard Annex, UAX #31, TODO title [online]. Edited by TODO author. Revision xx; issued for Unicode 12.0.0. 2019-02-15 [viewed 2020-02-23]. Available at http://www.unicode.org/reports/tr31/tr31-35.html