C++ Logo


Advanced search

Re: [SG16] [isocpp-ext] P1949R4 - C++ Identifier Syntax using Unicode Standard Annex 31

From: Matthew Woehlke <mwoehlke.floss_at_[hidden]>
Date: Thu, 18 Jun 2020 11:08:14 -0400
On 18/06/2020 10.46, JF Bastien wrote:
> On Thu, Jun 18, 2020 at 7:44 AM Tom Honermann wrote:
>> On 6/18/20 10:33 AM, Matthew Woehlke via Ext wrote:
>>> Okay, maybe not, but then I suppose my point is that if we're going to fix
>>> it, I would like to *fix* it, not just make it less broken.
>> What particular form of "*fix*" do you have in mind?

I believe I already explained that. To repeat, make identifiers conform
to '[_[:alpha:]][_[:alnum:]]*'.

> I'd like to understand what is "broken" first :-)
> Escaping characters?
> Or something about tools which try to naively process C++ code? i.e. are we
> trying to make naive tools easier?

That depends on your definition of "easier". The goal isn't so much to
make it easier to write a tool correctly, but to make it so that
*existing* tools¹ are correct w.r.t. the standard.

Note that "tools" here includes humans. At least for me, the above
definition is muscle memory (and also very, very easy to type; usually
as '\w+', ignoring that this will catch stuff like '9to5' since such
false positives are rare).

The alternative is to convince every text editor, text tool² and text
processing library in existence that '\w' is '\p{XID_Continue}' and not
'[_[:alnum:]]' as it is currently defined (by, AFAIK, *everyone*).

I would challenge anyone to show me an existing tool³ which uses the
proposed definition of identifiers. I can name a good half dozen, just
off the top of my head, that use *my* proposed definition.

(¹ I'll assume use of a Unicode-correct definition of '[[:alnum:]]'. For
tools that get that wrong, I'm happy to label the tool "broken".)

(² *cough*grep*cough*)

(³ Given the paper, it would seem like even compilers probably don't use
the proposal, but anyway, name some non-compiler tools...)


Received on 2020-06-18 10:11:26