C++ Logo

sg16

Advanced search

Re: [SG16] [isocpp-ext] P1949R4 - C++ Identifier Syntax using Unicode Standard Annex 31

From: Tom Honermann <tom_at_[hidden]>
Date: Thu, 18 Jun 2020 11:21:40 -0400
On 6/18/20 11:08 AM, Matthew Woehlke via Ext wrote:
> On 18/06/2020 10.46, JF Bastien wrote:
>> On Thu, Jun 18, 2020 at 7:44 AM Tom Honermann wrote:
>>> On 6/18/20 10:33 AM, Matthew Woehlke via Ext wrote:
>>>> Okay, maybe not, but then I suppose my point is that if we're going
>>>> to fix
>>>> it, I would like to *fix* it, not just make it less broken.
>>>
>>> What particular form of "*fix*" do you have in mind?
>
> I believe I already explained that. To repeat, make identifiers
> conform to '[_[:alpha:]][_[:alnum:]]*'.
Ok, so your suggestion is to remove a feature that has been in the
language since C++98 and C99. I'll let you write that paper :)
>
>> I'd like to understand what is "broken" first :-)
>> Escaping characters?
>> Or something about tools which try to naively process C++ code? i.e.
>> are we
>> trying to make naive tools easier?
>
> That depends on your definition of "easier". The goal isn't so much to
> make it easier to write a tool correctly, but to make it so that
> *existing* tools¹ are correct w.r.t. the standard.
>
> Note that "tools" here includes humans. At least for me, the above
> definition is muscle memory (and also very, very easy to type; usually
> as '\w+', ignoring that this will catch stuff like '9to5' since such
> false positives are rare).
>
> The alternative is to convince every text editor, text tool² and text
> processing library in existence that '\w' is '\p{XID_Continue}' and
> not '[_[:alnum:]]' as it is currently defined (by, AFAIK, *everyone*).
>
> I would challenge anyone to show me an existing tool³ which uses the
> proposed definition of identifiers. I can name a good half dozen, just
> off the top of my head, that use *my* proposed definition.
>
> (¹ I'll assume use of a Unicode-correct definition of '[[:alnum:]]'.
> For tools that get that wrong, I'm happy to label the tool "broken".)
>
> (² *cough*grep*cough*)
>
> (³ Given the paper, it would seem like even compilers probably don't
> use the proposal, but anyway, name some non-compiler tools...)
>
I wonder if you have misunderstood the proposal. The proposal reduces
the set of allowed identifiers relative to the status quo. The
reduction is principled and adheres with modern Unicode guidelines.

Tom.

Received on 2020-06-18 10:24:52