liaison: Re: [wg14/wg21 liaison] (SC22WG14.19238) grammar incompatibilities with lambdas

From: Aaron Ballman <aaron_at_[hidden]>
Date: Mon, 12 Apr 2021 08:04:46 -0400

On Mon, Apr 12, 2021 at 4:33 AM Jens Gustedt <jens.gustedt_at_[hidden]> wrote:
>
> Javier,
>
> on Sun, 11 Apr 2021 21:45:06 +0200 (CEST) you ("Javier Múgica"
> <javier_at_[hidden]>) wrote:
>
> > I certainly prefer a single token for [[. Pretending it is two
> > tokens... well, you already explained the problems.
>
> Thanks for the feedback!
>
> > On the other hand, ]] can appear in a valid program. Could it be
> > defined that a sequence of two consecutive ']' is a single token if
> > it matches a preceding [[ token?
>
> No, I don't think we should. That would just complexify the
> grammar.

+1

> The main difficulty of C++' grammar with `< >` template
> brackets is actually that `>>` is already the right-shift token.
> There is no need to artificially introduce the same problem here.
>
> But I should have said that explicitly:
>
> - introduce a `[[` for attribute opening
> - leave the token pair `]` `]` alone, since this occurs in
> valid code
>
> It is a bit unfortunate that this is not symmetric, but that is more
> an aesthetic question than anything else.

We had discussed this particular topic at several meetings and one of
the big concerns about converting [[ into a single token was that it
would introduce complexity for lexing in compilers that support
derivative languages, like Objective-C, where [[ shows up with quite a
bit of frequency in their syntax. So [[ suffers from exactly the same
problem as ]] in practice for some implementations.

> > (For those of you who work / have worked in
> > tokenzing C code). It seems a parser need not even keep a count of
> > unmatched [[ tokens, just look ahead for the closing ]], since a ]]
> > sequence cannot appear within an [[ attribute ]], can it?
> >
> > > The only impact for users of C23 would be that when they want to
> > > use a lambda in an array bound (which is a new feature) they'd have
> > > to put spaces between the `[` `[`.
> >
> > I'd certainly cope with this than have [[ not a single token. Just as
> > '>' '>' for templates in C++.
>
> Yes that would be my hope, too.
>
> BTW, at the time we voted this I had a more complete solution for C
>
> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2175.pdf
>
> This has encountered full resistance from the Unicode-illiterate and I
> don't think that it has been really considered seriously.
>
> It still would be a much cleaner solution.

I don't consider myself to be Unicode illiterate and I still resist
the idea in this narrow case. It's not part of the basic source
character set, there's not a key for it on my keyboard, it brings in
many questions about encodings, etc. If we wanted to go this route,
I'd argue we should be considering alternative tokens (or some similar
mechanism) for the whole class of punctuators. There's nothing special
about attributes in this particular case -- users could get just as
much use (or confusion) out of ≤ ≥ ≠ ∧ ∨ ⊻ and so on.

~Aaron

>
> Thanks
> Jens
>
> --
> :: INRIA Nancy Grand Est ::: Camus ::::::: ICube/ICPS :::
> :: ::::::::::::::: office Strasbourg : +33 368854536 ::
> :: :::::::::::::::::::::: gsm France : +33 651400183 ::
> :: ::::::::::::::: gsm international : +49 15737185122 ::
> :: http://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

Received on 2021-04-12 07:05:04