liaison: Re: [wg14/wg21 liaison] (SC22WG14.19301) adding punctuator tokens

From: Jens Gustedt <jens.gustedt_at_[hidden]>
Date: Sat, 17 Apr 2021 09:04:16 +0200

Steve,

on Fri, 16 Apr 2021 17:11:12 -0400 you (Steve Downey
<sdowney_at_[hidden]>) wrote:

> On Fri, Apr 16, 2021 at 2:47 PM Jens Gustedt <jens.gustedt_at_[hidden]>
> wrote:
>
> > Steve,
> >
> > on Fri, 16 Apr 2021 13:57:00 -0400 you (Steve Downey
> > <sdowney_at_[hidden]>) wrote:
> >
> > > Yes I have a todo to bring it at least to the liaison group.
> >
> > great!
> >
> > > I don't think the technical grammar changes would carry over, but
> > > the design ought to.
> >
> > C's technical specification here is in fact quite simple. This is
> > just done with ranges of codepoints that are permitted for
> > identifiers in a normative annex. We should just watch that we keep
> > these ranges in sync as much as that is possible.
> >
>
> C++ is planning to outsource that to the Unicode standard, because
> they're maintaining a stable list, and really that list doesn't get
> new items at any regularity.

This looks like a smart move to me. We should probably do the same for
C.

IIRC for such a reference to another norm we may refer to a specific
version of the norm, to fix a certain state, or refer to the norm as
such and take everything we get automatically when that norm is
updated.

Here I would go for the first and bump the specific version of Unicode
everytime we update the standard.

> The big change was not white-listing all the unassigned characters,
> since that allowed all sorts of problems, in addition to allowing
> the RTL stuff.

Sounds also good to me.

> There were some grammar changes just to fix some problems with
> pp-tokens, making sure that UCNs always were, and dealing with
> pp-numbers that don't turn out to be numbers.

That worries me a bit more. I would be good if that would converge to
a common treatement in C and C++. I don't think that pp-numbers in C
ever caused serious problems. They just survive for later translation
phases and if they are not appropriate, the problem is handled there.

> The most controversial change is this cleans out the emoji that have
> crept in, and it's mostly controversial because people think they are
> all allowed, whereas there are swaths of them blocked out now,
> including some of the ones used as modifiers.

I agree that it is definitively good to have these handled
uniformly. Probably we have the same problem in C, too.

All of this has me thinking that it would perhaps be better to have a
common TS (norm, whatever suits) that describes lexing and
preprocessing, and to which the C and C++ standards would just refer.

Thanks
Jens

-- 
:: INRIA Nancy Grand Est ::: Camus ::::::: ICube/ICPS :::
:: ::::::::::::::: office Strasbourg : +33 368854536   ::
:: :::::::::::::::::::::: gsm France : +33 651400183   ::
:: ::::::::::::::: gsm international : +49 15737185122 ::
:: http://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

Received on 2021-04-17 02:04:40