Date: Sat, 17 Apr 2021 20:28:35 +0200
Jens,
on Sat, 17 Apr 2021 18:57:48 +0200 you (Jens Maurer
<Jens.Maurer_at_[hidden]>) wrote:
> On 17/04/2021 11.36, Jens Gustedt wrote:
> > Jens,
> >
> > on Sat, 17 Apr 2021 10:02:15 +0200 you (Jens Maurer
> > <Jens.Maurer_at_[hidden]>) wrote:
>
> >> The paper is here:
> >>
> >> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1949r6.html
> >>
> >> The pp-number change in particular is just applying the new
> >> UAX#31-based lexing grammar non-terminals to these. Nothing to
> >> see, I believe.
> >
> > ah, ok. So I guess the intent of this change is to allow such
> > letters in application defined number and string suffixes? That
> > would be reasonable also to have for C, I think.
>
> The current lexing of pp-number is the same between C and C++;
> modulo digit separators in C++. See C2x 6.4.8 and C++20 5.9.
>
> Both languages say that 0abcdz\u4000xx is a pp-number.
> Since a universal-character-name may appear there, we should
> say that only identifier-allowed ones can appear there.
> (This decision primarily affects lexing of the next
> token, which might be an "identifier". Essentially, this says
> that you need to put whitespace in between.)
right, I wasn't even aware that universal-character-names were allowed
in pp-numbers, they are not mentioned in 6.4.3, but the syntax
production indeed allows this
(this could be arranged editorially, I think)
It seems C already makes the restriction to the "idenfier-allowed"
ranges, D.1, so yes an update for all of this to UAX#31 would
certainly be good to have in C.
Thanks
Jens
on Sat, 17 Apr 2021 18:57:48 +0200 you (Jens Maurer
<Jens.Maurer_at_[hidden]>) wrote:
> On 17/04/2021 11.36, Jens Gustedt wrote:
> > Jens,
> >
> > on Sat, 17 Apr 2021 10:02:15 +0200 you (Jens Maurer
> > <Jens.Maurer_at_[hidden]>) wrote:
>
> >> The paper is here:
> >>
> >> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1949r6.html
> >>
> >> The pp-number change in particular is just applying the new
> >> UAX#31-based lexing grammar non-terminals to these. Nothing to
> >> see, I believe.
> >
> > ah, ok. So I guess the intent of this change is to allow such
> > letters in application defined number and string suffixes? That
> > would be reasonable also to have for C, I think.
>
> The current lexing of pp-number is the same between C and C++;
> modulo digit separators in C++. See C2x 6.4.8 and C++20 5.9.
>
> Both languages say that 0abcdz\u4000xx is a pp-number.
> Since a universal-character-name may appear there, we should
> say that only identifier-allowed ones can appear there.
> (This decision primarily affects lexing of the next
> token, which might be an "identifier". Essentially, this says
> that you need to put whitespace in between.)
right, I wasn't even aware that universal-character-names were allowed
in pp-numbers, they are not mentioned in 6.4.3, but the syntax
production indeed allows this
(this could be arranged editorially, I think)
It seems C already makes the restriction to the "idenfier-allowed"
ranges, D.1, so yes an update for all of this to UAX#31 would
certainly be good to have in C.
Thanks
Jens
-- :: INRIA Nancy Grand Est ::: Camus ::::::: ICube/ICPS ::: :: ::::::::::::::: office Strasbourg : +33 368854536 :: :: :::::::::::::::::::::: gsm France : +33 651400183 :: :: ::::::::::::::: gsm international : +49 15737185122 :: :: http://icube-icps.unistra.fr/index.php/Jens_Gustedt ::
Received on 2021-04-17 13:28:57