C++ Logo

sg16

Advanced search

Re: [SG16-Unicode] [isocpp-core] Fwd: New Core Issue: [lex.name]/3.2 under-specifies "uppercase letter"

From: Corentin <corentin.jabot_at_[hidden]>
Date: Mon, 28 Oct 2019 20:07:47 +0100
On Mon, Oct 28, 2019, 19:59 Billy O'Neal (VC LIBS) <bion_at_[hidden]>
wrote:

> It’s also different in different locales, the classic example being “i"
> which becomes “İ” in some locales. See
> https://en.wikipedia.org/wiki/Dotted_and_dotless_I
>
>
>

That should not be relevant. Transformations (in this case uppercasing) can
be affected by locales, codepoints are not (graphemes are).

Anyway, I also think the change as proposed is reasonable.

> I think limiting it to A-Z is reasonable (as this PR does).
>
>
>
> Billy3
>
>
>
> *From: *Corentin via Core <core_at_[hidden]>
> *Sent: *Monday, October 28, 2019 10:56 AM
> *To: *JF Bastien <cxx_at_[hidden]>
> *Cc: *Corentin <corentin.jabot_at_[hidden]>; wmm_at_[hidden]; SG16
> <unicode_at_[hidden]>; Mathias Stearn <redbeard0531+isocpp_at_[hidden]>; C++
> Core Language Working Group <core_at_[hidden]>
> *Subject: *Re: [isocpp-core] [SG16-Unicode] Fwd: New Core Issue: [lex.name]/3.2
> under-specifies "uppercase letter"
>
>
>
> I would like to point out that afaik, although a rare event, the uppercase
> property of codepoints is not guaranteed to be stable and can change in
> either way from one Unicode version to the next.
>
>
>
> On Mon, Oct 28, 2019, 18:32 JF Bastien <cxx_at_[hidden]> wrote:
>
> I’d like to have a stronger motivation that this. Do we ever intend to use
> non-ascii as reserved names? If so, we should wait to resolve TR31 and not
> make any change because doing what you propose closes a door. If not (ie
> we’ll only ever use A-Z to start reserved names) then your change is
> exactly what we’ll want
>
>
>
> On Mon, Oct 28, 2019 at 9:39 AM Mathias Stearn <
> redbeard0531+isocpp_at_[hidden]> wrote:
>
> Is it just uppercase letters in the basic source character set, or
> anything considered an uppercase letter in the universal character set
> after phase 1 transcoding and universal-character-name resolution? Or is
> there some other definition of uppercase?
>
>
>
> I have a slight preference for restricting to just A-Z so that it doesn't
> require humans or tools to consult the unicode data tables to decide if an
> identifier is safe to use.
>
>
>
> Proposed resolution:
>
>
>
> Replace [lex.names]/3.2 with:
>
>
>
> Each identifier that contains a double underscore __ or begins with an
> underscore followed by an uppercase <del>letter</del><ins>*nondigit*</ins>
> is reserved to the implementation for any use.
>
>
>
>
>
> Alternatively we could either create a new grammar production for
> uppercase *nondigit*s, or just say something like "one of the universal
> characters in the range 0041-005A (A-Z)"
>
>
>
>
>
> _______________________________________________
> SG16 Unicode mailing list
> Unicode_at_[hidden]
> http://www.open-std.org/mailman/listinfo/unicode
> <https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.open-std.org%2Fmailman%2Flistinfo%2Funicode&data=02%7C01%7Cbion%40microsoft.com%7Ca8e6710dcdc74fe9a23c08d75bd01956%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637078821656282730&sdata=ai2WrEFV9uL%2BavOqZSHujL8e%2FCimTFYKkOVZt0NbGgE%3D&reserved=0>
>
> _______________________________________________
> SG16 Unicode mailing list
> Unicode_at_[hidden]
> http://www.open-std.org/mailman/listinfo/unicode
> <https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.open-std.org%2Fmailman%2Flistinfo%2Funicode&data=02%7C01%7Cbion%40microsoft.com%7Ca8e6710dcdc74fe9a23c08d75bd01956%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637078821656282730&sdata=ai2WrEFV9uL%2BavOqZSHujL8e%2FCimTFYKkOVZt0NbGgE%3D&reserved=0>
>
>
>

Received on 2019-10-28 20:08:01