sg16: Re: [SG16] Emojis in identifiers

From: Steve Downey <sdowney_at_[hidden]>
Date: Fri, 19 Jun 2020 11:07:34 -0400

While we don't exclude scripts generally, by not doing script analysis, the
lack of ZWJ and ZWNJ makes some words in Indic scripts problematic. The
examples in
https://unicode.org/reports/tr31/#Layout_and_Format_Control_Characters are
relevant. Zero Width Joiner and Zero Width Non-Joiner are used in
Farsi, Malayalam, and Sinhala.

Wikipedia https://en.wikipedia.org/wiki/Zero-width_joiner#Examples mentions
Devanagari and Kannada, although it appears that recent editions of Unicode
may have added explicit characters in Devanagari to alleviate the problem.

Script recognition would also be necessary to identify the "emoji" script
to allow sequences, as well as expanding the repertoire of allowed
characters to include the currently explicitly disallowed emoji, the ones
that were known at the time the allowed character ranges in C++ was put
together.

On Fri, Jun 19, 2020 at 1:26 AM Jens Maurer via SG16 <sg16_at_[hidden]>
wrote:

> On 19/06/2020 00.38, Ville Voutilainen via SG16 wrote:
> > I'm confused to the hilt by this:
> >
> > "So, it seems we would increase consensus in EWG if we
> > added emojis to the valid identifier characters."
> >
> > The paper I read didn't seem to go into that direction. That quoted
> > bit (which I copy-pasted, it's not a drunken
> > transformation) seems like it's a completely new direction.
>
> Yesterday's EWG session had a poll at the end about forwarding P1949
> to CWG (tentatively ready), and there were three "against" votes.
> Asked about their reasons, the two points raised were:
>
> - Are we excluding any (possibly fringe) scripts?
> (The paper should simply say "no, we don't", despite UAX #31
> confusingly containing a table "Excluded Scripts", but that's
> just for the opt-in "implementations may want to exclude them
> from identifiers" provision.)
>
> - We should be as inclusive as possible, so we should include
> emoji. (Slides may use them; some people may want to express
> themselves by using them.)
>
> Whether adding the latter would turn some "yes" votes into
> "no" votes in EWG is unknown. Let's ask.
>
> Jens
>
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>

Received on 2020-06-19 10:10:59