C++ Logo

sg16

Advanced search

Re: [SG16] Emojis in identifiers

From: Corentin Jabot <corentinjabot_at_[hidden]>
Date: Thu, 18 Jun 2020 23:30:16 +0200
On Thu, 18 Jun 2020 at 23:19, Steve Downey via SG16 <sg16_at_[hidden]>
wrote:

> I'll see if I can put together a list that makes sense of what characters
> are being removed by UAX 31 and the current Unicode database against the
> current list.
>
> For emoji, I think it's also probably not clear to people who don't handle
> text just how complicated they are. Simply allowing class Emoji would be
> utterly insufficient. The regex for checking if something _might_ be a
> valid emoji, per the Unicode standard:
>
> \p{RI} \p{RI}
> | \p{Emoji}
> ( \p{EMod}
> | \x{FE0F} \x{20E3}?
> | [\x{E0020}-\x{E007E}]+ \x{E007F} )?
> (\x{200D} \p{Emoji}
> ( \p{EMod}
> | \x{FE0F} \x{20E3}?
> | [\x{E0020}-\x{E007E}]+ \x{E007F} )?
> )*
>
>
> http://www.unicode.org/reports/tr51/#Emoji_Sequences
> I believe cutting off all of the extension mechanisms for emoji , such as for gender or skin tone, to be unacceptable. However the implementation cost in the lexer would be quite high.
>
>
Can we agree that we shout support unicode fully or not at all (exactly
because of gender and skin tons, etc) ?

Parsing emojis is probably not sufficient / non trivial. simple solution is
to stick to the list of supported emojis for inter exchange which
is finite (about 3000 or so elements I think)


> On Thu, Jun 18, 2020 at 4:36 PM Tom Honermann via SG16 <
> sg16_at_[hidden]> wrote:
>
>> On 6/18/20 3:14 PM, Alisdair Meredith via SG16 wrote:
>>
>> It is not clear we would increase consensus,
>> as we got feedback only from those who were
>> concerned at the lack of emoji support. We
>> don't know how many others might switch
>> away from their support if emoji support were
>> added.
>>
>> I would probably switch from in favor to
>> against for this, as I find emoji unclear and
>> often misleading in communicating meaning,
>> although perhaps some smaller subset of the
>> emoji space might be clearer?
>>
>> Note that I’m not saying to NOT do the work
>> to clarify the cost/benefit of supporting emoji,
>> just that it is not clear whether it will increase,
>> reduce, or simply change consensus. More
>> information in a paper is usually helpful though.
>>
>> Agreed with all of the above.
>>
>> There were quite a few abstentions. My guess is that a number of people
>> felt undecided for other reasons. Perhaps ambivalence due to a perception
>> that extended characters are not used in practice, or perhaps difficulty
>> with appreciating the impact of the change.
>>
>> It is challenging to get an intuitive sense of what identifiers are in or
>> out by comparing the list of code points in [lex.name]p1
>> <http://eel.is/c++draft/lex.name#1> vs the list of code points with
>> XID_Start/XID_Continue properties listed in the paper
>> <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1949r4.html#appendix-a---xid_start-code-points>.
>> Perhaps we can better compare and present how these lists differs? Perhaps
>> with a table illustrating included and excluded identifiers?
>>
>> I think it might help increase confidence as well if we can collect more
>> data regarding how extended characters are used in practice.
>>
>> Tom.
>>
>> AlisdairM
>>
>>
>> On Jun 18, 2020, at 19:55, Jens Maurer via SG16 <sg16_at_[hidden]> <sg16_at_[hidden]> wrote:
>>
>> So, it seems we would increase consensus in EWG if we
>> added emojis to the valid identifier characters.
>>
>> That also gets us zero-width joiners (ZWJ):https://www.unicode.org/reports/tr51/#gender-neutral
>>
>> but maybe we can limit the fall-out by allowing ZWJ
>> only inside of sequences of emojis, although I hate
>> to burden compilers with even more special rules around
>> the source code text (beyond NFC).
>>
>> Jens
>> --
>> SG16 mailing listSG16_at_[hidden]://lists.isocpp.org/mailman/listinfo.cgi/sg16
>>
>>
>> --
>> SG16 mailing list
>> SG16_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>>
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>

Received on 2020-06-18 16:33:38