On Jun 23, 2020, at 11:46 AM, Corentin Jabot via SG16 <sg16@lists.isocpp.org> wrote:

On Tue, 23 Jun 2020 at 17:24, Marcos Bento <marcosbento@gmail.com> wrote:

On Tue, Jun 23, 2020 at 4:27 PM Corentin Jabot <corentinjabot@gmail.com> wrote:

Emojis are not just codepoints, they are a complex grammar that cannot just be "allowed", there needs to be a non trivial support. In fact the only sane way to support that would be to only allow the "recommended for general interchange" emojis from a specific list.

I understand that there are technical issues that make emoji "undesirable" -- namely, as you point out, the unnecessary(?) complexity that identifying emojis would introduce in tools.
My suggestion would be to add that explanation of (or a reference to) the rationale behind the decision.

But again it should be driven by an analysis of use cases for emojis in identifiers and the impact on future evolution (emojis are symbols). For example Swift found itself in a situation where some emojis are considered identifiers and other custom operators.

Same for other scripts, individual letters are allowed but ZWNJ are not.
UAX#31 lists these scenarios.  A quick survey seems to show that there is no demand for Farsi, for example, because neither tools or people like to deal with mixed-directions languages. Is that a chicken egg problem ? Maybe

Other partially supported scripts are those which use a virama https://en.wikipedia.org/wiki/Virama , which is not semantically meaningful. 

My understanding is that it is not customary for Brahmic scripts to be used in programming languages, because of poor IDE or input support, or cultural reasons.
I would definitely love to see a proposal for this, but it should ultimately be driven by people familiar with these scripts and who understand the demand for them.

My point is that it would be interesting to have clear indication in the paper of what we are effectively dropping.

Maybe we can see it as a matter of educating the reader. Just in case someone is currently coding C++ in Brahmic, for example,  is warned.

No script is dropped.
We just don't allow ZW(N)J, which Brahmic scripts use. How much that does affect the ability to use these scripts is
probably hard to answer by someone not very familiar with them.

The goal of what was suggested wasn’t to have a discussion here. It was to encourage an update to the paper to reflect such analysis. If the paper just explained that Brahmic scripts are impacted in ways we don’t fully understand, that is ok; that is still useful information for everyone that will vote on the proposal. 

And again, we need to adopt the proposal as presented to find ourselves in a clean slate from which we can build upon iteratively as/if demand arises.


SG16 mailing list