On Fri, Aug 30, 2019, 12:45 AM JeanHeyd Meneide <phdofthehouse@gmail.com> wrote:
Well, the directives of the SG16 Direction Paper says to avoid excessive inventiveness....

I would be wary to any inventiveness in terms of departing from the Unicode spec , let's follow the recommendation of the Unicode consortium, they know better.

Because the unicode character database properties have pervasive uses in Unicode (staring with Normalization, and branching out from there), it would be incredibly hard to properly provide all of the C++ algorithms a chance to override the database used. You'd have to make sure every single algorithm can receive the database; or, define a "weak symbol" from the implementation which you then plug with your information. I don't know if the Timezone Database (tzdb) in chrono/date can help shed some light on how this might be done in a Modern C++ API without breaking implementer backs; looking there might be a good start.

Timezones are different in that they change a lot, but then again it is not something users should be able to control and they can't. As I write this mail it is 8am my time. And as much as I would like to, I don't get to change that.


I'm still unsure if it's necessary but if it's going to be customizable we'd better make sure it's done right...

I argue it isn't desirable.
Wether 'a' is a letter is not a user decision.
Whether a private use character has some properties can be resolved with a user provided database and algorithms.
Note that tautologically, these properties would not be Unicode properties.

You can understand this message because we share a common understanding of what each group of black pixel means. PUA has it's uses but it doesn't call for standard tools because by definition it is not standard. If you start to have properties on user provided characters, then suddenly none of the algorithm follow the spec.



Best Wishes,
JeanHeyd