sg16: [SG16-Unicode] Draft questions for Swift and WebKit representatives

From: Tom Honermann <tom_at_[hidden]>
Date: Sun, 22 Jul 2018 22:48:48 -0400

The following is a draft list of questions for our discussion with Swift
and WebKit representatives (tentatively) scheduled for this week.
Please suggest refinements or additions. These questions are intended
more to facilitate discussion than to solicit precise answers. I'll
forward these Monday (tomorrow) evening.

1. The Swift string manifesto is about 1 1/2 years old. What have you
    learned since writing it? What would you change? What have you
    changed?
2. Swift strings are extended grapheme cluster (EGC) based. What have
    been the best and worst results of this choice?
3. Swift strings do not enforce storage in any particular Unicode
    normalization form. Was consideration given to forcing storage in a
    particular form such as FCC or NFC?
4. Swift strings support comparison via normalization. Has use of
    canonical string equality been a performance issue? Or been a
    source of surprise to programmers?
5. Swift strings are not locale sensitive. Was any consideration given
    to creation of a distinct locale sensitive string type?
6. Swift strings provide a count property as required to satisfy the
    Collection protocol. How often do programmers use count (the number
    of EGCs in the string) inappropriately?
7. Swift strings support several memory unsafe initializers and
    methods. How frequently are these used incorrectly?
8. The Swift manifesto discussed three approaches to handling
    substrings and Swift 4 changed from "same type, shared storage" to
    "different type, shared storage". Any regrets?
9. How often do you find programmers doing work at the EGC level that
    would be better performed at the code unit or code point level?
10. Likewise, how often do you find programmers working with
    unicodeScalars, utf8, or utf16 views to do work better performed at
    the EGC level? For what reasons does this occur? Perhaps to work
    around differences in EGC boundaries across Unicode versions or the
    version of ICU being used?
11. Has consideration been given to exposing Unicode character database
    properties? CharacterSet exposes some of these properties, but have
    more been requested?
12. How firmly is the Swift string implementation tied to ICU? If the
    C++ standard library were to add suitable Unicode support, what
    would motivate reimplementing Swift strings on top of it?
13. Do Swift programmers tend to prefer string interpolation or string
    formatting functions?
14. What enhancements would you like to see in C++ to improve Unicode
    support?

Tom.

Received on 2018-07-23 04:56:47