On Tuesday, 26 August 2025 14:48:21 Pacific Daylight Time David Brown via Std-
Proposals wrote:
The consensus of the modern world is UTF-8 for everything, except for
legacy API's that are difficult to change.
And that is the big issue: all the legacy APIs. We're not talking about a
green field scenario. In the real world, UTF-16 has a place and is in use for
in-memory representation more frequently than UTF-8 or UTF-32. UTF-8 is used
for external representation (network protocols and files).
AFAIUI, all these languages, libraries and OS's that had UCS2 character
encodings also now support UTF-8, and generally encourage UTF-8 as the
main choice of character type.
Internally they still operate in UTF-16 and will need to perform conversion
to/from it to operate on UTF-8. And that includes *the* library for Unicode
support, ICU. If the Standard proposed an API for performing collation in
Unicode, chances are it would be implemented using ucol_strcoll[1].