Alisdair, 14651 sounds like it contains normalization and collation.  Is that right?  Also, any idea where a draft might be found?  I'd hate to pay just to have a peek at something I don't actually have a use for.


On Wed, May 16, 2018 at 8:29 AM, Alisdair Meredith <> wrote:
As far as I can tell, ISO 14651 covers the algorithms, or at least a start on them:

For C++20 I believe ISO 10646 is all that is needed, as the standard uses unicode directly for
only definitions of character sets.  Once SG-16 creates a proposal for deeper support, it will
probably want these additional references though.


On May 16, 2018, at 08:32, Martinho Fernandes <> wrote:

ISO 30112 doesn't seem to be enough in the long run either. Correct me if I'm wrong (I don't have access to the document), but from the abstract it sounds like this just specifies description formats; no algorithms and no data, just ways to specify them.

It doesn't cover the ground in,,,,, ... (roughly in order of importance). I don't know if there are ISO standards specifying the same aspects and staying in sync. I don't think there are any; the Unicode FAQ doesn't mention any ISO standard other than ISO 10646 ( If there are, let's use them; if there aren't, I think it'd be preferable to just have one single reference to the Unicode specification than to have several references to standards that may or may not get updated in lockstep and may or may reflect the current state of the Unicode Standard.

FWIW I only mentioned annexes because they're easier to link to than the core specification, even though there are some algorithms formally defined within it that are also not covered in ISO 10646 nor ISO 30112. Also note that a reference to a specific Unicode version encompasses "an edition of the core specification, The Unicode Standard, together with the Code Charts, Unicode Standard Annexes and the Unicode Character Database" (from

On 16.05.18 14:10, wrote:
If you want more than just character sets, you should refer ISO 30112,
which Unicode has tried to copy.

30112 is much more shaped to the POSIX/C/C++ model - not just UCS.

Best regards

On Fri, May 04, 2018 at 11:59:58PM +0200, R. Martinho Fernandes wrote:
Can you explain why? For now the ISO reference is enough, but in the future we will need the Unicode Standard reference because ISO 10646 is only the character set.

On May 4, 2018 11:57:08 PM GMT+02:00, wrote:
I qould like that we use the reference to ISO 10646 instead of the
unicode inc. reference.
I have advocated that for quite a long time  now.

Best regards

On Fri, May 04, 2018 at 09:43:22PM +0000, Steve Downey wrote:
I've been told that some people believe there's a policy that ISO
must cite other ISO Standards where those are available, which is why
citing the ISO copies of Unicode and ECMAScript. I can't find an
policy on this, though.
I'm willing to put in the preferred reference, with a
to the ISO reference. My only fear is that too many choices will lead

On Fri, May 4, 2018 at 4:44 PM JF Bastien <> wrote:

The Unicode standard has guidance on how to cite it:

It would be useful to link to this guidance (and follow it).

On Fri, May 4, 2018 at 1:10 PM, Steve Downey <>


There are some formatting issues I will clean up, in particular
the links to not raw links, and moving the links down to a

Also adding a title at the top.

Unicode mailing list

Unicode mailing list
Unicode mailing list

Unicode mailing list


Unicode mailing list

Unicode mailing list