C++ Logo

sg16

Advanced search

Re: [SG16-Unicode] Draft of paper to update the unicode standard reference

From: Steve Downey <sdowney_at_[hidden]>
Date: Wed, 16 May 2018 22:30:46 -0400
The ISO documents all seem, at this point, to be somewhat lagging derived
documents from the master doc produced by the consortium as the Unicode
Standard.
There's probably some bureaucratic requirement driving that, where
unicode.org isn't accredited.
The C++ standard is citing IETF RFCs,vso we ought to be able to cite
unicode.org. If we also need to cite iso, then we should pull those cites
from the Unicode Standard.
We should never be in a position of following the standard or doing the
right thing if we can help it.



On Wed, May 16, 2018, 12:47 Zach Laine <whatwasthataddress_at_[hidden]> wrote:

> Ah, thanks! The other link looked paywalled (and was anyways in French).
>
> Zach
>
>
> On Wed, May 16, 2018 at 11:13 AM, Martinho Fernandes <rmf_at_[hidden]> wrote:
>
>> Why settle for a draft when it's publicly available for free? It's here:
>> http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html.
>>
>> Note that it does not define normalization, and instead it refers to
>> UAX#15.
>>
>> On 16.05.18 17:55, Zach Laine wrote:
>>
>> Alisdair, 14651 sounds like it contains normalization and collation. Is
>> that right? Also, any idea where a draft might be found? I'd hate to pay
>> just to have a peek at something I don't actually have a use for.
>>
>> Zach
>>
>> On Wed, May 16, 2018 at 8:29 AM, Alisdair Meredith <alisdairm_at_[hidden]>
>> wrote:
>>
>>> As far as I can tell, ISO 14651 covers the algorithms, or at least a
>>> start on them:
>>> https://www.iso.org/standard/68309.html
>>>
>>>
>>> For C++20 I believe ISO 10646 is all that is needed, as the standard
>>> uses unicode directly for
>>> only definitions of character sets. Once SG-16 creates a proposal for
>>> deeper support, it will
>>> probably want these additional references though.
>>>
>>> AlisdairM
>>>
>>>
>>> On May 16, 2018, at 08:32, Martinho Fernandes <rmf_at_[hidden]> wrote:
>>>
>>> ISO 30112 doesn't seem to be enough in the long run either. Correct me
>>> if I'm wrong (I don't have access to the document), but from the abstract
>>> it sounds like this just specifies description formats; no algorithms and
>>> no data, just ways to specify them.
>>>
>>> It doesn't cover the ground in https://unicode.org/reports/tr15/,
>>> https://unicode.org/reports/tr29/, https://unicode.org/reports/tr14/,
>>> https://unicode.org/reports/tr9/, https://unicode.org/reports/tr50/,
>>> ... (roughly in order of importance). I don't know if there are ISO
>>> standards specifying the same aspects and staying in sync. I don't think
>>> there are any; the Unicode FAQ doesn't mention any ISO standard other than
>>> ISO 10646 (https://www.unicode.org/faq/unicode_iso.html). If there are,
>>> let's use them; if there aren't, I think it'd be preferable to just have
>>> one single reference to the Unicode specification than to have several
>>> references to standards that may or may not get updated in lockstep and may
>>> or may reflect the current state of the Unicode Standard.
>>>
>>>
>>> FWIW I only mentioned annexes because they're easier to link to than the
>>> core specification, even though there are some algorithms formally defined
>>> within it that are also not covered in ISO 10646 nor ISO 30112. Also note
>>> that a reference to a specific Unicode version encompasses "an edition of
>>> the core specification, *The Unicode Standard*, together with the Code
>>> Charts, Unicode Standard Annexes and the Unicode Character Database" (from
>>> https://www.unicode.org/standard/standard.html)
>>> On 16.05.18 14:10, keld_at_[hidden] wrote:
>>>
>>> If you want more than just character sets, you should refer ISO 30112,
>>> which Unicode has tried to copy.
>>>
>>> 30112 is much more shaped to the POSIX/C/C++ model - not just UCS.
>>>
>>> Best regards
>>> keld
>>>
>>>
>>> On Fri, May 04, 2018 at 11:59:58PM +0200, R. Martinho Fernandes wrote:
>>>
>>> Can you explain why? For now the ISO reference is enough, but in the future we will need the Unicode Standard reference because ISO 10646 is only the character set.
>>>
>>> On May 4, 2018 11:57:08 PM GMT+02:00, keld_at_[hidden] wrote:
>>>
>>> I qould like that we use the reference to ISO 10646 instead of the
>>> unicode inc. reference.
>>> I have advocated that for quite a long time now.
>>>
>>> Best regards
>>> keld
>>>
>>> On Fri, May 04, 2018 at 09:43:22PM +0000, Steve Downey wrote:
>>>
>>> I've been told that some people believe there's a policy that ISO
>>>
>>> Standards
>>>
>>> must cite other ISO Standards where those are available, which is why
>>>
>>> we're
>>>
>>> citing the ISO copies of Unicode and ECMAScript. I can't find an
>>>
>>> actual
>>>
>>> policy on this, though.
>>> I'm willing to put in the Unicode.org preferred reference, with a
>>>
>>> fallback
>>>
>>> to the ISO reference. My only fear is that too many choices will lead
>>>
>>> to
>>>
>>> paralysis.
>>>
>>> On Fri, May 4, 2018 at 4:44 PM JF Bastien <cxx_at_[hidden]> <cxx_at_[hidden]> wrote:
>>>
>>>
>>> The Unicode standard has guidance on how to cite it:
>>> http://www.unicode.org/versions/index.html#Citations
>>>
>>> It would be useful to link to this guidance (and follow it).
>>>
>>> On Fri, May 4, 2018 at 1:10 PM, Steve Downey <sdowney_at_[hidden]> <sdowney_at_[hidden]>
>>>
>>> wrote:
>>>
>>> https://github.com/steve-downey/sg16/blob/d10250/papers/D1025R0.md
>>>
>>> There are some formatting issues I will clean up, in particular
>>>
>>> changing
>>>
>>> the links to not raw links, and moving the links down to a
>>>
>>> bibliography
>>>
>>> section.
>>>
>>> Also adding a title at the top.
>>>
>>> _______________________________________________
>>> Unicode mailing listUnicode_at_[hidden]://www.open-std.org/mailman/listinfo/unicode
>>>
>>> _______________________________________________
>>> Unicode mailing listUnicode_at_[hidden]://www.open-std.org/mailman/listinfo/unicode
>>>
>>> _______________________________________________
>>> Unicode mailing listUnicode_at_[hidden]://www.open-std.org/mailman/listinfo/unicode
>>>
>>> _______________________________________________
>>> Unicode mailing listUnicode_at_[hidden]://www.open-std.org/mailman/listinfo/unicode
>>>
>>>
>>> --
>>> Martinho
>>>
>>> _______________________________________________
>>> Unicode mailing list
>>> Unicode_at_[hidden]
>>> http://www.open-std.org/mailman/listinfo/unicode
>>>
>>>
>>>
>>> _______________________________________________
>>> Unicode mailing list
>>> Unicode_at_[hidden]
>>> http://www.open-std.org/mailman/listinfo/unicode
>>>
>>>
>>
>> --
>> Martinho
>>
>>
> _______________________________________________
> Unicode mailing list
> Unicode_at_[hidden]
> http://www.open-std.org/mailman/listinfo/unicode
>

Received on 2018-05-17 04:30:59