C++ Logo

SG16

Advanced search

Subject: Re: [SG16-Unicode] [isocpp-lib] [isocpp-lib-ext] The "Let's Stop Ascribing Meaning to Code Points" blog post
From: Zach Laine (whatwasthataddress_at_[hidden])
Date: 2019-11-13 18:19:05


On Wed, Nov 13, 2019 at 1:28 PM Billy O'Neal (VC LIBS) <bion_at_[hidden]>
wrote:

> >Will you be hesitant to update the reference to the grapheme breaking
> algorithm if it changes in future Unicode standards as well?
>
> Yes. There's a reason why, for example, Java doesn't follow Unicode's
> rules in its regex implementation, because it would be a breaking change to
> do that.
>

IMO, this is the wrong way to think about stability w.r.t Unicode. The
changes that happen to Unicode are bug fixes. If they change the results
users get when they use a certain API, it's a fix, not a regression.
Adding an 8-width (or whatever it turns out to be) entry in the table for
U+FDFD in a later standard falls into that category.

> >It is important to remember that width estimation is orthogonal to
> memory safety; format_to_n() is there to give you the memory safety part,
> and that will never be impacted by the width estimation piece.
>
> I agree, but the same is true of sprintf vs. snprintf.
>
> Billy3
>

That sounds right to me, but I don't get the implication. Why did you
bring it up?

Zach



SG16 list run by herb.sutter at gmail.com