sg16: Re: [SG16-Unicode] [isocpp-lib] [isocpp-lib-ext] The "Let's Stop Ascribing Meaning to Code Points" blog post

From: Billy O'Neal (VC LIBS) <"Billy>
Date: Thu, 14 Nov 2019 01:38:18 +0000

>IMO, this is the wrong way to think about stability w.r.t Unicode. The changes that happen to Unicode are bug fixes. If they change the results users get when they use a certain API, it's a fix, not a regression.

I agree that it isn't a regression. Whether it is a fix or not has nothing to do with whether it is a breaking change; it's breaking if anyone relies on the behavior that is broken. We have customers that are angry with us because we fixed printf to print doubles correctly. And we had to ship a mode for those customers to make printf be broken again.

>>>It is important to remember that width estimation is orthogonal to memory safety; format_to_n() is there to give you the memory safety part, and that will never be impacted by the width estimation piece.
>>I agree, but the same is true of sprintf vs. snprintf.
>That sounds right to me, but I don't get the implication. Why did you bring it up?

The implication is that if the customers of format/format_to/format_to_n are anything like the customers of sprintf/snprintf, there will be users who call the not sized interface expecting the format string to make it safe.

Billy3
________________________________
From: Zach Laine <whatwasthataddress_at_[hidden]>
Sent: Wednesday, November 13, 2019 04:19 PM
To: Billy O'Neal (VC LIBS) <bion_at_[hidden]>
Cc: Library Working Group <lib_at_[hidden]>; Kirk Shoop <kirkshoop_at_[hidden]>; lib-ext_at_[hidden] <lib-ext_at_[hidden]>; Titus Winters <titus_at_[hidden]>; Victor Zverovich <victor.zverovich_at_[hidden]>; Corentin <corentin.jabot_at_[hidden]>; Tom Honermann <tom_at_[hidden]>; SG16 <unicode_at_[hidden]>
Subject: Re: [isocpp-lib] [isocpp-lib-ext] The "Let's Stop Ascribing Meaning to Code Points" blog post

On Wed, Nov 13, 2019 at 1:28 PM Billy O'Neal (VC LIBS) <bion_at_[hidden]<mailto:bion_at_[hidden]>> wrote:
>Will you be hesitant to update the reference to the grapheme breaking algorithm if it changes in future Unicode standards as well?

Yes. There's a reason why, for example, Java doesn't follow Unicode's rules in its regex implementation, because it would be a breaking change to do that.

IMO, this is the wrong way to think about stability w.r.t Unicode. The changes that happen to Unicode are bug fixes. If they change the results users get when they use a certain API, it's a fix, not a regression. Adding an 8-width (or whatever it turns out to be) entry in the table for U+FDFD in a later standard falls into that category.

>It is important to remember that width estimation is orthogonal to memory safety; format_to_n() is there to give you the memory safety part, and that will never be impacted by the width estimation piece.

I agree, but the same is true of sprintf vs. snprintf.

Billy3

That sounds right to me, but I don't get the implication. Why did you bring it up?

Zach

Received on 2019-11-14 02:38:21