sg16: Re: [SG16-Unicode] [isocpp-lib] [isocpp-lib-ext] The "Let's Stop Ascribing Meaning to Code Points" blog post

From: Victor Zverovich <victor.zverovich_at_[hidden]>
Date: Thu, 14 Nov 2019 01:45:35 +0000

format and format_to_n will be safe and format_to can be used unsafely
regardless of the changes to width or anything else. To quote Titus "every
change is a breaking change" and I think we should explicitly reserve the
right to change width estimation and update the Unicode database.

- Victor

On Wed, Nov 13, 2019 at 5:38 PM Billy O'Neal (VC LIBS) <bion_at_[hidden]>
wrote:

> >IMO, this is the wrong way to think about stability w.r.t Unicode. The
> changes that happen to Unicode are bug fixes. If they change the results
> users get when they use a certain API, it's a fix, not a regression.
>
> I agree that it isn't a regression. Whether it is a fix or not has nothing
> to do with whether it is a breaking change; it's breaking if anyone relies
> on the behavior that is broken. We have customers that are angry with us
> because we fixed printf to print doubles correctly. And we had to ship a
> mode for those customers to make printf be broken again.
>
> >>>It is important to remember that width estimation is orthogonal to
> memory safety; format_to_n() is there to give you the memory safety part,
> and that will never be impacted by the width estimation piece.
> >>I agree, but the same is true of sprintf vs. snprintf.
> >That sounds right to me, but I don't get the implication. Why did you
> bring it up?
>
> The implication is that if the customers of format/format_to/format_to_n
> are anything like the customers of sprintf/snprintf, there will be users
> who call the not sized interface expecting the format string to make it
> safe.
>
> Billy3
> ------------------------------
> *From:* Zach Laine <whatwasthataddress_at_[hidden]>
> *Sent:* Wednesday, November 13, 2019 04:19 PM
> *To:* Billy O'Neal (VC LIBS) <bion_at_[hidden]>
> *Cc:* Library Working Group <lib_at_[hidden]>; Kirk Shoop <
> kirkshoop_at_[hidden]>; lib-ext_at_[hidden] <lib-ext_at_[hidden]>;
> Titus Winters <titus_at_[hidden]>; Victor Zverovich <
> victor.zverovich_at_[hidden]>; Corentin <corentin.jabot_at_[hidden]>; Tom
> Honermann <tom_at_[hidden]>; SG16 <unicode_at_[hidden]>
> *Subject:* Re: [isocpp-lib] [isocpp-lib-ext] The "Let's Stop Ascribing
> Meaning to Code Points" blog post
>
> On Wed, Nov 13, 2019 at 1:28 PM Billy O'Neal (VC LIBS) <bion_at_[hidden]>
> wrote:
>
> >Will you be hesitant to update the reference to the grapheme breaking
> algorithm if it changes in future Unicode standards as well?
>
> Yes. There's a reason why, for example, Java doesn't follow Unicode's
> rules in its regex implementation, because it would be a breaking change to
> do that.
>
>
> IMO, this is the wrong way to think about stability w.r.t Unicode. The
> changes that happen to Unicode are bug fixes. If they change the results
> users get when they use a certain API, it's a fix, not a regression.
> Adding an 8-width (or whatever it turns out to be) entry in the table for
> U+FDFD in a later standard falls into that category.
>
>
> >It is important to remember that width estimation is orthogonal to
> memory safety; format_to_n() is there to give you the memory safety part,
> and that will never be impacted by the width estimation piece.
>
> I agree, but the same is true of sprintf vs. snprintf.
>
> Billy3
>
>
> That sounds right to me, but I don't get the implication. Why did you
> bring it up?
>
> Zach
>
>

Received on 2019-11-14 02:45:48