C++ Logo

SG16

Advanced search

Subject: Re: LWG3576 - Clarifying fill character in std::format
From: Charlie Barto (Charles.Barto_at_[hidden])
Date: 2021-08-10 12:44:09


Wording note: “any Unicode grapheme cluster other than { or }” may include grapheme clusters such as }̅ or similar

From: SG16 <sg16-bounces_at_[hidden]> On Behalf Of Victor Zverovich via SG16
Sent: Monday, August 9, 2021 8:36 AM
To: Corentin <corentin.jabot_at_[hidden]>
Cc: Victor Zverovich <victor.zverovich_at_[hidden]>; SG16 <sg16_at_[hidden]>
Subject: Re: [SG16] LWG3576 - Clarifying fill character in std::format

As an additional data point: the {fmt} library and Python's str.format use code points.

- Victor

On Mon, Aug 9, 2021 at 8:34 AM Victor Zverovich <mailto:victor.zverovich_at_[hidden]> wrote:
Thanks Corentin for bringing this up. I think this should be at least a code point (that was the original intent which was lost to wording ambiguity), otherwise fill is pretty much useless. Grapheme cluster is an option but might be an overkill.

Cheers,
Victor

On Mon, Aug 9, 2021 at 8:30 AM Corentin <mailto:corentin.jabot_at_[hidden]> wrote:
Hello,

I wanted to bring this new LWG issue to your attention.
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcplusplus.github.io%2FLWG%2Fissue3576&data=04%7C01%7CCharles.Barto%40microsoft.com%7C7bba94fb75cd4566a12008d95b4b6555%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637641201645917512%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=gKDkessWh8lXIy3raeThy0%2FX61USZcKgPUUumoaClig%3D&reserved=0

The author asks whether the fill character of std::format is
• a code unit
• a code point
• a grapheme cluster
This might be an abi breaking thing, and implementation disagrees already apparently.

My gut feeling is that it needs to at least be a codepoint.
I do not know if there are any concerns with allowing a grapheme in terms of implementation or performance. There is definitively some motivation, especially for non-nfc format strings.

This sort of issue illustrates my point that using the term character in the standard can be problematic!

Thanks,
Have a great week,

Corentin


SG16 list run by sg16-owner@lists.isocpp.org