C++ Logo

SG16

Advanced search

Subject: Re: LWG3576 - Clarifying fill character in std::format
From: Steve Downey (sdowney_at_[hidden])
Date: 2021-08-09 15:04:18


A specific case of the general confusion between "character" and `char`.
It's broken for any multi-byte encoding, not just Unicode.

However, I suspect that grapheme cluster might be a rabbit hole. Checking
whether a sequence is, is difficult, and IIRC might change?

On Mon, Aug 9, 2021 at 11:30 AM Corentin via SG16 <sg16_at_[hidden]>
wrote:

> Hello,
>
> I wanted to bring this new LWG issue to your attention.
> https://cplusplus.github.io/LWG/issue3576
>
> The author asks whether the fill character of std::format is
>
> - a code unit
> - a code point
> - a grapheme cluster
>
> This might be an abi breaking thing, and implementation disagrees already
> apparently.
>
> My gut feeling is that it needs to at least be a codepoint.
> I do not know if there are any concerns with allowing a grapheme in terms
> of implementation or performance. There is definitively some motivation,
> especially for non-nfc format strings.
>
> This sort of issue illustrates my point that using the term character in
> the standard can be problematic!
>
> Thanks,
> Have a great week,
>
> Corentin
>
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>



SG16 list run by sg16-owner@lists.isocpp.org