sg16: Re: [SG16] LWG3576 - Clarifying fill character in std::format

From: Peter Brett <pbrett_at_[hidden]>
Date: Mon, 9 Aug 2021 15:37:10 +0000

Hi Corentin,

Thank you very much for bringing this up!

I think that it makes logical sense to expect the ‘fill character’ to be a complete grapheme cluster. This makes sense – only graphemes have any defined width.

Allowing the fill character to be a codeunit would be nonsensical.

How difficult would it be to say that filling should be performed with a grapheme cluster, but filling with non-grapheme-cluster single codepoints is conditionally supported? It would permit the naïve implementation (and be backwards compatible) but would allow implementations to DTRT in the future…

            Peter

From: SG16 <sg16-bounces_at_[hidden]> On Behalf Of Corentin via SG16
Sent: 09 August 2021 16:30
To: SG16 <sg16_at_[hidden]>; Victor Zverovich <victor.zverovich_at_[hidden]>
Cc: Corentin <corentin.jabot_at_[hidden]>
Subject: [SG16] LWG3576 - Clarifying fill character in std::format

EXTERNAL MAIL
Hello,

I wanted to bring this new LWG issue to your attention.
https://cplusplus.github.io/LWG/issue3576<https://urldefense.com/v3/__https:/cplusplus.github.io/LWG/issue3576__;!!EHscmS1ygiU1lA!TrwCB_t-9nAWgDI5gnEC950v1I_yKFTypiXq-sgAuUBAOaMyOqlOx0BZAM4xmg$>

The author asks whether the fill character of std::format is

  * a code unit
  * a code point
  * a grapheme cluster
This might be an abi breaking thing, and implementation disagrees already apparently.

My gut feeling is that it needs to at least be a codepoint.
I do not know if there are any concerns with allowing a grapheme in terms of implementation or performance. There is definitively some motivation, especially for non-nfc format strings.

This sort of issue illustrates my point that using the term character in the standard can be problematic!

Thanks,
Have a great week,

Corentin

Received on 2021-08-09 10:37:37