sg16: Re: [SG16] Wording strategy for Unicode std::format

From: Peter Brett <pbrett_at_[hidden]>
Date: Thu, 22 Apr 2021 15:57:33 +0000

Hi Victor,

This is helpful, thank you. I will put full ‘L’ handling for UTF-8/16/32 as the preferred option in the paper.

Just for clarity, what does {fmt} currently do? Obviously if it currently does something different then I will have to do some work to demonstrate implementability.

Best wishes,

           Peter

From: SG16 <sg16-bounces_at_[hidden]> On Behalf Of Victor Zverovich via SG16
Sent: 22 April 2021 15:27
To: SG16 <sg16_at_[hidden]>
Cc: Victor Zverovich <victor.zverovich_at_[hidden]>
Subject: Re: [SG16] Wording strategy for Unicode std::format

EXTERNAL MAIL
> Peter:
> What should the following code do?

I think (1) is the only acceptable option because all the rest are inconsistent with existing std::format overloads.

> “std::locale in its current form is pretty much useless,” may be a true statement but it doesn’t help me make progress.

Maybe we are trying to make "progress" in the wrong direction? We don't have to quickly hack something together for new std::format overloads. We didn't have a chance to look at locale in C++20 but now is a great time.

> Corentin:
> Converting between UTF-X and UTF-Y is a lossless operation.

Only valid ones. There is still a question of handling transcoding errors.

> what is it that we gain by not allowing format(u8"{}", u""); and format(u8"{}", U"");?

We gain consistency between all std::format overloads, simple specification, not having to deal with transcoding errors. I am not suggesting that it shouldn't be possible but that it should be explicit, e.g.

  format(u8"{}", xcode(u""))

With explicit approach you can easily configure error handling.

> we could provide only the u8 overload

Sure provided that we have transcoding facilities. I don't think u16 and u32 overloads are particularly useful since you can't do much with the result.

> mandates that the existing locale be specialize for char8_t

If this specialization inherits all existing locale problems then I think it's not a good idea.

- Victor

On Mon, Apr 19, 2021 at 3:18 AM Corentin Jabot via SG16 <sg16_at_[hidden]<mailto:sg16_at_[hidden]>> wrote:
Talking with Peter, we realized we could provide only the u8 overload and mandates that the existing locale be specialize for char8_t
We believe this would

  * Satisfy Victor's excellent remark about the need not to be gratuitously inconsistent
  * Put minimum strain on implementers
  * Let us move forward with having a Unicode overload in 23.

--
SG16 mailing list
SG16_at_[hidden]<mailto:SG16_at_lists.isocpp.org>
https://lists.isocpp.org/mailman/listinfo.cgi/sg16<https://urldefense.com/v3/__https:/lists.isocpp.org/mailman/listinfo.cgi/sg16__;!!EHscmS1ygiU1lA!WueeYVkg4epLn98-McfxKUi3lJONY6lPzMPbUArFN5V6WCOZxR45PasGv15tlA$>

Received on 2021-04-22 10:57:40