C++ Logo

sg16

Advanced search

Re: [isocpp-sg16] std::format and charN_t

From: Thiago Macieira <thiago_at_[hidden]>
Date: Tue, 02 Jul 2024 17:01:52 +0200
On Tuesday 2 July 2024 13:39:38 CEST Tiago Freire wrote:
> Ok, that seems more inline with how I thought it was working.
> But then again if you are over-allocating and shrinking on a per-parameter
> basis it's not really pre-allocation. Not sure if that is what the OP had
> in mind, if he was worried about the cost of transcoding after formatting
> (and wanting pre-allocation) the cost of that is going to be relatively low
> compared to everything else.

Ivan and I work together in Qt (though not for the same company). I'm actually
the one who asked him to post our concerns to this ML.

We are worried about the cost of transcoding and the cost of memcpying data.
In this particular ask, the question was about allocating an additional buffer
and memcpying data out of it and onto the destination string. Right now,
there's no way to avoid this extra cost while doing transcoding, so we won't
try. Therefore, when formating a QLatin1StringView or QString onto a
std::string, we will have to:
1) allocate a QByteArray of the maximum size (which is 2x the size of the
latin1 string or 3x the codepoint count of the UTF-16 one)
2) transcode onto it
3) shrink in to size
4) allocate a std::string of the correct size
5) copy onto it
6) deallocate the QByteArray
7) use std::formatter<std::string>, which memcpy's it to the destination
std::string

(Steps 1 to 3 happen inside existing Qt functions)

The question about the cost of transcoding was in relation to a possible work
around / solution to the above. The Standard may provide a std::u16string
formatter onto a std::string, which would eliminate all of the above and
replace with a vendor's implementation. However, how good is the
implementation of the converter? Of the three major Standard Library
implementations, only one has vested interest in UTF-16. And because Qt has
been using UTF-16 since 2001, it hs very highly optimised converters we would
like to reuse.

Finally,, when formatting a QLatin1StringView or a QUtf8StringView onto a
QString, I will insist that qFormat not do the double allocation and double
memcpy.

-- 
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
  Principal Engineer - Intel DCAI Platform & System Engineering

Received on 2024-07-02 15:01:59