On 03/02/2021 19.22, Corentin wrote:
> On Wed, Feb 3, 2021 at 6:41 PM Jens Maurer <Jens.Maurer@gmx.net <mailto:Jens.Maurer@gmx.net>> wrote:
> I thought we had discussed that the standard library has certain
> facilities with locale-dependent character set.
> I haven't found a mention of "execution character set" in the library
> wording, so I'm interested in learning how these locale-dependent
> character sets are described / referenced.
> There is a whole new paragraph in the library introduction (page 10).
After a few easy fixes, it seems the library is not using the
term "execution character set" at all.
That means we don't need to define it, or the term "execution encoding".
std::isalpha and friends do whatever they do in C.
That doesn't really help, there is an encoding assumed by these functions, not mentioning it won't make the issue go away!
Also, strcpy and other similar C functions don't really deal in
characters, they deal in code units (= integer values), and they
don't care for the character set at all. If you combine compile-time
literals with runtime values (e.g. typed from a keyboard or taken
from argc/argv), you get what you get (possibly nonsense if one is
EBCDIC and the other is ASCII, for example).
Sure, the classifications functions are utterly broken, I don't think that reason to ignore the issue
I don't think we should improve on this situation in a
core-language character set cleanup paper.
Well, I understand that LWG will have an opinion on changes to the library specification :)
But fixing the core language by removing the description of all encodings assumed at runtime seems to sweep a pretty important question under the rug!
I've updated my paper: https://wiki.edg.com/pub/Wg21telecons2021/SG16/charset.html
and I'm intending to submit it to the next mailing.