C++ Logo


Advanced search

Re: [SG16] Is the concept of basic execution character sets useful?

From: Corentin <corentin.jabot_at_[hidden]>
Date: Thu, 4 Feb 2021 00:16:11 +0100
On Thu, Feb 4, 2021 at 12:03 AM Jens Maurer <Jens.Maurer_at_[hidden]> wrote:

> On 03/02/2021 19.22, Corentin wrote:
> >
> >
> > On Wed, Feb 3, 2021 at 6:41 PM Jens Maurer <Jens.Maurer_at_[hidden] <mailto:
> Jens.Maurer_at_[hidden]>> wrote:
> > I thought we had discussed that the standard library has certain
> > facilities with locale-dependent character set.
> > I haven't found a mention of "execution character set" in the library
> > wording, so I'm interested in learning how these locale-dependent
> > character sets are described / referenced.
> >
> >
> > There is a whole new paragraph in the library introduction (page 10).
> After a few easy fixes, it seems the library is not using the
> term "execution character set" at all.
> That means we don't need to define it, or the term "execution encoding".
> std::isalpha and friends do whatever they do in C.

That doesn't really help, there is an encoding assumed by these functions,
not mentioning it won't make the issue go away!

> Also, strcpy and other similar C functions don't really deal in
> characters, they deal in code units (= integer values), and they
> don't care for the character set at all. If you combine compile-time
> literals with runtime values (e.g. typed from a keyboard or taken
> from argc/argv), you get what you get (possibly nonsense if one is
> EBCDIC and the other is ASCII, for example).

Sure, the classifications functions are utterly broken, I don't think that
reason to ignore the issue

> I don't think we should improve on this situation in a
> core-language character set cleanup paper.

Well, I understand that LWG will have an opinion on changes to the library
specification :)
But fixing the core language by removing the description of all encodings
assumed at runtime seems to sweep a pretty important question under the rug!

> I've updated my paper:
> https://wiki.edg.com/pub/Wg21telecons2021/SG16/charset.html
> and I'm intending to submit it to the next mailing.
> Jens

Received on 2021-02-03 17:16:23