On Wed, Jun 10, 2020 at 1:46 PM Corentin Jabot <corentinjabot@gmail.com> wrote:

On Wed, Jun 10, 2020, 19:28 Hubert Tong via SG16 <sg16@lists.isocpp.org> wrote:
On Tue, Jun 9, 2020 at 10:48 AM Tom Honermann via SG16 <sg16@lists.isocpp.org> wrote:

This is your friendly reminder that an SG16 telecon will be held tomorrow, Wednesday June 10th, at 19:30 UTC (timezone conversion). To attend, visit https://bluejeans.com/140274541 at the start of the meeting.

The agenda for the meeting is:

Discuss terminology updates to strive for in C++23

P1859R0: Standard terminology character sets and encodings

Establish priorities for terms to address.

Establish a methodology for drafting wording updates.
This may be useful for an intermediate stage of the process of updating the wording:

Basic source character set:
set of abstract characters used for the description of source code for the purposes of this document

I would like to see that gone entirely ultimately

I think the "basic source character set" fails to be a "character set". I believe the elements of a character set are mappings of values to abstract characters.

re: Basic execution character set: I believe that these characters have some restrictions on their values and encoding. It would help to call these out with a note to the definition.

I don't think we intend to remove the restrictions from where they are!

That's fine. I'm saying that the restrictions seem to define these in some sense as well. A note to the entry, indicating the restrictions and where they are established, could be useful.

re: Execution character set: I have qualms about claiming representability of these in a char character literal. The term is intended to encompass abstract characters whose encoding is multibyte.

This is one of the reason why we need to separate character set and character encoding.
I think the expectation is that the code units size of the execution character set is <= sizeof(char)

re: 3.1.2: You mean wchar_t?

re: wording, we should not remove the indication that the interpretation of character or string data is affected by locales.

We should because it is incorrect and we need to separate the core language, which has an implementation defined mechanism to decide which encoding use, which may or may not depend on locale, and the way which there are interpreted by library - which is currently always either locale dependant or encoding agnostic.
A note to explain the relation would be useful though

Yes. A note (in the wording) with cross-references that help to find the library wording that establishes the non-change to the status quo would be appreciated.

The paper is not clear that the definitions section is intended to be considered wording. From a procedural point of view, this is a serious problem.

The terms should come from the ISO/IEC 10646 definition. The Unicode definitions fail to meet drafting requirements from the ISO/IEC Directives, Part 2.

Can you clarify that please?

The definition shall be written in such a form that it can replace the term in its context. It shall not start with an article (“the”, “a”) nor end with a full stop. A definition shall not take the form of, or contain, a requirement.

Although we have some concensus we should copy the definitions we want to use in the standard to avoid too much cross references, and keep the standard somewhat self contained

I would also prefer the copying (with appropriate pointers to the original source) to avoid incorporating-by-reference all the terms and definitions from some other document.

--
SG16 mailing list
SG16@lists.isocpp.org
https://lists.isocpp.org/mailman/listinfo.cgi/sg16