SG16 will hold a telecon on Wednesday, January 12th at 19:30 UTC (timezone conversion).
The agenda is:
- D2286R5: Formatting Ranges
- Review pending availability of proposed wording and continued targeting of C++23.
- P2491R0: Text encodings follow-up
- Initial review.
- P2498R0: Forward compatibility of text_encoding with additional encoding registries
- Initial review.
I don't yet have confirmation of the existence of a D2286R5 so, unless I hear otherwise, the first item on the agenda won't happen.
We last reviewed a draft of P2286R4 during the 2021-12-15 SG16 telecon where we approved forwarding it to LEWG despite the absence of wording. Prior to that, we had reviewed P2286R3 during the 2021-12-01 SG16 telecon. LEWG has since reviewed the proposal during its 2022-01-04 telecon and plans to review again during its 2022-01-18 telecon. In this telecon, we'll review the available wording to look for new SG16 concerns and to validate the wording reflects previous design guidance. Previous design discussion related to the following concerns:
- Use of P2290 style brace delimited hexadecimal notation to preserve the values of code units that appear in an ill-formed code unit sequence.
- Use of P2290 style brace delimited UCN notation (as opposed to hexadecimal notation) for non-printable characters.
- Whether it is always possible to map an input character to a Unicode character for the purpose of determining printability.
- How characters are determined to be printable or non-printable.
- Handling of lone surrogate characters; whether they are encoded in UCN notation (like a non-printable character) or in hexadecimal notation (like an invalid code unit).
- Handling of unassigned code points.
- Handling of Private Use Area (PUA) code points.
- How to determine the boundaries of ill-formed code unit sequences.
- Whether a replacement character should be emitted for an ill-formed code unit sequence (as opposed to emitting hexadecimal notation for each contributing code unit).
- Stability guarantees.
- Support for non-Unicode platforms.
- Handling of std::filesystem::path.
P2491R0 proposes changes to P1885; primarily with regard to handling of wide encodings. The recently communicated D1885R9 draft revision removes support for wide encodings thus making much (but not all) of what P2491R0 proposes moot.
P2498R0 also proposes changes to P1885 to make way for the possibility of supporting different encoding registrars in the future. Note that the ISO does specify its own registry of encodings that have been registered for use with ISO/IEC 2022. The registry is called ISO-IR (officially, "INTERNATIONAL REGISTER OF CODED CHARACTER SETS TO BE USED WITH ESCAPE SEQUENCES") and the registration procedures are specified in ISO/IEC 2375. Unfortunately, I don't think any of these publications is freely available, though copies can be found online. ISO/IEC 2022 is also published as ECMA-35.
Please review the updates made to D1885R9 prior to this telecon for applicability to our review of the latter two papers.
Tom.