ISOCPP sg16 List: Agenda for the 2024-01-24 SG16 meeting

From: Tom Honermann <tom_at_[hidden]>
Date: Wed, 24 Jan 2024 11:29:54 -0500

SG16 will hold a meeting on Wednesday, January 24th, at 19:30 UTC
(timezone conversion
<https://www.timeanddate.com/worldclock/converter.html?iso=20240124T193000&p1=1440&p2=tz_pst&p3=tz_mst&p4=tz_cst&p5=tz_est&p6=tz_cet>).

*That is today!* Yes, I continue to struggle to keep pace with the
world. No, I still have not published the minutes from the last meeting.

The agenda follows.

  * P3045R0: Quantities and units library <https://wg21.link/p3045r0>
  * CWG 2843: Undated reference to Unicode makes C++ a moving target
    <https://cplusplus.github.io/CWG/issues/2843.html>

We discussed a draft of P3045 during the 2023-11-29 SG16 meeting
<https://github.com/sg16-unicode/sg16-meetings/blob/master/README-2023.md#november-29th-2023>.
No polls were taken as that discussion was mostly introductory
presentation. Section 13 (Text output)
<https://wg21.link/p3045r0#text-output> discusses formatting and
character encoding considerations. The motivation and proposal for a
fixed_string type has been moved to a new paper that is yet to be
published; P3094 (std::basic_fixed_string) <https://wg21.link/p3094>.
Section 13.6 (Text output open questions)
<https://wg21.link/p3045r0#text-output-open-questions> has the following
list of questions and is what discussion will focus on today:

1. Which C++ character type should be used for symbols in Unicode encoding?
2. Are we OK with the usage of '_' for denoting a subscript identifier?
3. Are we OK with no text output support of quantity types?
4. Which character type should basic_symbol_text be used in a
    single-argument constructor?
5. How to name a non-Unicode accessor member function (e.g., .ascii())?
    The same name should consistently be used in text_encoding and in
    the formatting grammar.
6. Should unit_symbol() return std::string_view or basic_fixed_string?
7. Do we care about ostreams enough to introduce custom manipulators to
    format units?
8. What about the localization for units? Will we get something like
    ICU in the C++ standard?
9. std::chrono::duration uses 'Q' and 'q' for a number and a unit. In
    the grammar above, we proposed using 'N' and 'U' for them,
    respectively. We also introduced 'D' for dimensions. Are we OK with
    this?
10. Should we provide support for quantity points?

The 1st and 4th questions are, I think, the most important ones as they
directly impact both the user interface and the implementation. We need
to determine how to:

  * Specify both default/preferred symbols (e.g., non-ASCII) and
    compatibility/fallback symbols (e.g., text limited to the basic
    literal character set). For example, "Ω" as a default/preferred
    symbol for ohm with "ohm" as a compatibility/fallback. The paper has
    a number of such examples (see dim_thermodynamic_temperature, ohm,
    micro_, and hyperfine_structure_transition_frequency_of_cs in
    section 13.1.1 (Symbol definition examples)
    <https://wg21.link/p3045r0#symbol-definition-examples>)
  * Specify these sets of symbols for each of the ordinary, wide, and
    UTF character encodings.
      o Should it be required to explicitly provide symbol text for each
        of these encodings? Perhaps only when characters outside of the
        basic literal character set are used? Perhaps:
        named_unit<"s", ...> // Ok, uses "s" transcoded
        as necessary for each of the encodings.
        named_unit<{"u", L"u", u8"Ω", u"Ω", U"Ω"}, ...> // Ok, uses "u"
        transcoded as necessary for the compatibility/fallback symbol
        and the provided text as the default/preferred symbol otherwise.
                                                         // This variant
        would prohibit use of characters outside the basic literal
        character set with the ordinary character encoding thus ensuring
        portability.
        named_unit<{"u", "Ω", L"Ω", U"Ω"}, ...> // Ok, uses "u"
        transcoded as necessary for the compatibility/fallback symbol
        and the provided text as the default/preferred symbol
                                                         // otherwise
        with the UTF-32 text converted to UTF-8 and UTF-16 as necessary.
                                                         // This
        requires that the ordinary literal encoding be UTF-8 for the
        code to be well-formed (see P1854 <https://wg21.link/p1854>).

The other questions will likely require a little introductory discussion
to better understand the context for the question.

If time permits, we'll continue discussion of CWG 2843 from the
2024-01-10 SG16 meeting (for which minutes are not yet published). I
believe there are three questions yet to be answered:

1. The version of the Unicode Standard to be specified as the minimum
    version.
2. Whether implementations are allowed to use different
    implementation-defined Unicode versions for the core language and
    the standard library.
3. Whether the implementation-defined Unicode version should be exposed
    via a new feature test macro (perhaps two new feature test macros
    depending on the previous item).

Tom.

Received on 2024-01-24 16:29:56