ISOCPP sg16 List: Re: Agenda for the 2024-01-24 SG16 meeting

From: Steve Downey <sdowney_at_[hidden]>
Date: Wed, 24 Jan 2024 11:39:09 -0500

On Wed, Jan 24, 2024 at 11:29 AM Tom Honermann via SG16 <
sg16_at_[hidden]> wrote:

> SG16 will hold a meeting on Wednesday, January 24th, at 19:30 UTC (timezone
> conversion
> <https://www.timeanddate.com/worldclock/converter.html?iso=20240124T193000&p1=1440&p2=tz_pst&p3=tz_mst&p4=tz_cst&p5=tz_est&p6=tz_cet>
> ).
>
> *That is today!* Yes, I continue to struggle to keep pace with the world.
> No, I still have not published the minutes from the last meeting.
>
> The agenda follows.
>
> - P3045R0: Quantities and units library <https://wg21.link/p3045r0>
> - CWG 2843: Undated reference to Unicode makes C++ a moving target
> <https://cplusplus.github.io/CWG/issues/2843.html>
>
> We discussed a draft of P3045 during the 2023-11-29 SG16 meeting
> <https://github.com/sg16-unicode/sg16-meetings/blob/master/README-2023.md#november-29th-2023>.
> No polls were taken as that discussion was mostly introductory
> presentation. Section 13 (Text output)
> <https://wg21.link/p3045r0#text-output> discusses formatting and
> character encoding considerations. The motivation and proposal for a
> fixed_string type has been moved to a new paper that is yet to be
> published; P3094 (std::basic_fixed_string) <https://wg21.link/p3094>. Section
> 13.6 (Text output open questions)
> <https://wg21.link/p3045r0#text-output-open-questions> has the following
> list of questions and is what discussion will focus on today:
>
> 1. Which C++ character type should be used for symbols in Unicode
> encoding?
> 2. Are we OK with the usage of '_' for denoting a subscript identifier?
> 3. Are we OK with no text output support of quantity types?
> 4. Which character type should basic_symbol_text be used in a
> single-argument constructor?
> 5. How to name a non-Unicode accessor member function (e.g., .ascii())?
> The same name should consistently be used in text_encoding and in the
> formatting grammar.
> 6. Should unit_symbol() return std::string_view or basic_fixed_string?
> 7. Do we care about ostreams enough to introduce custom manipulators
> to format units?
> 8. What about the localization for units? Will we get something like
> ICU in the C++ standard?
> 9. std::chrono::duration uses 'Q' and 'q' for a number and a unit. In
> the grammar above, we proposed using 'N' and 'U' for them, respectively. We
> also introduced 'D' for dimensions. Are we OK with this?
> 10. Should we provide support for quantity points?
>
> The 1st and 4th questions are, I think, the most important ones as they
> directly impact both the user interface and the implementation. We need to
> determine how to:
>
> - Specify both default/preferred symbols (e.g., non-ASCII) and
> compatibility/fallback symbols (e.g., text limited to the basic literal
> character set). For example, "Ω" as a default/preferred symbol for ohm with
> "ohm" as a compatibility/fallback. The paper has a number of such examples
> (see dim_thermodynamic_temperature, ohm, micro_, and
> hyperfine_structure_transition_frequency_of_cs in section 13.1.1
> (Symbol definition examples)
> <https://wg21.link/p3045r0#symbol-definition-examples>)
> - Specify these sets of symbols for each of the ordinary, wide, and
> UTF character encodings.
> - Should it be required to explicitly provide symbol text for each
> of these encodings? Perhaps only when characters outside of the basic
> literal character set are used? Perhaps:
> named_unit<"s", ...> // Ok, uses "s"
> transcoded as necessary for each of the encodings.
> named_unit<{"u", L"u", u8"Ω", u"Ω", U"Ω"}, ...> // Ok, uses "u"
> transcoded as necessary for the compatibility/fallback symbol and the
> provided text as the default/preferred symbol otherwise.
> // This variant
> would prohibit use of characters outside the basic literal character set
> with the ordinary character encoding thus ensuring portability.
> named_unit<{"u", "Ω", L"Ω", U"Ω"}, ...> // Ok, uses "u"
> transcoded as necessary for the compatibility/fallback symbol and the
> provided text as the default/preferred symbol
> // otherwise with
> the UTF-32 text converted to UTF-8 and UTF-16 as necessary.
> // This requires
> that the ordinary literal encoding be UTF-8 for the code to be well-formed
> (see P1854 <https://wg21.link/p1854>).
>
>
u8"\N{OHM SIGN}" is probably the closer to right thing, since it's
distinguishable from u8"\N{GREEK CAPITAL LETTER OMEGA}"

>
> -
>
> The other questions will likely require a little introductory discussion
> to better understand the context for the question.
>
> If time permits, we'll continue discussion of CWG 2843 from the 2024-01-10
> SG16 meeting (for which minutes are not yet published). I believe there are
> three questions yet to be answered:
>
> 1. The version of the Unicode Standard to be specified as the minimum
> version.
> 2. Whether implementations are allowed to use different
> implementation-defined Unicode versions for the core language and the
> standard library.
> 3. Whether the implementation-defined Unicode version should be
> exposed via a new feature test macro (perhaps two new feature test macros
> depending on the previous item).
>
> Tom.
>
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>

Received on 2024-01-24 16:39:25