Date: Wed, 24 Jan 2024 12:24:36 -0500
I will not be able to attend today.
My only feedback would be that I do want feature macros to query for which
version of Unicode is in effect at translation time, and I believe that is quite
Important rather than nice-to-have.
AlisdairM
> On 24 Jan 2024, at 11:29, Tom Honermann via SG16 <sg16_at_[hidden]> wrote:
>
> SG16 will hold a meeting on Wednesday, January 24th, at 19:30 UTC (timezone conversion).
> That is today! Yes, I continue to struggle to keep pace with the world. No, I still have not published the minutes from the last meeting.
> The agenda follows.
> • P3045R0: Quantities and units library
> • CWG 2843: Undated reference to Unicode makes C++ a moving target
> We discussed a draft of P3045 during the 2023-11-29 SG16 meeting. No polls were taken as that discussion was mostly introductory presentation. Section 13 (Text output) discusses formatting and character encoding considerations. The motivation and proposal for a fixed_string type has been moved to a new paper that is yet to be published; P3094 (std::basic_fixed_string). Section 13.6 (Text output open questions) has the following list of questions and is what discussion will focus on today:
> • Which C++ character type should be used for symbols in Unicode encoding?
> • Are we OK with the usage of '_' for denoting a subscript identifier?
> • Are we OK with no text output support of quantity types?
> • Which character type should basic_symbol_text be used in a single-argument constructor?
> • How to name a non-Unicode accessor member function (e.g., .ascii())? The same name should consistently be used in text_encoding and in the formatting grammar.
> • Should unit_symbol() return std::string_view or basic_fixed_string?
> • Do we care about ostreams enough to introduce custom manipulators to format units?
> • What about the localization for units? Will we get something like ICU in the C++ standard?
> • std::chrono::duration uses 'Q' and 'q' for a number and a unit. In the grammar above, we proposed using 'N' and 'U' for them, respectively. We also introduced 'D' for dimensions. Are we OK with this?
> • Should we provide support for quantity points?
> The 1st and 4th questions are, I think, the most important ones as they directly impact both the user interface and the implementation. We need to determine how to:
> • Specify both default/preferred symbols (e.g., non-ASCII) and compatibility/fallback symbols (e.g., text limited to the basic literal character set). For example, "Ω" as a default/preferred symbol for ohm with "ohm" as a compatibility/fallback. The paper has a number of such examples (see dim_thermodynamic_temperature, ohm, micro_, and hyperfine_structure_transition_frequency_of_cs in section 13.1.1 (Symbol definition examples))
> • Specify these sets of symbols for each of the ordinary, wide, and UTF character encodings.
> • Should it be required to explicitly provide symbol text for each of these encodings? Perhaps only when characters outside of the basic literal character set are used? Perhaps:
> named_unit<"s", ...> // Ok, uses "s" transcoded as necessary for each of the encodings.
> named_unit<{"u", L"u", u8"Ω", u"Ω", U"Ω"}, ...> // Ok, uses "u" transcoded as necessary for the compatibility/fallback symbol and the provided text as the default/preferred symbol otherwise.
> // This variant would prohibit use of characters outside the basic literal character set with the ordinary character encoding thus ensuring portability.
> named_unit<{"u", "Ω", L"Ω", U"Ω"}, ...> // Ok, uses "u" transcoded as necessary for the compatibility/fallback symbol and the provided text as the default/preferred symbol
> // otherwise with the UTF-32 text converted to UTF-8 and UTF-16 as necessary.
> // This requires that the ordinary literal encoding be UTF-8 for the code to be well-formed (see P1854).
> The other questions will likely require a little introductory discussion to better understand the context for the question.
> If time permits, we'll continue discussion of CWG 2843 from the 2024-01-10 SG16 meeting (for which minutes are not yet published). I believe there are three questions yet to be answered:
> • The version of the Unicode Standard to be specified as the minimum version.
> • Whether implementations are allowed to use different implementation-defined Unicode versions for the core language and the standard library.
> • Whether the implementation-defined Unicode version should be exposed via a new feature test macro (perhaps two new feature test macros depending on the previous item).
> Tom.
>
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
My only feedback would be that I do want feature macros to query for which
version of Unicode is in effect at translation time, and I believe that is quite
Important rather than nice-to-have.
AlisdairM
> On 24 Jan 2024, at 11:29, Tom Honermann via SG16 <sg16_at_[hidden]> wrote:
>
> SG16 will hold a meeting on Wednesday, January 24th, at 19:30 UTC (timezone conversion).
> That is today! Yes, I continue to struggle to keep pace with the world. No, I still have not published the minutes from the last meeting.
> The agenda follows.
> • P3045R0: Quantities and units library
> • CWG 2843: Undated reference to Unicode makes C++ a moving target
> We discussed a draft of P3045 during the 2023-11-29 SG16 meeting. No polls were taken as that discussion was mostly introductory presentation. Section 13 (Text output) discusses formatting and character encoding considerations. The motivation and proposal for a fixed_string type has been moved to a new paper that is yet to be published; P3094 (std::basic_fixed_string). Section 13.6 (Text output open questions) has the following list of questions and is what discussion will focus on today:
> • Which C++ character type should be used for symbols in Unicode encoding?
> • Are we OK with the usage of '_' for denoting a subscript identifier?
> • Are we OK with no text output support of quantity types?
> • Which character type should basic_symbol_text be used in a single-argument constructor?
> • How to name a non-Unicode accessor member function (e.g., .ascii())? The same name should consistently be used in text_encoding and in the formatting grammar.
> • Should unit_symbol() return std::string_view or basic_fixed_string?
> • Do we care about ostreams enough to introduce custom manipulators to format units?
> • What about the localization for units? Will we get something like ICU in the C++ standard?
> • std::chrono::duration uses 'Q' and 'q' for a number and a unit. In the grammar above, we proposed using 'N' and 'U' for them, respectively. We also introduced 'D' for dimensions. Are we OK with this?
> • Should we provide support for quantity points?
> The 1st and 4th questions are, I think, the most important ones as they directly impact both the user interface and the implementation. We need to determine how to:
> • Specify both default/preferred symbols (e.g., non-ASCII) and compatibility/fallback symbols (e.g., text limited to the basic literal character set). For example, "Ω" as a default/preferred symbol for ohm with "ohm" as a compatibility/fallback. The paper has a number of such examples (see dim_thermodynamic_temperature, ohm, micro_, and hyperfine_structure_transition_frequency_of_cs in section 13.1.1 (Symbol definition examples))
> • Specify these sets of symbols for each of the ordinary, wide, and UTF character encodings.
> • Should it be required to explicitly provide symbol text for each of these encodings? Perhaps only when characters outside of the basic literal character set are used? Perhaps:
> named_unit<"s", ...> // Ok, uses "s" transcoded as necessary for each of the encodings.
> named_unit<{"u", L"u", u8"Ω", u"Ω", U"Ω"}, ...> // Ok, uses "u" transcoded as necessary for the compatibility/fallback symbol and the provided text as the default/preferred symbol otherwise.
> // This variant would prohibit use of characters outside the basic literal character set with the ordinary character encoding thus ensuring portability.
> named_unit<{"u", "Ω", L"Ω", U"Ω"}, ...> // Ok, uses "u" transcoded as necessary for the compatibility/fallback symbol and the provided text as the default/preferred symbol
> // otherwise with the UTF-32 text converted to UTF-8 and UTF-16 as necessary.
> // This requires that the ordinary literal encoding be UTF-8 for the code to be well-formed (see P1854).
> The other questions will likely require a little introductory discussion to better understand the context for the question.
> If time permits, we'll continue discussion of CWG 2843 from the 2024-01-10 SG16 meeting (for which minutes are not yet published). I believe there are three questions yet to be answered:
> • The version of the Unicode Standard to be specified as the minimum version.
> • Whether implementations are allowed to use different implementation-defined Unicode versions for the core language and the standard library.
> • Whether the implementation-defined Unicode version should be exposed via a new feature test macro (perhaps two new feature test macros depending on the previous item).
> Tom.
>
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
Received on 2024-01-24 17:24:49