Date: Wed, 23 Oct 2024 23:21:33 +0200
On 23/10/2024 21.59, Mateusz Pusz via SG16 wrote:
> Hi SG16 :-)
>
> At the last meeting about units, I said that we do not have to decide on any specific Unicode symbols for now, as the systems of units are the subject of the next papers. It turns out I was wrong :-(
>
> We need a few Unicode characters for the first proposal. Those are:
Could you give an indication of the respective use-case?
> Name Symbol C++ code Portable alternative
> Superscript Zero ⁰ u8"\u2070" "0"
> Superscript One ¹ u8"\u00b9" "1"
> Superscript Two ² u8"\u00b2" "2"
> Superscript Three ³ u8"\u00b3" "3"
> Superscript Four ⁴ u8"\u2074" "4"
> Superscript Five ⁵ u8"\u2075" "5"
> Superscript Six ⁶ u8"\u2076" "6"
> Superscript Seven ⁷ u8"\u2077" "7"
> Superscript Eight ⁸ u8"\u2078" "8"
> Superscript Nine ⁹ u8"\u2079" "9"
> Superscript Minus ⁻ u8"\u207b" "-"
> Multiplication Sign × u8"\u00d7" "x"
> Mathematical Italic Small Pi 𝜋 u8"\U0001D70B" "pi"
> Per Mille Sign ‰ u8"\u2030" "%o"
> Dot Operator ⋅ u8"\u22C5" <none>*
>
>
> * This option is valid only for UTF-8 encoding. Otherwise, we propose to throw an exception during format string processing.
Could you give a sample std::format invocation where this would happen?
In particular, I'm opposed to throwing an exception if the value of a
format argument (not the format string) happens to be a little off.
> 2. Is the C++ spelling correct? Should we spell them in the spec with '\u', '\U', or maybe some other way?
That depends on where they appear. If they appear in strings, something like "\uXXXX"
(with quotation marks, with optional encoding-prefix) seems appropriate.
If they appear in source code, see below.
> 3. Are the portable alternatives fine (especially "%o" might be controversial above)?
What would be an alternative if we didn't like "%o"? I'm not seeing one.
> 4. Is there a better way to spell `inline constexpr auto 𝜋 = pi;` to be explicit about the specific Unicode symbol used for the identifier?
So, this is an identifier, not a string literal. I think we shouldn't rely
on glyphs for anything outside of ASCII for unique character identification;
we shouldn't require consumers of the standard to be able to read non-Latin
scripts (even if we're talking about math symbols).
So maybe
inline constexpr auto \u03c0 = pi; // U+03C0 GREEK SMALL LETTER PI
or, actually showing the glyph:
inline constexpr auto π /* U+03C0 GREEK SMALL LETTER PI */ = pi;
Jens
> Hi SG16 :-)
>
> At the last meeting about units, I said that we do not have to decide on any specific Unicode symbols for now, as the systems of units are the subject of the next papers. It turns out I was wrong :-(
>
> We need a few Unicode characters for the first proposal. Those are:
Could you give an indication of the respective use-case?
> Name Symbol C++ code Portable alternative
> Superscript Zero ⁰ u8"\u2070" "0"
> Superscript One ¹ u8"\u00b9" "1"
> Superscript Two ² u8"\u00b2" "2"
> Superscript Three ³ u8"\u00b3" "3"
> Superscript Four ⁴ u8"\u2074" "4"
> Superscript Five ⁵ u8"\u2075" "5"
> Superscript Six ⁶ u8"\u2076" "6"
> Superscript Seven ⁷ u8"\u2077" "7"
> Superscript Eight ⁸ u8"\u2078" "8"
> Superscript Nine ⁹ u8"\u2079" "9"
> Superscript Minus ⁻ u8"\u207b" "-"
> Multiplication Sign × u8"\u00d7" "x"
> Mathematical Italic Small Pi 𝜋 u8"\U0001D70B" "pi"
> Per Mille Sign ‰ u8"\u2030" "%o"
> Dot Operator ⋅ u8"\u22C5" <none>*
>
>
> * This option is valid only for UTF-8 encoding. Otherwise, we propose to throw an exception during format string processing.
Could you give a sample std::format invocation where this would happen?
In particular, I'm opposed to throwing an exception if the value of a
format argument (not the format string) happens to be a little off.
> 2. Is the C++ spelling correct? Should we spell them in the spec with '\u', '\U', or maybe some other way?
That depends on where they appear. If they appear in strings, something like "\uXXXX"
(with quotation marks, with optional encoding-prefix) seems appropriate.
If they appear in source code, see below.
> 3. Are the portable alternatives fine (especially "%o" might be controversial above)?
What would be an alternative if we didn't like "%o"? I'm not seeing one.
> 4. Is there a better way to spell `inline constexpr auto 𝜋 = pi;` to be explicit about the specific Unicode symbol used for the identifier?
So, this is an identifier, not a string literal. I think we shouldn't rely
on glyphs for anything outside of ASCII for unique character identification;
we shouldn't require consumers of the standard to be able to read non-Latin
scripts (even if we're talking about math symbols).
So maybe
inline constexpr auto \u03c0 = pi; // U+03C0 GREEK SMALL LETTER PI
or, actually showing the glyph:
inline constexpr auto π /* U+03C0 GREEK SMALL LETTER PI */ = pi;
Jens
Received on 2024-10-23 21:21:41