Date: Tue, 29 Nov 2022 18:09:01 -0500
On 11/29/22 6:04 PM, Corentin Jabot wrote:
>
>
> On Tue, Nov 29, 2022 at 11:47 PM Tom Honermann <tom_at_[hidden]> wrote:
>
> On 11/29/22 5:32 PM, Corentin Jabot wrote:
>>
>>
>> On Tue, Nov 29, 2022 at 11:13 PM Tom Honermann via SG16
>> <sg16_at_[hidden]> wrote:
>>
>> Corentin, I'm having a hard time understanding what the
>> screenshots included in P2675R0 are showing. The paper
>> doesn't explain (or I missed it) what was done to produce
>> those screenshots. I can't tell what code points the
>> displayed glyphs correspond to. The set of characters
>> displayed seems to differ; the number of rows displayed is
>> not consistent. It doesn't seem to be possible to dive into
>> the data to determine which characters are being rendered
>> differently. Basically, I'm unable to evaluate whether the
>> screenshots support the proposal.
>>
>> It would also be helpful if the screenshots included
>> information about the terminal encoding (almost always UTF-8
>> I would guess/hope) and the font(s) used. I recognize
>> gathering such data could be challenging.
>>
>>
>> Well, given the first collection took many days and involved a
>> large number of people, that may prove challenging.
>> I did intend to have the characters separated by |, but by the
>> time I realized that it was too late.
>> Knowing the fonts used would be even more challenging as in the
>> absence of a codepoint renderers will try to find some other
>> similar font.
>> For east asian characters, they will be aligned no matter the
>> font, for emoji they may or may not be rendered on the grid.
>>
>> The paper contains a script that was used to generate the list of
>> codepoint rendered.
>> I intended to include that list however it proved difficult to
>> handle for google docs, and apparently subsequentely forgot to
>> put a reference
>> https://gist.githubusercontent.com/cor3ntin/e5731f77574b146d806e39283e8c7cb7/raw/01d760e7fbb6a56c637a3ce34688e1da286df287/full_width.hpp
>> Feel free to try on your system.
>>
>> The TL;DR is that on a conforming system with sufficient fonts -
>> which is not all systems, in part because Unicode 15 is recent
>> and not all terminals display wide tofu properly - what is
>> proposed is coherent with existing terminal behaviors. And we
>> should not try to placate over temporary deficiency of one
>> specific terminal over another.
>
> Thanks, could you please add that kind of description to the next
> revision of the paper? Not for tomorrow obviously.
>
> Having taken a look at the gist linked above, I have to conclude
> that the screenshots currently in the paper are not beneficial.
> Some screenshots show rows from the top of the gist and others
> from the bottom, but I can only tell that by guessing based on
> glyph shape and being locally able to look at the entire gist.
>
> Having a large number of screenshots is not important to me. I
> would be happy to have examples for just the following. But I
> would like screenshots that provide an apples-to-apples comparison
> and that show the entire gist.
>
> * Linux with Konsole.
> * Linux with gnome-terminal.
> * Linux with non-graphical terminal.
> * Windows with cmd.exe.
> * Windows with Windows Terminal.
> * Windows with Putty.
> * macOS with its default terminal.
>
>
> All of these are provided in the paper.
They are present, but they are incomplete and it is not possible for me
to make sense of them.
> You are free to collect your own of course, not everyone is able to
> show every codepoint at once given screen size, etc.
I understand that taking multiple screenshots would likely be required
to show a complete presentation.
> And it's beside the point.
> My point is that the standard currently has arbitrary rules that do
> not follow any existing practice.
Victor obtained the list by looking at existing practice.
> Can you justify the current list, as it is in the standard?
> Can you justify https://godbolt.org/z/McoG64n1v ?
I'm trying to evaluate your proposal against the status quo.
Unfortunately, the paper is not proving all that helpful in doing so.
Tom.
> Tom.
>
>>
>> Tom.
>>
>> On 11/29/22 3:40 PM, Tom Honermann via SG16 wrote:
>>>
>>> SG16 will hold a telecon on Wednesday, November 30th, at
>>> 19:30 UTC (timezone conversion
>>> <https://www.timeanddate.com/worldclock/converter.html?iso=20221130T193000&p1=1440&p2=tz_pst&p3=tz_mst&p4=tz_cst&p5=tz_est&p6=tz_cet>).
>>>
>>> *This message will also serve as your friendly reminder that
>>> this meeting is taking place tomorrow. **I'm sorry for
>>> publishing an agenda so very late. *
>>>
>>> *For participants in the USA, please note that daylight
>>> savings time ended 2022-11-06, so this telecon will start
>>> one hour earlier than our last telecon.*
>>>
>>> The agenda follows. We won't get through all of these. These
>>> are all of the NB comments we have left to address. Whatever
>>> we don't get to in this meeting will be scheduled for the
>>> December 14th meeting.
>>>
>>> * P2713R0: Escaping improvements in std::format
>>> <https://wg21.link/p2713r0>
>>> o US 38-098 22.14.6.4p1 [format.string.escaped]
>>> Escaping for debugging and logging
>>> <https://github.com/cplusplus/nbballot/issues/515>
>>> o FR 005-134 22.14.6.4 [format.string.escaped]
>>> Aggressive escaping
>>> <https://github.com/cplusplus/nbballot/issues/408>
>>> * P2693R0: Formatting thread::id and stacktrace
>>> <https://wg21.link/p2693r0>
>>> o FR-008-011 22.14 [format] Support formatting of
>>> thread::id
>>> <https://github.com/cplusplus/nbballot/issues/410>
>>> * FR-010-133 [Bibliography] Unify references to Unicode
>>> <https://github.com/cplusplus/nbballot/issues/412> and
>>> FR-021-013 5.3p5.2 [lex.charset] Codepoint names in
>>> identifiers
>>> <https://github.com/cplusplus/nbballot/issues/423>
>>> * P2675R0: LWG3780: The Paper (format's width estimation
>>> is too approximate and not forward compatible)
>>> <https://wg21.link/p2675r0>
>>> o LWG #3780: format's width estimation is too
>>> approximate and not forward compatible
>>> <https://cplusplus.github.io/LWG/issue3780>
>>> o FR-007-012 22.14.2.2 [format.string.std] codepoints
>>> with width 2
>>> <https://github.com/cplusplus/nbballot/issues/409>
>>> * FR-020-014 5.3 [lex.charset] Replace "translation
>>> character set" by "Unicode"
>>> <https://github.com/cplusplus/nbballot/issues/422>
>>>
>>> P2713R0 <https://wg21.link/p2713r0> (Escaping improvements
>>> in std::format) implements the SG16 proposed resolutions for
>>> US 38-098 <https://github.com/cplusplus/nbballot/issues/515>
>>> (see the 2022-10-19 SG16 meeting summary
>>> <https://github.com/sg16-unicode/sg16-meetings#october-19th-2022>)
>>> and FR 005-134
>>> <https://github.com/cplusplus/nbballot/issues/408> (see the
>>> 2022-11-02 SG16 meeting summary
>>> <https://github.com/sg16-unicode/sg16-meetings#november-2nd-2022>).
>>> We'll review the wording and then poll forwarding to LEWG as
>>> the resolution of the two NB comments.
>>>
>>> Candidate Poll 1: P2713R0: Forward to LEWG as the
>>> recommended resolution of US 38-098 and FR 005-134
>>> [amended to ...].
>>>
>>> P2693R0 <https://wg21.link/p2693r0> (Formatting thread::id
>>> and stacktrace) is intended to resolve FR-008-011
>>> <https://github.com/cplusplus/nbballot/issues/410>. I did
>>> not initially tag this NB comment as needing SG16 review,
>>> but Bryce requested that SG16 take a look, specifically with
>>> regard to narrow vs wide formatting. Bryce has indicated
>>> this paper will need to be approved soon in order for it to
>>> appear in the electronic polling that will be conducted in
>>> January.
>>>
>>> Candidate Poll 2: P2693R0: Forward to LEWG as the
>>> recommended resolution of FR-008-011 [amended to ...].
>>>
>>> FR-010-133
>>> <https://github.com/cplusplus/nbballot/issues/412> and
>>> FR-021-013
>>> <https://github.com/cplusplus/nbballot/issues/423> were
>>> discussed during the 2022-11-02 SG16 meeting
>>> <https://github.com/sg16-unicode/sg16-meetings#november-2nd-2022>
>>> and concluded with a recommendation to discuss with the
>>> project editor the possibility of preferring the Unicode
>>> Standard over ISO/IEC 10646 within the C++ standard. The
>>> project editor approved this direction and we can now move
>>> forward with drafting wording changes. This will require a
>>> paper produced in short order if it is to be accepted for C++23.
>>>
>>> P2675R0 <https://wg21.link/p2675r0> (LWG3780: The Paper
>>> (format's width estimation is too approximate and not
>>> forward compatible)) is intended to resolve LWG #3780
>>> <https://cplusplus.github.io/LWG/issue3780> and FR-007-012
>>> <https://github.com/cplusplus/nbballot/issues/409>. It seeks
>>> to replace the explicit list of code point ranges in
>>> [format.string.std]p12
>>> <https://eel.is/c++draft/format.string.std#12> with wording
>>> that derives substantially the same set of code points using
>>> Unicode database properties.
>>>
>>> Candidate Poll 3.1: P2675R0: Forward to LEWG as the
>>> recommended resolution of FR-007-012 [amended to ...].
>>> Candidate Poll 3.2: P2675R0: Forward to LEWG for C++26
>>> [amended to ...].
>>> Candidate Poll 3.3: Recommend to LEWG that FR-007-012 be
>>> rejected.
>>>
>>> FR-020-014
>>> <https://github.com/cplusplus/nbballot/issues/422> raises
>>> concerns that were discussed as part of the reviews of P2314
>>> <https://wg21.link/p2314> and P2297
>>> <https://wg21.link/p2297> during the 2021-03-24 SG16 meeting
>>> <https://github.com/sg16-unicode/sg16-meetings/blob/master/README-2021.md#march-24th-2021>.
>>> The comment does not appear to present new information. If
>>> we choose to accept, a paper will need to be quickly produced.
>>>
>>> Candidate poll 4.1: Recommend to CWG that FR-020-014 be
>>> accepted.
>>> Candidate poll 4.2: Recommend to CWG that FR-020-014 be
>>> rejected.
>>>
>>> Tom.
>>>
>>>
>> --
>> SG16 mailing list
>> SG16_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>>
>
>
> On Tue, Nov 29, 2022 at 11:47 PM Tom Honermann <tom_at_[hidden]> wrote:
>
> On 11/29/22 5:32 PM, Corentin Jabot wrote:
>>
>>
>> On Tue, Nov 29, 2022 at 11:13 PM Tom Honermann via SG16
>> <sg16_at_[hidden]> wrote:
>>
>> Corentin, I'm having a hard time understanding what the
>> screenshots included in P2675R0 are showing. The paper
>> doesn't explain (or I missed it) what was done to produce
>> those screenshots. I can't tell what code points the
>> displayed glyphs correspond to. The set of characters
>> displayed seems to differ; the number of rows displayed is
>> not consistent. It doesn't seem to be possible to dive into
>> the data to determine which characters are being rendered
>> differently. Basically, I'm unable to evaluate whether the
>> screenshots support the proposal.
>>
>> It would also be helpful if the screenshots included
>> information about the terminal encoding (almost always UTF-8
>> I would guess/hope) and the font(s) used. I recognize
>> gathering such data could be challenging.
>>
>>
>> Well, given the first collection took many days and involved a
>> large number of people, that may prove challenging.
>> I did intend to have the characters separated by |, but by the
>> time I realized that it was too late.
>> Knowing the fonts used would be even more challenging as in the
>> absence of a codepoint renderers will try to find some other
>> similar font.
>> For east asian characters, they will be aligned no matter the
>> font, for emoji they may or may not be rendered on the grid.
>>
>> The paper contains a script that was used to generate the list of
>> codepoint rendered.
>> I intended to include that list however it proved difficult to
>> handle for google docs, and apparently subsequentely forgot to
>> put a reference
>> https://gist.githubusercontent.com/cor3ntin/e5731f77574b146d806e39283e8c7cb7/raw/01d760e7fbb6a56c637a3ce34688e1da286df287/full_width.hpp
>> Feel free to try on your system.
>>
>> The TL;DR is that on a conforming system with sufficient fonts -
>> which is not all systems, in part because Unicode 15 is recent
>> and not all terminals display wide tofu properly - what is
>> proposed is coherent with existing terminal behaviors. And we
>> should not try to placate over temporary deficiency of one
>> specific terminal over another.
>
> Thanks, could you please add that kind of description to the next
> revision of the paper? Not for tomorrow obviously.
>
> Having taken a look at the gist linked above, I have to conclude
> that the screenshots currently in the paper are not beneficial.
> Some screenshots show rows from the top of the gist and others
> from the bottom, but I can only tell that by guessing based on
> glyph shape and being locally able to look at the entire gist.
>
> Having a large number of screenshots is not important to me. I
> would be happy to have examples for just the following. But I
> would like screenshots that provide an apples-to-apples comparison
> and that show the entire gist.
>
> * Linux with Konsole.
> * Linux with gnome-terminal.
> * Linux with non-graphical terminal.
> * Windows with cmd.exe.
> * Windows with Windows Terminal.
> * Windows with Putty.
> * macOS with its default terminal.
>
>
> All of these are provided in the paper.
They are present, but they are incomplete and it is not possible for me
to make sense of them.
> You are free to collect your own of course, not everyone is able to
> show every codepoint at once given screen size, etc.
I understand that taking multiple screenshots would likely be required
to show a complete presentation.
> And it's beside the point.
> My point is that the standard currently has arbitrary rules that do
> not follow any existing practice.
Victor obtained the list by looking at existing practice.
> Can you justify the current list, as it is in the standard?
> Can you justify https://godbolt.org/z/McoG64n1v ?
I'm trying to evaluate your proposal against the status quo.
Unfortunately, the paper is not proving all that helpful in doing so.
Tom.
> Tom.
>
>>
>> Tom.
>>
>> On 11/29/22 3:40 PM, Tom Honermann via SG16 wrote:
>>>
>>> SG16 will hold a telecon on Wednesday, November 30th, at
>>> 19:30 UTC (timezone conversion
>>> <https://www.timeanddate.com/worldclock/converter.html?iso=20221130T193000&p1=1440&p2=tz_pst&p3=tz_mst&p4=tz_cst&p5=tz_est&p6=tz_cet>).
>>>
>>> *This message will also serve as your friendly reminder that
>>> this meeting is taking place tomorrow. **I'm sorry for
>>> publishing an agenda so very late. *
>>>
>>> *For participants in the USA, please note that daylight
>>> savings time ended 2022-11-06, so this telecon will start
>>> one hour earlier than our last telecon.*
>>>
>>> The agenda follows. We won't get through all of these. These
>>> are all of the NB comments we have left to address. Whatever
>>> we don't get to in this meeting will be scheduled for the
>>> December 14th meeting.
>>>
>>> * P2713R0: Escaping improvements in std::format
>>> <https://wg21.link/p2713r0>
>>> o US 38-098 22.14.6.4p1 [format.string.escaped]
>>> Escaping for debugging and logging
>>> <https://github.com/cplusplus/nbballot/issues/515>
>>> o FR 005-134 22.14.6.4 [format.string.escaped]
>>> Aggressive escaping
>>> <https://github.com/cplusplus/nbballot/issues/408>
>>> * P2693R0: Formatting thread::id and stacktrace
>>> <https://wg21.link/p2693r0>
>>> o FR-008-011 22.14 [format] Support formatting of
>>> thread::id
>>> <https://github.com/cplusplus/nbballot/issues/410>
>>> * FR-010-133 [Bibliography] Unify references to Unicode
>>> <https://github.com/cplusplus/nbballot/issues/412> and
>>> FR-021-013 5.3p5.2 [lex.charset] Codepoint names in
>>> identifiers
>>> <https://github.com/cplusplus/nbballot/issues/423>
>>> * P2675R0: LWG3780: The Paper (format's width estimation
>>> is too approximate and not forward compatible)
>>> <https://wg21.link/p2675r0>
>>> o LWG #3780: format's width estimation is too
>>> approximate and not forward compatible
>>> <https://cplusplus.github.io/LWG/issue3780>
>>> o FR-007-012 22.14.2.2 [format.string.std] codepoints
>>> with width 2
>>> <https://github.com/cplusplus/nbballot/issues/409>
>>> * FR-020-014 5.3 [lex.charset] Replace "translation
>>> character set" by "Unicode"
>>> <https://github.com/cplusplus/nbballot/issues/422>
>>>
>>> P2713R0 <https://wg21.link/p2713r0> (Escaping improvements
>>> in std::format) implements the SG16 proposed resolutions for
>>> US 38-098 <https://github.com/cplusplus/nbballot/issues/515>
>>> (see the 2022-10-19 SG16 meeting summary
>>> <https://github.com/sg16-unicode/sg16-meetings#october-19th-2022>)
>>> and FR 005-134
>>> <https://github.com/cplusplus/nbballot/issues/408> (see the
>>> 2022-11-02 SG16 meeting summary
>>> <https://github.com/sg16-unicode/sg16-meetings#november-2nd-2022>).
>>> We'll review the wording and then poll forwarding to LEWG as
>>> the resolution of the two NB comments.
>>>
>>> Candidate Poll 1: P2713R0: Forward to LEWG as the
>>> recommended resolution of US 38-098 and FR 005-134
>>> [amended to ...].
>>>
>>> P2693R0 <https://wg21.link/p2693r0> (Formatting thread::id
>>> and stacktrace) is intended to resolve FR-008-011
>>> <https://github.com/cplusplus/nbballot/issues/410>. I did
>>> not initially tag this NB comment as needing SG16 review,
>>> but Bryce requested that SG16 take a look, specifically with
>>> regard to narrow vs wide formatting. Bryce has indicated
>>> this paper will need to be approved soon in order for it to
>>> appear in the electronic polling that will be conducted in
>>> January.
>>>
>>> Candidate Poll 2: P2693R0: Forward to LEWG as the
>>> recommended resolution of FR-008-011 [amended to ...].
>>>
>>> FR-010-133
>>> <https://github.com/cplusplus/nbballot/issues/412> and
>>> FR-021-013
>>> <https://github.com/cplusplus/nbballot/issues/423> were
>>> discussed during the 2022-11-02 SG16 meeting
>>> <https://github.com/sg16-unicode/sg16-meetings#november-2nd-2022>
>>> and concluded with a recommendation to discuss with the
>>> project editor the possibility of preferring the Unicode
>>> Standard over ISO/IEC 10646 within the C++ standard. The
>>> project editor approved this direction and we can now move
>>> forward with drafting wording changes. This will require a
>>> paper produced in short order if it is to be accepted for C++23.
>>>
>>> P2675R0 <https://wg21.link/p2675r0> (LWG3780: The Paper
>>> (format's width estimation is too approximate and not
>>> forward compatible)) is intended to resolve LWG #3780
>>> <https://cplusplus.github.io/LWG/issue3780> and FR-007-012
>>> <https://github.com/cplusplus/nbballot/issues/409>. It seeks
>>> to replace the explicit list of code point ranges in
>>> [format.string.std]p12
>>> <https://eel.is/c++draft/format.string.std#12> with wording
>>> that derives substantially the same set of code points using
>>> Unicode database properties.
>>>
>>> Candidate Poll 3.1: P2675R0: Forward to LEWG as the
>>> recommended resolution of FR-007-012 [amended to ...].
>>> Candidate Poll 3.2: P2675R0: Forward to LEWG for C++26
>>> [amended to ...].
>>> Candidate Poll 3.3: Recommend to LEWG that FR-007-012 be
>>> rejected.
>>>
>>> FR-020-014
>>> <https://github.com/cplusplus/nbballot/issues/422> raises
>>> concerns that were discussed as part of the reviews of P2314
>>> <https://wg21.link/p2314> and P2297
>>> <https://wg21.link/p2297> during the 2021-03-24 SG16 meeting
>>> <https://github.com/sg16-unicode/sg16-meetings/blob/master/README-2021.md#march-24th-2021>.
>>> The comment does not appear to present new information. If
>>> we choose to accept, a paper will need to be quickly produced.
>>>
>>> Candidate poll 4.1: Recommend to CWG that FR-020-014 be
>>> accepted.
>>> Candidate poll 4.2: Recommend to CWG that FR-020-014 be
>>> rejected.
>>>
>>> Tom.
>>>
>>>
>> --
>> SG16 mailing list
>> SG16_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>>
Received on 2022-11-29 23:09:06