Date: Tue, 22 Oct 2024 14:28:00 -0400
I am still far behind in getting meeting minutes published. My rough
minutes from the 2024-09-25 discussion of P2019R7 are below.
Corentin, I intend to schedule P3258R0 (Formatting of charN_t)
<https://wg21.link/p3258r0> for the 2024-11-05 meeting. Please let me
know if that won't work for you.
- P2019R7: Thread attributes:
- Corentin introduced the paper.
- The thread name is provided to the OS so that it is available for
display in an OS monitor, debugger, process thread list, etc...
- On POSIX systems, the name is stuffed into a small string buffer
provided by the pthreads library and is generally interpreted as
execution encoding.
- Prior to Windows 10, there were unofficial ways to associate a
name with a thread.
- On Windows 10 there is a public interface.
- On Windows, the name must be provided in wchar_t; there is no
"ANSI" version of the interface.
- A previous revision of this paper supported both char and wchar_t.
- A copy of the string will always be needed, so transcoding costs
aren't significant.
- We should support char8_t where we want to support Unicode, but
that isn't proposed in this paper; can add that later.
- The name is interpreted as an NTBS in the execution encoding.
- The name can be transcoded to wchar_t on Windows.
- If mojibake happens, it doesn't affect users of the software.
- Victor: Execution encoding isn't defined in the standard; we have
execution character set. Is this locale encoding?
- Victor: I think we should use a well-defined encoding like the
literal encoding.
- Corentin: We need a solution that works with MultiByteToWideChar()
on Windows; that only works with execution encoding.
- Corentin: We don't require a relationship between literal encoding
and execution encoding; that technically means that conversions don't work.
- Victor: This should be implementation detail.
- Tom: I'm torn, I agree with Victor, but I also thing the proposal
reflects existing implementations.
- Tom: On POSIX systems, the string is likely to be interpreted by
other tools using the execution encoding.
- Tom: Should that NTBS be NTMBS?
- Corentin: Probably.
- Corentin: The proposal is consistent with behavior elsewhere in the
standard library.
- Tom: Is the name exposed in the thread class interface?
- Corentin: No; not all platforms expose it; we would have to store
it internally.
- Corentin: On some platforms, retrieving thread properties requires
a thread ID.
- Steve: If the name is not in the right encoding, you'll get broken
behavior in a predictable way.
- Victor: On POSIX, I think we should do what path does and prohibit
transcoding.
- Victor: On Windows, this is just broken because it uses C locale.
- Tom: Does NTBS imply C locale to you?
- Victor: Yes.
- Jens: Execution encoding is not a term in the C++ standard; we have
in [character.seq.general] "the encodings of the execution character
sets ... are locale specific"; we should use this wording.
- Corentin: We probably should define "execution encoding".
- Jens: Not in this paper.
- Jens: We should specify which locale we mean here.
- Corentin: No, not in each place where we refer to the execution
encoding.
- Jens: In the recent exception class discussions, we determined that
the encodings correspond to the C locale; we should be more specific here.
- Corentin: If we just say NTMBS, then we get the right result.
- Tom: And that gets us C locale.
- Jens: For exception classes we say NTBS with a carve out for NTMBS;
do we want that here?
- Tom: If we could, I think we would prefer to require NTMBS for the
exception classes.
- Jens: <referring to exception class wording>; wording directs to
codecvt and thus C++ locale.
- Jens: So, do we want C or C++ locale here? It looks like NTMBS
comes in two flavors; we should be clear on the semantics.
- Jens: <referring to a link provided by Victor> regarding
SetThreadDescription(); there is a technique that involes use of a char
string.
- Corentin: That only works when a debugger is attached; implementors
won't use that technique.
- Jens: We need to decide what exactly we want this interface to be
compatible with.
- Jens: The level of complexity here is much less than for paths.
- Corentin: Can the C and C++ locales diverge?
- Jens: Yes.
- Eddie: This is similar to path; why not have the name_hint
constructor behave like path where we can provide the string in multiple
encodings.
- Corentin: Path is unique since filesystem encodings may differ from
the other encodings; it is more complicated.
- Corentin: We should avoid exposing programmers to encoding concerns
where we don't need to.
- Corentin: I removed char8_t support because I thought it would
increase consensus; if SG16 wants to re-introduce char8_t or require
literal encoding, I'm ok with that.
- Eddie: What I meant is that we could use the native()
implementation-defined character type.
- Corentin: I want to be able to pass an ordinary string literal and
have it work everywhere.
- Victor: I agree that we don't want most of the complexity of path.
However, path is a good abstract model of what we want here.
- Victor: On POSIX, the bytes should just be passed as is; a binary
identifier could be used if desired.
- Victor: NTMBS means you can't use std::format to produce the thread
name.
- Corentin: Would it increase consensus to require the ordinary
literal encoding for C++26?
- Victor: Yes.
- Jens: What does sprintf do?
- Tom: It uses the execution encoding so that special characters in
trailing code units are not misinterpreted.
- Jens: We don't require the literal encoding to match the execution
encoding though that might be a design bug.
- Jens: Though format is different, I don't think we should sprint a
new encoding requirement on programmers here.
- Jens: We could specify that, for POSIX, an NTBS and prohibit
conversions, add wchar_t for Windows, and then have portability issues.
- Corentin: I think printf() is the model to follow.
- Jens: Then make it an NTMBS and reference the C locale.
- Jens: The only conversion concern we have is which conversion
function is to be called on Windows.
- Corentin: The separation of the C and C++ locales is new
information to me.
- Tom: We could restrict the characters used to the basic literal
character set.
- Victor: I agree with the use of NTBS on POSIX, but I think NTMBS is
wrong for Windows as it is incompatible with everything.
- Poll 2: P2019R7: Name hint should be provided in the ordinary
literal encoding.
- Attendees: 8
- SF F N A SA
3 1 3 1 0
- Weak consensus.
- Poll 3: P2019R7: Name hint should be provided as an NTMBS in the C
locale encoding.
- Attendees: 8
- SF F N A SA
1 5 1 0 1
- Consensus.
- Victor: I don't think we're done with this paper; use of
string_view might be problematic.
- Jens: It is called name "hint" for a reason.
- Jens: If there are embedded nulls or gets truncated, tough luck.
Tom.
On 10/22/24 2:24 PM, Tom Honermann via SG16 wrote:
>
> SG16 will hold a meeting *tomorrow*, Wednesday, October 23rd, at 19:30
> UTC (timezone conversion
> <https://www.timeanddate.com/worldclock/converter.html?iso=20241023T193000&p1=1440&p2=tz_pdt&p3=tz_mdt&p4=tz_cdt&p5=tz_edt&p6=tz_cest>).
>
> The agenda follows.
>
> * P3374R0: Adding formatter for fpos<mbstate_t>
> <https://wg21.link/p3374r0>
> * P2019R7: Thread attributes <https://wg21.link/p2019r7>
>
> P3374R0 comes to us courtesy of Liang Jiaming and seeks to add
> formatting support for std::fpos<std::mbstate_t>. SG16 has not
> previously reviewed this proposal, but previous discussion occurred on
> the std-proposals mailing list in the thread starting here
> <https://lists.isocpp.org/std-proposals/2024/07/10667.php> and on the
> SG16 mailing list in a thread starting here
> <https://lists.isocpp.org/sg16/2024/08/4415.php>. Please try to review
> those before the meeting (and I again apologize for such late notice).
> std::mbstate_t is an implementation-defined type, so we can't
> legislate much about its representation in formatted output, but we
> can provide guidance, particularly with regard to whether formatted
> values of the type should be sufficient to reconstruct values in a
> hypothetical scanner such as that proposed by P1729 (Text Parsing)
> <https://wg21.link/p1729>.
>
> P2019R7 was reviewed by SG16 during the 2024-09-25 meeting in which
> the following polls were taken:
>
> * Poll 2: P2019R7: Name hint should be provided in the ordinary
> literal encoding.
> o Attendees: 8
> o
> SF
> F
> N
> A
> SA
> 3
> 1
> 3
> 1
> 0
>
> o Weak consensus.
> * Poll 3: P2019R7: Name hint should be provided as an NTMBS in the C
> locale encoding.
> o Attendees: 8
> o
> SF
> F
> N
> A
> SA
> 1
> 5
> 1
> 0
> 1
>
> o Consensus
>
> The guidance provided regarding encoding of the name hint is fairly
> clear despite some continued opposition, so new information should be
> provided in order to reopen that discussion. Continued discussion is
> warranted for other concerns raised regarding POSIX vs Windows
> platforms and the use of std::string_view. We'll focus first on those
> topics and then any other concerns that are raised.
>
> Tom.
>
>
minutes from the 2024-09-25 discussion of P2019R7 are below.
Corentin, I intend to schedule P3258R0 (Formatting of charN_t)
<https://wg21.link/p3258r0> for the 2024-11-05 meeting. Please let me
know if that won't work for you.
- P2019R7: Thread attributes:
- Corentin introduced the paper.
- The thread name is provided to the OS so that it is available for
display in an OS monitor, debugger, process thread list, etc...
- On POSIX systems, the name is stuffed into a small string buffer
provided by the pthreads library and is generally interpreted as
execution encoding.
- Prior to Windows 10, there were unofficial ways to associate a
name with a thread.
- On Windows 10 there is a public interface.
- On Windows, the name must be provided in wchar_t; there is no
"ANSI" version of the interface.
- A previous revision of this paper supported both char and wchar_t.
- A copy of the string will always be needed, so transcoding costs
aren't significant.
- We should support char8_t where we want to support Unicode, but
that isn't proposed in this paper; can add that later.
- The name is interpreted as an NTBS in the execution encoding.
- The name can be transcoded to wchar_t on Windows.
- If mojibake happens, it doesn't affect users of the software.
- Victor: Execution encoding isn't defined in the standard; we have
execution character set. Is this locale encoding?
- Victor: I think we should use a well-defined encoding like the
literal encoding.
- Corentin: We need a solution that works with MultiByteToWideChar()
on Windows; that only works with execution encoding.
- Corentin: We don't require a relationship between literal encoding
and execution encoding; that technically means that conversions don't work.
- Victor: This should be implementation detail.
- Tom: I'm torn, I agree with Victor, but I also thing the proposal
reflects existing implementations.
- Tom: On POSIX systems, the string is likely to be interpreted by
other tools using the execution encoding.
- Tom: Should that NTBS be NTMBS?
- Corentin: Probably.
- Corentin: The proposal is consistent with behavior elsewhere in the
standard library.
- Tom: Is the name exposed in the thread class interface?
- Corentin: No; not all platforms expose it; we would have to store
it internally.
- Corentin: On some platforms, retrieving thread properties requires
a thread ID.
- Steve: If the name is not in the right encoding, you'll get broken
behavior in a predictable way.
- Victor: On POSIX, I think we should do what path does and prohibit
transcoding.
- Victor: On Windows, this is just broken because it uses C locale.
- Tom: Does NTBS imply C locale to you?
- Victor: Yes.
- Jens: Execution encoding is not a term in the C++ standard; we have
in [character.seq.general] "the encodings of the execution character
sets ... are locale specific"; we should use this wording.
- Corentin: We probably should define "execution encoding".
- Jens: Not in this paper.
- Jens: We should specify which locale we mean here.
- Corentin: No, not in each place where we refer to the execution
encoding.
- Jens: In the recent exception class discussions, we determined that
the encodings correspond to the C locale; we should be more specific here.
- Corentin: If we just say NTMBS, then we get the right result.
- Tom: And that gets us C locale.
- Jens: For exception classes we say NTBS with a carve out for NTMBS;
do we want that here?
- Tom: If we could, I think we would prefer to require NTMBS for the
exception classes.
- Jens: <referring to exception class wording>; wording directs to
codecvt and thus C++ locale.
- Jens: So, do we want C or C++ locale here? It looks like NTMBS
comes in two flavors; we should be clear on the semantics.
- Jens: <referring to a link provided by Victor> regarding
SetThreadDescription(); there is a technique that involes use of a char
string.
- Corentin: That only works when a debugger is attached; implementors
won't use that technique.
- Jens: We need to decide what exactly we want this interface to be
compatible with.
- Jens: The level of complexity here is much less than for paths.
- Corentin: Can the C and C++ locales diverge?
- Jens: Yes.
- Eddie: This is similar to path; why not have the name_hint
constructor behave like path where we can provide the string in multiple
encodings.
- Corentin: Path is unique since filesystem encodings may differ from
the other encodings; it is more complicated.
- Corentin: We should avoid exposing programmers to encoding concerns
where we don't need to.
- Corentin: I removed char8_t support because I thought it would
increase consensus; if SG16 wants to re-introduce char8_t or require
literal encoding, I'm ok with that.
- Eddie: What I meant is that we could use the native()
implementation-defined character type.
- Corentin: I want to be able to pass an ordinary string literal and
have it work everywhere.
- Victor: I agree that we don't want most of the complexity of path.
However, path is a good abstract model of what we want here.
- Victor: On POSIX, the bytes should just be passed as is; a binary
identifier could be used if desired.
- Victor: NTMBS means you can't use std::format to produce the thread
name.
- Corentin: Would it increase consensus to require the ordinary
literal encoding for C++26?
- Victor: Yes.
- Jens: What does sprintf do?
- Tom: It uses the execution encoding so that special characters in
trailing code units are not misinterpreted.
- Jens: We don't require the literal encoding to match the execution
encoding though that might be a design bug.
- Jens: Though format is different, I don't think we should sprint a
new encoding requirement on programmers here.
- Jens: We could specify that, for POSIX, an NTBS and prohibit
conversions, add wchar_t for Windows, and then have portability issues.
- Corentin: I think printf() is the model to follow.
- Jens: Then make it an NTMBS and reference the C locale.
- Jens: The only conversion concern we have is which conversion
function is to be called on Windows.
- Corentin: The separation of the C and C++ locales is new
information to me.
- Tom: We could restrict the characters used to the basic literal
character set.
- Victor: I agree with the use of NTBS on POSIX, but I think NTMBS is
wrong for Windows as it is incompatible with everything.
- Poll 2: P2019R7: Name hint should be provided in the ordinary
literal encoding.
- Attendees: 8
- SF F N A SA
3 1 3 1 0
- Weak consensus.
- Poll 3: P2019R7: Name hint should be provided as an NTMBS in the C
locale encoding.
- Attendees: 8
- SF F N A SA
1 5 1 0 1
- Consensus.
- Victor: I don't think we're done with this paper; use of
string_view might be problematic.
- Jens: It is called name "hint" for a reason.
- Jens: If there are embedded nulls or gets truncated, tough luck.
Tom.
On 10/22/24 2:24 PM, Tom Honermann via SG16 wrote:
>
> SG16 will hold a meeting *tomorrow*, Wednesday, October 23rd, at 19:30
> UTC (timezone conversion
> <https://www.timeanddate.com/worldclock/converter.html?iso=20241023T193000&p1=1440&p2=tz_pdt&p3=tz_mdt&p4=tz_cdt&p5=tz_edt&p6=tz_cest>).
>
> The agenda follows.
>
> * P3374R0: Adding formatter for fpos<mbstate_t>
> <https://wg21.link/p3374r0>
> * P2019R7: Thread attributes <https://wg21.link/p2019r7>
>
> P3374R0 comes to us courtesy of Liang Jiaming and seeks to add
> formatting support for std::fpos<std::mbstate_t>. SG16 has not
> previously reviewed this proposal, but previous discussion occurred on
> the std-proposals mailing list in the thread starting here
> <https://lists.isocpp.org/std-proposals/2024/07/10667.php> and on the
> SG16 mailing list in a thread starting here
> <https://lists.isocpp.org/sg16/2024/08/4415.php>. Please try to review
> those before the meeting (and I again apologize for such late notice).
> std::mbstate_t is an implementation-defined type, so we can't
> legislate much about its representation in formatted output, but we
> can provide guidance, particularly with regard to whether formatted
> values of the type should be sufficient to reconstruct values in a
> hypothetical scanner such as that proposed by P1729 (Text Parsing)
> <https://wg21.link/p1729>.
>
> P2019R7 was reviewed by SG16 during the 2024-09-25 meeting in which
> the following polls were taken:
>
> * Poll 2: P2019R7: Name hint should be provided in the ordinary
> literal encoding.
> o Attendees: 8
> o
> SF
> F
> N
> A
> SA
> 3
> 1
> 3
> 1
> 0
>
> o Weak consensus.
> * Poll 3: P2019R7: Name hint should be provided as an NTMBS in the C
> locale encoding.
> o Attendees: 8
> o
> SF
> F
> N
> A
> SA
> 1
> 5
> 1
> 0
> 1
>
> o Consensus
>
> The guidance provided regarding encoding of the name hint is fairly
> clear despite some continued opposition, so new information should be
> provided in order to reopen that discussion. Continued discussion is
> warranted for other concerns raised regarding POSIX vs Windows
> platforms and the use of std::string_view. We'll focus first on those
> topics and then any other concerns that are raised.
>
> Tom.
>
>
Received on 2024-10-22 18:28:03