I am still far behind in getting meeting minutes published. My rough minutes from the 2024-09-25 discussion of P2019R7 are below.
Corentin, I intend to schedule P3258R0 (Formatting of charN_t)
for the 2024-11-05 meeting. Please let me know if that won't work
for you.
- P2019R7: Thread attributes:
- Corentin introduced the paper.
- The thread name is provided to the OS so that it is
available for display in an OS monitor, debugger, process thread
list, etc...
- On POSIX systems, the name is stuffed into a small string
buffer provided by the pthreads library and is generally
interpreted as execution encoding.
- Prior to Windows 10, there were unofficial ways to associate
a name with a thread.
- On Windows 10 there is a public interface.
- On Windows, the name must be provided in wchar_t; there is
no "ANSI" version of the interface.
- A previous revision of this paper supported both char and
wchar_t.
- A copy of the string will always be needed, so transcoding
costs aren't significant.
- We should support char8_t where we want to support Unicode,
but that isn't proposed in this paper; can add that later.
- The name is interpreted as an NTBS in the execution
encoding.
- The name can be transcoded to wchar_t on Windows.
- If mojibake happens, it doesn't affect users of the
software.
- Victor: Execution encoding isn't defined in the standard; we
have execution character set. Is this locale encoding?
- Victor: I think we should use a well-defined encoding like the
literal encoding.
- Corentin: We need a solution that works with
MultiByteToWideChar() on Windows; that only works with execution
encoding.
- Corentin: We don't require a relationship between literal
encoding and execution encoding; that technically means that
conversions don't work.
- Victor: This should be implementation detail.
- Tom: I'm torn, I agree with Victor, but I also thing the
proposal reflects existing implementations.
- Tom: On POSIX systems, the string is likely to be interpreted
by other tools using the execution encoding.
- Tom: Should that NTBS be NTMBS?
- Corentin: Probably.
- Corentin: The proposal is consistent with behavior elsewhere
in the standard library.
- Tom: Is the name exposed in the thread class interface?
- Corentin: No; not all platforms expose it; we would have to
store it internally.
- Corentin: On some platforms, retrieving thread properties
requires a thread ID.
- Steve: If the name is not in the right encoding, you'll get
broken behavior in a predictable way.
- Victor: On POSIX, I think we should do what path does and
prohibit transcoding.
- Victor: On Windows, this is just broken because it uses C
locale.
- Tom: Does NTBS imply C locale to you?
- Victor: Yes.
- Jens: Execution encoding is not a term in the C++ standard; we
have in [character.seq.general] "the encodings of the execution
character sets ... are locale specific"; we should use this
wording.
- Corentin: We probably should define "execution encoding".
- Jens: Not in this paper.
- Jens: We should specify which locale we mean here.
- Corentin: No, not in each place where we refer to the
execution encoding.
- Jens: In the recent exception class discussions, we determined
that the encodings correspond to the C locale; we should be more
specific here.
- Corentin: If we just say NTMBS, then we get the right result.
- Tom: And that gets us C locale.
- Jens: For exception classes we say NTBS with a carve out for
NTMBS; do we want that here?
- Tom: If we could, I think we would prefer to require NTMBS for
the exception classes.
- Jens: <referring to exception class wording>; wording
directs to codecvt and thus C++ locale.
- Jens: So, do we want C or C++ locale here? It looks like NTMBS
comes in two flavors; we should be clear on the semantics.
- Jens: <referring to a link provided by Victor> regarding
SetThreadDescription(); there is a technique that involes use of a
char string.
- Corentin: That only works when a debugger is attached;
implementors won't use that technique.
- Jens: We need to decide what exactly we want this interface to
be compatible with.
- Jens: The level of complexity here is much less than for
paths.
- Corentin: Can the C and C++ locales diverge?
- Jens: Yes.
- Eddie: This is similar to path; why not have the name_hint
constructor behave like path where we can provide the string in
multiple encodings.
- Corentin: Path is unique since filesystem encodings may differ
from the other encodings; it is more complicated.
- Corentin: We should avoid exposing programmers to encoding
concerns where we don't need to.
- Corentin: I removed char8_t support because I thought it would
increase consensus; if SG16 wants to re-introduce char8_t or
require literal encoding, I'm ok with that.
- Eddie: What I meant is that we could use the native()
implementation-defined character type.
- Corentin: I want to be able to pass an ordinary string literal
and have it work everywhere.
- Victor: I agree that we don't want most of the complexity of
path. However, path is a good abstract model of what we want here.
- Victor: On POSIX, the bytes should just be passed as is; a
binary identifier could be used if desired.
- Victor: NTMBS means you can't use std::format to produce the
thread name.
- Corentin: Would it increase consensus to require the ordinary
literal encoding for C++26?
- Victor: Yes.
- Jens: What does sprintf do?
- Tom: It uses the execution encoding so that special characters
in trailing code units are not misinterpreted.
- Jens: We don't require the literal encoding to match the
execution encoding though that might be a design bug.
- Jens: Though format is different, I don't think we should
sprint a new encoding requirement on programmers here.
- Jens: We could specify that, for POSIX, an NTBS and prohibit
conversions, add wchar_t for Windows, and then have portability
issues.
- Corentin: I think printf() is the model to follow.
- Jens: Then make it an NTMBS and reference the C locale.
- Jens: The only conversion concern we have is which conversion
function is to be called on Windows.
- Corentin: The separation of the C and C++ locales is new
information to me.
- Tom: We could restrict the characters used to the basic
literal character set.
- Victor: I agree with the use of NTBS on POSIX, but I think
NTMBS is wrong for Windows as it is incompatible with everything.
- Poll 2: P2019R7: Name hint should be provided in the ordinary
literal encoding.
- Attendees: 8
- SF F N A SA
3 1 3 1 0
- Weak consensus.
- Poll 3: P2019R7: Name hint should be provided as an NTMBS in
the C locale encoding.
- Attendees: 8
- SF F N A SA
1 5 1 0 1
- Consensus.
- Victor: I don't think we're done with this paper; use of
string_view might be problematic.
- Jens: It is called name "hint" for a reason.
- Jens: If there are embedded nulls or gets truncated, tough
luck.
Tom.
SG16 will hold a meeting tomorrow, Wednesday, October 23rd, at 19:30 UTC (timezone conversion).
The agenda follows.
P3374R0 comes to us courtesy of Liang Jiaming and seeks to add formatting support for std::fpos<std::mbstate_t>. SG16 has not previously reviewed this proposal, but previous discussion occurred on the std-proposals mailing list in the thread starting here and on the SG16 mailing list in a thread starting here. Please try to review those before the meeting (and I again apologize for such late notice). std::mbstate_t is an implementation-defined type, so we can't legislate much about its representation in formatted output, but we can provide guidance, particularly with regard to whether formatted values of the type should be sufficient to reconstruct values in a hypothetical scanner such as that proposed by P1729 (Text Parsing).
P2019R7 was reviewed by SG16 during the 2024-09-25 meeting in which the following polls were taken:
- Poll 2: P2019R7: Name hint should be provided in the ordinary literal encoding.
- Attendees: 8
SF
F
N
A
SA
3
1
3
1
0
- Weak consensus.
- Poll 3: P2019R7: Name hint should be provided as an NTMBS in the C locale encoding.
- Attendees: 8
SF
F
N
A
SA
1
5
1
0
1
- Consensus
The guidance provided regarding encoding of the name hint is fairly clear despite some continued opposition, so new information should be provided in order to reopen that discussion. Continued discussion is warranted for other concerns raised regarding POSIX vs Windows platforms and the use of std::string_view. We'll focus first on those topics and then any other concerns that are raised.
Tom.