SG16 will hold a meeting today, Wednesday, January 22nd,
at 19:30 UTC (timezone conversion). That is 11:30am
PST, 12:30pm MST, 1:30pm CST, 2:30pm EST, and 20:30 CET.
Sorry for the delayed scheduling. I've almost unburied myself to the point that I'll be able to devote a reasonable amount of time to SG16 again.
I added this meeting to the shared calendar just a few minutes
ago. If you need a .ics file to
import into your calendar, you can download it here.
The agenda follows.
I normally schedule SG16 meetings for the 2nd and 4th Wednesdays of each month. The Hagenberg meeting is the 2nd week of February (the 10th through the 15th). I'm not planning to hold an in-person SG16 meeting during the Hagenberg meeting because it is too difficult to get quorum. I'd like to get one more meeting scheduled before Hagenberg, so either January 29th or February 5th. I have a slight preference for February 5th and will schedule for that date unless there are requests for the January 29th date instead. Following Hagenberg, we can resume our normal meeting cadence starting on February 26th. If you have thoughts on this, please share.
We last discussed P2019 during the 2024-09-25
SG16 meeting where we had polls that demonstrated consensus
for two different options for the encoding of thread names, but
with neither option being a clear winner. I still haven't
published proper minutes for that meeting, but see the summary
posted in the github tracker. In close calls like this, we
usually defer to the author. Corentin and Victor discussed offline
and reached an agreement on use of the ordinary literal encoding.
Corentin will present and we'll hopefully find agreement on a
solution and forward the paper.
I intentionally kept the agenda rather light for this meeting
given the late notice and anticipation that there will still be
plenty to discuss regarding literal encodings, NTBS, NTMBS, and C
vs C++ vs system/environment locales.
My rough notes from the 2025-09-25 discussion of P2019R7 are below for reference.
- P2019R7: Thread attributes:
- Corentin introduced the paper.
- The thread name is provided to the OS so that it is available for display in an OS monitor, debugger, process thread list, etc...
- On POSIX systems, the name is stuffed into a small string buffer provided by the pthreads library and is generally interpreted as execution encoding.
- Prior to Windows 10, there were unofficial ways to associate a name with a thread.
- On Windows 10 there is a public interface.
- On Windows, the name must be provided in wchar_t; there is no "ANSI" version of the interface.
- A previous revision of this paper supported both char and wchar_t.
- A copy of the string will always be needed, so transcoding costs aren't significant.
- We should support char8_t where we want to support Unicode, but that isn't proposed in this paper; can add that later.
- The name is interpreted as an NTBS in the execution encoding.
- The name can be transcoded to wchar_t on Windows.
- If mojibake happens, it doesn't affect users of the software.
- Victor: Execution encoding isn't defined in the standard; we have execution character set. Is this locale encoding?
- Victor: I think we should use a well-defined encoding like the literal encoding.
- Corentin: We need a solution that works with MultiByteToWideChar() on Windows; that only works with execution encoding.
- Corentin: We don't require a relationship between literal encoding and execution encoding; that technically means that conversions don't work.
- Victor: This should be implementation detail.
- Tom: I'm torn, I agree with Victor, but I also think the proposal reflects existing implementations.
- Tom: On POSIX systems, the string is likely to be interpreted by other tools using the execution encoding.
- Tom: Should that NTBS be NTMBS?
- Corentin: Probably.
- Corentin: The proposal is consistent with behavior elsewhere in the standard library.
- Tom: Is the name exposed in the thread class interface?
- Corentin: No; not all platforms expose it; we would have to store it internally.
- Corentin: On some platforms, retrieving thread properties requires a thread ID.
- Steve: If the name is not in the right encoding, you'll get broken behavior in a predictable way.
- Victor: On POSIX, I think we should do what path does and prohibit transcoding.
- Victor: On Windows, this is just broken because it uses C locale.
- Tom: Does NTBS imply C locale to you?
- Victor: Yes.
- Jens: Execution encoding is not a term in the C++ standard; we have in [character.seq.general] "the encodings of the execution character sets ... are locale specific"; we should use this wording.
- Corentin: We probably should define "execution encoding".
- Jens: Not in this paper.
- Jens: We should specify which locale we mean here.
- Corentin: No, not in each place where we refer to the execution encoding.
- Jens: In the recent exception class discussions, we determined that the encodings correspond to the C locale; we should be more specific here.
- Corentin: If we just say NTMBS, then we get the right result.
- Tom: And that gets us C locale.
- Jens: For exception classes we say NTBS with a carve out for NTMBS; do we want that here?
- Tom: If we could, I think we would prefer to require NTMBS for the exception classes.
- Jens: <referring to exception class wording>; wording directs to codecvt and thus C++ locale.
- Jens: So, do we want C or C++ locale here? It looks like NTMBS comes in two flavors; we should be clear on the semantics.
- Jens: <referring to a link provided by Victor> regarding SetThreadDescription(); there is a technique that involes use of a char string.
- Corentin: That only works when a debugger is attached; implementors won't use that technique.
- Jens: We need to decide what exactly we want this interface to be compatible with.
- Jens: The level of complexity here is much less than for paths.
- Corentin: Can the C and C++ locales diverge?
- Jens: Yes.
- Eddie: This is similar to path; why not have the name_hint constructor behave like path where we can provide the string in multiple encodings.
- Corentin: Path is unique since filesystem encodings may differ from the other encodings; it is more complicated.
- Corentin: We should avoid exposing programmers to encoding concerns where we don't need to.
- Corentin: I removed char8_t support because I thought it would increase consensus; if SG16 wants to re-introduce char8_t or require literal encoding, I'm ok with that.
- Eddie: What I meant is that we could use the native() implementation-defined character type.
- Corentin: I want to be able to pass an ordinary string literal and have it work everywhere.
- Victor: I agree that we don't want most of the complexity of path. However, path is a good abstract model of what we want here.
- Victor: On POSIX, the bytes should just be passed as is; a binary identifier could be used if desired.
- Victor: NTMBS means you can't use std::format to produce the thread name.
- Corentin: Would it increase consensus to require the ordinary literal encoding for C++26?
- Victor: Yes.
- Jens: What does sprintf do?
- Tom: It uses the execution encoding so that special characters in trailing code units are not misinterpreted.
- Jens: We don't require the literal encoding to match the execution encoding though that might be a design bug.
- Jens: Though format is different, I don't think we should sprint a new encoding requirement on programmers here.
- Jens: We could specify that, for POSIX, an NTBS and prohibit conversions, add wchar_t for Windows, and then have portability issues.
- Corentin: I think printf() is the model to follow.
- Jens: Then make it an NTMBS and reference the C locale.
- Jens: The only conversion concern we have is which conversion function is to be called on Windows.
- Corentin: The separation of the C and C++ locales is new information to me.
- Tom: We could restrict the characters used to the basic literal character set.
- Victor: I agree with the use of NTBS on POSIX, but I think NTMBS is wrong for Windows as it is incompatible with everything.
- Poll 2: P2019R7: Name hint should be provided in the ordinary literal encoding.
- Attendees: 8
- SF F N A SA
3 1 3 1 0
- Weak consensus.
- Poll 3: P2019R7: Name hint should be provided as an NTMBS in the C locale encoding.
- Attendees: 8
- SF F N A SA
1 5 1 0 1
- Consensus.
- Victor: I don't think we're done with this paper; use of string_view might be problematic.
- Jens: It is called name "hint" for a reason.
- Jens: If there are embedded nulls or gets truncated, tough luck.
Tom.