ISOCPP sg16 List: Re: [isocpp-sg16] Agenda for the 2025-01-21 SG16 meeting

From: Tom Honermann <tom_at_[hidden]>
Date: Wed, 22 Jan 2025 14:29:16 -0500

This meeting is happening now. Be there or be □.

Tom.

On 1/22/25 1:28 PM, Tom Honermann via SG16 wrote:
>
> Reminder that this meeting is happening in one hour.
>
> Tom.
>
> On 1/22/25 12:33 AM, Tom Honermann via SG16 wrote:
>>
>> SG16 will hold a meeting *today*, Wednesday, January 22nd, at 19:30
>> UTC (timezone conversion
>> <https://www.timeanddate.com/worldclock/converter.html?iso=20250122T193000&p1=1440&p2=tz_pst&p3=tz_mst&p4=tz_cst&p5=tz_est&p6=tz_cet>).
>> That is 11:30am PST, 12:30pm MST, 1:30pm CST, 2:30pm EST, and 20:30 CET.
>>
>> Sorry for the delayed scheduling. I've almost unburied myself to the
>> point that I'll be able to devote a reasonable amount of time to SG16
>> again.
>>
>> I added this meeting to the shared calendar just a few minutes ago.
>> If you need a .ics file to import into your calendar, you can
>> download it here
>> <https://documents.isocpp.org/remote.php/dav/public-calendars/R7imgS2LJD9xfeWN/EE81D0AF-4216-4365-89BD-181D7A0D6DB2.ics?export>.
>>
>> The agenda follows.
>>
>> * Decide on a meeting schedule for before/after the Hagenberg meeting.
>> * P2019R7: Thread attributes <https://wg21.link/p2019>
>>
>> I normally schedule SG16 meetings for the 2nd and 4th Wednesdays of
>> each month. The Hagenberg meeting is the 2nd week of February (the
>> 10th through the 15th). I'm not planning to hold an in-person SG16
>> meeting during the Hagenberg meeting because it is too difficult to
>> get quorum. I'd like to get one more meeting scheduled before
>> Hagenberg, so either January 29th or February 5th. I have a slight
>> preference for February 5th and will schedule for that date unless
>> there are requests for the January 29th date instead. Following
>> Hagenberg, we can resume our normal meeting cadence starting on
>> February 26th. If you have thoughts on this, please share.
>>
>> We last discussed P2019 during the 2024-09-25 SG16 meeting
>> <https://github.com/sg16-unicode/sg16-meetings/tree/master#september-25th-2024>
>> where we had polls that demonstrated consensus for two different
>> options for the encoding of thread names, but with neither option
>> being a clear winner. I still haven't published proper minutes for
>> that meeting, but see the summary posted in the github tracker
>> <https://github.com/cplusplus/papers/issues/817#issuecomment-2482513214>.
>> In close calls like this, we usually defer to the author. Corentin
>> and Victor discussed offline and reached an agreement on use of the
>> ordinary literal encoding. Corentin will present and we'll hopefully
>> find agreement on a solution and forward the paper.
>>
>> I intentionally kept the agenda rather light for this meeting given
>> the late notice and anticipation that there will still be plenty to
>> discuss regarding literal encodings, NTBS, NTMBS, and C vs C++ vs
>> system/environment locales.
>>
>> My rough notes from the 2025-09-25 discussion of P2019R7 are below
>> for reference.
>>
>> - P2019R7: Thread attributes:
>> - Corentin introduced the paper.
>> - The thread name is provided to the OS so that it is
>> available for display in an OS monitor, debugger, process thread
>> list, etc...
>> - On POSIX systems, the name is stuffed into a small string
>> buffer provided by the pthreads library and is generally
>> interpreted as execution encoding.
>> - Prior to Windows 10, there were unofficial ways to
>> associate a name with a thread.
>> - On Windows 10 there is a public interface.
>> - On Windows, the name must be provided in wchar_t; there is
>> no "ANSI" version of the interface.
>> - A previous revision of this paper supported both char and
>> wchar_t.
>> - A copy of the string will always be needed, so transcoding
>> costs aren't significant.
>> - We should support char8_t where we want to support Unicode,
>> but that isn't proposed in this paper; can add that later.
>> - The name is interpreted as an NTBS in the execution encoding.
>> - The name can be transcoded to wchar_t on Windows.
>> - If mojibake happens, it doesn't affect users of the software.
>> - Victor: Execution encoding isn't defined in the standard; we
>> have execution character set. Is this locale encoding?
>> - Victor: I think we should use a well-defined encoding like
>> the literal encoding.
>> - Corentin: We need a solution that works with
>> MultiByteToWideChar() on Windows; that only works with execution
>> encoding.
>> - Corentin: We don't require a relationship between literal
>> encoding and execution encoding; that technically means that
>> conversions don't work.
>> - Victor: This should be implementation detail.
>> - Tom: I'm torn, I agree with Victor, but I also think the
>> proposal reflects existing implementations.
>> - Tom: On POSIX systems, the string is likely to be interpreted
>> by other tools using the execution encoding.
>> - Tom: Should that NTBS be NTMBS?
>> - Corentin: Probably.
>> - Corentin: The proposal is consistent with behavior elsewhere
>> in the standard library.
>> - Tom: Is the name exposed in the thread class interface?
>> - Corentin: No; not all platforms expose it; we would have to
>> store it internally.
>> - Corentin: On some platforms, retrieving thread properties
>> requires a thread ID.
>> - Steve: If the name is not in the right encoding, you'll get
>> broken behavior in a predictable way.
>> - Victor: On POSIX, I think we should do what path does and
>> prohibit transcoding.
>> - Victor: On Windows, this is just broken because it uses C locale.
>> - Tom: Does NTBS imply C locale to you?
>> - Victor: Yes.
>> - Jens: Execution encoding is not a term in the C++ standard;
>> we have in [character.seq.general] "the encodings of the
>> execution character sets ... are locale specific"; we should use
>> this wording.
>> - Corentin: We probably should define "execution encoding".
>> - Jens: Not in this paper.
>> - Jens: We should specify which locale we mean here.
>> - Corentin: No, not in each place where we refer to the
>> execution encoding.
>> - Jens: In the recent exception class discussions, we
>> determined that the encodings correspond to the C locale; we
>> should be more specific here.
>> - Corentin: If we just say NTMBS, then we get the right result.
>> - Tom: And that gets us C locale.
>> - Jens: For exception classes we say NTBS with a carve out for
>> NTMBS; do we want that here?
>> - Tom: If we could, I think we would prefer to require NTMBS
>> for the exception classes.
>> - Jens: <referring to exception class wording>; wording directs
>> to codecvt and thus C++ locale.
>> - Jens: So, do we want C or C++ locale here? It looks like
>> NTMBS comes in two flavors; we should be clear on the semantics.
>> - Jens: <referring to a link provided by Victor> regarding
>> SetThreadDescription(); there is a technique that involes use of
>> a char string.
>> - Corentin: That only works when a debugger is attached;
>> implementors won't use that technique.
>> - Jens: We need to decide what exactly we want this interface
>> to be compatible with.
>> - Jens: The level of complexity here is much less than for paths.
>> - Corentin: Can the C and C++ locales diverge?
>> - Jens: Yes.
>> - Eddie: This is similar to path; why not have the name_hint
>> constructor behave like path where we can provide the string in
>> multiple encodings.
>> - Corentin: Path is unique since filesystem encodings may
>> differ from the other encodings; it is more complicated.
>> - Corentin: We should avoid exposing programmers to encoding
>> concerns where we don't need to.
>> - Corentin: I removed char8_t support because I thought it
>> would increase consensus; if SG16 wants to re-introduce char8_t
>> or require literal encoding, I'm ok with that.
>> - Eddie: What I meant is that we could use the native()
>> implementation-defined character type.
>> - Corentin: I want to be able to pass an ordinary string
>> literal and have it work everywhere.
>> - Victor: I agree that we don't want most of the complexity of
>> path. However, path is a good abstract model of what we want here.
>> - Victor: On POSIX, the bytes should just be passed as is; a
>> binary identifier could be used if desired.
>> - Victor: NTMBS means you can't use std::format to produce the
>> thread name.
>> - Corentin: Would it increase consensus to require the ordinary
>> literal encoding for C++26?
>> - Victor: Yes.
>> - Jens: What does sprintf do?
>> - Tom: It uses the execution encoding so that special
>> characters in trailing code units are not misinterpreted.
>> - Jens: We don't require the literal encoding to match the
>> execution encoding though that might be a design bug.
>> - Jens: Though format is different, I don't think we should
>> sprint a new encoding requirement on programmers here.
>> - Jens: We could specify that, for POSIX, an NTBS and prohibit
>> conversions, add wchar_t for Windows, and then have portability
>> issues.
>> - Corentin: I think printf() is the model to follow.
>> - Jens: Then make it an NTMBS and reference the C locale.
>> - Jens: The only conversion concern we have is which conversion
>> function is to be called on Windows.
>> - Corentin: The separation of the C and C++ locales is new
>> information to me.
>> - Tom: We could restrict the characters used to the basic
>> literal character set.
>> - Victor: I agree with the use of NTBS on POSIX, but I think
>> NTMBS is wrong for Windows as it is incompatible with everything.
>> - Poll 2: P2019R7: Name hint should be provided in the ordinary
>> literal encoding.
>> - Attendees: 8
>> - SF F N A SA
>> 3 1 3 1 0
>> - Weak consensus.
>> - Poll 3: P2019R7: Name hint should be provided as an NTMBS in
>> the C locale encoding.
>> - Attendees: 8
>> - SF F N A SA
>> 1 5 1 0 1
>> - Consensus.
>> - Victor: I don't think we're done with this paper; use of
>> string_view might be problematic.
>> - Jens: It is called name "hint" for a reason.
>> - Jens: If there are embedded nulls or gets truncated, tough luck.
>>
>> Tom.
>>
>>
>

Received on 2025-01-22 19:29:21