C++ Logo

sg16

Advanced search

Re: [isocpp-sg16] Agenda for the 2024-10-23 SG16 meeting

From: Tom Honermann <tom_at_[hidden]>
Date: Tue, 22 Oct 2024 14:28:00 -0400
I am still far behind in getting meeting minutes published. My rough
minutes from the 2024-09-25 discussion of P2019R7 are below.

Corentin, I intend to schedule P3258R0 (Formatting of charN_t)
<https://wg21.link/p3258r0> for the 2024-11-05 meeting. Please let me
know if that won't work for you.

- P2019R7: Thread attributes:
   - Corentin introduced the paper.
     - The thread name is provided to the OS so that it is available for
display in an OS monitor, debugger, process thread list, etc...
     - On POSIX systems, the name is stuffed into a small string buffer
provided by the pthreads library and is generally interpreted as
execution encoding.
     - Prior to Windows 10, there were unofficial ways to associate a
name with a thread.
     - On Windows 10 there is a public interface.
     - On Windows, the name must be provided in wchar_t; there is no
"ANSI" version of the interface.
     - A previous revision of this paper supported both char and wchar_t.
     - A copy of the string will always be needed, so transcoding costs
aren't significant.
     - We should support char8_t where we want to support Unicode, but
that isn't proposed in this paper; can add that later.
     - The name is interpreted as an NTBS in the execution encoding.
     - The name can be transcoded to wchar_t on Windows.
     - If mojibake happens, it doesn't affect users of the software.
   - Victor: Execution encoding isn't defined in the standard; we have
execution character set. Is this locale encoding?
   - Victor: I think we should use a well-defined encoding like the
literal encoding.
   - Corentin: We need a solution that works with MultiByteToWideChar()
on Windows; that only works with execution encoding.
   - Corentin: We don't require a relationship between literal encoding
and execution encoding; that technically means that conversions don't work.
   - Victor: This should be implementation detail.
   - Tom: I'm torn, I agree with Victor, but I also thing the proposal
reflects existing implementations.
   - Tom: On POSIX systems, the string is likely to be interpreted by
other tools using the execution encoding.
   - Tom: Should that NTBS be NTMBS?
   - Corentin: Probably.
   - Corentin: The proposal is consistent with behavior elsewhere in the
standard library.
   - Tom: Is the name exposed in the thread class interface?
   - Corentin: No; not all platforms expose it; we would have to store
it internally.
   - Corentin: On some platforms, retrieving thread properties requires
a thread ID.
   - Steve: If the name is not in the right encoding, you'll get broken
behavior in a predictable way.
   - Victor: On POSIX, I think we should do what path does and prohibit
transcoding.
   - Victor: On Windows, this is just broken because it uses C locale.
   - Tom: Does NTBS imply C locale to you?
   - Victor: Yes.
   - Jens: Execution encoding is not a term in the C++ standard; we have
in [character.seq.general] "the encodings of the execution character
sets ... are locale specific"; we should use this wording.
   - Corentin: We probably should define "execution encoding".
   - Jens: Not in this paper.
   - Jens: We should specify which locale we mean here.
   - Corentin: No, not in each place where we refer to the execution
encoding.
   - Jens: In the recent exception class discussions, we determined that
the encodings correspond to the C locale; we should be more specific here.
   - Corentin: If we just say NTMBS, then we get the right result.
   - Tom: And that gets us C locale.
   - Jens: For exception classes we say NTBS with a carve out for NTMBS;
do we want that here?
   - Tom: If we could, I think we would prefer to require NTMBS for the
exception classes.
   - Jens: <referring to exception class wording>; wording directs to
codecvt and thus C++ locale.
   - Jens: So, do we want C or C++ locale here? It looks like NTMBS
comes in two flavors; we should be clear on the semantics.
   - Jens: <referring to a link provided by Victor> regarding
SetThreadDescription(); there is a technique that involes use of a char
string.
   - Corentin: That only works when a debugger is attached; implementors
won't use that technique.
   - Jens: We need to decide what exactly we want this interface to be
compatible with.
   - Jens: The level of complexity here is much less than for paths.
   - Corentin: Can the C and C++ locales diverge?
   - Jens: Yes.
   - Eddie: This is similar to path; why not have the name_hint
constructor behave like path where we can provide the string in multiple
encodings.
   - Corentin: Path is unique since filesystem encodings may differ from
the other encodings; it is more complicated.
   - Corentin: We should avoid exposing programmers to encoding concerns
where we don't need to.
   - Corentin: I removed char8_t support because I thought it would
increase consensus; if SG16 wants to re-introduce char8_t or require
literal encoding, I'm ok with that.
   - Eddie: What I meant is that we could use the native()
implementation-defined character type.
   - Corentin: I want to be able to pass an ordinary string literal and
have it work everywhere.
   - Victor: I agree that we don't want most of the complexity of path.
However, path is a good abstract model of what we want here.
   - Victor: On POSIX, the bytes should just be passed as is; a binary
identifier could be used if desired.
   - Victor: NTMBS means you can't use std::format to produce the thread
name.
   - Corentin: Would it increase consensus to require the ordinary
literal encoding for C++26?
   - Victor: Yes.
   - Jens: What does sprintf do?
   - Tom: It uses the execution encoding so that special characters in
trailing code units are not misinterpreted.
   - Jens: We don't require the literal encoding to match the execution
encoding though that might be a design bug.
   - Jens: Though format is different, I don't think we should sprint a
new encoding requirement on programmers here.
   - Jens: We could specify that, for POSIX, an NTBS and prohibit
conversions, add wchar_t for Windows, and then have portability issues.
   - Corentin: I think printf() is the model to follow.
   - Jens: Then make it an NTMBS and reference the C locale.
   - Jens: The only conversion concern we have is which conversion
function is to be called on Windows.
   - Corentin: The separation of the C and C++ locales is new
information to me.
   - Tom: We could restrict the characters used to the basic literal
character set.
   - Victor: I agree with the use of NTBS on POSIX, but I think NTMBS is
wrong for Windows as it is incompatible with everything.
   - Poll 2: P2019R7: Name hint should be provided in the ordinary
literal encoding.
     - Attendees: 8
     - SF F N A SA
        3 1 3 1 0
     - Weak consensus.
   - Poll 3: P2019R7: Name hint should be provided as an NTMBS in the C
locale encoding.
     - Attendees: 8
     - SF F N A SA
        1 5 1 0 1
     - Consensus.
   - Victor: I don't think we're done with this paper; use of
string_view might be problematic.
   - Jens: It is called name "hint" for a reason.
   - Jens: If there are embedded nulls or gets truncated, tough luck.

Tom.

On 10/22/24 2:24 PM, Tom Honermann via SG16 wrote:
>
> SG16 will hold a meeting *tomorrow*, Wednesday, October 23rd, at 19:30
> UTC (timezone conversion
> <https://www.timeanddate.com/worldclock/converter.html?iso=20241023T193000&p1=1440&p2=tz_pdt&p3=tz_mdt&p4=tz_cdt&p5=tz_edt&p6=tz_cest>).
>
> The agenda follows.
>
> * P3374R0: Adding formatter for fpos<mbstate_t>
> <https://wg21.link/p3374r0>
> * P2019R7: Thread attributes <https://wg21.link/p2019r7>
>
> P3374R0 comes to us courtesy of Liang Jiaming and seeks to add
> formatting support for std::fpos<std::mbstate_t>. SG16 has not
> previously reviewed this proposal, but previous discussion occurred on
> the std-proposals mailing list in the thread starting here
> <https://lists.isocpp.org/std-proposals/2024/07/10667.php> and on the
> SG16 mailing list in a thread starting here
> <https://lists.isocpp.org/sg16/2024/08/4415.php>. Please try to review
> those before the meeting (and I again apologize for such late notice).
> std::mbstate_t is an implementation-defined type, so we can't
> legislate much about its representation in formatted output, but we
> can provide guidance, particularly with regard to whether formatted
> values of the type should be sufficient to reconstruct values in a
> hypothetical scanner such as that proposed by P1729 (Text Parsing)
> <https://wg21.link/p1729>.
>
> P2019R7 was reviewed by SG16 during the 2024-09-25 meeting in which
> the following polls were taken:
>
> * Poll 2: P2019R7: Name hint should be provided in the ordinary
> literal encoding.
> o Attendees: 8
> o
> SF
> F
> N
> A
> SA
> 3
> 1
> 3
> 1
> 0
>
> o Weak consensus.
> * Poll 3: P2019R7: Name hint should be provided as an NTMBS in the C
> locale encoding.
> o Attendees: 8
> o
> SF
> F
> N
> A
> SA
> 1
> 5
> 1
> 0
> 1
>
> o Consensus
>
> The guidance provided regarding encoding of the name hint is fairly
> clear despite some continued opposition, so new information should be
> provided in order to reopen that discussion. Continued discussion is
> warranted for other concerns raised regarding POSIX vs Windows
> platforms and the use of std::string_view. We'll focus first on those
> topics and then any other concerns that are raised.
>
> Tom.
>
>

Received on 2024-10-22 18:28:03