C++ Logo

sg16

Advanced search

Re: [SG16] On the character encoding of diagnostic text

From: Aaron Ballman <aaron_at_[hidden]>
Date: Tue, 1 Sep 2020 13:05:10 -0400
On Tue, Sep 1, 2020 at 12:08 PM Alisdair Meredith via SG16
<sg16_at_[hidden]> wrote:
>
> For a cross compiler, the basic execution character set should correspond to the target platform, but the diagnostics character set should be for the host?

That matches my understanding.

I suppose a question I could add is whether anyone would like to see a
new character set introduced for diagnostics. My intuition is that it
would be a pretty heavy hammer to bring to bear and that the basic
source character set is probably Good Enough (tm).

~Aaron

>
> AlisdairM
>
> Sent from my iPhone
>
> On Sep 1, 2020, at 11:48, JF Bastien via SG16 <sg16_at_[hidden]> wrote:
>
> 
> I've written testcases for this, and compilers disagree on most details: https://godbolt.org/z/vGPdha
> Of course, available in tweet form: https://twitter.com/jfbastien/status/1298307325443231744
> Amusingly, UNIX control characters have been used in the past for nefarious purposes against the host shell.
>
> On Tue, Sep 1, 2020 at 8:01 AM Steve Downey via SG16 <sg16_at_[hidden]> wrote:
>>
>> I'm not sure about C, but for C++ the grammar for static_assert is
>> static_assert-declaration:
>> static_­assert ( constant-expression ) ;
>> static_­assert ( constant-expression , string-literal ) ;
>> So the value for the string is going to be encoded in the target
>> environment, rather than the source. While all characters from the
>> basic source character set and the execution character set are going
>> to be available, the values for those characters may not actually be
>> suitable for emitting. No idea what compilers do today. String-literal
>> also allows for encoding prefix, so the associated encoding might be
>> Unicode. Or even a wide string.
>> Do we need to have an 'original spelling' rule for static_assert?
>>
>> On Tue, Sep 1, 2020 at 10:13 AM Aaron Ballman via SG16
>> <sg16_at_[hidden]> wrote:
>> >
>> > I've been working on a paper (attached) for WG14 that I also intend to
>> > submit to WG21 (with C++-specific wording) and am looking for some
>> > early feedback from SG16 as the topic relates to text encoding of
>> > diagnostic messages.
>> >
>> > The basic thrust of the paper is that both languages have constructs
>> > which can produce diagnostics including user-defined text and don't
>> > have a "compiler diagnostic character set" defined to convert the text
>> > into. So the paper tries to limit the damage by allowing characters
>> > outside of the basic source character set to be handled as a matter of
>> > QoI. The paper also asks a question that Tom H posed to me, which is
>> > whether we want the restriction to be against the basic execution
>> > character set for slightly easier handling of \r and \n.
>> >
>> > Any feedback you feel like providing is appreciated. Thanks!
>> >
>> > ~Aaron
>> > --
>> > SG16 mailing list
>> > SG16_at_[hidden]
>> > https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>> --
>> SG16 mailing list
>> SG16_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16

Received on 2020-09-01 12:08:56