On Tue, Sep 1, 2020, 18:08 Alisdair Meredith via SG16 <sg16@lists.isocpp.org> wrote:

For a cross compiler, the basic execution character set should correspond to the target platform, but the diagnostics character set should be for the host?

If we wanted to do things properly,

* There is no relation between the execution encoding and the encoding used by the compiler in its diagnostic output ( which isn't described by the standard ).

* Going through the execution encoding might be lossy even when going directly from internal to compiler diagnostic encoding would not be.

* knowing that, we would have multiple options for static_assert:

* Make it magic ( it is already magic, no other construct is restricted to a string literal)

* Add u8 overloads

The later solution would become necessary if static_assert was modified to also accept constant expression as these have to be consistent with execution encoding.

Both solutions are complementary.

The same solution applies to attributes I believe.

They do not apply to #error which strictly happen before phase 5.

None of that is directly related with Aaron's paper which clarify current behavior .

I hope it helps. Sorry for the concise reply, it's challenging on phone!

AlisdairM

Sent from my iPhone

On Sep 1, 2020, at 11:48, JF Bastien via SG16 <sg16@lists.isocpp.org> wrote:

I've written testcases for this, and compilers disagree on most details: https://godbolt.org/z/vGPdha
Of course, available in tweet form: https://twitter.com/jfbastien/status/1298307325443231744
Amusingly, UNIX control characters have been used in the past for nefarious purposes against the host shell.

On Tue, Sep 1, 2020 at 8:01 AM Steve Downey via SG16 <sg16@lists.isocpp.org> wrote:
I'm not sure about C, but for C++ the grammar for static_assert is
static_assert-declaration:
static_assert ( constant-expression ) ;
static_assert ( constant-expression , string-literal ) ;
So the value for the string is going to be encoded in the target
environment, rather than the source. While all characters from the
basic source character set and the execution character set are going
to be available, the values for those characters may not actually be
suitable for emitting. No idea what compilers do today. String-literal
also allows for encoding prefix, so the associated encoding might be
Unicode. Or even a wide string.
Do we need to have an 'original spelling' rule for static_assert?

On Tue, Sep 1, 2020 at 10:13 AM Aaron Ballman via SG16
<sg16@lists.isocpp.org> wrote:
>
> I've been working on a paper (attached) for WG14 that I also intend to
> submit to WG21 (with C++-specific wording) and am looking for some
> early feedback from SG16 as the topic relates to text encoding of
> diagnostic messages.
>
> The basic thrust of the paper is that both languages have constructs
> which can produce diagnostics including user-defined text and don't
> have a "compiler diagnostic character set" defined to convert the text
> into. So the paper tries to limit the damage by allowing characters
> outside of the basic source character set to be handled as a matter of
> QoI. The paper also asks a question that Tom H posed to me, which is
> whether we want the restriction to be against the basic execution
> character set for slightly easier handling of \r and \n.
>
> Any feedback you feel like providing is appreciated. Thanks!
>
> ~Aaron
> --
> SG16 mailing list
> SG16@lists.isocpp.org
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
--
SG16 mailing list
SG16@lists.isocpp.org
https://lists.isocpp.org/mailman/listinfo.cgi/sg16

--
SG16 mailing list
SG16@lists.isocpp.org
https://lists.isocpp.org/mailman/listinfo.cgi/sg16
--
SG16 mailing list
SG16@lists.isocpp.org
https://lists.isocpp.org/mailman/listinfo.cgi/sg16