Date: Tue, 2 Nov 2021 07:47:32 -0400
On Mon, Nov 1, 2021 at 4:40 PM Hubert Tong via Liaison
<liaison_at_[hidden]> wrote:
>
> On Mon, Nov 1, 2021 at 8:56 AM Aaron Ballman <aaron_at_[hidden]> wrote:
>>
>> On Fri, Oct 29, 2021 at 10:00 PM Hubert Tong via Liaison
>> <liaison_at_[hidden]> wrote:
>> >
>> > It seems the C standard has a rule regarding the shift state of the spelling of string literals in the source file but neither the C standard nor the C++ standard specify what shift state should be observed just prior to the terminating NUL of a string literal. Is this lack of normative encouragement or requirement for the string to be in the initial shift state prior to the terminating NUL just a defect?
>> >
>> > Note that NUL is required by C to be encodable in any shift state; therefore, the need to insert NUL does not "naturally" cause strings in non-initial shift states to return to the initial shift state.
>>
>> I'm by no means an expert in this area, but does 5.2.1.2p1-2 cover this?
>>
>> 1 ...
>> - A byte with all bits zero shall be interpreted as a null
>> character independent of shift state. Such a byte shall not occur as
>> part of any other multibyte character.
>>
>> 2 For source files, the following shall hold:
>> - An identifier, comment, string literal, character constant, or
>> header name shall begin and end in the initial shift state.
>> ...
>>
>> So NUL is required to be interpreted regardless of shift state (per
>> p1), and after the null, the string literal has to end in the initial
>> shift state (per p2).
>
>
> The shift state in the source has at most a tenuous relationship with the shift state that would be observed in the contents of the string literal at runtime.
Yeah, this isn't the most clear exposition. If you do determine the
intent, an editorial paper to make the words more understandable would
not be a bad idea.
>>
>> (Sorry if I'm misunderstanding something. This may also be worth
>> asking on the SG16 mailing lists due to their expertise.)
>
>
> I was hoping to get a comment from the C side as well.
If you don't hear enough back on the liaison list, you might also try
the WG14 reflectors directly.
~Aaron
>
>>
>>
>> ~Aaron
>>
>>
>>
>>
>> >
>> > -- HT
>> > _______________________________________________
>> > Liaison mailing list
>> > Liaison_at_[hidden]
>> > Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/liaison
>> > Link to this post: http://lists.isocpp.org/liaison/2021/10/0895.php
>
> _______________________________________________
> Liaison mailing list
> Liaison_at_[hidden]
> Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/liaison
> Link to this post: http://lists.isocpp.org/liaison/2021/11/0897.php
<liaison_at_[hidden]> wrote:
>
> On Mon, Nov 1, 2021 at 8:56 AM Aaron Ballman <aaron_at_[hidden]> wrote:
>>
>> On Fri, Oct 29, 2021 at 10:00 PM Hubert Tong via Liaison
>> <liaison_at_[hidden]> wrote:
>> >
>> > It seems the C standard has a rule regarding the shift state of the spelling of string literals in the source file but neither the C standard nor the C++ standard specify what shift state should be observed just prior to the terminating NUL of a string literal. Is this lack of normative encouragement or requirement for the string to be in the initial shift state prior to the terminating NUL just a defect?
>> >
>> > Note that NUL is required by C to be encodable in any shift state; therefore, the need to insert NUL does not "naturally" cause strings in non-initial shift states to return to the initial shift state.
>>
>> I'm by no means an expert in this area, but does 5.2.1.2p1-2 cover this?
>>
>> 1 ...
>> - A byte with all bits zero shall be interpreted as a null
>> character independent of shift state. Such a byte shall not occur as
>> part of any other multibyte character.
>>
>> 2 For source files, the following shall hold:
>> - An identifier, comment, string literal, character constant, or
>> header name shall begin and end in the initial shift state.
>> ...
>>
>> So NUL is required to be interpreted regardless of shift state (per
>> p1), and after the null, the string literal has to end in the initial
>> shift state (per p2).
>
>
> The shift state in the source has at most a tenuous relationship with the shift state that would be observed in the contents of the string literal at runtime.
Yeah, this isn't the most clear exposition. If you do determine the
intent, an editorial paper to make the words more understandable would
not be a bad idea.
>>
>> (Sorry if I'm misunderstanding something. This may also be worth
>> asking on the SG16 mailing lists due to their expertise.)
>
>
> I was hoping to get a comment from the C side as well.
If you don't hear enough back on the liaison list, you might also try
the WG14 reflectors directly.
~Aaron
>
>>
>>
>> ~Aaron
>>
>>
>>
>>
>> >
>> > -- HT
>> > _______________________________________________
>> > Liaison mailing list
>> > Liaison_at_[hidden]
>> > Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/liaison
>> > Link to this post: http://lists.isocpp.org/liaison/2021/10/0895.php
>
> _______________________________________________
> Liaison mailing list
> Liaison_at_[hidden]
> Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/liaison
> Link to this post: http://lists.isocpp.org/liaison/2021/11/0897.php
Received on 2021-11-02 06:47:47