Date: Sun, 28 Mar 2021 16:56:55 -0400
Thanks - that was exactly my thoughts, and I missed it when Corentin tried to correct me. All clear now, although is the reference to Raw string literals still useful, or more likely to lead to folks like me mid reading too quickly, and getting confused?
I believe there would be no cha he of behavior if we strike the “except” that can no longer occur. I agree with everyone else that we should not change the status of any existing UB - that is for SG12 (and on my todo list as soon as I get some free cycles).
AlisdairM
Sent from my iPhone
> On Mar 28, 2021, at 16:27, Jens Maurer <Jens.Maurer_at_[hidden]> wrote:
>
> On 28/03/2021 22.20, Alisdair Meredith via SG16 wrote:
>> I’m still not sure I see it. The UB in [lex.phases]p2 is entirely when reverting changes
>> in raw string literals, and my understanding of this paper is the only raw string reversion
>> that remains is line splicing, and the presence of the newline character means that this
>> reversion could never produce a UCN - hence, that particular UB is no longer possible
>> as its circumstances can no longer occur.
>
> Are you talking about this sentence?
>
> "Except for splices reverted in a raw string literal, if a splice results in a
> character sequence that matches the syntax of a universal-character-name, the
> behavior is undefined."
>
> Note that it says "except for .. raw string literal", so it applies to splices
> outside of raw string literals. And there, UCNs definitely exist.
>
> Jens
>
>
>> I have no issue with the p4 UB moving into the preprocessor clause, merely wished to
>> highlight it with an editorial note or similar so that it was clear no UB was changing,
>> merely the place where it is documented is being moved, in the hope of avoiding this
>> kind of thread in a later discussion ;)
>>
>> AlisdairM
>>
>>>> On Mar 28, 2021, at 3:06 PM, Corentin Jabot <corentinjabot_at_[hidden] <mailto:corentinjabot_at_[hidden]>> wrote:
>>>
>>>
>>>
>>>> On Sun, Mar 28, 2021 at 8:56 PM Alisdair Meredith via SG16 <sg16_at_[hidden] <mailto:sg16_at_[hidden]>> wrote:
>>>
>>> This paper touches, incidentally, wording that is of concern to SG12 regarding UB.
>>> To ease communication/concerns, it might be worth pointing out the that UB wording
>>> added to [lex.concat] is not new UB, but merely moving where we document the
>>> existing UB in [lex.phases]p4.
>>>
>>> As part of this cleanup, is the UB in [lex.phases]p2 still possible, or does the raw
>>> remaining string reversion no longer have the ability to accidentally form a UCN as
>>> all we are reverting is line-splicing, which implies there must be a new-line character
>>> emdedded in any reversion, which would not enable forming a UCN?
>>>
>>>
>>> There is still UB. While we don't replace unicode by UCNs in phase 1 anymore, there still may be UCNs spelled in the source.
>>>
>>> Removing the UB in lexing is on my radar.
>>> I think there is consensus that the UB should not be there - and that its purpose (dealing with the observability of different models chosen by C and C++) is no longer relevant,
>>> although ideally we would have the same behavior in C and we can confer to SG22.
>>>
>>>
>>>
>>> AlisdairM
>>>
>>>> On Mar 28, 2021, at 2:15 PM, Tom Honermann via SG16 <sg16_at_[hidden] <mailto:sg16_at_[hidden]>> wrote:
>>>>
>>>> The summary for the SG16 meeting held March 24th, 2021 is now available. For those that attended, please review and suggest corrections:
>>>>
>>>> * https://github.com/sg16-unicode/sg16-meetings#march-24th-2021 <https://github.com/sg16-unicode/sg16-meetings#march-24th-2021>
>>>>
>>>> A decision was made to forward Jens' D2314R2: Character sets and encodings <https://wiki.edg.com/pub/Wg21telecons2021/SG16/d2314r2.html> to EWG at this meeting. Per SG16 operating procedures <https://github.com/sg16-unicode/sg16/blob/master/OperatingProcedures.md>, this decision has tentative consensus as of now and will become the SG16 consensus one week from now pending new dissenting perspectives or other new information. Given that the decision was unanimous (though there were abstentions), EWG has already been informed and has tentatively scheduled this paper for discussion on May 6th.
>>>>
>>>> *Poll: Forward D2314R2 as presented on 2021-03-24 to EWG for inclusion in C++23.*
>>>>
>>>> *
>>>>
>>>> Attendance: 9
>>>>
>>>> SF F N A SA
>>>> 3 5 0 0 0
>>>>
>>>> *
>>>>
>>>> Consensus is in favor.
>>>>
>>>> Tom.
>>>> --
>>>> SG16 mailing list
>>>> SG16_at_[hidden] <mailto:SG16_at_[hidden]>
>>>> https://lists.isocpp.org/mailman/listinfo.cgi/sg16 <https://lists.isocpp.org/mailman/listinfo.cgi/sg16>
>>>
>>> --
>>> SG16 mailing list
>>> SG16_at_[hidden] <mailto:SG16_at_[hidden]>
>>> https://lists.isocpp.org/mailman/listinfo.cgi/sg16 <https://lists.isocpp.org/mailman/listinfo.cgi/sg16>
>>
>>
>
I believe there would be no cha he of behavior if we strike the “except” that can no longer occur. I agree with everyone else that we should not change the status of any existing UB - that is for SG12 (and on my todo list as soon as I get some free cycles).
AlisdairM
Sent from my iPhone
> On Mar 28, 2021, at 16:27, Jens Maurer <Jens.Maurer_at_[hidden]> wrote:
>
> On 28/03/2021 22.20, Alisdair Meredith via SG16 wrote:
>> I’m still not sure I see it. The UB in [lex.phases]p2 is entirely when reverting changes
>> in raw string literals, and my understanding of this paper is the only raw string reversion
>> that remains is line splicing, and the presence of the newline character means that this
>> reversion could never produce a UCN - hence, that particular UB is no longer possible
>> as its circumstances can no longer occur.
>
> Are you talking about this sentence?
>
> "Except for splices reverted in a raw string literal, if a splice results in a
> character sequence that matches the syntax of a universal-character-name, the
> behavior is undefined."
>
> Note that it says "except for .. raw string literal", so it applies to splices
> outside of raw string literals. And there, UCNs definitely exist.
>
> Jens
>
>
>> I have no issue with the p4 UB moving into the preprocessor clause, merely wished to
>> highlight it with an editorial note or similar so that it was clear no UB was changing,
>> merely the place where it is documented is being moved, in the hope of avoiding this
>> kind of thread in a later discussion ;)
>>
>> AlisdairM
>>
>>>> On Mar 28, 2021, at 3:06 PM, Corentin Jabot <corentinjabot_at_[hidden] <mailto:corentinjabot_at_[hidden]>> wrote:
>>>
>>>
>>>
>>>> On Sun, Mar 28, 2021 at 8:56 PM Alisdair Meredith via SG16 <sg16_at_[hidden] <mailto:sg16_at_[hidden]>> wrote:
>>>
>>> This paper touches, incidentally, wording that is of concern to SG12 regarding UB.
>>> To ease communication/concerns, it might be worth pointing out the that UB wording
>>> added to [lex.concat] is not new UB, but merely moving where we document the
>>> existing UB in [lex.phases]p4.
>>>
>>> As part of this cleanup, is the UB in [lex.phases]p2 still possible, or does the raw
>>> remaining string reversion no longer have the ability to accidentally form a UCN as
>>> all we are reverting is line-splicing, which implies there must be a new-line character
>>> emdedded in any reversion, which would not enable forming a UCN?
>>>
>>>
>>> There is still UB. While we don't replace unicode by UCNs in phase 1 anymore, there still may be UCNs spelled in the source.
>>>
>>> Removing the UB in lexing is on my radar.
>>> I think there is consensus that the UB should not be there - and that its purpose (dealing with the observability of different models chosen by C and C++) is no longer relevant,
>>> although ideally we would have the same behavior in C and we can confer to SG22.
>>>
>>>
>>>
>>> AlisdairM
>>>
>>>> On Mar 28, 2021, at 2:15 PM, Tom Honermann via SG16 <sg16_at_[hidden] <mailto:sg16_at_[hidden]>> wrote:
>>>>
>>>> The summary for the SG16 meeting held March 24th, 2021 is now available. For those that attended, please review and suggest corrections:
>>>>
>>>> * https://github.com/sg16-unicode/sg16-meetings#march-24th-2021 <https://github.com/sg16-unicode/sg16-meetings#march-24th-2021>
>>>>
>>>> A decision was made to forward Jens' D2314R2: Character sets and encodings <https://wiki.edg.com/pub/Wg21telecons2021/SG16/d2314r2.html> to EWG at this meeting. Per SG16 operating procedures <https://github.com/sg16-unicode/sg16/blob/master/OperatingProcedures.md>, this decision has tentative consensus as of now and will become the SG16 consensus one week from now pending new dissenting perspectives or other new information. Given that the decision was unanimous (though there were abstentions), EWG has already been informed and has tentatively scheduled this paper for discussion on May 6th.
>>>>
>>>> *Poll: Forward D2314R2 as presented on 2021-03-24 to EWG for inclusion in C++23.*
>>>>
>>>> *
>>>>
>>>> Attendance: 9
>>>>
>>>> SF F N A SA
>>>> 3 5 0 0 0
>>>>
>>>> *
>>>>
>>>> Consensus is in favor.
>>>>
>>>> Tom.
>>>> --
>>>> SG16 mailing list
>>>> SG16_at_[hidden] <mailto:SG16_at_[hidden]>
>>>> https://lists.isocpp.org/mailman/listinfo.cgi/sg16 <https://lists.isocpp.org/mailman/listinfo.cgi/sg16>
>>>
>>> --
>>> SG16 mailing list
>>> SG16_at_[hidden] <mailto:SG16_at_[hidden]>
>>> https://lists.isocpp.org/mailman/listinfo.cgi/sg16 <https://lists.isocpp.org/mailman/listinfo.cgi/sg16>
>>
>>
>
Received on 2021-03-28 15:57:00