Date: Fri, 19 Apr 2024 09:09:40 -0400
> On Apr 19, 2024, at 2:15 AM, Jens Maurer <jens.maurer_at_[hidden]> wrote:
>
>
>
> On 19/04/2024 03.37, Daveed Vandevoorde via SG16 wrote:
>>
>>
>>> On Apr 18, 2024, at 7:21 PM, Tom Honermann <tom_at_[hidden]> wrote:
>>> ... The contents of the string_view consist of characters of the basic source character set only (an implementation can map other characters using universal character names).
>>>
>>
>> Right. You mentioned in Tokyo that that doesn’t work. Can you elaborate what the technical hickup is?
>
> Universal-character-names are interpreted when transitioning
> the lexical representation of a string-literal into an object
> containing code units [lex.string], or when lexing tokens
> outside of string-literals in translation phase 3 [lex.phases].
>
> Nothing will interpret universal-character-names (as such) in
> a string_view, because it is already assumed to be a range of
> code units.
Right. But that’s mostly a UI issue, no? There is nothing that makes it “not work”. Only that if you want to get corresponding code units, some work will be needed on the consumer side. (Interestingly, the compiler already knows how to do the work for round-tripping support.)
Note that I’m not arguing that producing basic-source-character names/text is my preferred approach. I just want to understand if it has inherent implementation/semantic difficulties as a fallback if no other approach can be made to adequately work.
Daveed
>
>
>
> On 19/04/2024 03.37, Daveed Vandevoorde via SG16 wrote:
>>
>>
>>> On Apr 18, 2024, at 7:21 PM, Tom Honermann <tom_at_[hidden]> wrote:
>>> ... The contents of the string_view consist of characters of the basic source character set only (an implementation can map other characters using universal character names).
>>>
>>
>> Right. You mentioned in Tokyo that that doesn’t work. Can you elaborate what the technical hickup is?
>
> Universal-character-names are interpreted when transitioning
> the lexical representation of a string-literal into an object
> containing code units [lex.string], or when lexing tokens
> outside of string-literals in translation phase 3 [lex.phases].
>
> Nothing will interpret universal-character-names (as such) in
> a string_view, because it is already assumed to be a range of
> code units.
Right. But that’s mostly a UI issue, no? There is nothing that makes it “not work”. Only that if you want to get corresponding code units, some work will be needed on the consumer side. (Interestingly, the compiler already knows how to do the work for round-tripping support.)
Note that I’m not arguing that producing basic-source-character names/text is my preferred approach. I just want to understand if it has inherent implementation/semantic difficulties as a fallback if no other approach can be made to adequately work.
Daveed
Received on 2024-04-19 13:09:55