Subject: Re: Conversion of grapheme clusters to (wide) execution encoding
From: Jens Maurer (Jens.Maurer_at_[hidden])
Date: 2020-06-01 14:47:46
On 01/06/2020 11.08, Corentin wrote:
> On Mon, 1 Jun 2020 at 08:10, Jens Maurer <Jens.Maurer_at_[hidden] <mailto:Jens.Maurer_at_[hidden]>> wrote:
> String literals also have an inherent length.Â I'm mildly opposed to normatively
> specifying a required alteration of the "source-code-apparent" length for types
> whose encoding are not variable-width to start with (u8, u16).Â That leaves
> 1 and 2 for me.
> The length ofÂ "Â©" will be different in utf8 or latin1 for example - it should be defined in the number of code units in the execution encoding
Yes, but you sort-of expect a length between 1-5 octets for UTF-8.
Not so for Latin-1: For one character appearing in source code,
I'd expect one length unit.
As I said, it's only a mild preference.
However, I must say I'm missing a bit of a big picture here:
What's the actual problem to be solved?
SG16 list run by firstname.lastname@example.org