C++ Logo


Advanced search

Re: [SG16] Agreeing with Corentin's point re: problem with strict use of abstract characters

From: Jens Maurer <Jens.Maurer_at_[hidden]>
Date: Sun, 14 Jun 2020 22:45:27 +0200
On 14/06/2020 22.19, Corentin Jabot wrote:
> On Sun, 14 Jun 2020 at 21:55, Jens Maurer <Jens.Maurer_at_[hidden] <mailto:Jens.Maurer_at_[hidden]>> wrote:

> No, each code point in a sequence (given Unicode input) is a separate abstract character
> in my view (after combining surrogate pairs, of course).
> For example diatrics, when preceded by a letter are not considered abstract characters of their own.

"Abstract character" is defined in https://www.unicode.org/glossary/ as follows:

"A unit of information used for the organization, control, or representation of textual data."
(ISO 10646 does not appear to have a definition in its clause 3.)

I'm not seeing a conflict between that definition and my view that a diacritic,
preceded by a letter, can be viewed as two different abstract characters.
I agree that the alternate viewpoint "single abstract character" is not
in conflict with the definition, either.

What is your statement "are not considered abstract characters of their own"
(which seems to leave little room for alternatives) based on?


Received on 2020-06-14 15:48:47