C++ Logo


Advanced search

Re: [SG16] Redefining Lexing in terms of Unicode

From: Jens Maurer <Jens.Maurer_at_[hidden]>
Date: Fri, 29 May 2020 14:24:15 +0200
On 29/05/2020 05.23, Steve Downey via SG16 wrote:
> Unicode-oriented people are essentially everyone today. No one who works with text today, including people who work on computer language text outside C and C++ Standard, expect or will deal with anything else.

Let me respectfully disagree here. We might all wish that to be the
case, but reality is often different. I seriously doubt that the
majority of FORTRAN or COBOL programs was written with Unicode in mind,
for example.

> Discussing decoding actual source files into 'source character set' and encoding into 'execution character set' values is a lossy translation hindering actual discussion of what C++ implementations do.

Having terms for text-related concepts in the compilation environment
vs. in the execution environment seems like a useful distinction,
at least insofar literal encodings are concerned.

However, that seems mostly orthogonal to the quest to describe
lexing in Unicode. Again, what exactly is the pain point with
the status quo description framework? Is this even worth spending
time on?


Received on 2020-05-29 07:27:24