C++ Logo


Advanced search

Re: [SG16] What do we want from source to internal conversion?

From: Tom Honermann <tom_at_[hidden]>
Date: Mon, 15 Jun 2020 11:56:08 -0400
On 6/15/20 11:38 AM, Corentin wrote:
> On Mon, 15 Jun 2020 at 17:17, Tom Honermann <tom_at_[hidden]
> <mailto:tom_at_[hidden]>> wrote:
> On 6/15/20 7:14 AM, Corentin via SG16 wrote:
> Hubert has specifically requested better support for unmappable
> characters, so I don't agree with the parenthetical.
> I don't think that's a fair characterisation. Again there is a mapping
> for all characters in ebcdic. That mapping is prescriptive rather than
> semantic, but both Unicode and IBM agree on that mapping ( the
> codepoints they map to do not have associated semantic whatsoever and
> are meant to be used that way). The wording trick will be to make sure
> we don't prevent that mapping.

The claim that Unicode and IBM agree on this mapping seems overreaching
to me. Yes, there is a specification for how EBCDIC code pages can be
mapped to Unicode code points in a way that preserves round tripping. I
don't think that should be read as an endorsement for conflating the
semantic meanings of those characters that represent distinct abstract
characters before/after such a mapping. I believe there have been
requests to be able to differentiate the presence of one of these
control characters in the source input and the mapped Unicode code point
being written as a UCN.


Received on 2020-06-15 10:59:19