sg16: Re: [SG16-Unicode] code_unit_sequence and code_point

From: keld_at <keld_at_[hidden]>
Date: Wed, 20 Jun 2018 19:24:52 +0200

On Wed, Jun 20, 2018 at 12:13:41PM -0400, Tom Honermann wrote:
> On 06/20/2018 05:34 AM, keld_at_[hidden] wrote:
> >On Tue, Jun 19, 2018 at 09:52:05PM -0400, Tom Honermann wrote:
> >>On 06/19/2018 04:19 PM, Lyberta wrote:
> >>>keld_at_[hidden]:
> >
> >Using a 16 bit wchar_t is ok if you restrict yourself to only a 16 bit
> >subset of UCS.
>
> I don't disagree, but for modern applications, limiting support to the
> BMP is a pretty significant restriction. And modern applications need
> to work on Windows and interact with the wchar_t based Win32 UTF-16 APIs.

I agree that this is not the state of the art. But it once was, and I think it is the reason for
Microsoft to use 16 bit for wchar_t.

> >I am happy to have a specific type to handle code points that are defined
> >to have
> >UCS code point values. I just note that I think APIs to handle such a type
> >would need to
> >have exactly the same functionality as for handling wchar_t entities.
>
> If I'm reading this correctly, it sounds like you are expressing a
> preference that text interfaces should be consistently provided for
> char, wchar_t, char16_t, char32_t (and char8_t). If so, I agree.

My thoughts were only wchar_t and char32_t, The other types would need another layer
- they cannot generally hold a code point of the processing character type. So they
cannot be used for portable programs that can be used everywhere.

Most programs I work with are made for the global market, and IMHO, you
should program for the global market.

Best regards
keld

Received on 2018-06-20 19:24:52