Subject: Re: [SG16-Unicode] D1628R0 (Unicode character properties)
From: Steve Downey (sdowney_at_[hidden])
Date: 2019-03-28 07:58:03
These are plumbing for writing unicode algorithms portably, such as regex.
I don't think they are something most users should interact with. As such,
I think the wide contract is appropriate. It's reasonable for the binary
properties to say false for non code point values.
I also, on reflection, think excluding char and wchar is misguided. The
contract is that char8 etc are utf-8 etc, but it's also frequently the case
that the execution narrow and wide encodings are also unicode. That you
have possible GIGO errors isn't a good reason to block possible correct
use. It just encourages casting.
On Thu, Mar 28, 2019, 05:10 Lyberta <lyberta_at_[hidden]> wrote:
> >> I guess, but do we really want our users to shove random integers in it
> > Yes. I really want a wide contract there
> But... why?
> >> Yes, contract or invariant means strong type, not dumb char32_t
> > TR 44 is purposefully dumb by design too.
> I guess it was written by people with more of a C mindset. I'm looking
> at std::chrono and love how I can never shove an integer there because
> it is ambiguous. Same with text - an integer is ambiguous without
> character set or encoding. I know this api has Unicode in its name
> but... I think I gotta try to come up with properties design that is
> compatible with my design and see if there are any bad points.
> Also, I know this is a bit obscure, but what about non-Unicode? I think
> having relatively universal free functions is fine and then if they get
> std::unicode_code_point as template parameter, they will select unicode
> implementation. Hence again, strong types are important.
> Also, consider std::ascii_character, std::shift_jis_something.. I don't
> know Shift-JIS. :/
> SG16 Unicode mailing list
SG16 list run by firstname.lastname@example.org