C++ Logo

std-proposals

Advanced search

Re: Enhancement of std::regex

From: Lyberta <lyberta_at_[hidden]>
Date: Mon, 29 Jul 2019 02:11:00 +0000
Nozomu Katō via Std-Proposals:
> Hello,
>
> I am preparing a proposal document to enhance std::regex:
> http://www.akenotsuki.com/misc/srell/en/proposal/draft.html
>
> The document is almost done except the Technical Specifications section.
> But this proposal needs a presenter as I cannot attend a face-to-face
> meeting. If you like this proposal and can attend face-to-face committee
> meetings of C++, please contact me.
>
> Regards,
> Nozomu
>

In my experience, anything that was based on char or wchar_t can't
really support Unicode as a drop-in.

I'm working on my own Unicode proposal that ditches any "execution
character set" nonsense and provides clean interface:

https://github.com/Lyberta/cpp-unicode

Regex is on the very end of the list, we don't have even means to
iterate scalar values, grapheme clusters or to read or write Unicode to
files, so going straight to regex feels strange.

Your text only mentions code points yet well-formed UTF can only contain
scalar values and I'm a strong opponent of code point interfaces.

Do you know how regex handles grapheme clusters? Is it ever valid to
break a grapheme cluster when matching or replacing?

In any case, feel free to get acquainted with SG16:

https://github.com/sg16-unicode/sg16


Received on 2019-07-28 21:13:26