C++ Logo

std-discussion

Advanced search

Re: Problem of the type requirement of the regex library

From: Lyberta <lyberta_at_[hidden]>
Date: Sat, 28 Dec 2019 10:26:00 +0000
Thiago Macieira via Std-Discussion:
> std::char_traits is used by std::string, which is not a text container. The
> ones being created for std::text is a different story.
>

The latest SG16 proposal for std::text from ThePhD still uses
std::basic_string as default container for code units. I find it
short-sighted.

Looking at design of std::basic_string, it looks like the Traits
template parameter is designed to hold all the metadata about encoding
while CharT is just a C compatibility thing. So to have proper text
encoding you would provide your own traits like this:

using ascii_string = std::basic_string<char, ascii_traits>;
using ebcdic_string = std::basic_string<char, ebcdic_traits>;

And "char" is this polymorphic type with wild semantics that can't be
operated on without locale and stuff. Basically, insane type-unsafe
design going from C and char*.

Of course, proper design would use text encoding as template parameter
like this:

using ascii_string = std::basic_string<ascii>;
using ebcdic_string = std::basic_string<ebcdic>;

And then coming back to the topic of regex:

using ascii_regex = std::basic_regex<ascii>;
using ebcdic_regex = std::basic_regex<ebcdic>;

Which means that current design of regex from C++11 (and text processing
in general) is broken and requires the new design and deprecation of old
one.


Received on 2019-12-28 04:29:05