C++ Logo

std-proposals

Advanced search

Re: Beyond regex

From: Justin Bassett <jbassett271_at_[hidden]>
Date: Tue, 29 Oct 2019 08:23:41 -0700
On Tue, Oct 29, 2019, 1:52 AM Dejan Milosavljevic via Std-Proposals <
std-proposals_at_[hidden]> wrote:

> 2. Is the implementation useful to you/to-others?
> Implementation it selfish the is useful for anyone. I documentation give
> simple example. Is this example useful to you?
> Or to simplify completely in next example. In stream of characters
> what is next to come: word or key-word? How to solve within existing
> c++?
> Parse 1024 characters and then apply std::regex for word and
> key-word(s)? Very inefficient (slow execution) not practical to code
> and maintain.
> Let me generalize question: is `lex` useful? See: 1.
>

Yes, lexing is undoubtedly useful, but the question is about this
particular implementation of the broader concept. Why is this interface the
one you settled on?

It looks like this library creates the lexer at runtime, in contrast to the
software 'lex', which generates the lexer at compile-time. I would argue
that compile-time generation makes more sense, as algorithms with
exponential time complexity are involved.

I'm also uncomfortable with the integration of std::regex, which are
"regular expressions" of the form that most programming languages allow,
but they are not actual regular expressions that are desirable for lexical
analysis (which allow the implementation to use DFAs).

3. Have you tried implementing it?
> Yes and no.
> Yes. I implement my own `regex` and `lex` on the top of the that. No
> one will use that because it is not rely on standard.
> No. Implementing `lex` in the top of std::rex require internal
> knowledge for specific compiler vendor and compiler version.
> Standard does not require how implementation will look like. New
> versions will arrives very fast. Any attempt to start is pointless
> before start.
>

Language recognition is niche, and there are already many well supported
tools. If you implement the library and it turns out that the community
loves it, it would be much more likely for this to be standardized. That
is, community support could provide an argument for standardizing this
implementation of language recognition instead of the other existing
implementations.

--Justin

Received on 2019-10-29 10:26:10