Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] make boost::regex constructor O(1) using metaparsing #240

Open
denzor200 opened this issue Jan 7, 2025 · 0 comments
Open

Comments

@denzor200
Copy link

Constructors of std::regex and boost::regex are known to be slow for a big regular expression and of course nobody will use them inside any loop statement.

Given the example..

void process()
{
   for (int i=0; i<10000; ++i) {
      boost::regex r("([^[:blank:]]+)|(\"[^\"]+\")|(\\([^\\)]+\\))");
      // ...
   }
}

..should be changed to..

boost::regex r("([^[:blank:]]+)|(\"[^\"]+\")|(\\([^\\)]+\\))");
void process()
{
   for (int i=0; i<10000; ++i) {
      // ...
   }
}

..due to performance reason.

As we could see, the constructor performs parsing of string literal in run time and this is the bottleneck. We know that it's possible to parse any literal during compilation time, but at the current moment boost::regex unable to do it.

I've found a library which was designed to write compile-time parsers:
https://github.com/boostorg/metaparse
I guess this library might be yet another dependency of the Boost Regex library, doesn't it?

This library even contains sample of parsing simple regex during compilation time:
https://github.com/boostorg/metaparse/blob/master/example/regexp/main.cpp

I suppose we can extend that sample and then integrate it into boost::regex. Here we even no need to make boost::regex constexpr and all that we need is to add yet another overload of regex's constructor:

template<char... Chars>
explicit basic_regexp(boost::metaparse::string<Chars...>,
                      flag_type f = regex_constants::normal);

And then our example might be reimplemented like this:

void process()
{
   for (int i=0; i<10000; ++i) {
      boost::regex r(BOOST_METAPARSE_STRING_VALUE("([^[:blank:]]+)|(\"[^\"]+\")|(\\([^\\)]+\\))"));
      // ...
   }
}

No global variables anymore, and no performance issues, looks good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant