It causes error_stack
even when using cluster group there (?:.)+
I can tell you how to curtail this exception if you want.
And I can tell you what is causing it too.
Use regex.h
as a reference..
First, you'll notice these defines near the top
#ifndef _REGEX_MAX_COMPLEXITY_COUNT
#define _REGEX_MAX_COMPLEXITY_COUNT 10000000L /* set to 0 to disable */
#endif /* _REGEX_MAX_COMPLEXITY_COUNT */
#ifndef _REGEX_MAX_STACK_COUNT
#ifdef _WIN64
#define _REGEX_MAX_STACK_COUNT 600L /* set to 0 to disable */
#else /* _WIN64 */
#define _REGEX_MAX_STACK_COUNT 1000L /* set to 0 to disable */
#endif /* _WIN64 */
#endif /* _REGEX_MAX_STACK_COUNT */
To disable either complexity or max stack, at compile time put the definition
you'd like before the include file <regex>
.
For example, set to 0 to disable these protections.
For your test, we can set the stack to 200,000 like this
#define _REGEX_MAX_STACK_COUNT 200000
#include <regex>
This now will actually not throw an exception now for your example.
Second, what causes this in the code.
I just did a glance with a debug session, and found out a general cause.
It all comes in the _Matcher class.
Inside _Match_pat()
it firsts decrements _Max_stack_count,
before exiting it increments _Max_stack_count.
_Max_stack_count
gets set when _Matcher is instantiated.
template<>class _Matcher
{ // provides ways to match a regular expression to a text sequence
_Max_stack_count = _REGEX_MAX_STACK_COUNT;
}
The recursion comes from _Match_pat()
being called recursively by either
_Do_if()
and _Do_rep()
which are initially called from _Match_pat()
And the behavior comes mainly from open ended quantified groups (capture or cluster) that can match many, many times.
In your case, its (.)+
Unusually though, all this is doing is matching a character and putting it
in the capture group 1, but appending it to group 0.
In my opinion, the closure should register a increment of the stack count available
(pop), but instead the recursion is being used as a crutch to support
other states that would have to be duplicated to take account of this.
Anyway, you can set breakpoints and try it out for yourself.
Below are the code segments of interest.
bool _Matcher<>::_Match_pat(_Node_base *_Nx)
{
if (0 < _Max_stack_count && --_Max_stack_count <= 0)
_Xregex_error(regex_constants::error_stack);
while (_Nx != 0)
{
case _N_if:
if (!_Do_if((_Node_if *)_Nx))
_Failed = true;
_Nx = 0;
break;
case _N_endif:
break;
case _N_rep:
if (!_Do_rep((_Node_rep *)_Nx,
(_Nx->_Flags & _Fl_greedy) != 0, 0))
_Failed = true;
_Nx = 0;
break;
case _N_end_rep:
{ // return at end of loop
_Node_rep *_Nr = ((_Node_end_rep *)_Nx)->_Begin_rep;
_Loop_vals_t *_Psav = &_Loop_vals[_Nr->_Loop_number];
if (_Nr->_Simple_loop == 0 && !_Do_rep(_Nr,
(_Nr->_Flags & _Fl_greedy) != 0, _Psav->_Loop_idx))
_Failed = true; // recurse only if loop contains if/do
_Nx = 0;
break;
}
}
if (0 < _Max_stack_count)
++_Max_stack_count;
}
bool _Matcher<>::_Do_rep0()
{ // apply repetition to loop with no nested if/do
if (!_Match_pat(_Node->_Next)) // Several of these
}
bool _Matcher<>::_Do_if()
{ // apply if node
if (_Match_pat(_Node->_Next)) // try to match this branch
}