Your regex runs into catastrophic backtracking because you have nested quantifiers (([...]+)*
). Since your regex requires the string to end in /
(which fails on your example), the regex engine tries all permutations of the string in the vain hope to find a matching combination. That's where it gets stuck.
To illustrate, let's assume "A*BCD"
as the input to your regex and see what happens:
(w+)
matches A
. Good.
*
matches *
. Yay.
[ws]+
matches BCD
. OK.
/
fails to match (no characters left to match). OK, let's back up one character.
/
fails to match D
. Hum. Let's back up some more.
[ws]+
matches BC
, and the repeated [ws]+
matches D
.
/
fails to match. Back up.
/
fails to match D
. Back up some more.
[ws]+
matches B
, and the repeated [ws]+
matches CD
.
/
fails to match. Back up again.
/
fails to match D
. Back up some more, again.
- How about
[ws]+
matches B
, repeated [ws]+
matches C
, repeated [ws]+
matches D
? No? Let's try something else.
[ws]+
matches BC
. Let's stop here and see what happens.
- Darn,
/
still doesn't match D
.
[ws]+
matches B
.
- Still no luck.
/
doesn't match C
.
- Hey, the whole group is optional
(...)*
.
- Nope,
/
still doesn't match B
.
- OK, I give up.
Now that was a string of just three letters. Yours had about 30, trying all permutations of which would keep your computer busy until the end of days.
I suppose what you're trying to do is to get the strings before/after *
, in which case, use
pattern = r"(w+)*([ws]+)$"
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…