Here on SO people sometimes say something like "you cannot parse X with regular expressions, because X is not a regular language". From my understanding however, modern regular expressions engines can match more than just regular languages in Chomsky's sense. My questions:
given a regular expression engine that supports
- backreferences
- lookaround assertions of unlimited width
- recursion, like
(?R)
what kind of languages can it parse? Can it parse any context-free language, and if not, what would be the counterexample?
(To be precise, by "parse" I mean "build a single regular expression that would accept all strings generated by the grammar X and reject all other strings").
Add.: I'm particularly interested to see an example of a context-free language that modern regex engines (Perl, Net, python regex module) would be unable to parse.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…