Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
387 views
in Technique[技术] by (71.8m points)

c# - Translate Perl regular expressions to .NET

I have some useful regular expressions in Perl. Is there a simple way to translate them to .NET's dialect of regular expressions?

If not, is there a concise reference of differences?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

There is a big comparison table in http://www.regular-expressions.info/refflavors.html.


Most of the basic elements are the same, the differences are:

Minor differences:

  • Unicode escape sequences. In .NET it is u200A, in Perl it is x{200A}.
  • v in .NET is just the vertical tab (U+000B), in Perl it stands for the "vertical whitespace" class. Of course there is V in Perl because of this.
  • The conditional expression for named reference in .NET is (?(name)yes|no), but (?(<name>)yes|no) in Perl.

Some elements are Perl-only:

  • Possessive quantifiers (x?+, x*+, x++ etc). Use non-backtracking subexpression ((?>…)) instead.
  • Named unicode escape sequence N{LATIN SMALL LETTER X}, N{U+200A}.
  • Case folding and escaping
    • l (lower case next char), u (upper case next char).
    • L (lower case), U (upper case), Q (quote meta characters) until E.
  • Shorthand notation for Unicode property pL and PL. You have to include the braces in .NET e.g. p{L}.
  • Odd things like X, C.
  • Special character classes like v, V, h, H, N, R
  • Backreference to a specific or previous group g1, g{-1}. You can only use absolute group index in .NET.
  • Named backreference g{name}. Use k<name> instead.
  • POSIX character class [[:alpha:]].
  • Branch-reset pattern (?|…)
  • K. Use look-behind ((?<=…)) instead.
  • Code evaluation assertion (?{…}), post-poned subexpression (??{…}).
  • Subexpression reference (recursive pattern) (?0), (?R), (?1), (?-1), (?+1), (?&name).
  • Some conditional expression's predicate are Perl-specific:
    • code (?{…})
    • recursive (R), (R1), (R&name)
    • define (DEFINE).
  • Special Backtracking Control Verbs (*VERB:ARG)
  • Python syntax
    • (?P<name>…). Use (?<name>…) instead.
    • (?P=name). Use k<name> instead.
    • (?P>name). No equivalent in .NET.

Some elements are .NET only:

  • Variable length look-behind. In Perl, for positive look-behind, use K instead.
  • Arbitrary regular expression in conditional expression (?(pattern)yes|no).
  • Character class subtraction (undocumented?) [a-z-[d-w]]
  • Balancing Group (?<-name>…). This could be simulated with code evaluation assertion (?{…}) followed by a (?&name).

References:


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...