Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
261 views
in Technique[技术] by (71.8m points)

regex - Using explicitly numbered repetition instead of question mark, star and plus

I've seen regex patterns that use explicitly numbered repetition instead of ?, * and +, i.e.:

Explicit            Shorthand
(something){0,1}    (something)?
(something){1}      (something)
(something){0,}     (something)*
(something){1,}     (something)+

The questions are:

  • Are these two forms identical? What if you add possessive/reluctant modifiers?
  • If they are identical, which one is more idiomatic? More readable? Simply "better"?
Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

To my knowledge they are identical. I think there maybe a few engines out there that don't support the numbered syntax but I'm not sure which. I vaguely recall a question on SO a few days ago where explicit notation wouldn't work in Notepad++.

The only time I would use explicitly numbered repetition is when the repetition is greater than 1:

  • Exactly two: {2}
  • Two or more: {2,}
  • Two to four: {2,4}

I tend to prefer these especially when the repeated pattern is more than a few characters. If you have to match 3 numbers, some people like to write: ddd but I would rather write d{3} since it emphasizes the number of repetitions involved. Furthermore, down the road if that number ever needs to change, I only need to change {3} to {n} and not re-parse the regex in my head or worry about messing it up; it requires less mental effort.

If that criteria isn't met, I prefer the shorthand. Using the "explicit" notation quickly clutters up the pattern and makes it hard to read. I've worked on a project where some developers didn't know regex too well (it's not exactly everyone's favorite topic) and I saw a lot of {1} and {0,1} occurrences. A few people would ask me to code review their pattern and that's when I would suggest changing those occurrences to shorthand notation and save space and, IMO, improve readability.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...