Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
758 views
in Technique[技术] by (71.8m points)

regex - Java String.replaceAll() with back reference

There is a Java Regex question: Given a string, if the "*" is at the start or the end of the string, keep it, otherwise, remove it. For example:

  1. * --> *
  2. ** --> **
  3. ******* --> **
  4. *abc**def* --> *abcdef*

The answer is:

str.replaceAll("(^\*)|(\*$)|\*", "$1$2");

I tried the answer on my machine and it works. But I don't know how it works.

From my understanding, all matched substrings should be replaced with $1$2. However, it works as:

  1. (^\*) replaced with $1,
  2. (\*$) replaced with $2,
  3. \* replaced with empty.

Could someone explain how it works? More specifically, if there is | between expressions, how String.replaceAll() works with back reference?

Thank you in advance.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I'll try to explain what's happening in regex.

str.replaceAll("(^\*)|(\*$)|\*", "$1$2");

$1 represents first group which is (^\*) $2 represents 2nd group (\*$)

when you call str.replaceAll, you are essentially capturing both groups and everything else but when replacing, replace captured text with whatever got captured in both groups.

Example: *abc**def* --> *abcdef*

Regex is found string starting with *, it will put in $1 group, next it will keep looking until it find * at end of group and store it in #2. now when replacing it will eliminate all * except one stored in $1 or $2

For more information see Capture Groups


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...