Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
783 views
in Technique[技术] by (71.8m points)

regex - Non-greedy Regular Expression in Java

I have next code:

public static void createTokens(){
    String test = "test is a word word word word big small";
    Matcher mtch = Pattern.compile("test is a (\s*.+?\s*) word (\s*.+?\s*)").matcher(test);
    while (mtch.find()){
        for (int i = 1; i <= mtch.groupCount(); i++){
            System.out.println(mtch.group(i));
        }
    }
}

And have next output:

word
w

But in my opinion it must be:

word
word

Somebody please explain me why so?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Because your patterns are non-greedy, so they matched as little text as possible while still consisting of a match.

Remove the ? in the second group, and you'll get
word
word word big small

Matcher mtch = Pattern.compile("test is a (\s*.+?\s*) word (\s*.+\s*)").matcher(test);

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...