If you want to make sure you are really matching a url adress and not only some word starting with 'www.' you can use the expression mentioned by DVK before. I modified it slightly and wrote a small code snippet to be a starting point for you:
import java.util.*;
import java.util.regex.*;
class FindUrls
{
public static List<String> extractUrls(String input) {
List<String> result = new ArrayList<String>();
Pattern pattern = Pattern.compile(
"\b(((ht|f)tp(s?)\:\/\/|~\/|\/)|www.)" +
"(\w+:\w+@)?(([-\w]+\.)+(com|org|net|gov" +
"|mil|biz|info|mobi|name|aero|jobs|museum" +
"|travel|[a-z]{2}))(:[\d]{1,5})?" +
"(((\/([-\w~!$+|.,=]|%[a-f\d]{2})+)+|\/)+|\?|#)?" +
"((\?([-\w~!$+|.,*:]|%[a-f\d{2}])+=?" +
"([-\w~!$+|.,*:=]|%[a-f\d]{2})*)" +
"(&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?" +
"([-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*" +
"(#([-\w~!$+|.,*:=]|%[a-f\d]{2})*)?\b");
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
result.add(matcher.group());
}
return result;
}
}
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…