Is there any method in Java or any open source library for escaping (not quoting) a special character (meta-character), in order to use it as a regular expression?
This would be very handy in dynamically building a regular expression, without having to manually escape each individual character.
For example, consider a simple regex like d+.d+
that matches numbers with a decimal point like 1.2
, as well as the following code:
String digit = "d";
String point = ".";
String regex1 = "\d+\.\d+";
String regex2 = Pattern.quote(digit + "+" + point + digit + "+");
Pattern numbers1 = Pattern.compile(regex1);
Pattern numbers2 = Pattern.compile(regex2);
System.out.println("Regex 1: " + regex1);
if (numbers1.matcher("1.2").matches()) {
System.out.println("Match");
} else {
System.out.println("No match");
}
System.out.println("Regex 2: " + regex2);
if (numbers2.matcher("1.2").matches()) {
System.out.println("Match");
} else {
System.out.println("No match");
}
Not surprisingly, the output produced by the above code is:
Regex 1: d+.d+
Match
Regex 2: Qd+.d+E
No match
That is, regex1
matches 1.2
but regex2
(which is "dynamically" built) does not (instead, it matches the literal string d+.d+
).
So, is there a method that would automatically escape each regex meta-character?
If there were, let's say, a static escape()
method in java.util.regex.Pattern
, the output of
Pattern.escape('.')
would be the string "."
, but
Pattern.escape(',')
should just produce ","
, since it is not a meta-character. Similarly,
Pattern.escape('d')
could produce "d"
, since 'd'
is used to denote digits (although escaping may not make sense in this case, as 'd'
could mean literal 'd'
, which wouldn't be misunderstood by the regex interpeter to be something else, as would be the case with '.'
).
Question&Answers:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…