Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
559 views
in Technique[技术] by (71.8m points)

java - Regex to match exactly n occurrences of letters and m occurrences of digits

I have to match an 8 character string, which can contain exactly 2 letters (1 uppercase and 1 lowercase), and exactly 6 digits, but they can be permutated arbitrarily.

So, basically:

  • K82v6686 would pass
  • 3w28E020 would pass
  • 1276eQ900 would fail (too long)
  • 98Y78k9k would fail (three letters)
  • A09B2197 would fail (two capital letters)

I've tried using the positive lookahead to make sure that the string contains digits, uppercase and lowercase letters, but I have trouble with limiting it to a certain number of occurrences. I suppose I could go about it by including all possible combinations of where the letters and digits can occur:

(?=.*[0-9])(?=.*[A-Z])(?=.*[a-z]) ([A-Z][a-z][0-9]{6})|([A-Z][0-9][a-z][0-9]{5})| ... | ([0-9]{6}[a-z][A-Z])

But that's a very roundabout way of doing it, and I'm wondering if there's a better solution.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can use

^(?=[^A-Z]*[A-Z][^A-Z]*$)(?=[^a-z]*[a-z][^a-z]*$)(?=(?:D*d){6}D*$)[a-zA-Z0-9]{8}$

See the regex demo (a bit modified due to the multiline input). In Java, do not forget to use double backslashes (e.g. \d to match a digit).

Here is a breakdown:

  • ^ - start of string (assuming no multiline flag is to be used)
  • (?=[^A-Z]*[A-Z][^A-Z]*$) - check if there is only 1 uppercase letter (use p{Lu} to match any Unicode uppercase letter and P{Lu} to match any character other than that)
  • (?=[^a-z]*[a-z][^a-z]*$) - similar check if there is only 1 lowercase letter (alternatively, use p{Ll} and P{Ll} to match Unicode letters)
  • (?=(?:D*d){6}D*$) - check if there are six digits in a string (=from the beginning of the string, there can be 0 or more non-digit symbols (D matches any character but a digit, you may also replace it with [^0-9]), then followed by a digit (d) and then followed by 0 or more non-digit characters (D*) up to the end of string ($)) and then
  • [a-zA-Z0-9]{8} - match exactly 8 alphanumeric characters.
  • $ - end of string.

Following the logic, we can even reduce this to just

^(?=[^a-z]*[a-z][^a-z]*$)(?=(?:D*d){6}D*$)[a-zA-Z0-9]{8}$

One condition can be removed as we only allow lower- and uppercase letters and digits with [a-zA-Z0-9], and when we apply 2 conditions the 3rd one is automatically performed when matching the string (one character must be an uppercase in this case).

When using it with Java matches() method, there is no need to use ^ and $ anchors at the start and end of the pattern, but you still need it in the lookaheads:

String s = "K82v6686";
String rx = "(?=[^a-z]*[a-z][^a-z]*$)" +      // 1 lowercase letter check
            "(?=(?:\D*\d){6}\D*$)" +       // 6 digits check
            "[a-zA-Z0-9]{8}";                 // matching 8 alphanum chars exactly
if (s.matches(rx)) {
    System.out.println("Valid"); 
} 

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...