Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
580 views
in Technique[技术] by (71.8m points)

php - Regex match numbers not followed by a hyphen

I've worked with regexes for years and never had this much trouble working out a regex. I am using PHP 7.2.2's preg_match() to return an array of matching numbers for parsing, hence the parentheses in the regex.

I am trying to match one or more numbers followed by an x followed by one or more numbers where the entire string is not followed by a hyphen. When $input is 18x18, 18x18- or 18x18size, the matches are 18 and 1. When the $input is 8x8, there are no matches.

I seem to be doing something fundamentally wrong here.

<?php
$input = "18x18";    
preg_match("/(d+)x(d+)[^-]/", $input, $matches);

Calling the print_r($matches) results in:

Array
(
    [0] => 18x18
    [1] => 18
    [2] => 1
)

The parens are there because I am using PHP's preg_match to return an array of matches. I understand when hyphens should be escaped and I've tried both ways to be sure but get the same results. Why doesn't this match?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You may use

'~(d+)x(d++)(?!-)~'

It can also be written without a possessive quantifier as '~(d+)x(d+)(?![-d])~' since the d inside the lookahead will also forbid matching the second digit chunk partially.

Alternatively, additionally to the lookahead, you may use word boundaries:

'~(d+)x(d+)(?!-)~'

See the regex demo #1 and regex demo #2.

Details

  • (d+)x(d++)(?!-) / (d+)x(d+)(?![-d]) - matches and captures 1 or more digits into Group 1, then matches x, and then matches and captures into Group 2 one or more digits possessively without letting backtracking into the digit matching pattern, and the (?!-) negative lookahead check (making sure there is no - immediately after the current position) is performed once after d++ matches all the digits it can. In case of d+(?![-d]), the 1+ digits are matched first, and then the negative lookahead makes sure there is no digit and - immediately to the right of the current location.
  • (d+)x(d+)(?!-) - matches a word boundary first, then matches and captures 1 or more digits into Group 1, then matches x, then matches and captures into Group 2 one or more digits, then asserts that there is a word boundary, and only then makes sure there is no - right after.

See a PHP demo:

if (preg_match('~(d+)x(d++)(?!-)~', "18x18", $m)) {
    echo "18x18: " . $m[1] . " - " . $m[2] . "
";
}
if (preg_match('~(d+)x(d+)(?!-)~', "18x18", $m)) {
    echo "18x18: " . $m[1] . " - " . $m[2] . "
";
}
if (preg_match('~(d+)x(d++)(?!-)~', "18x18-", $m)) {
    echo "18x18-: " . $m[1] . " - " . $m[2] . "
";
}
if (preg_match('~(d+)x(d+)(?!-)~', "18x18-", $m)) {
    echo "18x18-: " . $m[1] . " - " . $m[2];
}

Output:

18x18: 18 - 18
18x18: 18 - 18

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...