Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
193 views
in Technique[技术] by (71.8m points)

php - How could I find all whitespaces excluding the ones between quotes?

I need to split string by spaces, but phrase in quotes should be preserved unsplitted. Example:

  word1 word2 "this is a phrase" word3 word4 "this is a second phrase" word5

this should result in array after preg_split:

array(
 [0] => 'word1',
 [1] => 'word2',
 [2] => 'this is a phrase',
 [3] => 'word3',
 [4] => 'word4',
 [5] => 'this is a second phrase',
 [6]  => 'word5'
)

How should I compose my regexp to do that?

PS. There is related question, but I don't think it works in my case. Accepted answer provides regexp to find words instead of whitespaces.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

With the help of user MizardX from #regex irc channel (irc.freenode.net) solution was found. It even supports single quotes.

$str= 'word1 word2 'this is a phrase' word3 word4 "this is a second phrase" word5 word1 word2 "this is a phrase" word3 word4 "this is a second phrase" word5';

$regexp = '/G(?:"[^"]*"|'[^']*'|[^"'s]+)*Ks+/';

$arr = preg_split($regexp, $str);

print_r($arr);

Result is:

Array (
    [0] => word1
    [1] => word2
    [2] => 'this is a phrase'
    [3] => word3
    [4] => word4
    [5] => "this is a second phrase"
    [6] => word5
    [7] => word1
    [8] => word2
    [9] => "this is a phrase"
    [10] => word3
    [11] => word4
    [12] => "this is a second phrase"
    [13] => word5  
)

PS. Only disadvantage is that this regexp works only for PCRE 7.

It turned out that I do not have PCRE 7 support on production server, only PCRE 6 is installed there. Even though it is not as flexible as previous one for PCRE 7, regexp that will work is (got rid of G and K):

/(?:"[^"]*"|'[^']*'|[^"'s]+)+/

For the given input result is the same as above.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...