Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
253 views
in Technique[技术] by (71.8m points)

c# - Regex split string preserving quotes

I need to split a string like the one below, based on space as the delimiter. But any space within a quote should be preserved.

research library "not available" author:"Bernard Shaw"

to

research
library
"not available"
author:"Bernard Shaw"

I am trying to do this in C Sharp, I have this Regex: @"(?<="")|w[ws]*(?="")|w+|""[ws]*""" from another post in SO, which splits the string into

research
library
"not available"
author
"Bernard Shaw"

which unfortunately does not meet my exact requirements.

I'm looking for any Regex, that would do the trick.

Any help appreciated.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

As long as there can be no escaped quoted inside quoted strings, the following should work:

splitArray = Regex.Split(subjectString, "(?<=^[^"]*(?:"[^"]*"[^"]*)*) (?=(?:[^"]*"[^"]*")*[^"]*$)");

This regex splits on space characters only if they are preceded and followed by an even number of quotes.

The regex without all those escaped quotes, explained:

(?<=      # Assert that it's possible to match this before the current position (positive lookbehind):
 ^        # The start of the string
 [^"]*    # Any number of non-quote characters
 (?:      # Match the following group...
  "[^"]*  # a quote, followed by any number of non-quote characters
  "[^"]*  # the same
 )*       # ...zero or more times (so 0, 2, 4, ... quotes will match)
)         # End of lookbehind assertion.
[ ]       # Match a space
(?=       # Assert that it's possible to match this after the current position (positive lookahead):
 (?:      # Match the following group...
  [^"]*"  # see above
  [^"]*"  # see above
 )*       # ...zero or more times.
 [^"]*    # Match any number of non-quote characters
 $        # Match the end of the string
)         # End of lookahead assertion

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...