Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
499 views
in Technique[技术] by (71.8m points)

c# - Regex pattern to choose data BETWEEN matching quotation marks

Suppose I had the following string I wanted to run a Regular expression on:

This is a test string with "quotation-marks" within it.
The "problem" I am having, per-se, is "knowing" which "quotation-marks"
go with which words.

Now, suppose I wanted to replace all the - characters between the quotation marks with, say, a space. I was thinking I could do so with a regex looking as follows:

Find What:      ("[^"]*?)(-)([^"]*?")
Replace With:   $1 $3

The problem I'm having is that using this pattern, it does not take into account whether a quotation mark was opening or closing the statement.

So, in the example above, the - character in per-se will be replaced by a space since it is between 2 quotation marks, but between a closing and an opening mark - When I specifically want to look within the text between an opening and a closing mark.

How do you account for this in such a regular expression?

I hope this makes sense.

I'm using VB / C# Regex.


Just to complete the question (and hopefully elaborate a bit more if necessary), the end result I would like to get would be:

This is a test string with "quotation marks" within it.
The "problem" I am having, per-se, is "knowing" which "quotation marks"
go with which words.

Thanks!!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You are having the same problem as someone who is trying to match HTML or opening and closing parentheses, regex can only match regular languages and knowing which " is a closing and an opening one is out of its reach for anything but the trivial cases.

EDIT: As shown in Vasili Syrakis's answer, sometimes it can be done but regex is a fragile solution for this type of problem.

With that said, you can convert your problem in the trivial case. Since you are using .NET, you can simply match every quoted string and use the overload that takes a match evaluator.

Regex.Replace(text, "".*?"", m => m.Value.Replace("-", " "))

Test:

var text = @"This is a test string with ""quotation-marks"" within it.
The ""problem"" I am having, per-se, is ""knowing"" which ""quotation-marks""
go with which words.";

Console.Write(Regex.Replace(text, "".*?"", m => m.Value.Replace("-", " ")));
//This is a test string with "quotation marks" within it.
//The "problem" I am having, per-se, is "knowing" which "quotation marks"
//go with which words. 

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...