Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
754 views
in Technique[技术] by (71.8m points)

c# - Check if String Contains Number from List, Remove that Number from String

I want to check if a string contains a word or number from a list and remove it from the string.

I want to do this for multiple matches found.


The sentence reads

This is a 01 02 03 (01) (02) (03) no01 no02 no03 test


I need the Regex.Replace to remove only the full 01, 02, 03, not the ones inside other words.

This is a (01) (02) (03) no01 no02 no03 test


But it only removes the occurrences of 03, the last item in the match list, in all places.

This is a 01 02 (01) (02) () no01 no02 no test


http://rextester.com/BCEXTJ37204

C#

List<string> filters = new List<string>();
List<string> matches = new List<string>();

string sentence = "This is a 01 02 03 (01) (02) (03) no01 no02 no03 test";
string newSentence = string.Empty;

// Create Filters List
for (int i = 0; i < 101; i++)
{
    filters.Add(string.Format("{0:00}", i)); // 01-100
}

// Find Matches
for (int i = 0; i < filters.Count; i++)
{
    // Add to Matches List
    if (sentence.Contains(filters[i]))
    {
        matches.Add(filters[i]); // will be 01, 02, 03
    }
}

// Filter Sentence
for (int i = 0; i < matches.Count; i++)
{
    newSentence = Regex.Replace(sentence, matches[i], "", RegexOptions.IgnoreCase);
}

// Display New Sentence
Console.WriteLine(newSentence);

I tried changing string.Format() to @"{0:00}" to match whole words but it doesn't work.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The problem is that you are invoking your regex matcher repeatedly on the original string. That's why only the last change "sticks", while the rest get discarded:

newSentence = Regex.Replace(sentence, matches[i], "", RegexOptions.IgnoreCase);

If you change this to call Replace on newSentence, it is going to work correctly:

newSentence = sentence;
for (int i = 0; i < matches.Count; i++) {
    newSentence = Regex.Replace(newSentence, matches[i], "", RegexOptions.IgnoreCase);
}

However, this is suboptimal: you would be better off concatenating all replacements into a single regex, like this:

newSentence = Regex.Replace(
    sentence
,   @"(?<=s|^)(" + string.Join("|", matches) + @")(?=s|$)"
,   ""
,   RegexOptions.IgnoreCase
);

You can also remove pre-checks of filters that constructs matches, because regex engine would take care of it pretty efficiently.

Demo.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...