Perhaps I'm overlooking something obvious but I was wondering what the fastest way to implement whole-word string replacement in C++ might be. At first I considered simply concatenating spaces to the search word, but this does not consider the string boundaries or punctuation.
This is my current abstraction for (non-whole-word) replacement:
void Replace(wstring& input, wstring find, wstring replace_with) {
if (find.empty() || find == replace_with || input.length() < find.length()) {
return;
}
for (size_t pos = input.find(find);
pos != wstring::npos;
pos = input.find(find, pos)) {
input.replace(pos, find.length(), replace_with);
pos += replace_with.length();
}
}
If I only consider spaces as a word boundary, I could probably implement this by comparing the beginning and end of the search string against the find string to cover the string boundaries, and then following with a Replace(L' ' + find + L' ').... but I was wondering if there was a more elegant solution that would include punctuation efficiently.
Let's consider a word to be any collection of characters that is separated by either whitespace or punctuation (to keep it simple let's say !"#$%&'()*+,-./ at minimum -- which happen to correspond to (c > 31 && c < 48)
).
In my application I have to call this function over a rather large array of short strings, which may include various Unicode which I don't want to split new words. I would also like to avoid including any external libraries, but STL is fine.
The purpose of not using regular expressions is the promise of less overhead and the goal of a fast function suited to this particular task over a large dataset.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…