Why not use a word boundary?
match_string = r'' + word + r''
match_string = r'{}'.format(word)
match_string = rf'{word}' # Python 3.7+ required
If you have a list of words (say, in a words
variable) to be matched as a whole word, use
match_string = r'(?:{})'.format('|'.join(words))
match_string = rf'(?:{"|".join(words)})' # Python 3.7+ required
In this case, you will make sure the word is only captured when it is surrounded by non-word characters. Also note that
matches at the string start and end. So, no use adding 3 alternatives.
Sample code:
import re
strn = "word hereword word, there word"
search = "word"
print re.findall(r"" + search + r"", strn)
And we found our 3 matches:
['word', 'word', 'word']
NOTE ON "WORD" BOUNDARIES
When the "words" are in fact chunks of any chars you should re.escape
them before passing to the regex pattern:
match_string = r'{}'.format(re.escape(word)) # a single escaped "word" string passed
match_string = r'(?:{})'.format("|".join(map(re.escape, words))) # words list is escaped
match_string = rf'(?:{"|".join(map(re.escape, words))})' # Same as above for Python 3.7+
If the words to be matched as whole words may start/end with special characters,
won't work, use unambiguous word boundaries:
match_string = r'(?<!w){}(?!w)'.format(re.escape(word))
match_string = r'(?<!w)(?:{})(?!w)'.format("|".join(map(re.escape, words)))
If the word boundaries are whitespace chars or start/end of string, use whitespace boundaries, (?<!S)...(?!S)
:
match_string = r'(?<!S){}(?!S)'.format(word)
match_string = r'(?<!S)(?:{})(?!S)'.format("|".join(map(re.escape, words)))
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…