I'm trying to match and remove all words in a list from a string using a compiled regex but I'm struggling to avoid occurrences within words.
Current:
REMOVE_LIST = ["a", "an", "as", "at", ...]
remove = '|'.join(REMOVE_LIST)
regex = re.compile(r'('+remove+')', flags=re.IGNORECASE)
out = regex.sub("", text)
In: "The quick brown fox jumped over an ant"
Out: "quick brown fox jumped over t"
Expected: "quick brown fox jumped over"
I've tried changing the string to compile to the following but to no avail:
regex = re.compile(r'('+remove+')', flags=re.IGNORECASE)
Any suggestions or am I missing something garishly obvious?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…