I have a string variable where I want to remove certain words, but many other words would be a partial match, which I don't want to remove. I want to remove words, if and only if they are a complete match.
clear
* Add in some example data
input index str50 words
1 "more mor morph test"
2 "ten tennis tenner tenth keeper"
3 "badder baddy bad other"
end
* I create a copy to compare obefore/after strip
gen strip_words = words
* This is a list of words I want removed. In reality, this is a fairly long list
local removs "mor ten bad"
* For each of words, remove the complete word from teh string
foreach w of local removs {
replace strip_words = subinstr(strip_words, "`w'","", .)
}
list
+---------------------------------------------------------------+
| index words strip_words |
|---------------------------------------------------------------|
1. | 1 more mor morph test e ph test |
2. | 2 ten tennis tenner tenth keeper nis ner th keeper |
3. | 3 badder baddy bad other der dy other |
+---------------------------------------------------------------+
I've tried padding some spaces with replace strip_words = " " + strip_words + " "
, but then this also removes the spaces separating the other words. My desired output would be
+-------------------------------------------------------------------------+
| index words strip_words |
|-------------------------------------------------------------------------|
1. | 1 more mor morph test more morph test |
2. | 2 ten tennis tenner tenth keeper tennis tenner tenth keeper |
3. | 3 badder baddy bad other badder baddy other |
+-------------------------------------------------------------------------+
'''
question from:
https://stackoverflow.com/questions/65598883/stata-remove-entire-word-from-string 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…