We have this test file:
$ cat file
abc, def, abc, def
To remove duplicate words:
$ sed -r ':a; s/([[:alnum:]]+)(.*)1/12/g; ta; s/(, )+/, /g; s/, *$//' file
abc, def
How it works
:a
This defines a label a
.
s/([[:alnum:]]+)(.*)1/12/g
This looks for a duplicated word consisting of alphanumeric characters and removes the second occurrence.
ta
If the last substitution command resulted in a change, this jumps back to label a
to try again.
In this way, the code keeps looking for duplicates until none remain.
s/(, )+/, /g; s/, *$//
These two substitution commands clean up any left over comma-space combinations.
Mac OSX or other BSD System
For Mac OSX or other BSD system, try:
sed -E -e ':a' -e 's/([[:alnum:]]+)(.*)1/12/g' -e 'ta' -e 's/(, )+/, /g' -e 's/, *$//' file
Using a string instead of a file
sed easily handles input either from a file, as shown above, or from a shell string as shown below:
$ echo 'ab, cd, cd, ab, ef' | sed -r ':a; s/([[:alnum:]]+)(.*)1/12/g; ta; s/(, )+/, /g; s/, *$//'
ab, cd, ef
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…