Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
602 views
in Technique[技术] by (71.8m points)

r - How do I replace the string exactly using gsub()

I have a corpus: txt = "a patterned layer within a microelectronic pattern." I would like to replace the term "pattern" exactly by "form", I try to write a code:

txt_replaced = gsub("pattern","form",txt)

However, the responsed corpus in txt_replaced is: "a formed layer within a microelectronic form."

As you can see, the term "patterned" is wrongly replaced by "formed" because parts of characteristics in "patterned" matched to "pattern".

I would like to query that if I can replace the string exactly using gsub()? That is, only the term with exactly match should be replaced.

I thirst for a responsed as below: "a patterned layer within a microelectronic form."

Many thanks!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

As @koshke noted, a very similar question has been answered before (by me). ...But that was grep and this is gsub, so I'll answer it again:

"<" is an escape sequence for the beginning of a word, and ">" is the end. In R strings you need to double the backslashes, so:

txt <- "a patterned layer within a microelectronic pattern."
txt_replaced <- gsub("\<pattern\>","form",txt)
txt_replaced
# [1] "a patterned layer within a microelectronic form."

Or, you could use instead of < and >. matches a word boundary so it can be used at both ends>

txt_replaced <- gsub("\bpattern\b","form",txt)

Also note that if you want to replace only ONE occurrence, you should use sub instead of gsub.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...