Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
210 views
in Technique[技术] by (71.8m points)

string - How to replace repeated words with single word

I have a string variable response:

where where where is it         
I'm going there
where where did you say
sometimes it is where you think
i think its where where you go
its everywhere where you are
i am planning on going where where where i want to

As you can see, the word "where" is repeated quite often. I want to replace strings "where where" and "where where where" (or even "where where where where") with "where".

However, I don't want to replace "everywhere where" with "where".

I know I can do this manually, but I was hoping to condense the code into as few lines as possible.

This is what I have been trying so far:

gen temp = regexr(response, " (where)+ where ", " where ") 
replace temp = regexr(response, "^(where)+ where ", "where ")

These are my results after running the code above:

where where is it  
I'm going there
where did you say
sometimes it is where you think
i think its where where you go
its everywhere where you are
i am planning on going where where where i want to

Instead, I want the final data to look like this:

where is it         
I'm going there
where did you say
sometimes it is where you think
i think its where you go
its everywhere where you are
i am planning on going where i want to

I have been using "(where)+" to capture both "where where" and "where where where" but it doesn't seem to work. I also split the code into two commands, one begins with "^(where)" and the other with " (where)" in order to avoid capturing the 'where' in "everywhere" but it seems as if the code does not capture "where where" when it occurs in the middle of the sentence.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

A quick fix using Stata's string functions is the following:

clear

input str50 string1
"where where where is it"        
"I'm going there"
"where where did you say"
"sometimes it is where you think"
"i think its where where you go"
"its everywhere where you are"
"i am planning on going where where where i want to"
end

generate tag1 = !strmatch(string1, "*everywhere where*")

generate tag2 = ( length(string1) - length(subinstr(string1, "where", "", .)) ) / 5

generate string2 = cond(tag1 == 1, stritrim(subinstr(string1, "where", "", tag2-1)), string1)


list string2, separator(0)

     +----------------------------------------+
     |                                string2 |
     |----------------------------------------|
  1. |                            where is it |
  2. |                        I'm going there |
  3. |                      where did you say |
  4. |        sometimes it is where you think |
  5. |               i think its where you go |
  6. |           its everywhere where you are |
  7. | i am planning on going where i want to |
     +----------------------------------------+

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...