I have two strings:
a <- "Roy lives in Japan and travels to Africa"
b <- "Roy travels Africa with this wife"
I am looking to get a count of common words between these strings.
The answer should be 3.
being the common words
This is what I tried:
stra <- as.data.frame(t(read.table(textConnection(a), sep = " ")))
strb <- as.data.frame(t(read.table(textConnection(b), sep = " ")))
Taking unique to avoid repeat counting
stra_unique <-as.data.frame(unique(stra$V1))
strb_unique <- as.data.frame(unique(strb$V1))
colnames(stra_unique) <- c("V1")
colnames(strb_unique) <- c("V1")
common_words <-length(merge(stra_unique,strb_unique, by = "V1")$V1)
I need to this for a data set with over 2000 and 1200 strings.
Total times I have to evaluate the string is 2000 X 1200. Any quick way, without using loops.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…