You can reverse the syntax to avoid partial matching from LOCATIONS table.
library(fuzzyjoin)
check <- data.frame(STRING = c("BATANGAS", "QINGDAO"))
LOCATIONS <- data.frame(STRING = c("BATANGAS LUZON", "QINGDAO PT", "TANGA"))
LOCATIONS %>%
fuzzy_right_join(check, by = c("STRING" = "STRING"), match_fun = str_detect)
STRING.x STRING.y
1 BATANGAS LUZON BATANGAS
2 QINGDAO PT QINGDAO
To check further for full words only, you can do this..
check <- structure(list(To_check = c("BATANGAS", "QINGDAO", "ABC", "DEF"
), id = 1:4), class = "data.frame", row.names = c(NA, -4L))
check
> check
To_check id
1 BATANGAS 1
2 QINGDAO 2
3 ABC 3
4 DEF 4
> LOCATIONS
STRING
1 BATANGAS LUZON
2 QINGDAO PT
3 TANGA
4 ABCD
LOCATIONS %>%
fuzzy_right_join(check %>% mutate(dummy = paste0('\b', To_check, '\b')),
by = c("STRING" = "dummy"), match_fun = str_detect) %>%
select(-dummy)
STRING To_check id
1 BATANGAS LUZON BATANGAS 1
2 QINGDAO PT QINGDAO 2
3 <NA> ABC 3
4 <NA> DEF 4
needless to say you can use fuzzy_inner_join
for having matched results only
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…