I want to be able to fuzzy match one column and exact match another column.
Say I df1 looks like this:
And df2 looks like this:
I want to fuzzy match the "Name" but exact match the "Year." So "Ashley" and "Ashlee" would be a match. This is what I have so far:
res <- fuzzy_left_join(
df,
df2,
by=c("Year","Name"),
list(`==`, function(x,y) stringdist(tolower(x), tolower(y), method="lv") <= 3)
)
res %>%
select(Year = Year.x, everything(), - Year.y)
It appears to be over-matching, though. Not sure what's going on.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…