Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
591 views
in Technique[技术] by (71.8m points)

r - dplyr join warning: joining factors with different levels

When using the join function in the dplyr package, I get this warning:

Warning message:
In left_join_impl(x, y, by$x, by$y) :
  joining factors with different levels, coercing to character vector

There is not a lot of information online about this. Any idea what it could be? Thanks!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

That's not an error, that's a warning. And it's telling you that one of the columns you used in your join was a factor and that factor had different levels in the different datasets. In order not to lose any information, the factors were converted to character values. For example:

library(dplyr)
x<-data.frame(a=letters[1:7])
y<-data.frame(a=letters[4:10])

class(x$a) 
# [1] "factor"

# NOTE these are different
levels(x$a)
# [1] "a" "b" "c" "d" "e" "f" "g"
levels(y$a)
# [1] "d" "e" "f" "g" "h" "i" "j"

m <- left_join(x,y)
# Joining by: "a"
# Warning message:
# joining factors with different levels, coercing to character vector 

class(m$a)
# [1] "character"

You can make sure that both factors have the same levels before merging

combined <- sort(union(levels(x$a), levels(y$a)))
n <- left_join(mutate(x, a=factor(a, levels=combined)),
    mutate(y, a=factor(a, levels=combined)))
# Joining by: "a"
class(n$a)
#[1] "factor"

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...