Could anyone systematically explain to me the hierarchy of type conversion between character/numeric/factor while using rbind and data.frame?
In my understanding, rbind
puts together in a matrix, which can only have one type. So if there's a type conflict, what's the type that will get converted to? Do other types of matrix-creation function (e.g. cbind
, matrix
) work the same way? Example:
> sapply(rbind("a", "b"), class)
a b
"character" "character"
> sapply(rbind(1, "b"), class)
1 b
"character" "character"
On the other hand, a data frame can hold multiple types, so data.frame
preserves the original type, EXCEPT that it always tries to convert character into factors. (Is this correct? This is very counter-intuitive to me.)
With the same logic, is it correct that a factor type will always remain factor, no matter whether it is factor(c(1,2))
or factor(c("a", "b"))
?
> sapply(data.frame("a", "b"), class)
X.a. X.b.
"factor" "factor"
> sapply(data.frame(1, "b"), class)
X1 X.b.
"numeric" "factor"
> sapply(data.frame(1, factor("a")), class)
X1 factor..a..
"numeric" "factor"
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…