Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
324 views
in Technique[技术] by (71.8m points)

r - Joining factor levels of two columns

I have 2 columns of data with the same type of data (Strings).

I want to join the levels of the columns. ie. we have:

col1   col2
Bob    John
Tom    Bob
Frank  Jane
Jim    Bob
Tom    Bob
...    ... (and so on)

now col1 has 4 levels (Bob, Tom Frank, Jim) and col2 has 3 levels (John, Jane, Bob)

But I want both columns to have all the factor levels (Bob, Tom, Frank, Jim, Jane, John), as to later replace each of the 'names' with a unique id, such that the final output would be:

col1   col2
1      5
2      1
3      6
4      1
2      1

that is Bob -> 1, Tom -> 2, etc. in both columns.

Any ideas :) ?

edit: Thanks all for the wonderful answers! You are all awesome as far as I know :)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
x <- structure(list(col1 = structure(c(1L, 4L, 2L, 3L, 4L), .Label = c("Bob", "Frank", "Jim", "Tom"), class = "factor"), col2 = structure(c(3L, 1L, 2L, 1L, 1L), .Label = c("Bob", "Jane", "John"), class = "factor")), .Names = c("col1", "col2"), class = "data.frame", row.names = c(NA, -5L))

Make a simple union of factor names:

both <- union(levels(x$col1), levels(x$col2))

And relevel the two factors:

x$col1 <- factor(x$col1, levels=both)
x$col2 <- factor(x$col2, levels=both)

After editing: added example to make numeric values from factors

You could simply transform the factor levels to numeric values, e.g.:

as.numeric(x$col1)

Or a more simpler, nicer solution based on @Gavin Simpson's hint below in one step:

data.matrix(x)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...