Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
699 views
in Technique[技术] by (71.8m points)

r - Count number of time combination of events appear in dataframe columns ext

This is an extension of the question asked in Count number of times combination of events occurs in dataframe columns, I will reword the question again so it is all here:

I have a data frame and I want to calculate the number of times each combination of events in two columns occur (in any order), with a zero if a combination doesn't appear.

For example say I have

df <- data.frame('x' = c('a', 'b', 'c', 'c', 'c'), 
                 'y' = c('c', 'c', 'a', 'a', 'b'))

So

x y  
a c  
b c  
c a  
c a  
c a  
c b

a and b do not occur together, a and c 4 times (rows 2, 4, 5, 6) and b and c twice (3rd and 7th rows) so I would want to return

x-y num  
a-b 0  
a-c 4  
b-c 2  

I hope this makes sense? Thanks in advance

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This should do it:

res = table(df)

To convert to data frame:

resdf = as.data.frame(res)

The resdf data.frame looks like:

  x y Freq
1 a a    0
2 b a    0
3 c a    2
4 a b    0
5 b b    0
6 c b    1
7 a c    1
8 b c    1
9 c c    0

Note that this answer takes order into account. If ordering of the columns is unimportant, then modifying the original data.frame prior to the process will remove the effect of ordering (a-c treated the same as c-a).

df1 = as.data.frame(t(apply(df,1,sort)))

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...