I have a dataframe like this:
sample_df<-data.frame(
client=c('John', 'John','Mary','Mary'),
date=c('2016-07-13','2016-07-13','2016-07-13','2016-07-13'),
cluster=c('A','B','A','A'))
#sample data frame
client date cluster
1 John 2016-07-13 A
2 John 2016-07-13 B
3 Mary 2016-07-13 A
4 Mary 2016-07-13 A
I would like to transform it into different format, which will be like:
#ideal data frame
client date cluster
1 John 2016-07-13 c('A,'B')
2 Mary 2016-07-13 A
For the 'cluster' column, it will be a list if some client is belong to different cluster on the same date.
I thought I can do it with dplyr package with commend as below
library(dplyr)
ideal_df<-sample %>%
group_by(client, date) %>%
summarize( #some anonymous function)
However, I don't know how to write the anonymous function in this situation. Is there a way to transform the data into the ideal format?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…