Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
3.9k views
in Technique[技术] by (71.8m points)

r - Transpose / reshape dataframe without "timevar" from long to wide format

I have a data frame that follows the below long Pattern:

   Name          MedName
  Name1    atenolol 25mg
  Name1     aspirin 81mg
  Name1 sildenafil 100mg
  Name2    atenolol 50mg
  Name2   enalapril 20mg

And would like to get below (I do not care if I can get the columns to be named this way, just want the data in this format):

   Name   medication1    medication2      medication3
  Name1 atenolol 25mg   aspirin 81mg sildenafil 100mg
  Name2 atenolol 50mg enalapril 20mg             NA

Through this very site I have become familiarish with the reshape/reshape2 package, and have went through several attempts to try to get this to work but have thus far failed.

When I try dcast(dataframe, Name ~ MedName, value.var='MedName') I just get a bunch of columns that are flags of the medication names (values that get transposed are 1 or 0) example:

 Name  atenolol 25mg  aspirin 81mg
Name1              1             1
Name2              0             0 

I also tried a dcast(dataset, Name ~ variable) after I melted the dataset, however this just spits out the following (just counts how many meds each person has):

 Name  MedName
Name1        3
name2        2

Finally, I tried to melt the data and then reshape using idvar="Name" timevar="variable" (of which all just are Mednames), however this does not seem built for my issue since if there are multiple matches to the idvar, the reshape just takes the first MedName and ignores the rest.

Does anyone know how to do this using reshape or another R function? I realize that there probably is a way to do this in a more messy manner with some for loops and conditionals to basically split and re-paste the data, but I was hoping there was a more simple solution. Thank you so much!


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Assuming your data is in the object dataset:

library(plyr)
## Add a medication index
data_with_index <- ddply(dataset, .(Name), mutate, 
                         index = paste0('medication', 1:length(Name)))    
dcast(data_with_index, Name ~ index, value.var = 'MedName')

##    Name   medication1    medication2      medication3
## 1 Name1 atenolol 25mg   aspirin 81mg sildenafil 100mg
## 2 Name2 atenolol 50mg enalapril 20mg             <NA>

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...