Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
400 views
in Technique[技术] by (71.8m points)

r - Merge data.frames with duplicates

I have many data.frames, for example:

df1 = data.frame(names=c('a','b','c','c','d'),data1=c(1,2,3,4,5))
df2 = data.frame(names=c('a','e','e','c','c','d'),data2=c(1,2,3,4,5,6))
df3 = data.frame(names=c('c','e'),data3=c(1,2))

and I need to merge these data.frames, without delete the name duplicates

> result
  names data1 data2 data3
1  'a'    1    1      NA
2  'b'    2    NA     NA
3  'c'    3    4      1
4  'c'    4    5      NA
5  'd'    5    6      NA
6  'e'    NA   2      2       
7  'e'    NA   3      NA

I cant find function like merge with option to handle with name duplicates. Thank you for your help. To define my problem. The data comes from biological experiment where one sample have a different number of replicates. I need to merge all experiment, and I need to produce this table. I can't generate unique identifier for replicates.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

First define a function, run.seq, which provides sequence numbers for duplicates since it appears from the output that what is desired is that the ith duplicate of each name in each component of the merge be associated. Then create a list of the data frames and add a run.seq column to each component. Finally use Reduce to merge them all.

run.seq <- function(x) as.numeric(ave(paste(x), x, FUN = seq_along))

L <- list(df1, df2, df3)
L2 <- lapply(L, function(x) cbind(x, run.seq = run.seq(x$names)))

out <- Reduce(function(...) merge(..., all = TRUE), L2)[-2]

The last line gives:

> out
  names data1 data2 data3
1     a     1     1    NA
2     b     2    NA    NA
3     c     3     4     1
4     c     4     5    NA
5     d     5     6    NA
6     e    NA     2     2
7     e    NA     3    NA

EDIT: Revised run.seq so that input need not be sorted.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...