Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
504 views
in Technique[技术] by (71.8m points)

r - cbind 2 dataframes with different number of rows

I have two lists named h and g. They each contain 244 dataframes and they look like the following:

h[[1]]
   year  avg    hr   sal
1  2010  0.300  31   2000
2  2011  0.290  30   4000
3  2012  0.275  14    600
4  2013  0.280  24    800 
5  2014  0.295  18   1000
6  2015  0.330  26   7000
7  2016  0.315  40   9000

g[[1]]
   year  pos  fld     
1  2010  A   0.990
2  2011  B   0.995
3  2013  C   0.970
4  2014  B   0.980
5  2015  D   0.990

I want to cbind these two dataframes. But as you see, they have different number of rows. I want to combine these dataframes so that the rows with the same year will be combined in one row. And I want the empty spaces to be filled with NA. The result I expect looks like this:

   year  avg    hr   sal   pos   fld
1  2010  0.300  31   2000   A   0.990
2  2011  0.290  30   4000   B   0.995
3  2012  0.275  14    600   NA    NA
4  2013  0.280  24    800   C   0.970
5  2014  0.295  18   1000   B   0.980
6  2015  0.330  26   7000   D   0.990
7  2016  0.315  40   9000   NA    NA

Also, I want to repeat this for all the 244 dataframes in each list, h and g. I'd like to make a new list named final which contains the 244 combined dataframes.

How can I do this...? All answers will be greatly appreciated :)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I think you should instead use merge:

merge(df1, df2, by="year", all = T)

For your data:

df1 = data.frame(matrix(0, 7, 4))
names(df1) = c("year", "avg", "hr", "sal")
df1$year = 2010:2016
df1$avg = c(.3, .29, .275, .280, .295, .33, .315)
df1$hr = c(31, 30, 14, 24, 18, 26, 40)
df1$sal = c(2000, 4000, 600, 800, 1000, 7000, 9000)
df2 = data.frame(matrix(0, 5, 3))
names(df2) = c("year", "pos", "fld")
df2$year = c(2010, 2011, 2013, 2014, 2015)
df2$pos = c('A', 'B', 'C', 'B', 'D')
df2$fld = c(.99,.995,.97,.98,.99)

cbind is meant to column-bind two dataframes that are in all sense compatible. But what you aim to do is actual merge, where you want the elements from the two data frames not be discarded, and for missing values you get NA instead.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...