Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
457 views
in Technique[技术] by (71.8m points)

r - Calculate row means based on (partial) matching column names

I am starting with 3 large data tables (named A1,A2,A3). Each table has 4 data columns (V1-V4), 1 "Date" column that is constant across all three tables, and thousands of rows.

Here is some dummy data that approximates my tables.

A1.V1<-c(1,2,3,4)
A1.V2<-c(2,4,6,8)
A1.V3<-c(1,3,5,7)
A1.V4<-c(1,2,3,4)


A2.V1<-c(1,2,3,4)
A2.V2<-c(2,4,6,8)
A2.V3<-c(1,3,5,7)
A2.V4<-c(1,2,3,4)


A3.V1<-c(1,2,3,4)
A3.V2<-c(2,4,6,8)
A3.V3<-c(1,3,5,7)
A3.V4<-c(1,2,3,4)

Date<-c(2001,2002,2003,2004)

DF<-data.frame(Date, A1.V1,A1.V2,A1.V3,A1.V4,A2.V1,A2.V2,A2.V3,A2.V4,A3.V1,A3.V2,A3.V3,A3.V4)

So this is what my data frame ends up looking like:

  Date A1.V1 A1.V2 A1.V3 A1.V4 A2.V1 A2.V2 A2.V3 A2.V4 A3.V1 A3.V2 A3.V3 A3.V4
1 2001     1     2     1     1     1     2     1     1     1     2     1     1
2 2002     2     4     3     2     2     4     3     2     2     4     3     2
3 2003     3     6     5     3     3     6     5     3     3     6     5     3
4 2004     4     8     7     4     4     8     7     4     4     8     7     4

My goal is to calculate the row mean for each of the matching columns from each data table. So in this instance, I would want row means for all columns ending in V1, all columns ending in V2, all columns ending in V3 and all columns ending in V4.

The end result would look like this

      V1  V2  V3  V4
2001   1   2   1   1
2002   2   4   3   2
2003   3   6   5   3
2004   4   8   7   4

So my question is, how to I go about calculating row means based on a partial match in the column name?

Thanks

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
colnames = c("V1", "V2", "V3", "V4")
res <- sapply(colnames, function(x) rowMeans(DF [, grep(x, names(DF))] )  )
rownames(res) <- DF$Date
res
     V1 V2 V3 V4
2001  1  2  1  1
2002  2  4  3  2
2003  3  6  5  3
2004  4  8  7  4

The R grep function returns an integer vector that is used to selectively "pull" columns containing individual "V"-column names from the larger dataframe.

If you needed to generate the names automagically:

> unique(sapply(strsplit(names(DF)[-1], ".", fixed=TRUE), "[", 2) )
[1] "V1" "V2" "V3" "V4"

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...