Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
189 views
in Technique[技术] by (71.8m points)

Ragged rowSums in R

I am trying to do a rowSum for the actuals columns. However, I would like to include the values up to the UpTo date for certain observations. Here is the data frame:

dat <- structure(list(Company = c("ABC", "DEF", "XYZ"), UpTo = c(NA, 
"Q2", "Q3"), Actual.Q1 = c(100L, 80L, 100L), Actual.Q2 = c(50L, 
75L, 50L), Forecast.Q3 = c(80L, 50L, 80L), Forecast.Q4 = c(90L, 
80L, 100L)), .Names = c("Company", "UpTo", "Actual.Q1", "Actual.Q2", 
"Forecast.Q3", "Forecast.Q4"), class = "data.frame", row.names = c("1", 
"2", "3"))

  Company UpTo Actual.Q1 Actual.Q2 Forecast.Q3 Forecast.Q4
1     ABC   NA       100        50          80          90
2     DEF   Q2        80        75          50          80
3     XYZ   Q3       100        50          80         100
  • For company ABC, since there is no UpTo date, it will just be Actual.Q1 + Actual.Q2, which is 150.
  • For company DEF, since the UpTo date is Q2, it will be Actual.Q1 + Actual.Q2, which is 155.
  • For company XYZ, since the UpTo date is Q3, it will be Actual.Q1 + Actual.Q2 + Forecast.Q3, which is 230.

The resulting data frame would look like this:

  Company UpTo Actual.Q1 Actual.Q2 Forecast.Q3 Forecast.Q4 SumRecent
1     ABC   NA       100        50          80          90       150
2     DEF   Q2        80        75          50          80       155
3     XYZ   Q3       100        50          80         100       230

I have tried to use the rowSums function. However, it does not take into effect the variable UpTo. Any help is appreciated. Thanks!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Here is a possibility:

df$SumRecent <- sapply(1:nrow(df), function(x) {sum(df[x,3:ifelse(is.na(grep(df[x,2], colnames(df))[1]), 4, grep(df[x,2], colnames(df))[1])])})


#   Company UpTo Actual.Q1 Actual.Q2 Forecast.Q3 Forecast.Q4 SumRecent
# 1     ABC <NA>       100        50          80          90       150
# 2     DEF   Q2        80        75          50          80       155
# 3     XYZ   Q3       100        50          80         100       230

We are looking with the use of grep for a match of the value in the column UpTo (df[x,2]) in the column names of df (colnames(df)). If we find it we get the sum, if we don't find it we just sum the values in columns 3 and 4.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...