Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
691 views
in Technique[技术] by (71.8m points)

r - How to format a pivot like table that includes records for all time and id values?

I'm collecting logs in R with 3 columns:
week, probe, and number of observations.
There aren't record when there is no observation.

week=c(1,2,2,4)  
probe=c("A","C","B","C")  
obs=c(2,4,3,1)
logs=data.frame(week,probe,obs)

logs

week probe obs  
1     A   2
2     C   4
2     B   3
4     C   1

I want to reformat the data so that it includes all weeks and all probes even if there was no observation, so that it looks like this:

week probe obs  
1     A   2  
1     B   0  
1     C   0  
1     D   0  
2     A   0  
2     B   0  
2     C   3  
2     D   4  
3     A   0  
3     B   0  
3     C   0  
3     D   0  
4     A   0  
4     B   0  
4     C   1  
4     D   0  

I have the list of all probes here:

allprobes=c("A","B","C","D")

and I want to look at these weeks:

allweeks=c(1:4)

I've been looking at melt, cast, reshape, but I only manage to get 1 line per id or month... as I actually want to keep the original format of the logs. It seems easy enough at first but I'm now stuck... Any advice on how to get the data formatted this way?

Thanks a lot for any help.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Two options in base R:

Use expand.grid and merge:

> fullFrame <- expand.grid(allweeks, allprobes)
> names(fullFrame) <- c("week", "probe")
> merge(fullFrame, logs, all = TRUE)
   week probe obs
1     1     A   2
2     1     B  NA
3     1     C  NA
4     1     D  NA
5     2     A  NA
6     2     B   3
7     2     C   4
8     2     D  NA
9     3     A  NA
10    3     B  NA
11    3     C  NA
12    3     D  NA
13    4     A  NA
14    4     B  NA
15    4     C   1
16    4     D  NA

expand.grid will create a data.frame of all the possible combinations of your "allprobes" and "allweeks" objects. Then, we rename the columns of that new data.frame to match the relevant columns from your "logs" data.frame when using merge. The argument all = TRUE tells merge to fill in the missing values with NA.

If you want zeroes instead of NA, proceed as follows:

fullFrame <- expand.grid(allweeks, allprobes)
names(fullFrame) <- c("week", "probe")
finalLogs <- merge(fullFrame, logs, all = TRUE)
finalLogs[is.na(finalLogs)] <- 0

Use xtabs after converting "week" and "probe" to factors

If you convert "week" and "probe" to factors that includes all the relevant levels, then you can simply use xtabs wrapped in data.frame:

logs$week <- factor(logs$week, levels=c(1, 2, 3, 4))
logs$probe <- factor(logs$probe, levels=c("A", "B", "C", "D"))
data.frame(xtabs(obs ~ week + probe, logs))
#    week probe Freq
# 1     1     A    2
# 2     2     A    0
# 3     3     A    0
# 4     4     A    0
# 5     1     B    0
# 6     2     B    3
# 7     3     B    0
# 8     4     B    0
# 9     1     C    0
# 10    2     C    4
# 11    3     C    0
# 12    4     C    1
# 13    1     D    0
# 14    2     D    0
# 15    3     D    0
# 16    4     D    0

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...