r - Fill missing combinations in a dataframe

Question

Welcome To Ask or Share your Answers For Others

r - Fill missing combinations in a dataframe

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

r - Fill missing combinations in a dataframe

My example dataset:

df <- data.frame(
 REGION = c("REGION A", "REGION A", "REGION B"), 
 CATEGORY = c("A", "B", "B"), 
 VALUE1 = c(2,3,4),
 VALUE2 = c(1,2,3)
)

Result:

  REGION    CATEGORY VALUE1 VALUE2
1 REGION A   A             2     1
2 REGION A   B             3     2
3 REGION B   B             4     3

Now I want that every combination of REGION and CATEGORY that is not considered in the data set is filled with a VALUE1 and VALUE2 of 0. The result of this df should be:

      REGION   CATEGORY VALUE1 VALUE2
    1 REGION A  A          2      1
    2 REGION A  B          3      2
    3 REGION B  A          4      3
    4 REGION B  B          0      0

I already wrote a big function for it, that generates a dynamic string with for-loops, but I have the feeling that there is a much simpler way to do it with only a few lines of code. I guess I am thinking much too complicated. Any ideas? Thank you in advance.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-17T00:11:41+0000

Using complete from tidyr:

library(tidyr)
as.data.frame(complete(df,REGION,CATEGORY,fill=list(VALUE1=0,VALUE2=0)))

Output:

    REGION CATEGORY VALUE1 VALUE2
1 REGION A        A      2      1
2 REGION A        B      3      2
3 REGION B        A      0      0
4 REGION B        B      4      3

If there are many variables, you could also just do as.data.frame(complete(df,REGION,CATEGORY)) and replace the NA's afterwards.

Hope this helps!

Categories

r - Fill missing combinations in a dataframe

r - Fill missing combinations in a dataframe

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags