Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
438 views
in Technique[技术] by (71.8m points)

r - How to randomize (or permute) a dataframe rowwise and columnwise?

I have a dataframe (df1) like this.

     f1   f2   f3   f4   f5
d1   1    0    1    1    1  
d2   1    0    0    1    0
d3   0    0    0    1    1
d4   0    1    0    0    1

The d1...d4 column is the rowname, the f1...f5 row is the columnname.

To do sample(df1), I get a new dataframe with count of 1 same as df1. So, the count of 1 is conserved for the whole dataframe but not for each row or each column.

Is it possible to do the randomization row-wise or column-wise?

I want to randomize the df1 column-wise for each column, i.e. the number of 1 in each column remains the same. and each column need to be changed by at least once. For example, I may have a randomized df2 like this: (Noted that the count of 1 in each column remains the same but the count of 1 in each row is different.

     f1   f2   f3   f4   f5
d1   1    0    0    0    1  
d2   0    1    0    1    1
d3   1    0    0    1    1
d4   0    0    1    1    0

Likewise, I also want to randomize the df1 row-wise for each row, i.e. the no. of 1 in each row remains the same, and each row need to be changed (but the no of changed entries could be different). For example, a randomized df3 could be something like this:

     f1   f2   f3   f4   f5
d1   0    1    1    1    1  <- two entries are different
d2   0    0    1    0    1  <- four entries are different
d3   1    0    0    0    1  <- two entries are different
d4   0    0    1    0    1  <- two entries are different

PS. Many thanks for the help from Gavin Simpson, Joris Meys and Chase for the previous answers to my previous question on randomizing two columns.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Given the R data.frame:

> df1
  a b c
1 1 1 0
2 1 0 0
3 0 1 0
4 0 0 0

Shuffle row-wise:

> df2 <- df1[sample(nrow(df1)),]
> df2
  a b c
3 0 1 0
4 0 0 0
2 1 0 0
1 1 1 0

By default sample() randomly reorders the elements passed as the first argument. This means that the default size is the size of the passed array. Passing parameter replace=FALSE (the default) to sample(...) ensures that sampling is done without replacement which accomplishes a row wise shuffle.

Shuffle column-wise:

> df3 <- df1[,sample(ncol(df1))]
> df3
  c a b
1 0 1 1
2 0 1 0
3 0 0 1
4 0 0 0

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...