Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
333 views
in Technique[技术] by (71.8m points)

r - Extracting a random sample of rows in a data.frame with a nested conditional

This question builds from the SO post found here and uses code that was modified from a post on the R-help mailing list which can be seen here

I am trying to extract a random sample of rows in a data frame but with a conditional. Using the R iris data which looks like:

> head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa 

To take a simple random sample, the code below works fine to take a sample of 2 rows.

iris[sample(nrow(iris), 2), ]

However I am unsure how to condition the Species field. For example how to take the random sample as indicated above but only when Species != “setosa”

There are three categories of iris$Species

> summary(iris$Species)
    setosa versicolor  virginica 
        50         50         50

I am unsure how to correctly nest conditionals. One of my earlier attempts is below with the obviously incorrect results included….

> iris[sample(nrow(iris)[iris$Species != "setosa"], 2), ]
     Sepal.Length Sepal.Width Petal.Length Petal.Width Species
NA             NA          NA           NA          NA    <NA>
NA.1           NA          NA           NA          NA    <NA>

Thanks

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I'd use which to get the vector of rows numbers from which you can sample given your condition....

iris[ sample( which( iris$Species != "setosa" ) , 2 ) , ]
#    Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
#59           6.6         2.9          4.6         1.3 versicolor
#133          6.4         2.8          5.6         2.2  virginica

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...