Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
718 views
in Technique[技术] by (71.8m points)

dataframe - getting a sample of a data.frame in R

I have the following data frame in R:

id<-c(1,2,3,4,10,2,4,5,6,8,2,1,5,7,7)
date<-c(19970807,19970902,19971010,19970715,19991212,19961212,19980909,19990910,19980707,19991111,19970203,19990302,19970605,19990808,19990706)
spent<-c(1997,19,199,134,654,37,876,890,873,234,643,567,23,25,576)
df<-data.frame(id,date,spent)

I need to take a random sample of 3 customers (based on id) in a way that all observations of the customers be extracted.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You want to use %in% and unique

df[df$id %in% sample(unique(df$id),3),]
##    id     date spent
## 4   4 19970715   134
## 7   4 19980909   876
## 8   5 19990910   890
## 10  8 19991111   234
## 13  5 19970605    23

Using data.table to avoid $ referencing

library(data.table)
DT <- data.table(df)

 DT[id %in% sample(unique(id),3)]
##    id     date spent
## 1:  1 19970807  1997
## 2:  4 19970715   134
## 3:  4 19980909   876
## 4:  1 19990302   567
## 5:  7 19990808    25
## 6:  7 19990706   576

This ensures that you are always evaluating the expressions within the data.table.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...