Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
348 views
in Technique[技术] by (71.8m points)

r - Get a list of the data sets in a particular package

I would like to get a list of all the data sets in a particular R package shown in the console. I know that the function data() will list all the data sets in loaded packages. That's not my target. I want to get the list of all data sets in a particular R package. The following attempt is not working.

data()
data('arules')
# Warning message:
# In data("arules") : data set ‘arules’ not found

My other intention is to get a list of dim for all the data sets in a particular package.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

There's some good info on this in the details section of help(data). Here are the basics, using the plyr package as an example. For starters, let's see what's available from data().

names(data())
#[1] "title"   "header"  "results" "footer" 

Further investigation of those elements will reveal what's in them. Next, we can use the arguments in data() and then subset the resulting list to find the names of the data sets in the package.

d <- data(package = "plyr")
## names of data sets in the package
d$results[, "Item"]
# [1] "baseball" "ozone"   
## assign it to use later
nm <- d$results[, "Item"]
## call the promised data
data(list = nm, package = "plyr")
## get the dimensions of each data set
lapply(mget(nm), dim)
# $baseball
# [1] 21699    22
#
# $ozone
# [1] 24 24 72

Edit/Update: If you wish to find the names of data sets in all installed packages, you can use the following. .packages(TRUE) gives all packages available in the library location path lib.loc. Since the data sets in the base and stats packages have been moved to the datasets package, we need to account for that by taking them out with setdiff().

## names of all packages sans base and stats
pkgs <- setdiff(.packages(TRUE), c("base", "stats"))
## get the names of all the data sets
dsets <- data(package = pkgs)$result[, "Item"]
## look at the first few in our result
head(dsets)
# [1] "AirPassengers"          "BJsales"                "BJsales.lead (BJsales)"
# [4] "BOD"                    "CO2"                    "ChickWeight"   

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...