• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    迪恩网络公众号

stedy/Machine-Learning-with-R-datasets: Formatted datasets for Machine Learning ...

原作者: [db:作者] 来自: 网络 收藏 邀请

开源软件名称(OpenSource Name):

stedy/Machine-Learning-with-R-datasets

开源软件地址(OpenSource Url):

https://github.com/stedy/Machine-Learning-with-R-datasets

开源编程语言(OpenSource Language):


开源软件介绍(OpenSource Introduction):

Data for Machine Learning with R

Machine Learning with R by Brett Lantz is a book that provides an introduction to machine learning using R. As far as I can tell, Packt Publishing does not make its datasets available online unless you buy the book and create a user account which can be a problem if you are checking the book out from the library or borrowing the book from a friend. All of these datasets are in the public domain but simply needed some cleaning up and recoding to match the format in the book.

How to download the data

  1. In your Mac or Linux envirounment, open a terminal and change to the directory where you want your data to be downloaded.
  2. Go to the github page you want to download it's data (for example the challenger data in chapter 6: https://github.com/stedy/Machine-Learning-with-R-datasets/blob/master/challenger.csv)
  3. On the right side, you will find a button called "raw". Click on it.
  4. Copy the url you will get for the new page (in our example I got https://raw.githubusercontent.com/stedy/Machine-Learning-with-R-datasets/master/challenger.csv)
  5. put the following command in the terminal screen wget name_of_url

so in our example it should be like this wget https://raw.githubusercontent.com/stedy/Machine-Learning-with-R-datasets/master/challenger.csv

Chapter 1

No datasets used

Chapter 2

usedcars.csv could not be found online

Chapter 3

wisc_bc_data.csv from https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/

Chapter 4

sms_spam.csv from http://www.dt.fee.unicamp.br/~tiago/smsspamcollection/

Chapter 5

credit.csv from https://archive.ics.uci.edu/ml/machine-learning-databases/statlog/german/

mushrooms.csv from https://archive.ics.uci.edu/ml/machine-learning-databases/mushroom/

Chapter 6

challenger.csv from https://archive.ics.uci.edu/ml/machine-learning-databases/space-shuttle/

insurance.csv could not be found online

whitewines.csv from https://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/

Chapter 7

concrete.csv from https://archive.ics.uci.edu/ml/machine-learning-databases/concrete/compressive/

letterdata.csv from https://archive.ics.uci.edu/ml/machine-learning-databases/letter-recognition/

Chapter 8

groceries.csv is from arules package but probably just easier to call library(arules); data(Groceries)

Chapter 9

snsdata.csv could not be found online

Chapter 10

sms_results.csv is likely from the sms_test_pred object in Chapter 4 but difficult to be sure.

credit.csv is likely the same file from Chapter 5.

Chapter 11

credit.csv from Chapter 5 is reused.

Chapter 12

No datasets used




鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
热门推荐
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap