Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
173 views
in Technique[技术] by (71.8m points)

python - How to import csv data file into scikit-learn?

From my understanding, the scikit-learn accepts data in (n-sample, n-feature) format which is a 2D array. Assuming I have data in the form ...

Stock prices    indicator1    indicator2
2.0             123           1252
1.0             ..            ..
..              .             . 
.

How do I import this?

question from:https://stackoverflow.com/questions/11023411/how-to-import-csv-data-file-into-scikit-learn

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This is not a CSV file; this is just a space separated file. Assuming there are no missing values, you can easily load this into a Numpy array called data with

import numpy as np

f = open("filename.txt")
f.readline()  # skip the header
data = np.loadtxt(f)

If the stock price is what you want to predict (your y value, in scikit-learn terms), then you should split data using

X = data[:, 1:]  # select columns 1 through end
y = data[:, 0]   # select column 0, the stock price

Alternatively, you might be able to massage the standard Python csv module into handling this type of file.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...