I am trying to evaluate a multiple linear regression model. I have a data set like this :
This data set has 157 rows * 54 columns.
I need to predict ground_truth value from articles. I will add my multiple linear model 7 articles between en_Amantadine with en_Common.
I have code for multiple linear regression :
from sklearn.linear_model import LinearRegression
X = [[6, 2], [8, 1], [10, 0], [14, 2], [18, 0]] // need to modify for my problem
y = [[7],[9],[13],[17.5], [18]] // need to modify
model = LinearRegression()
model.fit(X, y)
My problem is, I cannot extract data from my DataFrame for X and y variables. In my code X should be:
X = [[4984, 94, 2837, 857, 356, 1678, 29901],
[4428, 101, 4245, 906, 477, 2313, 34176],
....
]
y = [[3.135999], [2.53356] ....]
I cannot convert DataFrame to this type of structure.
How can i do this ?
Any help is appreciated.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…